An event-driven ASP.NET Core backend that accepts document-analysis jobs through a Web API, stores job state in PostgreSQL, and processes jobs asynchronously using RabbitMQ and a background worker.
This portfolio project is built to demonstrate employable .NET backend skills beyond basic CRUD APIs, including:
- asynchronous job processing
- worker-based background execution
- message-driven architecture
- explicit job lifecycle management
- outbox pattern
- PostgreSQL persistence with EF Core
- fully containerised application and infrastructure
- layered architecture
- automated testing
- GitHub Actions CI
- idempotent consumer design
In the current version, the API accepts text input rather than real file uploads. A client submits a document-processing job, the API stores the job and outbox message, and marks the job as queued. The publisher periodically polls the outbox for unpublished messages, processes them in batches, and publishes them to RabbitMQ. A worker then processes the job asynchronously. The client can then query job status and results later.
This project is designed to show practical backend skills that map to real systems:
- ASP.NET Core Web API design
- EF Core + PostgreSQL persistence
- RabbitMQ message publishing and consumption
- Background worker processing
- explicit domain state transitions
- polling-based async job tracking
- outbox pattern
- clean separation of concerns across layers
- fully containerised local development with Docker Compose
- integration, domain, and worker unit testing
- GitHub Actions CI
- dead-letter queue handling
- consumer-side retry tracking
- idempotent consumer
- .NET 10
- ASP.NET Core Web API
- Worker Service
- EF Core 10
- PostgreSQL
- RabbitMQ
- Docker / Docker Compose
- xUnit
- GitHub Actions
- Document Processing UI — React frontend
src/
DocumentProcessing.Api
DocumentProcessing.Application
DocumentProcessing.Domain
DocumentProcessing.Infrastructure
DocumentProcessing.Worker
tests/
DocumentProcessing.Api.Tests
DocumentProcessing.Domain.Tests
DocumentProcessing.Worker.Tests
DocumentProcessing.E2E.Tests
-
DocumentProcessing.Api HTTP endpoints, request/response contracts, JSON configuration, and API wiring.
-
DocumentProcessing.Application Use cases, DTOs, service abstractions, repository abstractions, and messaging abstractions.
-
DocumentProcessing.Domain Core business model, job lifecycle rules, and domain invariants.
-
DocumentProcessing.Infrastructure EF Core persistence, repository implementations, RabbitMQ publisher, and infrastructure registration.
-
DocumentProcessing.Worker Background consumer that reads RabbitMQ messages, loads jobs from the database, performs analysis, and updates job state.
The core aggregate is DocumentJob.
A job starts in Pending, is moved to Queued when the job and outbox message are persisted, then moves to Processing when the worker begins analysis. From Processing, the job can become either Completed or Failed.
There is also a dispatch-failure path from Queued to Failed. This is used when the outbox publisher cannot publish the job message to RabbitMQ after the maximum retry count is exceeded. In that case, the job failed before worker processing began.
Transition rules:
Pending->QueuedQueued->ProcessingProcessing->CompletedProcessing->FailedQueued->Failedfor outbox dispatch failure
For v1, Completed and Failed are treated as terminal states.
POST /api/jobs
Accepts text input and returns a queued job response.
GET /api/jobs/{id}
Returns the current status and any available analysis results.
GET /api/jobs
Returns jobs ordered by SubmittedAtUtc descending.
{
"inputText": "This is a test document.\nIt has multiple lines.\n"
}{
"id": "c6147881-a3a7-41fb-97f7-f96d12e62e58",
"status": "Queued",
"inputText": "This is a test document.\nIt has multiple lines.\n",
"submittedAtUtc": "2026-04-15T05:30:40.209483Z",
"updatedAtUtc": "2026-04-15T05:30:40.283789Z",
"completedAtUtc": null,
"errorMessage": null,
"wordCount": null,
"characterCount": null,
"lineCount": null,
"keywordHits": null,
"category": null,
"summary": null
}{
"id": "c6147881-a3a7-41fb-97f7-f96d12e62e58",
"status": "Completed",
"inputText": "This is a test document.\nIt has multiple lines.\n",
"submittedAtUtc": "2026-04-15T05:30:40.209483Z",
"updatedAtUtc": "2026-04-15T05:30:40.796823Z",
"completedAtUtc": "2026-04-15T05:30:40.796823Z",
"errorMessage": null,
"wordCount": 9,
"characterCount": 48,
"lineCount": 2,
"keywordHits": 0,
"category": "General",
"summary": "This is a test document.\nIt has multiple lines.\n"
}- Client submits a document job to
POST /api/jobs, job initial state isPending - Application marks the job as
Queued - Application persists the job and the outbox message atomically in a single transaction.
- Background outbox publisher periodically polls for unpublished outbox messages, publishes them to RabbitMQ, and marks them as published.
- Worker consumes the message from RabbitMQ.
- Worker loads the job from PostgreSQL
- Worker marks the job as
Processing. In the current version, this intermediate state may not be visible to the client during fast processing or retry scenarios. - Worker performs simple text analysis
- Worker marks the job as
CompletedorFailed - Client retrieves job status using
GET /api/jobs/{id}orGET /api/jobs
The worker currently produces:
- word count
- character count
- line count
- keyword hit count
- default category
- truncated summary
Notes
- Line counting ignores trailing newline characters.
keywordHitsis currently a placeholder implementation.categoryis currently a simple default value.
docker compose up -dThis starts all services:
- PostgreSQL
- RabbitMQ
- API (available at
http://localhost:8080) - Worker
Database migrations are applied automatically on API startup.
curl -X POST http://localhost:8080/api/jobs \
-H "Content-Type: application/json" \
-d '{"inputText": "Hello from the fully containerised stack!"}'curl http://localhost:8080/api/jobs/{id}Available at http://localhost:15672 using the credentials configured in docker-compose.yml.
Interactive API documentation is available via Scalar at http://localhost:8080/scalar/v1 when running locally.
If you prefer to run the API and worker directly for faster iteration:
1. Start infrastructure only
docker compose up -d postgres rabbitmq2. Apply database migrations
dotnet ef database update \
--project src/DocumentProcessing.Infrastructure \
--startup-project src/DocumentProcessing.Api3. Run the API and worker
dotnet run --project src/DocumentProcessing.Api
dotnet run --project src/DocumentProcessing.WorkerDocumentJobcreation with valid input- valid transitions
- invalid transitions
- guard clauses
- completion result mapping
OutboxMessagecreation and validation rules- outbox publication/error rules
- create job returns an accepted response
- get job by id returns the persisted job
- get job by id returns 404 when missing
- list jobs returns jobs ordered by submission time
API integration tests require a running PostgreSQL instance.
Start infrastructure withdocker compose up -d postgres rabbitmqbefore running the full test suite locally.
Domain and worker unit tests run without any infrastructure.
- document analysis returns expected counts for single-line input
- document analysis returns expected counts for multiline input
- trailing newline does not create an extra counted line
- long input truncates summary correctly
- full job lifecycle from submission to completion
- full job lifecycle from submission to failure (skipped — see Known Limitation: Consumer Retry Queue Pattern)
E2E tests require the full stack to be running via
docker compose up.
# Unit and integration tests
dotnet test --filter "Category!=E2E"
# E2E tests (requires docker compose up)
dotnet test --filter "Category=E2E"GitHub Actions CI is configured for this repository and runs on every push.
The workflow:
- restores dependencies
- builds the solution
- provisions PostgreSQL and RabbitMQ service containers
- applies EF Core migrations
- runs unit and integration tests
E2E tests are excluded from CI and are intended to be run against a locally running stack.
Without the outbox pattern, a failure between saving the job and publishing the message could leave a job stranded in Queued state indefinitely.
With outbox pattern implemented with unit of work, the job is persisted in the database along with a message in the outbox in one atomic transaction. The Outbox Publisher then periodically polls the outbox for unpublished messages and processes them in batches.
A partial index on outbox_messages covering only unpublished messages ensures the publisher query stays fast as the table grows.
This project uses the outbox pattern to provide at-least-once message delivery between the API and worker process. When a document job is created, the job state change and the corresponding outbox message are persisted in the same database transaction. A separate publisher then reads unpublished outbox messages and publishes them to RabbitMQ.
Because at-least-once delivery can result in the same message being delivered more than once, the worker is designed as an idempotent consumer. Before processing a message, the worker loads the current job state and only performs work when the job is in a valid processable state. Messages in Completed or Failed states are acknowledged without reprocessing the job.
This project intentionally does not attempt to provide at-most-once or exactly-once delivery. Implementing them would require additional distributed coordination that is outside the scope of this portfolio project.
A Dead Letter Exchange (DLX) and Dead Letter Queue (DLQ) are implemented for handling messages that cannot be processed safely.
The main queue, document-processing.jobs, is configured with a dead-letter exchange:
- DLX:
document-processing.dlx - DLQ:
document-processing.jobs.dlq - Dead-letter routing key:
document-processing-key
When the worker determines that a message should not be retried, it negatively acknowledges the message with requeue: false. RabbitMQ then dead-letters the message to document-processing.dlx, which routes it to document-processing.jobs.dlq using the configured binding.
The DLX, DLQ, and bindings are declared in rabbitmq/definitions.json.
- Invalid or empty
ProcessDocumentJobMessage - Malformed
x-deathheader - Non-existent
DocumentJob DocumentJobis in a state that cannot be safely processed or acknowledged as a duplicate
Producer / API
│
│ publish job message
▼
document-processing.jobs
│
│ consumed by Worker
▼
DocumentJobConsumer
│
├── success
│ └── message is acknowledged and removed from queue
│ BasicAckAsync
│
├── transient failure
│ └── message is negatively acknowledged and requeued
│ BasicNackAsync(requeue: true)
│
└── non-retryable failure / retry limit exceeded
│
│ message is negatively acknowledged without requeue
│ BasicNackAsync(requeue: false)
│
│ RabbitMQ dead-letters the message using:
│ exchange: document-processing.dlx
│ routing key: document-processing-key
▼
document-processing.dlx
│
│ direct exchange binding:
│ routing key: document-processing-key
▼
document-processing.jobs.dlq
Currently, transient processing failures are handled with BasicNackAsync(requeue: true), which returns the message immediately to the main queue.
This is simple, but it has two drawbacks:
- a repeatedly failing message may be retried immediately and consume worker capacity
- because the message is requeued rather than dead-lettered, RabbitMQ’s x-death header is not incremented for those retry attempts.
The worker already contains early support for reading RabbitMQ’s x-death header, but the current retry behaviour does not yet make full use of it because failed messages are requeued directly.
A more robust solution would use a dedicated retry exchange and retry queue with a TTL. Failed messages would be dead-lettered into the retry queue, wait for the TTL to expire, and then be routed back to the main queue. This would provide delayed retries and better broker-level retry tracking. This is planned as a future improvement.
The API automatically applies pending EF Core migrations on startup. This ensures the database schema is always up to date when running via Docker Compose without any manual steps.
This version intentionally keeps scope tight:
- text input only
- no real file upload yet
- no retry endpoint yet
- no authentication/authorization yet
- no pagination/filtering for job listing yet
- no advanced text analytics yet
Possible next steps:
- JWT authentication / authorization
- retry support for failed jobs
- real file upload
- richer keyword analysis and categorisation
- pagination and filtering for job queries
- React frontend for job submission and status tracking
- dedicated retry exchange and queue with a TTL
- dead-lettering messages that exceed the maximum retry count
I'm transitioning into .NET backend development from a senior integration / telecom engineering background. This project is intended to demonstrate backend skills that map to real production systems: asynchronous workflows, messaging, persistence, background processing, lifecycle management, and operational thinking.
MIT