A production-oriented backend service demonstrating webhook ingestion, durable event persistence, asynchronous processing, retries, idempotency, observability, and failure recovery.
Portuguese version: README.pt-BR.md.
Event-Driven Webhook Processor receives external webhook events, stores them durably, places them on a queue, and processes them outside the request cycle. The project exists to demonstrate backend architecture patterns that show up in distributed systems: at-least-once delivery, idempotency keys, retryable jobs, dead-letter-style failure handling, structured logs, and operational reprocessing.
The backend is the main focus. A small operations dashboard is included to inspect received events, processing status, attempts, failures, and manual reprocessing.
Request and processing flow:
- A webhook producer sends
POST /webhooks/eventswithx-api-keyand an idempotency header. - The API validates the request, checks the idempotency key, and persists the event in PostgreSQL.
- The API enqueues a BullMQ job in Redis and returns
202 Accepted. - A separate worker process consumes jobs, marks events as processing, runs the processing handler, and updates the final status.
- Failed events keep error details and can be requeued with
POST /events/:id/reprocess. - Structured logs are emitted by both API and worker processes.
The API and worker are intentionally separate processes, following the 12-Factor process model. Configuration is provided through environment variables, backing services are attached resources, and logs are written to stdout as event streams.
- Node.js
- TypeScript
- Fastify
- PostgreSQL
- Prisma
- Redis
- BullMQ
- Pino
- Docker Compose
- Vitest
- Webhook ingestion endpoint with API key protection.
- Durable event storage with PostgreSQL.
- Idempotency key support for safe webhook retries.
- Redis-backed BullMQ queue for asynchronous processing.
- Worker retries with exponential backoff.
- Processing status tracking:
received,queued,processing,processed,failed,requeued. - Failure capture with attempt count and error message.
- Manual reprocessing for failed events.
- Structured JSON logs.
- Minimal operations dashboard at
/dashboard/. - Unit and integration-style tests using fast fakes.
Install dependencies:
npm installCreate a local environment file:
cp .env.example .envStart PostgreSQL and Redis:
docker compose up -dGenerate the Prisma client and apply migrations:
npm run db:generate
npm run db:migrateStart the API:
npm run dev:apiStart the worker in another terminal:
npm run dev:workerOpen the dashboard:
http://localhost:3000/dashboard/
Send a webhook event:
curl -X POST http://localhost:3000/webhooks/events \
-H "content-type: application/json" \
-H "x-api-key: change-me" \
-H "idempotency-key: order-123-created" \
-d '{"type":"order.created","source":"checkout","data":{"orderId":"ord_123"}}'Replay the same request with the same idempotency key to receive the existing event instead of creating a duplicate.
List events:
curl http://localhost:3000/events \
-H "x-api-key: change-me"Filter failed events:
curl "http://localhost:3000/events?status=failed" \
-H "x-api-key: change-me"Simulate a processing failure:
curl -X POST http://localhost:3000/webhooks/events \
-H "content-type: application/json" \
-H "x-api-key: change-me" \
-H "idempotency-key: failing-event-1" \
-d '{"type":"demo.failed","simulateFailure":true}'Reprocess a failed event:
curl -X POST http://localhost:3000/events/<event-id>/reprocess \
-H "x-api-key: change-me"Run the test suite:
npm testValidate TypeScript:
npm run buildThe tests use in-memory fakes for persistence and queueing, keeping the feedback loop fast while covering idempotency, enqueueing, querying, reprocessing, and worker behavior.
- Scale API and worker processes independently based on ingress and queue depth.
- Use managed PostgreSQL and Redis with backups, monitoring, and high availability.
- Treat
WEBHOOK_API_KEY,DATABASE_URL, andREDIS_URLas environment-specific secrets. - Add request signing verification when integrating with providers that support HMAC signatures.
- Add metrics for queue depth, processing latency, retries, failures, and reprocessing rate.
- Configure log aggregation so Pino JSON logs can be searched by
eventId,jobId, andidempotencyKey. - Add graceful deployment procedures so workers finish or safely release active jobs during shutdown.
- PostgreSQL is used for durable event history instead of SQLite because the project emphasizes production backend architecture.
- BullMQ and Redis provide simple retry and queue semantics without the operational weight of RabbitMQ or Kafka.
- The processing handler is intentionally small and deterministic; real systems would call domain services or external APIs.
- The dashboard avoids a frontend framework to keep the backend as the center of the project.
- API key authentication keeps the demo clear while still showing a realistic security boundary.
Relevant architectural decisions are documented in /docs/adr.
Portuguese ADR versions are available in /docs/adr/pt-BR.
- Add HMAC signature verification per webhook provider.
- Add OpenTelemetry traces and Prometheus-style metrics.
- Add dead-letter queue inspection and bulk reprocessing.
- Add pagination and full-text search for events.
- Add a Dockerfile for running API and worker images in containerized environments.
MIT License. See LICENSE.