Message Queues

A message queue is a durable buffer between a producer (the service that generates work) and a consumer (the service that processes it). The producer writes a message to the queue and moves on immediately — it does not wait for the consumer to finish. This decoupling is one of the most important patterns in distributed systems.

Why Queues

Absorb traffic spikes — if 10,000 orders arrive in one second but the processing service can only handle 1,000/s, a queue buffers the excess. Without it, the processing service would be overwhelmed or requests would be dropped.
Decouple services — the producer doesn't need to know about the consumer. Services can be deployed, restarted, or scaled independently without coordinating with each other.
Enable retry logic — if a consumer crashes while processing a message, the message is re-queued and retried rather than lost. Dead-letter queues capture messages that fail repeatedly for manual inspection.
Async processing — tasks that don't need to complete synchronously (sending emails, resizing images, generating reports) are pushed to a queue and processed in the background, keeping API response times fast.
Fan-out — a single message can be consumed by multiple independent services simultaneously (notifications, analytics, audit logging) without the producer knowing about any of them.

Delivery Guarantees

At-most-once — messages are delivered zero or one times. Fast but may lose messages on failure. Acceptable for metrics or logs where occasional loss is tolerable.
At-least-once — messages are delivered one or more times. No message is ever lost, but duplicates are possible on retry. Consumers must be idempotent — processing the same message twice must produce the same result as processing it once.
Exactly-once — messages are delivered exactly one time. The hardest guarantee to achieve; requires transactional semantics between the broker and the consumer's storage. Kafka supports this with transactions; for other systems it is typically simulated with idempotent consumer logic (deduplication keys).

Idempotency in practice

Because at-least-once is the practical default for most systems, consumers should be designed to be idempotent from the start. Common patterns:

Idempotency key — each message carries a unique ID; the consumer checks a processed-IDs store before acting and skips duplicates.
Upsert operations — write operations are designed so re-running them produces the same outcome (e.g. INSERT ... ON CONFLICT DO UPDATErather than a plain INSERT).
Conditional updates — apply changes only if the current state matches the expected pre-condition (optimistic locking).

Kafka vs RabbitMQ vs SQS

	Kafka	RabbitMQ	SQS
Model	Distributed log; consumers read from offsets	Push-based broker; messages deleted on ACK	Managed pull queue
Retention	Configurable (days/weeks); replay possible	Until consumed or TTL	Up to 14 days
Throughput	Millions of messages/sec	Thousands–hundreds of thousands/sec	Scales automatically
Ordering	Per-partition ordering guaranteed	Per-queue FIFO (with single consumer)	Standard: no guarantee; FIFO queue: yes
Replay	Yes — consumers can re-read from any offset	No — messages are deleted on ACK	No
Best for	Event streaming, audit logs, analytics pipelines	Task queues, complex routing, RPC patterns	Simple async decoupling on AWS

When to use each

Kafka — when you need to replay events, build event-sourced systems, fan out to many independent consumers, or process high-volume streams. The log model makes it the foundation for event-driven architectures.
RabbitMQ — when you need flexible routing (topic exchanges, header routing), priority queues, or a simpler operational model than Kafka for moderate throughput task queues.
SQS — when you are already in AWS and want a zero-ops managed queue. Use SQS Standard for maximum throughput; SQS FIFO when message ordering and exactly-once processing matter.