DB Was Committed but the Kafka Event Vanished — Solving the Dual-Write Problem with PostgreSQL + Kafka Using the Transactional Outbox Pattern
A PagerDuty alert going off at 2:30 AM leaves a peculiar kind of memory. I experienced exactly that situation. When I first designed a microservice, I saved orders to the DB and immediately published OrderCreated events to Kafka. It worked fine most of the time — but one night, the Kafka broker went down for about 30 seconds and came back up. When I checked the dashboard, 340 orders had made it into the DB, yet the inventory service and shipping service had no idea those orders existed. Inventory was never decremented, and shipping requests were never created. This is the Dual-Write problem — a structural issue where attempting to write to two different resources simultaneously (DB persistence and event publishing) causes data inconsistency when either one fails.
This post covers how to implement the Transactional Outbox pattern, which prevents that problem at the root within a PostgreSQL ACID transaction. From Polling Publisher TypeScript code to Debezium CDC integration, FOR UPDATE SKIP LOCKED strategy, and WAL monitoring — I've compiled the pitfalls I actually ran into in production. By the end of this post, you'll be able to decide between the Polling Publisher and CDC approaches for your situation, and attach an outbox to an existing PostgreSQL schema today.
Core Concepts
Why the Dual-Write Problem Is Unavoidable
In a distributed system, "DB persistence" and "event publishing" are fundamentally writes to two different resources. There is no standard way to bundle a PostgreSQL transaction and a Kafka publish into a single atomic unit.
XA transactions (2PC) theoretically make it possible, but Kafka doesn't support XA coordinators in the JTA style. Kafka's own transaction API only guarantees atomicity within Kafka itself (producer → broker) and cannot be bundled with a PostgreSQL transaction. The end result is increased performance overhead and operational complexity with no meaningful guarantee.
The possible scenarios then look like this:
| Scenario | Outcome |
|---|---|
| DB commit succeeds → Kafka publish fails | Event is lost; consumer services are unaware |
| Kafka publish succeeds → DB rollback | Event was sent but data doesn't exist (ghost event) |
| Network timeout | It's unclear whether it succeeded or failed |
Trendyol experienced exactly this problem. They were sending messages to RabbitMQ in a "fire and forget" manner, and thousands of events vanished during a network outage — that incident prompted them to adopt the Outbox pattern.
The Full System Architecture at a Glance
Introducing the Outbox pattern changes the architecture like this:
┌──────────────────────────────────────────────────┐
│ Application │
│ createOrder(), createPayment() │
└──────────────────────┬───────────────────────────┘
│ Single ACID Transaction
▼
┌──────────────────────────────────────────────────┐
│ PostgreSQL │
│ ┌─────────────┐ ┌────────────────────┐ │
│ │ orders │ + │ outbox_events │ │
│ └─────────────┘ └──────────┬─────────┘ │
└─────────────────────────────────────┼────────────┘
│
┌───────────────────────┤
│ │
▼ ▼
┌───────────────────┐ ┌─────────────────────────┐
│ Polling Publisher │ │ Debezium CDC │
│ (polls every 500ms)│ │ (real-time WAL stream) │
└─────────┬─────────┘ └────────────┬────────────┘
└──────────────┬──────────┘
▼
┌───────────────────────┐
│ Kafka │
│ order.events │
│ payment.events │
└───────────┬───────────┘
│
┌─────────────┴────────────┐
▼ ▼
┌──────────────────┐ ┌───────────────────────┐
│ Inventory Service│ │ Shipping Service │
└──────────────────┘ └───────────────────────┘Core principle: Instead of sending events directly to Kafka, you first write them to the outbox table within the same DB transaction. Since the business data and the event record are committed or rolled back together, PostgreSQL ACID guarantees atomicity. A separate relay process handles the Kafka publishing.
Designing the outbox Table
CREATE TABLE outbox_events (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
aggregate_type VARCHAR(255) NOT NULL, -- Business unit type: "Order", "Payment"
aggregate_id VARCHAR(255) NOT NULL, -- Unique ID of that unit (used as Kafka partition key)
event_type VARCHAR(255) NOT NULL, -- Event kind: "OrderCreated", "PaymentCompleted"
payload JSONB NOT NULL,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
published_at TIMESTAMPTZ, -- NULL = unpublished, NOT NULL = published
retry_count INT NOT NULL DEFAULT 0,
failed_at TIMESTAMPTZ -- Timestamp of the most recent publish failure
);
-- Partial index for performance when querying unpublished events
CREATE INDEX idx_outbox_unpublished
ON outbox_events (created_at ASC)
WHERE published_at IS NULL;aggregate_type and aggregate_id are terms borrowed from DDD (Domain-Driven Design) — simply put, they represent "which business unit (order, payment)" and "the ID of that unit." aggregate_id is later used as the Kafka message key, ensuring events for the same order always route to the same partition.
In practice, having only two states — published_at and NULL — eventually proves insufficient. For situations where a process dies mid-publish attempt or you need to give up after repeated retries, including retry_count and failed_at in the initial design makes it much easier to build dead-letter handling and alerting thresholds later.
Two Directions for the Message Relay
| Approach | Mechanism | Latency | Complexity | When to switch |
|---|---|---|---|---|
| Polling Publisher | Scheduler periodically queries the outbox table | 100ms – several seconds | Low | When the team is small and fast adoption is needed |
| CDC (WAL Tailing) | Debezium reads the PostgreSQL WAL in real time | Tens of ms | High | When events exceed ~500/sec or latency must be under 500ms |
Practical Implementation
Implementing the Polling Publisher
The Polling Publisher can be implemented at the application level without additional infrastructure, making it an ideal first choice for most teams. The key is to acquire row-level locks so multiple instances don't process the same event simultaneously.
Step 1: Put the business logic and outbox insert in the same transaction
// TypeScript + pg library
import { Pool } from 'pg';
interface CreateOrderDto {
userId: string;
totalAmount: number;
}
async function createOrder(pool: Pool, orderData: CreateOrderDto): Promise<string> {
const client = await pool.connect();
try {
await client.query('BEGIN');
const orderResult = await client.query(
`INSERT INTO orders (user_id, total_amount, status)
VALUES ($1, $2, 'PENDING')
RETURNING id`,
[orderData.userId, orderData.totalAmount]
);
const orderId = orderResult.rows[0].id;
// Insert the event into the outbox within the same transaction
await client.query(
`INSERT INTO outbox_events
(aggregate_type, aggregate_id, event_type, payload)
VALUES ($1, $2, $3, $4)`,
[
'Order',
orderId,
'OrderCreated',
JSON.stringify({
orderId,
userId: orderData.userId,
totalAmount: orderData.totalAmount,
}),
]
);
await client.query('COMMIT');
return orderId;
} catch (err) {
await client.query('ROLLBACK');
console.error('[createOrder] 주문 생성 실패, 롤백:', err);
throw err;
} finally {
client.release();
}
}Step 2: Polling without contention using FOR UPDATE SKIP LOCKED
When I first wrote this, I underestimated the importance of this clause — it wasn't until I reproduced a scenario where two instances simultaneously grabbed the same event that I understood it was essential. Without this clause, multiple instances trying to process the same row concurrently can cause conflicts or deadlocks.
import { Pool } from 'pg';
import { Producer } from 'kafkajs';
async function runPollingPublisher(pool: Pool, producer: Producer): Promise<void> {
const client = await pool.connect();
try {
await client.query('BEGIN');
const result = await client.query(`
SELECT id, aggregate_type, aggregate_id, event_type, payload
FROM outbox_events
WHERE published_at IS NULL
ORDER BY created_at ASC
LIMIT 100
FOR UPDATE SKIP LOCKED
`);
if (result.rows.length === 0) {
await client.query('ROLLBACK');
return;
}
// kafkajs sendBatch API
await producer.sendBatch({
topicMessages: result.rows.map(row => ({
topic: `${row.aggregate_type.toLowerCase()}.events`,
messages: [
{
key: row.aggregate_id, // Partition key — events for the same order always go to the same partition
value: JSON.stringify(row.payload),
},
],
})),
});
const ids = result.rows.map(r => r.id);
await client.query(
`UPDATE outbox_events
SET published_at = NOW()
WHERE id = ANY($1::uuid[])`,
[ids]
);
await client.query('COMMIT');
} catch (err) {
await client.query('ROLLBACK');
// published_at is not updated, so the next poll automatically retries
console.error('[PollingPublisher] 발행 실패, 다음 주기에 재시도:', err);
} finally {
client.release();
}
}
// Recursive setTimeout instead of setInterval — the next run only starts after the previous one finishes
async function schedulePolling(pool: Pool, producer: Producer): Promise<void> {
await runPollingPublisher(pool, producer).catch(err => {
console.error('[PollingPublisher] 처리되지 않은 에러:', err);
});
setTimeout(() => schedulePolling(pool, producer), 500);
}
schedulePolling(pool, producer);| Code point | Reason |
|---|---|
FOR UPDATE SKIP LOCKED |
Skips already-locked rows — fundamentally prevents multiple instances from processing the same event |
aggregate_id as message key |
Events for the same order always go to the same partition → ordering within an aggregate is guaranteed |
| ROLLBACK on Kafka publish failure | published_at is not updated, so the event is automatically reprocessed on the next poll |
Recursive setTimeout |
setInterval starts the next execution even if the previous one hasn't finished — use the recursive pattern to prevent concurrent runs |
LIMIT 100 |
Grabbing too many rows at once lengthens the DB transaction time and increases the risk of Kafka publish timeouts |
Scope of ordering guarantees: Partitioning by
aggregate_idguarantees event order within the same order, but not global ordering across different orders. If you need to guarantee thatOrderCreatedalways comes beforePaymentCompletedglobally, a separate design is required.
Real-Time WAL Integration with Debezium CDC
Once a Polling Publisher is running smoothly and event throughput exceeds a few hundred per second, the 500ms polling interval starts to become a bottleneck. At that point, CDC (Change Data Capture) becomes an attractive option. Debezium reads PostgreSQL's WAL (Write-Ahead Log) directly and streams events to Kafka within tens of milliseconds.
Understanding logical replication first
Logical Replication: A PostgreSQL feature that records data changes at the SQL level. Unlike binary replication, it allows selecting specific tables, enabling Debezium to watch only the outbox table.
A Replication Slot is a PostgreSQL internal cursor that tracks how far Debezium has read into the WAL. WAL prior to this cursor is not deleted, so if Debezium is stopped for a long time, WAL can accumulate and fill up the disk.
Enabling PostgreSQL logical replication
-- postgresql.conf settings (or change online with ALTER SYSTEM)
ALTER SYSTEM SET wal_level = 'logical';
ALTER SYSTEM SET max_replication_slots = 5;
ALTER SYSTEM SET max_wal_senders = 5;
-- PostgreSQL restart required after these changesDebezium Connector configuration
{
"name": "outbox-connector",
"config": {
"connector.class": "io.debezium.connector.postgresql.PostgresConnector",
"database.hostname": "postgres",
"database.port": "5432",
"database.user": "debezium",
"database.password": "secret",
"database.dbname": "mydb",
"topic.prefix": "mydb",
"table.include.list": "public.outbox_events",
"plugin.name": "pgoutput",
"slot.name": "debezium_slot",
"transforms": "outbox",
"transforms.outbox.type": "io.debezium.transforms.outbox.EventRouter",
"transforms.outbox.table.field.event.id": "id",
"transforms.outbox.table.field.event.key": "aggregate_id",
"transforms.outbox.table.field.event.type": "event_type",
"transforms.outbox.table.field.event.payload": "payload",
"transforms.outbox.route.by.field": "aggregate_type",
"transforms.outbox.route.topic.replacement": "${routedByValue}.events",
"tombstones.on.delete": "false"
}
}Outbox Event Router SMT (Single Message Transform): When Debezium detects an INSERT event on the outbox table, this transformer routes it to the appropriate Kafka topic based on the
aggregate_typevalue.Ordergoes toorder.events,Paymentgoes topayment.events— automatically.
How is published_at managed with CDC?
Unlike the Polling Publisher approach, with CDC you don't need to update published_at directly. Debezium tracks the WAL's LSN (Log Sequence Number) in the replication slot, so it knows how far it has processed even after a restart. Not having to manage publish state at the application level is one of the practical advantages of CDC.
The full CDC flow
[PostgreSQL WAL]
└─ logical replication slot (pgoutput)
└─ Debezium PostgreSQL Connector (Kafka Connect)
└─ Outbox Event Router SMT
├─ order.events
├─ payment.events
└─ delivery.eventsThe trade-off is that you now have the burden of operating a Kafka Connect cluster and Debezium connectors. It's worth honestly weighing your team's size and infrastructure capacity before making a choice.
Operational Automation — Cleanup Scheduler and WAL Monitoring
Honestly, this is the part I glossed over at first and later regretted. The outbox table keeps accumulating records even after events are published — after a few weeks, the table had grown to several GB. I only added a cleanup job after I started noticing query performance degrading.
Cleaning up published events
-- Clean up published events older than 7 days (via pg_cron or an external cron job)
DELETE FROM outbox_events
WHERE published_at IS NOT NULL
AND published_at < NOW() - INTERVAL '7 days';Monitoring WAL retention (essential when using CDC)
If Debezium unexpectedly stops, the replication slot will prevent WAL from being deleted, potentially filling the disk.
-- Check WAL retention
SELECT
slot_name,
active,
pg_size_pretty(
pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn)
) AS wal_behind
FROM pg_replication_slots;It's recommended to set up an alert when wal_behind exceeds a few GB. If Debezium is expected to be down for an extended period, documenting an operational procedure to explicitly drop or pause the slot in advance will reduce middle-of-the-night alerts.
Pros and Cons
Advantages
| Item | Description |
|---|---|
| Atomicity guarantee | Leverages DB ACID directly — events and data always commit or rollback together |
| Automated retries | If the relay fails, the outbox record remains and is automatically reprocessed on the next run |
| Broker independence | Works identically with various brokers beyond Kafka, such as RabbitMQ and SQS |
| Leverages existing infrastructure | The Polling approach can start with just the current DB, no additional infrastructure needed |
| Ordering guarantee | Using aggregate_id as the partition key maintains event order for the same entity |
Disadvantages and Caveats
| Item | Description | Mitigation |
|---|---|---|
| At-least-once delivery | Not exactly-once — duplicate publishing is possible | Consumers must implement idempotency (idempotency key) |
| Ordering only within aggregate | Order is guaranteed only among the same aggregate_id — global ordering is not possible |
Separate design required if global ordering is needed |
| DB write amplification | Every event incurs an additional write to the outbox | Index optimization, run cleanup scheduler |
| outbox table growth | Published records accumulate continuously | Periodic deletion job is essential |
| WAL disk risk (CDC) | Stopping Debezium causes the replication slot to block WAL deletion | Monitor pg_replication_slots, establish slot deactivation procedures |
| Infrastructure complexity (CDC) | Operational overhead of Kafka Connect + Debezium | Choose the approach that fits your team size and requirements |
Idempotency: The property where performing the same operation multiple times yields the same result. Since events can be published more than once, consumer services must record already-processed event IDs (in DB or Redis) to prevent duplicate processing.
The Most Common Mistakes in Practice
-
Omitting
FOR UPDATE SKIP LOCKED: When running multiple instances of the Polling Publisher without this clause, multiple instances simultaneously process the same event, dramatically increasing duplicate publishing. It typically works fine on a single instance and the problem only surfaces the moment you scale out — that's the trap. -
Scheduling polls with
setInterval: The next execution starts even if the previous one hasn't finished, causing concurrent execution issues. The recursivesetTimeoutpattern is safer, ensuring the next execution only begins after the previous one has fully completed. -
Neglecting the replication slot: In a CDC environment, if Debezium is stopped for a long time without dropping the slot, WAL accumulates without bound. Including a step in your operational runbook — "drop or pause slot when disabling Debezium" — makes incident response much easier.
-
Not running an outbox cleanup job: There's no reason to keep published events indefinitely, yet without a cleanup job, after a few months the table size starts affecting query performance. It's recommended to include the
pg_cronsetup from the very beginning. -
Applying Outbox to intra-service communication: The Outbox pattern is suited for events crossing service boundaries. Applying it to communication between modules within the same service only adds unnecessary complexity. Apply it only at boundaries where it's truly needed.
Closing Thoughts
The Transactional Outbox pattern is a simple yet powerful approach that solves the Dual-Write problem by borrowing the atomicity of DB transactions.
A realistic path is to start with the Polling Publisher and switch to CDC when you need to handle more than ~500 events per second or meet a sub-500ms latency requirement. Both approaches operate on an at-least-once guarantee, so consumer-side idempotency design should always be considered alongside them.
Three steps you can start with right now:
-
Add the outbox table: Add the
outbox_eventstable to your existing PostgreSQL schema, and create the partial index (WHERE published_at IS NULL) at the same time — you can start without worrying about query performance. -
Wire up the Polling Publisher: Run the polling query using
FOR UPDATE SKIP LOCKEDvia the recursivesetTimeoutpattern. Trendyol's open-source PollingOutboxPublisher is a great reference for a battle-tested implementation. -
Add consumer idempotency: It's recommended to add logic to consumer services that records the event
idin a processed-events table (or Redis) and returns early on duplicate receipt.
Once you've completed these three steps, even if the Kafka broker dies for 30 seconds and comes back, every order that made it into the DB will be published as an event. Those 2:30 AM alerts might become a little less frequent. Before deploying to production, it's worth running a scenario in an integration test environment where you forcibly shut down Kafka. The confidence that this pattern actually works comes not from the code, but from failure simulation.
References
- Transactional Outbox Pattern | microservices.io
- Outbox Event Router | Debezium Official Docs
- PostgreSQL Connector | Debezium Official Docs
- Transactional Outbox Pattern | AWS Prescriptive Guidance
- The Transactional Outbox Pattern | Confluent Developer
- Transactional Outbox: Database-Kafka Consistency | Conduktor
- Transactional Outbox Pattern in Practice | SeatGeek Engineering
- Outbox Pattern Story at Trendyol | Trendyol Tech
- Outbox Performance Boost: Faster & More Reliable | Trendyol Tech
- PollingOutboxPublisher | GitHub
- Outbox, Inbox patterns and delivery guarantees explained | event-driven.io
- Revisiting the Outbox Pattern | Decodable Blog
- Stop Overusing the Outbox Pattern | Squer
- A Use Case for Transactions: Adapting to Transactional Outbox Pattern | Spring Blog