Back to Journal
System Design

Complete Guide to Event-Driven Architecture with Go

A comprehensive guide to implementing Event-Driven Architecture using Go, covering architecture, code examples, and production-ready patterns.

Muneer Puthiya Purayil 17 min read

Event-driven architecture fundamentally changes how Go services communicate. Instead of synchronous HTTP calls creating tight coupling between services, events flow through message brokers, allowing each service to evolve independently. This guide covers everything from foundational patterns to production-hardened implementations.

Core Concepts

Event-driven architecture (EDA) decomposes systems into producers that emit events and consumers that react to them. In Go, this maps naturally to goroutines and channels at the local level, and to Kafka or NATS at the distributed level.

Three primary patterns emerge:

  1. Event Notification — a service emits a thin event signaling something happened, and consumers fetch additional data as needed
  2. Event-Carried State Transfer — events contain the full state change, eliminating the need for consumers to call back
  3. Event Sourcing — the event log is the source of truth, with current state derived by replaying events
go
1// Event Notification — thin event
2type OrderCreated struct {
3 EventID string `json:"event_id"`
4 OrderID string `json:"order_id"`
5 Timestamp time.Time `json:"timestamp"`
6}
7 
8// Event-Carried State Transfer — full payload
9type OrderCreatedFull struct {
10 EventID string `json:"event_id"`
11 OrderID string `json:"order_id"`
12 CustomerID string `json:"customer_id"`
13 Items []OrderItem `json:"items"`
14 Total decimal.Decimal `json:"total"`
15 Currency string `json:"currency"`
16 Timestamp time.Time `json:"timestamp"`
17}
18 

Setting Up Kafka with Go

The segmentio/kafka-go library provides the most idiomatic Go experience. Unlike confluent-kafka-go which wraps librdkafka via CGO, kafka-go is a pure Go implementation — no C dependencies, simpler cross-compilation, and easier debugging.

go
1package kafka
2 
3import (
4 "context"
5 "encoding/json"
6 "time"
7 
8 "github.com/segmentio/kafka-go"
9)
10 
11type Producer struct {
12 writer *kafka.Writer
13}
14 
15func NewProducer(brokers []string) *Producer {
16 return &Producer{
17 writer: &kafka.Writer{
18 Addr: kafka.TCP(brokers...),
19 Balancer: &kafka.LeastBytes{},
20 BatchSize: 100,
21 BatchTimeout: 10 * time.Millisecond,
22 RequiredAcks: kafka.RequireAll,
23 Async: false,
24 },
25 }
26}
27 
28func (p *Producer) Publish(ctx context.Context, topic string, key string, event interface{}) error {
29 payload, err := json.Marshal(event)
30 if err != nil {
31 return fmt.Errorf("marshal event: %w", err)
32 }
33 
34 return p.writer.WriteMessages(ctx, kafka.Message{
35 Topic: topic,
36 Key: []byte(key),
37 Value: payload,
38 Headers: []kafka.Header{
39 {Key: "event_type", Value: []byte(fmt.Sprintf("%T", event))},
40 {Key: "produced_at", Value: []byte(time.Now().UTC().Format(time.RFC3339Nano))},
41 },
42 })
43}
44 
45func (p *Producer) Close() error {
46 return p.writer.Close()
47}
48 

Building a Consumer Group

Consumer groups distribute partitions across instances for horizontal scaling. Here's a production-ready consumer implementation:

go
1package kafka
2 
3import (
4 "context"
5 "log/slog"
6 "time"
7 
8 "github.com/segmentio/kafka-go"
9)
10 
11type Handler func(ctx context.Context, msg kafka.Message) error
12 
13type Consumer struct {
14 reader *kafka.Reader
15 handler Handler
16 logger *slog.Logger
17}
18 
19func NewConsumer(brokers []string, groupID, topic string, handler Handler, logger *slog.Logger) *Consumer {
20 return &Consumer{
21 reader: kafka.NewReader(kafka.ReaderConfig{
22 Brokers: brokers,
23 GroupID: groupID,
24 Topic: topic,
25 MinBytes: 1e3,
26 MaxBytes: 10e6,
27 MaxWait: 250 * time.Millisecond,
28 CommitInterval: 0, // manual commit
29 StartOffset: kafka.LastOffset,
30 }),
31 handler: handler,
32 logger: logger,
33 }
34}
35 
36func (c *Consumer) Run(ctx context.Context) error {
37 c.logger.Info("consumer started", "topic", c.reader.Config().Topic, "group", c.reader.Config().GroupID)
38 
39 for {
40 select {
41 case <-ctx.Done():
42 c.logger.Info("consumer shutting down")
43 return c.reader.Close()
44 default:
45 }
46 
47 msg, err := c.reader.FetchMessage(ctx)
48 if err != nil {
49 if ctx.Err() != nil {
50 return nil
51 }
52 c.logger.Error("fetch failed", "error", err)
53 continue
54 }
55 
56 if err := c.handler(ctx, msg); err != nil {
57 c.logger.Error("handler failed",
58 "topic", msg.Topic,
59 "partition", msg.Partition,
60 "offset", msg.Offset,
61 "error", err,
62 )
63 // Push to dead letter topic on failure
64 if dlqErr := c.sendToDLQ(ctx, msg, err); dlqErr != nil {
65 c.logger.Error("DLQ publish failed", "error", dlqErr)
66 }
67 }
68 
69 if err := c.reader.CommitMessages(ctx, msg); err != nil {
70 c.logger.Error("commit failed", "error", err)
71 }
72 }
73}
74 

Event Routing and Dispatch

A clean event router decouples message consumption from business logic:

go
1type EventRouter struct {
2 handlers map[string]Handler
3 fallback Handler
4 logger *slog.Logger
5}
6 
7func NewEventRouter(logger *slog.Logger) *EventRouter {
8 return &EventRouter{
9 handlers: make(map[string]Handler),
10 logger: logger,
11 }
12}
13 
14func (r *EventRouter) Register(eventType string, handler Handler) {
15 r.handlers[eventType] = handler
16}
17 
18func (r *EventRouter) SetFallback(handler Handler) {
19 r.fallback = handler
20}
21 
22func (r *EventRouter) Dispatch(ctx context.Context, msg kafka.Message) error {
23 eventType := ""
24 for _, h := range msg.Headers {
25 if h.Key == "event_type" {
26 eventType = string(h.Value)
27 break
28 }
29 }
30 
31 handler, ok := r.handlers[eventType]
32 if !ok {
33 if r.fallback != nil {
34 return r.fallback(ctx, msg)
35 }
36 r.logger.Warn("no handler registered", "event_type", eventType)
37 return nil
38 }
39 
40 return handler(ctx, msg)
41}
42 

Usage ties everything together:

go
1func main() {
2 logger := slog.Default()
3 router := NewEventRouter(logger)
4 
5 router.Register("OrderCreated", handleOrderCreated)
6 router.Register("OrderShipped", handleOrderShipped)
7 router.Register("OrderCancelled", handleOrderCancelled)
8 
9 consumer := NewConsumer(
10 []string{"kafka-1:9092", "kafka-2:9092"},
11 "order-processor",
12 "order-events",
13 router.Dispatch,
14 logger,
15 )
16 
17 ctx, cancel := signal.NotifyContext(context.Background(), syscall.SIGINT, syscall.SIGTERM)
18 defer cancel()
19 
20 if err := consumer.Run(ctx); err != nil {
21 log.Fatal(err)
22 }
23}
24 

Need a second opinion on your system design architecture?

I run free 30-minute strategy calls for engineering teams tackling this exact problem.

Book a Free Call

Exactly-Once Processing with the Outbox Pattern

Dual writes — updating a database and publishing an event — create consistency risks. The transactional outbox pattern solves this by writing events to a database table within the same transaction, then polling or using CDC to publish them to Kafka.

go
1type OutboxEntry struct {
2 ID string `db:"id"`
3 AggregateID string `db:"aggregate_id"`
4 EventType string `db:"event_type"`
5 Payload []byte `db:"payload"`
6 CreatedAt time.Time `db:"created_at"`
7 PublishedAt *time.Time `db:"published_at"`
8}
9 
10func (s *OrderService) CreateOrder(ctx context.Context, cmd CreateOrderCommand) error {
11 tx, err := s.db.BeginTx(ctx, nil)
12 if err != nil {
13 return fmt.Errorf("begin tx: %w", err)
14 }
15 defer tx.Rollback()
16 
17 order := Order{
18 ID: uuid.New().String(),
19 CustomerID: cmd.CustomerID,
20 Items: cmd.Items,
21 Status: "created",
22 }
23 
24 if _, err := tx.ExecContext(ctx,
25 "INSERT INTO orders (id, customer_id, items, status) VALUES ($1, $2, $3, $4)",
26 order.ID, order.CustomerID, order.Items, order.Status,
27 ); err != nil {
28 return fmt.Errorf("insert order: %w", err)
29 }
30 
31 event := OrderCreatedFull{
32 EventID: uuid.New().String(),
33 OrderID: order.ID,
34 CustomerID: order.CustomerID,
35 Items: order.Items,
36 Timestamp: time.Now().UTC(),
37 }
38 payload, _ := json.Marshal(event)
39 
40 if _, err := tx.ExecContext(ctx,
41 "INSERT INTO outbox (id, aggregate_id, event_type, payload) VALUES ($1, $2, $3, $4)",
42 event.EventID, order.ID, "OrderCreated", payload,
43 ); err != nil {
44 return fmt.Errorf("insert outbox: %w", err)
45 }
46 
47 return tx.Commit()
48}
49 

The outbox poller runs as a separate goroutine:

go
1func (p *OutboxPoller) Poll(ctx context.Context) error {
2 ticker := time.NewTicker(100 * time.Millisecond)
3 defer ticker.Stop()
4 
5 for {
6 select {
7 case <-ctx.Done():
8 return nil
9 case <-ticker.C:
10 entries, err := p.fetchUnpublished(ctx, 100)
11 if err != nil {
12 p.logger.Error("fetch outbox entries failed", "error", err)
13 continue
14 }
15 
16 for _, entry := range entries {
17 if err := p.producer.Publish(ctx, p.topic, entry.AggregateID, entry.Payload); err != nil {
18 p.logger.Error("publish failed", "entry_id", entry.ID, "error", err)
19 break
20 }
21 if err := p.markPublished(ctx, entry.ID); err != nil {
22 p.logger.Error("mark published failed", "entry_id", entry.ID, "error", err)
23 }
24 }
25 }
26 }
27}
28 

Observability and Metrics

Production event systems need deep observability. Use OpenTelemetry for traces that span producer-to-consumer:

go
1import (
2 "go.opentelemetry.io/otel"
3 "go.opentelemetry.io/otel/propagation"
4 "go.opentelemetry.io/otel/trace"
5)
6 
7var tracer = otel.Tracer("event-processor")
8 
9func (c *Consumer) tracedHandler(ctx context.Context, msg kafka.Message) error {
10 // Extract trace context from message headers
11 carrier := NewKafkaHeaderCarrier(msg.Headers)
12 ctx = otel.GetTextMapPropagator().Extract(ctx, carrier)
13 
14 ctx, span := tracer.Start(ctx, fmt.Sprintf("process %s", msg.Topic),
15 trace.WithAttributes(
16 attribute.String("messaging.system", "kafka"),
17 attribute.String("messaging.destination", msg.Topic),
18 attribute.Int("messaging.kafka.partition", msg.Partition),
19 attribute.Int64("messaging.kafka.offset", msg.Offset),
20 ),
21 )
22 defer span.End()
23 
24 return c.handler(ctx, msg)
25}
26 

Combine tracing with Prometheus metrics for consumer lag, processing latency, and error rates:

go
1var (
2 eventsProcessed = promauto.NewCounterVec(prometheus.CounterOpts{
3 Name: "events_processed_total",
4 Help: "Total events processed by type and status",
5 }, []string{"event_type", "status"})
6 
7 processingDuration = promauto.NewHistogramVec(prometheus.HistogramOpts{
8 Name: "event_processing_duration_seconds",
9 Help: "Event processing duration",
10 Buckets: prometheus.ExponentialBuckets(0.001, 2, 12),
11 }, []string{"event_type"})
12)
13 

Graceful Shutdown and Backpressure

Graceful shutdown prevents message loss during deployments. Backpressure prevents a slow consumer from unbounded memory growth:

go
1func (c *Consumer) RunWithBackpressure(ctx context.Context, maxInFlight int) error {
2 sem := make(chan struct{}, maxInFlight)
3 
4 for {
5 select {
6 case <-ctx.Done():
7 // Wait for in-flight messages to complete
8 for i := 0; i < maxInFlight; i++ {
9 sem <- struct{}{}
10 }
11 return c.reader.Close()
12 case sem <- struct{}{}:
13 }
14 
15 msg, err := c.reader.FetchMessage(ctx)
16 if err != nil {
17 <-sem
18 if ctx.Err() != nil {
19 return nil
20 }
21 continue
22 }
23 
24 go func(m kafka.Message) {
25 defer func() { <-sem }()
26 if err := c.handler(ctx, m); err != nil {
27 c.logger.Error("handler error", "error", err)
28 }
29 c.reader.CommitMessages(ctx, m)
30 }(msg)
31 }
32}
33 

Conclusion

Building event-driven architecture in Go rewards you with a system that's straightforward to reason about, easy to operate, and fast enough for all but the most extreme throughput requirements. The combination of goroutines for concurrency, channels for internal coordination, and libraries like kafka-go for external messaging creates an ecosystem where the hard parts of distributed systems — exactly-once processing, consumer group management, graceful shutdown — can be handled with clear, idiomatic code.

The patterns covered here — outbox for consistency, event routing for clean dispatch, OpenTelemetry for observability — compose together naturally. Start with the simplest implementation that meets your requirements, instrument everything from day one, and evolve toward more sophisticated patterns like event sourcing only when the business domain demands it.

FAQ

Need expert help?

Building with system design?

I help teams ship production-grade systems. From architecture review to hands-on builds.

Muneer Puthiya Purayil

SaaS Architect & AI Systems Engineer. 10+ years shipping production infrastructure across fintech, automotive, e-commerce, and healthcare.

Engage

Start a
Conversation.

For teams building at scale: SaaS platforms, agentic AI systems, and enterprise mobile infrastructure. Scope and fit are evaluated before any engagement begins.

Limited availability · Q3 / Q4 2026