Should I use pgx or database/sql for sharded connections?

Use `pgx` directly (not through `database/sql`) when you need PostgreSQL-specific features like COPY protocol, LISTEN/NOTIFY, or custom type handling. Use `database/sql` when you want driver-agnostic code or might switch databases. For sharding infrastructure, `database/sql` is sufficient and more portable. If you need pgx features, use `pgxpool.Pool` instead of `sql.DB` — the pool management API is similar.

How do I handle transactions that span multiple shards?

Avoid them. Redesign your data model so transactions are shard-local. If cross-shard atomicity is genuinely required, implement a saga pattern: execute each shard's transaction independently with compensating transactions for rollback. Two-phase commit (2PC) across shards is theoretically correct but operationally fragile — a coordinator failure during the commit phase blocks all participating shards.

What is the memory overhead of maintaining many connection pools?

Each `sql.DB` pool with 50 max connections consumes roughly 50-100MB of memory (connection buffers, prepared statement caches). For 16 shards, that is 800MB-1.6GB. For 100 shards, budget 5-10GB for connection management alone. Reduce per-pool max connections proportionally as you add shards, keeping the total connection count within your application's memory budget.

How do I unit test code that uses the sharded database?

Use interface-based design. Define a `ShardedQuerier` interface with `QueryShard`, `ExecShard`, and `ScatterGather` methods. Your repository code depends on the interface, not the concrete `ShardedDB`. In tests, provide a mock implementation backed by a single SQLite or PostgreSQL instance. For integration tests, spin up multiple PostgreSQL containers with Docker Compose, one per shard.

Complete Guide to Database Sharding with Go

Go's explicit error handling, connection pool management via database/sql, and goroutine-based concurrency make it particularly well-suited for building database sharding infrastructure. This guide covers implementing a complete sharding layer in Go, from shard routing and connection management through cross-shard queries and online data migration.

Core Architecture

A Go sharding layer consists of four components: the shard router (maps keys to shards), the pool manager (maintains per-shard connection pools), the query executor (handles single-shard and cross-shard queries), and the migration engine (moves data between shards).

1package sharding

3import (

4 "context"

5 "database/sql"

6 "fmt"

7 "hash/fnv"

8 "sync"

10 _ "github.com/lib/pq"

11)

13// ShardConfig defines a single shard's connection parameters

14type ShardConfig struct {

15 ID string

16 Host string

17 Port int

18 Database string

19 User string

20 Password string

21 MaxConns int

22}

24func (c ShardConfig) DSN() string {

25 return fmt.Sprintf(

26 "host=%s port=%d dbname=%s user=%s password=%s sslmode=require",

27 c.Host, c.Port, c.Database, c.User, c.Password,

28 )

29}

Shard Router with Consistent Hashing

1package sharding

3import (

4 "hash/crc32"

5 "sort"

6 "strconv"

7 "sync"

10type ConsistentHashRouter struct {

11 ring map[uint32]string

12 sortedHashes []uint32

13 replicas int

14 mu sync.RWMutex

15}

17func NewConsistentHashRouter(replicas int) *ConsistentHashRouter {

18 return &ConsistentHashRouter{

19 ring: make(map[uint32]string),

20 replicas: replicas,

21 }

22}

24func (r *ConsistentHashRouter) AddShard(shardID string) {

25 r.mu.Lock()

26 defer r.mu.Unlock()

28 for i := 0; i < r.replicas; i++ {

29 hash := crc32.ChecksumIEEE([]byte(shardID + ":" + strconv.Itoa(i)))

30 r.ring[hash] = shardID

31 r.sortedHashes = append(r.sortedHashes, hash)

32 }

33 sort.Slice(r.sortedHashes, func(i, j int) bool {

34 return r.sortedHashes[i] < r.sortedHashes[j]

35 })

36}

38func (r *ConsistentHashRouter) RemoveShard(shardID string) {

39 r.mu.Lock()

40 defer r.mu.Unlock()

42 for i := 0; i < r.replicas; i++ {

43 hash := crc32.ChecksumIEEE([]byte(shardID + ":" + strconv.Itoa(i)))

44 delete(r.ring, hash)

45 }

47 newHashes := make([]uint32, 0, len(r.sortedHashes))

48 for _, h := range r.sortedHashes {

49 if _, ok := r.ring[h]; ok {

50 newHashes = append(newHashes, h)

51 }

52 }

53 r.sortedHashes = newHashes

54}

56func (r *ConsistentHashRouter) Route(key string) string {

57 r.mu.RLock()

58 defer r.mu.RUnlock()

60 if len(r.sortedHashes) == 0 {

61 return ""

62 }

64 hash := crc32.ChecksumIEEE([]byte(key))

65 idx := sort.Search(len(r.sortedHashes), func(i int) bool {

66 return r.sortedHashes[i] >= hash

67 })

69 if idx >= len(r.sortedHashes) {

70 idx = 0

71 }

73 return r.ring[r.sortedHashes[idx]]

74}

Connection Pool Manager

1package sharding

3import (

4 "context"

5 "database/sql"

6 "fmt"

7 "sync"

8 "time"

11type PoolManager struct {

12 pools map[string]*sql.DB

13 mu sync.RWMutex

14}

16func NewPoolManager(configs []ShardConfig) (*PoolManager, error) {

17 pm := &PoolManager{

18 pools: make(map[string]*sql.DB),

19 }

21 for _, config := range configs {

22 db, err := sql.Open("postgres", config.DSN())

23 if err != nil {

24 return nil, fmt.Errorf("open %s: %w", config.ID, err)

25 }

27 db.SetMaxOpenConns(config.MaxConns)

28 db.SetMaxIdleConns(config.MaxConns / 2)

29 db.SetConnMaxLifetime(30 * time.Minute)

30 db.SetConnMaxIdleTime(5 * time.Minute)

32 if err := db.PingContext(context.Background()); err != nil {

33 return nil, fmt.Errorf("ping %s: %w", config.ID, err)

34 }

36 pm.pools[config.ID] = db

37 }

39 return pm, nil

40}

42func (pm *PoolManager) GetDB(shardID string) (*sql.DB, error) {

43 pm.mu.RLock()

44 defer pm.mu.RUnlock()

46 db, ok := pm.pools[shardID]

47 if !ok {

48 return nil, fmt.Errorf("unknown shard: %s", shardID)

49 }

50 return db, nil

51}

53func (pm *PoolManager) AllShardIDs() []string {

54 pm.mu.RLock()

55 defer pm.mu.RUnlock()

57 ids := make([]string, 0, len(pm.pools))

58 for id := range pm.pools {

59 ids = append(ids, id)

60 }

61 return ids

62}

64func (pm *PoolManager) Close() error {

65 pm.mu.Lock()

66 defer pm.mu.Unlock()

68 var firstErr error

69 for id, db := range pm.pools {

70 if err := db.Close(); err != nil && firstErr == nil {

71 firstErr = fmt.Errorf("close %s: %w", id, err)

72 }

73 }

74 return firstErr

75}

77func (pm *PoolManager) Stats() map[string]sql.DBStats {

78 pm.mu.RLock()

79 defer pm.mu.RUnlock()

81 stats := make(map[string]sql.DBStats, len(pm.pools))

82 for id, db := range pm.pools {

83 stats[id] = db.Stats()

84 }

85 return stats

86}

Sharded Query Executor

1package sharding

3import (

4 "context"

5 "database/sql"

6 "fmt"

7 "sync"

10type ShardedDB struct {

11 router *ConsistentHashRouter

12 pools *PoolManager

13}

15func NewShardedDB(configs []ShardConfig, replicas int) (*ShardedDB, error) {

16 router := NewConsistentHashRouter(replicas)

17 for _, config := range configs {

18 router.AddShard(config.ID)

19 }

21 pools, err := NewPoolManager(configs)

22 if err != nil {

23 return nil, err

24 }

26 return &ShardedDB{router: router, pools: pools}, nil

27}

29// QueryShard executes a query on the shard determined by the shard key

30func (s *ShardedDB) QueryShard(

31 ctx context.Context,

32 shardKey string,

33 query string,

34 args ...interface{},

35) (*sql.Rows, error) {

36 shardID := s.router.Route(shardKey)

37 db, err := s.pools.GetDB(shardID)

38 if err != nil {

39 return nil, fmt.Errorf("get shard %s: %w", shardID, err)

40 }

41 return db.QueryContext(ctx, query, args...)

42}

44// ExecShard executes a statement on the shard determined by the shard key

45func (s *ShardedDB) ExecShard(

46 ctx context.Context,

47 shardKey string,

48 query string,

49 args ...interface{},

50) (sql.Result, error) {

51 shardID := s.router.Route(shardKey)

52 db, err := s.pools.GetDB(shardID)

53 if err != nil {

54 return nil, fmt.Errorf("get shard %s: %w", shardID, err)

55 }

56 return db.ExecContext(ctx, query, args...)

57}

59// ScatterGather executes a query across all shards and collects results

60func (s *ShardedDB) ScatterGather(

61 ctx context.Context,

62 query string,

63 args ...interface{},

64) ([]ShardResult, error) {

65 shardIDs := s.pools.AllShardIDs()

66 results := make([]ShardResult, len(shardIDs))

67 var wg sync.WaitGroup

68 var mu sync.Mutex

69 var firstErr error

71 for i, shardID := range shardIDs {

72 wg.Add(1)

73 go func(idx int, id string) {

74 defer wg.Done()

76 db, err := s.pools.GetDB(id)

77 if err != nil {

78 mu.Lock()

79 if firstErr == nil {

80 firstErr = err

81 }

82 mu.Unlock()

83 return

84 }

86 rows, err := db.QueryContext(ctx, query, args...)

87 if err != nil {

88 results[idx] = ShardResult{ShardID: id, Error: err}

89 return

90 }

91 defer rows.Close()

93 columns, _ := rows.Columns()

94 var data []map[string]interface{}

96 for rows.Next() {

97 values := make([]interface{}, len(columns))

98 valuePtrs := make([]interface{}, len(columns))

99 for i := range values {

100 valuePtrs[i] = &values[i]

101 }

102

103 if err := rows.Scan(valuePtrs...); err != nil {

104 results[idx] = ShardResult{ShardID: id, Error: err}

105 return

106 }

107

108 row := make(map[string]interface{})

109 for i, col := range columns {

110 row[col] = values[i]

111 }

112 data = append(data, row)

113 }

114

115 results[idx] = ShardResult{ShardID: id, Data: data}

116 }(i, shardID)

117 }

118

119 wg.Wait()

120 return results, firstErr

121}

122

123type ShardResult struct {

124 ShardID string

125 Data []map[string]interface{}

126 Error error

127}

128

Need a second opinion on your system design architecture?

I run free 30-minute strategy calls for engineering teams tackling this exact problem.

Book a Free Call

Repository Pattern with Sharding

1package repository

3import (

4 "context"

5 "fmt"

6 "time"

8 "myapp/sharding"

11type Order struct {

12 TenantID string

13 OrderID string

14 CustomerID string

15 Status string

16 TotalCents int64

17 Currency string

18 CreatedAt time.Time

19}

21type OrderRepository struct {

22 db *sharding.ShardedDB

23}

25func NewOrderRepository(db *sharding.ShardedDB) *OrderRepository {

26 return &OrderRepository{db: db}

27}

29func (r *OrderRepository) Create(ctx context.Context, order Order) error {

30 _, err := r.db.ExecShard(ctx, order.TenantID,

31 `INSERT INTO orders (tenant_id, order_id, customer_id, status, total_cents, currency, created_at)

32 VALUES ($1, $2, $3, $4, $5, $6, $7)`,

33 order.TenantID, order.OrderID, order.CustomerID,

34 order.Status, order.TotalCents, order.Currency, order.CreatedAt,

35 )

36 return err

37}

39func (r *OrderRepository) GetByID(ctx context.Context, tenantID, orderID string) (*Order, error) {

40 rows, err := r.db.QueryShard(ctx, tenantID,

41 `SELECT tenant_id, order_id, customer_id, status, total_cents, currency, created_at

42 FROM orders WHERE tenant_id = $1 AND order_id = $2`,

43 tenantID, orderID,

44 )

45 if err != nil {

46 return nil, err

47 }

48 defer rows.Close()

50 if !rows.Next() {

51 return nil, fmt.Errorf("order not found: %s/%s", tenantID, orderID)

52 }

54 var o Order

55 err = rows.Scan(&o.TenantID, &o.OrderID, &o.CustomerID,

56 &o.Status, &o.TotalCents, &o.Currency, &o.CreatedAt)

57 return &o, err

58}

60func (r *OrderRepository) ListByTenant(ctx context.Context, tenantID string, limit, offset int) ([]Order, error) {

61 rows, err := r.db.QueryShard(ctx, tenantID,

62 `SELECT tenant_id, order_id, customer_id, status, total_cents, currency, created_at

63 FROM orders WHERE tenant_id = $1

64 ORDER BY created_at DESC LIMIT $2 OFFSET $3`,

65 tenantID, limit, offset,

66 )

67 if err != nil {

68 return nil, err

69 }

70 defer rows.Close()

72 var orders []Order

73 for rows.Next() {

74 var o Order

75 if err := rows.Scan(&o.TenantID, &o.OrderID, &o.CustomerID,

76 &o.Status, &o.TotalCents, &o.Currency, &o.CreatedAt); err != nil {

77 return nil, err

78 }

79 orders = append(orders, o)

80 }

81 return orders, rows.Err()

82}

84// Cross-shard aggregation

85func (r *OrderRepository) GetGlobalStats(ctx context.Context) (*OrderStats, error) {

86 results, err := r.db.ScatterGather(ctx,

87 `SELECT COUNT(*) as count, COALESCE(SUM(total_cents), 0) as revenue

88 FROM orders WHERE created_at > NOW() - INTERVAL '30 days'`,

89 )

90 if err != nil {

91 return nil, err

92 }

94 stats := &OrderStats{}

95 for _, result := range results {

96 if result.Error != nil {

97 continue // Log and skip failed shards

98 }

99 for _, row := range result.Data {

100 stats.TotalOrders += row["count"].(int64)

101 stats.TotalRevenue += row["revenue"].(int64)

102 }

103 }

104 return stats, nil

105}

106

107type OrderStats struct {

108 TotalOrders int64

109 TotalRevenue int64

110}

111

Shard Health Monitoring

1package sharding

3import (

4 "context"

5 "database/sql"

6 "time"

9type ShardHealth struct {

10 ShardID string

11 Healthy bool

12 PingLatency time.Duration

13 OpenConnections int

14 InUseConnections int

15 IdleConnections int

16 WaitCount int64

17 DataSizeBytes int64

18}

20func (pm *PoolManager) HealthCheck(ctx context.Context) []ShardHealth {

21 shardIDs := pm.AllShardIDs()

22 results := make([]ShardHealth, len(shardIDs))

24 var wg sync.WaitGroup

25 for i, id := range shardIDs {

26 wg.Add(1)

27 go func(idx int, shardID string) {

28 defer wg.Done()

30 db, err := pm.GetDB(shardID)

31 if err != nil {

32 results[idx] = ShardHealth{ShardID: shardID, Healthy: false}

33 return

34 }

36 start := time.Now()

37 err = db.PingContext(ctx)

38 latency := time.Since(start)

40 stats := db.Stats()

41 health := ShardHealth{

42 ShardID: shardID,

43 Healthy: err == nil,

44 PingLatency: latency,

45 OpenConnections: stats.OpenConnections,

46 InUseConnections: stats.InUse,

47 IdleConnections: stats.Idle,

48 WaitCount: stats.WaitCount,

49 }

51 if err == nil {

52 var size int64

53 db.QueryRowContext(ctx, "SELECT pg_database_size(current_database())").Scan(&size)

54 health.DataSizeBytes = size

55 }

57 results[idx] = health

58 }(i, id)

59 }

61 wg.Wait()

62 return results

63}

Testing

1package sharding_test

3import (

4 "testing"

5 "myapp/sharding"

8func TestConsistentHashRouter(t *testing.T) {

9 router := sharding.NewConsistentHashRouter(150)

10 router.AddShard("shard-1")

11 router.AddShard("shard-2")

12 router.AddShard("shard-3")

14 // Same key always routes to same shard

15 shard1 := router.Route("tenant-abc")

16 shard2 := router.Route("tenant-abc")

17 if shard1 != shard2 {

18 t.Errorf("inconsistent routing: %s != %s", shard1, shard2)

19 }

21 // Distribution should be roughly even

22 counts := map[string]int{}

23 for i := 0; i < 10000; i++ {

24 key := fmt.Sprintf("tenant-%d", i)

25 shard := router.Route(key)

26 counts[shard]++

27 }

29 for shard, count := range counts {

30 ratio := float64(count) / 10000.0

31 if ratio < 0.2 || ratio > 0.5 {

32 t.Errorf("shard %s has uneven distribution: %.2f", shard, ratio)

33 }

34 }

35}

37func TestRouterStabilityOnShardAddition(t *testing.T) {

38 router := sharding.NewConsistentHashRouter(150)

39 router.AddShard("shard-1")

40 router.AddShard("shard-2")

42 // Record initial routing

43 initial := make(map[string]string)

44 for i := 0; i < 1000; i++ {

45 key := fmt.Sprintf("key-%d", i)

46 initial[key] = router.Route(key)

47 }

49 // Add a third shard

50 router.AddShard("shard-3")

52 // Count how many keys moved

53 moved := 0

54 for key, originalShard := range initial {

55 if router.Route(key) != originalShard {

56 moved++

57 }

58 }

60 // With consistent hashing, roughly 1/3 of keys should move

61 moveRatio := float64(moved) / 1000.0

62 if moveRatio > 0.5 {

63 t.Errorf("too many keys moved: %.2f (expected ~0.33)", moveRatio)

64 }

65}

Conclusion

Go's standard library provides the core building blocks for database sharding: database/sql for connection pooling, goroutines for parallel shard queries, and sync primitives for thread-safe router management. The consistent hash router ensures predictable key distribution, the pool manager handles per-shard connection lifecycle, and the scatter-gather executor enables cross-shard queries with bounded concurrency. This foundation scales from a handful of shards to hundreds with minimal changes to the application code.

FAQ

Need expert help?

Building with system design?

I help teams ship production-grade systems. From architecture review to hands-on builds.

Book a Free Call Send a Brief

sharding database scalability distributed-systems go guide

Muneer Puthiya Purayil

SaaS Architect & AI Systems Engineer. 10+ years shipping production infrastructure across fintech, automotive, e-commerce, and healthcare.

View Portfolio Book a Call

← Previous

Core Architecture

Shard Router with Consistent Hashing

Connection Pool Manager

Sharded Query Executor

Repository Pattern with Sharding

Shard Health Monitoring

Testing

Conclusion

FAQ

Building with system design?

Complete Guide to Database Sharding with Java

Complete Guide to Database Sharding with Python

Complete Guide to Database Sharding with Typescript

Complete Guide to Database Sharding with Java

Complete Guide to Database Sharding with Python

Start aConversation.

Start a
Conversation.