Back to Journal
DevOps

Complete Guide to Kubernetes Production Setup with Go

A comprehensive guide to implementing Kubernetes Production Setup using Go, covering architecture, code examples, and production-ready patterns.

Muneer Puthiya Purayil 17 min read

Go's compiled binaries, minimal runtime overhead, and first-class Kubernetes client libraries make it the natural choice for building Kubernetes tooling and deploying Go services to production clusters. This guide covers the complete lifecycle: containerization, deployment manifests, observability, and operational patterns specific to Go workloads.

Building Production Container Images

Go's static binary compilation is its biggest advantage for containerization. A properly built Go binary has zero runtime dependencies, enabling scratch or distroless base images.

dockerfile
1FROM golang:1.22-alpine AS builder
2WORKDIR /app
3COPY go.mod go.sum ./
4RUN go mod download
5COPY . .
6RUN CGO_ENABLED=0 GOOS=linux GOARCH=amd64 go build \
7 -ldflags="-w -s -X main.version=$(git describe --tags --always)" \
8 -o /app/server ./cmd/server
9 
10FROM gcr.io/distroless/static-debian12:nonroot
11COPY --from=builder /app/server /server
12EXPOSE 8080
13ENTRYPOINT ["/server"]
14 

Key decisions in this Dockerfile:

  • CGO_ENABLED=0 ensures a fully static binary with no C library dependencies.
  • -ldflags="-w -s" strips debug information and symbol tables, reducing binary size by 25-30%.
  • distroless/static provides a minimal base (2MB) with no shell, no package manager, and a reduced attack surface.
  • The nonroot tag runs as UID 65534, satisfying Kubernetes Pod Security Standards.

The resulting image is typically 15-25MB, compared to 300-800MB for a typical Python or Node.js image. This translates to faster pulls during scaling events — a 20MB image pulls in under 2 seconds vs 30+ seconds for a 500MB image.

Application Structure for Kubernetes

Graceful Shutdown

Go's context package and signal handling make graceful shutdown straightforward:

go
1package main
2 
3import (
4 "context"
5 "log/slog"
6 "net/http"
7 "os"
8 "os/signal"
9 "syscall"
10 "time"
11)
12 
13func main() {
14 logger := slog.New(slog.NewJSONHandler(os.Stdout, nil))
15 
16 mux := http.NewServeMux()
17 mux.HandleFunc("GET /healthz", healthHandler)
18 mux.HandleFunc("GET /readyz", readyHandler)
19 mux.HandleFunc("GET /api/v1/orders", ordersHandler)
20 
21 srv := &http.Server{
22 Addr: ":8080",
23 Handler: mux,
24 ReadTimeout: 5 * time.Second,
25 WriteTimeout: 10 * time.Second,
26 IdleTimeout: 120 * time.Second,
27 }
28 
29 ctx, stop := signal.NotifyContext(context.Background(), syscall.SIGTERM, syscall.SIGINT)
30 defer stop()
31 
32 go func() {
33 logger.Info("server starting", "addr", srv.Addr)
34 if err := srv.ListenAndServe(); err != nil && err != http.ErrServerClosed {
35 logger.Error("server failed", "error", err)
36 os.Exit(1)
37 }
38 }()
39 
40 <-ctx.Done()
41 logger.Info("shutdown signal received")
42 
43 shutdownCtx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
44 defer cancel()
45 
46 if err := srv.Shutdown(shutdownCtx); err != nil {
47 logger.Error("forced shutdown", "error", err)
48 }
49 logger.Info("server stopped")
50}
51 

The 30-second shutdown timeout should match your Kubernetes terminationGracePeriodSeconds. When Kubernetes sends SIGTERM, this code stops accepting new connections, finishes in-flight requests, and exits cleanly.

Health Check Endpoints

go
1var (
2 ready atomic.Bool
3 dbPool *pgxpool.Pool
4)
5 
6func healthHandler(w http.ResponseWriter, r *http.Request) {
7 w.WriteHeader(http.StatusOK)
8 w.Write([]byte("ok"))
9}
10 
11func readyHandler(w http.ResponseWriter, r *http.Request) {
12 if !ready.Load() {
13 http.Error(w, "not ready", http.StatusServiceUnavailable)
14 return
15 }
16
17 ctx, cancel := context.WithTimeout(r.Context(), 2*time.Second)
18 defer cancel()
19
20 if err := dbPool.Ping(ctx); err != nil {
21 http.Error(w, "db unavailable", http.StatusServiceUnavailable)
22 return
23 }
24 w.WriteHeader(http.StatusOK)
25 w.Write([]byte("ready"))
26}
27 

Separate liveness (/healthz) from readiness (/readyz). The liveness probe should only check if the process is responsive — never include dependency checks. A database outage should not trigger pod restarts (which would make things worse); it should only remove the pod from service rotation via the readiness probe.

Kubernetes Deployment Manifests

yaml
1apiVersion: apps/v1
2kind: Deployment
3metadata:
4 name: order-service
5 labels:
6 app: order-service
7 version: v1.4.2
8spec:
9 replicas: 3
10 selector:
11 matchLabels:
12 app: order-service
13 template:
14 metadata:
15 labels:
16 app: order-service
17 version: v1.4.2
18 annotations:
19 prometheus.io/scrape: "true"
20 prometheus.io/port: "9090"
21 prometheus.io/path: "/metrics"
22 spec:
23 terminationGracePeriodSeconds: 30
24 serviceAccountName: order-service
25 securityContext:
26 runAsNonRoot: true
27 runAsUser: 65534
28 fsGroup: 65534
29 seccompProfile:
30 type: RuntimeDefault
31 containers:
32 - name: order-service
33 image: registry.example.com/order-service:v1.4.2
34 ports:
35 - containerPort: 8080
36 name: http
37 - containerPort: 9090
38 name: metrics
39 env:
40 - name: DATABASE_URL
41 valueFrom:
42 secretKeyRef:
43 name: order-service-secrets
44 key: database-url
45 - name: GOMAXPROCS
46 valueFrom:
47 resourceFieldRef:
48 resource: limits.cpu
49 resources:
50 requests:
51 cpu: 250m
52 memory: 128Mi
53 limits:
54 memory: 256Mi
55 readinessProbe:
56 httpGet:
57 path: /readyz
58 port: http
59 initialDelaySeconds: 2
60 periodSeconds: 10
61 timeoutSeconds: 3
62 livenessProbe:
63 httpGet:
64 path: /healthz
65 port: http
66 initialDelaySeconds: 5
67 periodSeconds: 15
68 timeoutSeconds: 3
69 securityContext:
70 allowPrivilegeEscalation: false
71 readOnlyRootFilesystem: true
72 capabilities:
73 drop: ["ALL"]
74 

Notable configuration choices:

  • GOMAXPROCS from resource limits. Go defaults to using all available CPU cores, but in a container with CPU limits, it sees the host's cores. Setting GOMAXPROCS from the CPU limit prevents excessive goroutine scheduling overhead. Alternatively, use the automaxprocs package from Uber.
  • Low memory requests. Go services are memory-efficient. A typical API server with moderate traffic needs 64-128Mi. Set requests based on actual usage from VPA recommendations.
  • Short initialDelaySeconds. Go binaries start in milliseconds, unlike JVM or interpreted languages that need 10-30 seconds for warm-up. A 2-second readiness delay is conservative.
  • Read-only root filesystem. Go's static binary needs no writable filesystem unless the application explicitly writes to disk.

Need a second opinion on your DevOps pipelines architecture?

I run free 30-minute strategy calls for engineering teams tackling this exact problem.

Book a Free Call

Observability with OpenTelemetry

go
1package observability
2 
3import (
4 "context"
5 "log/slog"
6 "net/http"
7 "time"
8 
9 "github.com/prometheus/client_golang/prometheus"
10 "github.com/prometheus/client_golang/prometheus/promauto"
11 "github.com/prometheus/client_golang/prometheus/promhttp"
12 "go.opentelemetry.io/otel"
13 "go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc"
14 "go.opentelemetry.io/otel/sdk/resource"
15 sdktrace "go.opentelemetry.io/otel/sdk/trace"
16 semconv "go.opentelemetry.io/otel/semconv/v1.24.0"
17)
18 
19var (
20 httpRequestDuration = promauto.NewHistogramVec(
21 prometheus.HistogramOpts{
22 Name: "http_request_duration_seconds",
23 Help: "Duration of HTTP requests",
24 Buckets: []float64{.005, .01, .025, .05, .1, .25, .5, 1, 2.5, 5},
25 },
26 []string{"method", "path", "status"},
27 )
28 
29 httpRequestsTotal = promauto.NewCounterVec(
30 prometheus.CounterOpts{
31 Name: "http_requests_total",
32 Help: "Total HTTP requests",
33 },
34 []string{"method", "path", "status"},
35 )
36)
37 
38func InitTracer(ctx context.Context, serviceName string) (*sdktrace.TracerProvider, error) {
39 exporter, err := otlptracegrpc.New(ctx)
40 if err != nil {
41 return nil, err
42 }
43 
44 tp := sdktrace.NewTracerProvider(
45 sdktrace.WithBatcher(exporter),
46 sdktrace.WithResource(resource.NewWithAttributes(
47 semconv.SchemaURL,
48 semconv.ServiceName(serviceName),
49 )),
50 sdktrace.WithSampler(sdktrace.ParentBased(
51 sdktrace.TraceIDRatioBased(0.1),
52 )),
53 )
54 otel.SetTracerProvider(tp)
55 return tp, nil
56}
57 
58func MetricsMiddleware(next http.Handler) http.Handler {
59 return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
60 start := time.Now()
61 wrapped := &responseWriter{ResponseWriter: w, statusCode: 200}
62 next.ServeHTTP(wrapped, r)
63 duration := time.Since(start).Seconds()
64 status := http.StatusText(wrapped.statusCode)
65 httpRequestDuration.WithLabelValues(r.Method, r.URL.Path, status).Observe(duration)
66 httpRequestsTotal.WithLabelValues(r.Method, r.URL.Path, status).Inc()
67 })
68}
69 
70func MetricsServer(addr string) *http.Server {
71 mux := http.NewServeMux()
72 mux.Handle("/metrics", promhttp.Handler())
73 return &http.Server{Addr: addr, Handler: mux}
74}
75 
76type responseWriter struct {
77 http.ResponseWriter
78 statusCode int
79}
80 
81func (rw *responseWriter) WriteHeader(code int) {
82 rw.statusCode = code
83 rw.ResponseWriter.WriteHeader(code)
84}
85 

Run the metrics server on a separate port (9090) from the application (8080). This allows Prometheus to scrape metrics without the scrape requests appearing in application logs or metrics.

Building Kubernetes Operators in Go

Go is the standard language for Kubernetes operators via the controller-runtime library:

go
1package controller
2 
3import (
4 "context"
5 "fmt"
6 
7 appsv1 "k8s.io/api/apps/v1"
8 corev1 "k8s.io/api/core/v1"
9 "k8s.io/apimachinery/pkg/api/resource"
10 metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
11 ctrl "sigs.k8s.io/controller-runtime"
12 "sigs.k8s.io/controller-runtime/pkg/client"
13 "sigs.k8s.io/controller-runtime/pkg/log"
14 
15 appv1alpha1 "example.com/operator/api/v1alpha1"
16)
17 
18type AppReconciler struct {
19 client.Client
20}
21 
22func (r *AppReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) {
23 logger := log.FromContext(ctx)
24 
25 var app appv1alpha1.App
26 if err := r.Get(ctx, req.NamespacedName, &app); err != nil {
27 return ctrl.Result{}, client.IgnoreNotFound(err)
28 }
29 
30 deployment := &appsv1.Deployment{
31 ObjectMeta: metav1.ObjectMeta{
32 Name: app.Name,
33 Namespace: app.Namespace,
34 },
35 Spec: appsv1.DeploymentSpec{
36 Replicas: &app.Spec.Replicas,
37 Selector: &metav1.LabelSelector{
38 MatchLabels: map[string]string{"app": app.Name},
39 },
40 Template: corev1.PodTemplateSpec{
41 ObjectMeta: metav1.ObjectMeta{
42 Labels: map[string]string{"app": app.Name},
43 },
44 Spec: corev1.PodSpec{
45 Containers: []corev1.Container{{
46 Name: app.Name,
47 Image: fmt.Sprintf("%s:%s", app.Spec.Image, app.Spec.Version),
48 Resources: corev1.ResourceRequirements{
49 Requests: corev1.ResourceList{
50 corev1.ResourceCPU: resource.MustParse("100m"),
51 corev1.ResourceMemory: resource.MustParse("128Mi"),
52 },
53 },
54 }},
55 },
56 },
57 },
58 }
59 
60 if err := ctrl.SetControllerReference(&app, deployment, r.Scheme()); err != nil {
61 return ctrl.Result{}, err
62 }
63 
64 logger.Info("reconciling app", "name", app.Name)
65 // CreateOrUpdate logic here
66 return ctrl.Result{}, nil
67}
68 
69func (r *AppReconciler) SetupWithManager(mgr ctrl.Manager) error {
70 return ctrl.NewControllerManagedBy(mgr).
71 For(&appv1alpha1.App{}).
72 Owns(&appsv1.Deployment{}).
73 Complete(r)
74}
75 

The controller-runtime framework handles watch events, work queues, leader election, and metrics exposition. Building a custom operator for your domain-specific resources (database provisioning, certificate management, application deployment patterns) is one of Go's strongest use cases in the Kubernetes ecosystem.

Performance Tuning for Go on Kubernetes

Connection Pool Management

go
1func NewDBPool(ctx context.Context, databaseURL string) (*pgxpool.Pool, error) {
2 config, err := pgxpool.ParseConfig(databaseURL)
3 if err != nil {
4 return nil, err
5 }
6 
7 config.MaxConns = 20
8 config.MinConns = 5
9 config.MaxConnLifetime = 30 * time.Minute
10 config.MaxConnIdleTime = 5 * time.Minute
11 config.HealthCheckPeriod = 30 * time.Second
12 
13 pool, err := pgxpool.NewWithConfig(ctx, config)
14 if err != nil {
15 return nil, err
16 }
17 
18 if err := pool.Ping(ctx); err != nil {
19 return nil, err
20 }
21 return pool, nil
22}
23 

Size connection pools based on pod count. If you run 10 pods with 20 connections each, you need a database that supports 200+ connections. RDS instances have finite connection limits — a db.r6g.large supports 1,000 connections. Plan capacity accordingly.

Memory Optimization with GOMEMLIMIT

yaml
1env:
2- name: GOMEMLIMIT
3 value: "230MiB" # ~90% of memory limit (256Mi)
4 

GOMEMLIMIT (Go 1.19+) tells the garbage collector the maximum memory target. Set it to 90% of your container memory limit to prevent OOM kills while allowing the GC to use available memory efficiently. Without this, Go's default GC target (doubling live heap) can exceed container limits during traffic spikes.

Conclusion

Go and Kubernetes are a natural pairing. Go's fast startup times (milliseconds, not seconds), small memory footprint (64-128Mi for typical API servers), and static compilation (15MB images) make it operationally efficient in container orchestration environments. The language's first-class Kubernetes client libraries and the controller-runtime framework make it the standard choice for building platform tooling.

The key Go-specific optimizations for Kubernetes are setting GOMAXPROCS from CPU limits, using GOMEMLIMIT at 90% of memory limits, sizing connection pools relative to replica count, and leveraging the static binary for distroless images. These details distinguish a Go deployment that runs well from one that runs efficiently at scale.

FAQ

Need expert help?

Building with CI/CD pipelines?

I help teams ship production-grade systems. From architecture review to hands-on builds.

Muneer Puthiya Purayil

SaaS Architect & AI Systems Engineer. 10+ years shipping production infrastructure across fintech, automotive, e-commerce, and healthcare.

Engage

Start a
Conversation.

For teams building at scale: SaaS platforms, agentic AI systems, and enterprise mobile infrastructure. Scope and fit are evaluated before any engagement begins.

Limited availability · Q3 / Q4 2026