What is the recommended memory configuration for Spring Boot on Kubernetes?

Set the container memory limit to at least 512Mi for a minimal Spring Boot application. Use `MaxRAMPercentage=75.0` to allocate 75% of the container limit to heap. For a 1Gi limit, this gives 768MB heap with 256MB remaining for metaspace (~100MB), thread stacks (1MB per thread), direct buffers, and native allocations. Monitor `jvm.memory.used` in production and adjust if you see the total approaching 90% of the container limit.

Should you use GraalVM native image for all Java services on Kubernetes?

No. Native images excel at startup time (100ms vs 20s) and memory footprint (50-70% reduction), making them ideal for scale-to-zero, serverless, and CLI applications. However, peak throughput is 10-20% lower than JIT-compiled JVM because the native image lacks runtime profiling and optimization. For long-running services where startup time is less critical, the traditional JVM delivers better throughput. The build time increase (5-15 minutes) and reflection configuration requirements also add de

How do you handle Java database connection pools during HPA scaling?

Configure HikariCP's `maximum-pool-size` based on your max replica count, not current. If HPA can scale to 10 replicas with 20 connections each, your database needs 200+ connection capacity. Use PgBouncer as a sidecar or standalone service to multiplex connections when pod count is variable. Set `minimumIdle` to 2-3 to avoid connection storms on startup while keeping idle resource usage low.

What JVM version should you use for Kubernetes deployments?

Use the latest LTS release — JDK 21 as of 2025. It includes virtual threads (Project Loom), improved container support, and performance optimizations. Eclipse Temurin provides the most widely-used open-source JDK distributions with Alpine Linux variants for smaller images. Avoid JDK 8 or 11 — their container support is significantly worse, requiring manual configuration that JDK 17+ handles automatically.

Complete Guide to Kubernetes Production Setup with Java

Java's mature ecosystem, battle-tested libraries, and the JVM's runtime optimizations make it a strong choice for Kubernetes workloads — if you account for the JVM's unique operational characteristics. Container-aware JVM tuning, proper memory configuration, and startup optimization are essential for running Java effectively in orchestrated environments.

Container-Aware JVM Configuration

The JVM historically had poor container support, defaulting to host-level CPU and memory detection. Modern JVMs (17+) handle containers correctly, but explicit configuration ensures predictable behavior.

dockerfile

1FROM eclipse-temurin:21-jdk-alpine AS builder

2WORKDIR /app

3COPY gradle/ gradle/

4COPY gradlew build.gradle.kts settings.gradle.kts ./

5RUN ./gradlew dependencies --no-daemon

6COPY src/ src/

7RUN ./gradlew bootJar --no-daemon -x test

9FROM eclipse-temurin:21-jre-alpine

10RUN addgroup -g 1001 -S appuser && adduser -S appuser -u 1001

11WORKDIR /app

12COPY --from=builder /app/build/libs/app.jar ./app.jar

13USER appuser

14EXPOSE 8080

15ENTRYPOINT ["java", \

16 "-XX:MaxRAMPercentage=75.0", \

17 "-XX:InitialRAMPercentage=50.0", \

18 "-XX:+UseG1GC", \

19 "-XX:MaxGCPauseMillis=200", \

20 "-XX:+UseStringDeduplication", \

21 "-Djava.security.egd=file:/dev/urandom", \

22 "-jar", "app.jar"]

Critical JVM flags for Kubernetes:

MaxRAMPercentage=75.0 — The JVM uses 75% of the container memory limit for heap, leaving 25% for metaspace, thread stacks, native memory, and the OS page cache. Setting this too high (>80%) leads to OOM kills because the JVM's non-heap memory needs are significant.
UseG1GC — G1 is the default in JDK 17+ and provides the best balance of throughput and latency for typical web services.
UseStringDeduplication — In microservices with heavy JSON processing, duplicate strings consume 10-25% of heap. This flag deduplicates them during GC with minimal overhead.

Spring Boot on Kubernetes

Application Configuration

yaml

1# application.yaml

2server:

3 port: 8080

4 shutdown: graceful

5 tomcat:

6 max-threads: 200

7 accept-count: 100

8 connection-timeout: 5s

10spring:

11 lifecycle:

12 timeout-per-shutdown-phase: 30s

13 datasource:

14 url: ${DATABASE_URL}

15 hikari:

16 maximum-pool-size: 20

17 minimum-idle: 5

18 idle-timeout: 300000

19 max-lifetime: 1800000

20 connection-timeout: 5000

22management:

23 endpoints:

24 web:

25 exposure:

26 include: health,prometheus,info

27 endpoint:

28 health:

29 probes:

30 enabled: true

31 group:

32 readiness:

33 include: db,diskSpace

34 liveness:

35 include: ping

36 metrics:

37 tags:

38 application: ${spring.application.name}

Spring Boot 3.x includes built-in Kubernetes probe support. Setting management.endpoint.health.probes.enabled=true exposes /actuator/health/liveness and /actuator/health/readiness automatically. The readiness probe checks database connectivity; the liveness probe only checks process health.

Kubernetes Deployment

yaml

1apiVersion: apps/v1

2kind: Deployment

3metadata:

4 name: order-service

5spec:

6 replicas: 3

7 selector:

8 matchLabels:

9 app: order-service

10 template:

11 metadata:

12 labels:

13 app: order-service

14 annotations:

15 prometheus.io/scrape: "true"

16 prometheus.io/port: "8080"

17 prometheus.io/path: "/actuator/prometheus"

18 spec:

19 terminationGracePeriodSeconds: 35

20 containers:

21 - name: order-service

22 image: registry.example.com/order-service:v2.1.0

23 ports:

24 - containerPort: 8080

25 name: http

26 env:

27 - name: JAVA_TOOL_OPTIONS

28 value: "-XX:MaxRAMPercentage=75.0 -XX:+UseG1GC"

29 - name: DATABASE_URL

30 valueFrom:

31 secretKeyRef:

32 name: order-service-secrets

33 key: database-url

34 resources:

35 requests:

36 cpu: 500m

37 memory: 512Mi

38 limits:

39 memory: 1Gi

40 readinessProbe:

41 httpGet:

42 path: /actuator/health/readiness

43 port: http

44 initialDelaySeconds: 20

45 periodSeconds: 10

46 timeoutSeconds: 5

47 failureThreshold: 3

48 livenessProbe:

49 httpGet:

50 path: /actuator/health/liveness

51 port: http

52 initialDelaySeconds: 30

53 periodSeconds: 15

54 timeoutSeconds: 5

55 failureThreshold: 3

56 startupProbe:

57 httpGet:

58 path: /actuator/health/liveness

59 port: http

60 initialDelaySeconds: 10

61 periodSeconds: 5

62 failureThreshold: 30

63 lifecycle:

64 preStop:

65 exec:

66 command: ["sh", "-c", "sleep 5"]

Java-specific deployment considerations:

Higher initialDelaySeconds. Spring Boot applications take 10-30 seconds to start, unlike Go (milliseconds) or Node.js (1-3 seconds). Use a startupProbe with generous failure thresholds instead of delaying the readiness probe excessively.
preStop sleep. The 5-second sleep before SIGTERM gives the Kubernetes endpoints controller time to remove the pod from service routing. Without this, in-flight requests hit terminating pods during the brief propagation delay.
Memory requests at 512Mi minimum. The JVM's baseline memory consumption (heap + metaspace + thread stacks + GC overhead) rarely goes below 300Mi for a Spring Boot application. Requesting 256Mi will cause OOM kills.

Startup Optimization

JVM startup time is the primary operational challenge in Kubernetes. A slow-starting application delays rolling updates and scaling events.

dockerfile

1FROM eclipse-temurin:21-jre-alpine

3COPY app.jar /app/app.jar

4WORKDIR /app

6# Generate CDS archive during build

7RUN java -XX:ArchiveClassesAtExit=app-cds.jsa \

8 -Dspring.context.exit=onRefresh \

9 -jar app.jar || true

11ENTRYPOINT ["java", \

12 "-XX:SharedArchiveFile=app-cds.jsa", \

13 "-XX:MaxRAMPercentage=75.0", \

14 "-jar", "app.jar"]

CDS pre-processes class metadata and stores it in a shared archive. This reduces startup time by 20-40% for Spring Boot applications by avoiding redundant class loading and verification work at startup.

GraalVM Native Image

dockerfile

1FROM ghcr.io/graalvm/native-image-community:21 AS builder

2WORKDIR /app

3COPY gradle/ gradle/

4COPY gradlew build.gradle.kts settings.gradle.kts ./

5RUN ./gradlew dependencies --no-daemon

6COPY src/ src/

7RUN ./gradlew nativeCompile --no-daemon

9FROM gcr.io/distroless/base-debian12:nonroot

10COPY --from=builder /app/build/native/nativeCompile/app /app

11EXPOSE 8080

12ENTRYPOINT ["/app"]

GraalVM native images start in 100-300ms (vs 10-30 seconds for JVM) and use 50-70% less memory. The tradeoff is longer build times (5-15 minutes), potential reflection and serialization compatibility issues, and reduced peak throughput. Native images are ideal for serverless-style workloads with frequent scale-to-zero events; traditional JVM is better for long-running services where throughput matters more than startup.

Micrometer Metrics and Prometheus

java

1@Configuration

2public class MetricsConfig {

4 @Bean

5 public TimedAspect timedAspect(MeterRegistry registry) {

6 return new TimedAspect(registry);

7 }

9 @Bean

10 public MeterRegistryCustomizer<MeterRegistry> commonTags(

11 @Value("${spring.application.name}") String appName) {

12 return registry -> registry.config()

13 .commonTags(

14 "application", appName,

15 "region", System.getenv().getOrDefault("AWS_REGION", "unknown")

16 );

17 }

18}

20@RestController

21@RequestMapping("/api/v1/orders")

22public class OrderController {

24 private final OrderService orderService;

25 private final Counter orderCounter;

26 private final Timer orderTimer;

28 public OrderController(OrderService orderService, MeterRegistry registry) {

29 this.orderService = orderService;

30 this.orderCounter = Counter.builder("orders.created")

31 .description("Total orders created")

32 .register(registry);

33 this.orderTimer = Timer.builder("orders.processing.duration")

34 .description("Order processing duration")

35 .publishPercentiles(0.5, 0.95, 0.99)

36 .register(registry);

37 }

39 @PostMapping

40 public ResponseEntity<Order> createOrder(@RequestBody @Valid CreateOrderRequest request) {

41 return orderTimer.record(() -> {

42 Order order = orderService.create(request);

43 orderCounter.increment();

44 return ResponseEntity.status(HttpStatus.CREATED).body(order);

45 });

46 }

47}

Micrometer integrates with Spring Boot's actuator to expose JVM-specific metrics (heap usage, GC pause times, thread counts) alongside application metrics. Key JVM metrics to monitor in Kubernetes:

jvm.memory.used — Watch for memory creep toward the container limit
jvm.gc.pause — G1 pauses above 500ms indicate heap sizing issues
jvm.threads.live — Thread count growth without corresponding load suggests thread leaks

Need a second opinion on your DevOps pipelines architecture?

I run free 30-minute strategy calls for engineering teams tackling this exact problem.

Book a Free Call

Connection Pool Tuning

java

1@Configuration

2public class DataSourceConfig {

4 @Bean

5 @ConfigurationProperties("spring.datasource.hikari")

6 public HikariConfig hikariConfig() {

7 HikariConfig config = new HikariConfig();

8 config.setMetricsTrackerFactory(new MicrometerMetricsTrackerFactory(meterRegistry));

9 return config;

10 }

11}

HikariCP sizing formula for Kubernetes: pool_size = (pod_replicas * max_pool_per_pod) ≤ database_max_connections * 0.8. For 3 replicas with 20 connections each, you need a database supporting at least 75 connections (60 active + 20% headroom). During HPA scaling events, connection counts spike — configure the database for your maximum pod count, not your current count.

JVM Warm-up and Readiness

The JVM's JIT compiler optimizes frequently-executed code paths after observing execution patterns. A freshly started pod has higher latency until the JIT warms up.

java

1@Component

2public class WarmupRunner implements ApplicationRunner {

4 private final RestTemplate restTemplate;

6 @Override

7 public void run(ApplicationArguments args) {

8 // Hit critical endpoints to trigger JIT compilation

9 for (int i = 0; i < 1000; i++) {

10 try {

11 restTemplate.getForEntity("http://localhost:8080/api/v1/health", String.class);

12 } catch (Exception ignored) {}

13 }

14 }

15}

A lightweight warm-up routine that exercises hot paths reduces p99 latency for the first few minutes after deployment by 40-60%. Combine this with a readiness probe that includes a latency check to ensure traffic only reaches warmed-up pods.

Anti-Patterns to Avoid

Using -Xmx instead of MaxRAMPercentage. Hard-coded heap sizes don't adapt to different container memory limits across environments. A service configured with -Xmx512m running in a 2Gi container wastes 75% of available memory.

Ignoring non-heap memory. Metaspace, thread stacks (1MB per thread by default), direct byte buffers, and JNI allocations consume memory outside the heap. A common failure mode: -Xmx=900m in a 1Gi container with 200 threads uses 900MB heap + 200MB stacks + 100MB metaspace = OOM kill.

Disabling JVM ergonomics. Flags like -XX:-UseContainerSupport or fixed -XX:ParallelGCThreads override the JVM's automatic container detection. Unless you have measured a specific problem, let the JVM auto-tune based on container limits.

Fat JARs with embedded resources. Spring Boot fat JARs that bundle static assets, test dependencies, or documentation inflate image size. Use Maven or Gradle profiles to exclude non-production dependencies and use a separate CDN for static assets.

Conclusion

Running Java effectively on Kubernetes requires understanding the JVM's resource model. The JVM is not a lightweight runtime — it needs adequate memory for heap, metaspace, and thread stacks, and it needs time to warm up the JIT compiler. Kubernetes operators who account for these characteristics with proper memory configuration (MaxRAMPercentage at 75%), startup probes for slow initialization, and warm-up routines for JIT compilation build Java services that perform predictably under orchestration.

The Spring Boot ecosystem's Kubernetes integration — actuator health probes, Micrometer metrics, graceful shutdown — has matured significantly. Combined with GraalVM native images for startup-critical workloads and CDS for traditional JVM deployments, Java remains a competitive choice for Kubernetes-native applications where the ecosystem's maturity and library breadth outweigh the operational complexity of JVM tuning.

FAQ

Need expert help?

Building with CI/CD pipelines?

I help teams ship production-grade systems. From architecture review to hands-on builds.

Book a Free Call Send a Brief

kubernetes k8s container-orchestration devops java guide

Muneer Puthiya Purayil

SaaS Architect & AI Systems Engineer. 10+ years shipping production infrastructure across fintech, automotive, e-commerce, and healthcare.

View Portfolio Book a Call

← Previous

Complete Guide to Kubernetes Production Setup with Java

Container-Aware JVM Configuration

Spring Boot on Kubernetes

Application Configuration

Kubernetes Deployment

Startup Optimization

GraalVM Native Image

Micrometer Metrics and Prometheus

Connection Pool Tuning

JVM Warm-up and Readiness

Anti-Patterns to Avoid

Conclusion

FAQ

Building with CI/CD pipelines?

Complete Guide to Kubernetes Production Setup with Rust

Complete Guide to Kubernetes Production Setup with Go

Complete Guide to Kubernetes Production Setup with Python

Kubernetes Production Setup Best Practices for Startup Teams

Complete Guide to Kubernetes Production Setup with Rust

Start a
Conversation.

Container-Aware JVM Configuration

Spring Boot on Kubernetes

Application Configuration

Kubernetes Deployment

Startup Optimization

Class Data Sharing (CDS)

GraalVM Native Image

Micrometer Metrics and Prometheus

Connection Pool Tuning

JVM Warm-up and Readiness

Anti-Patterns to Avoid

Conclusion

FAQ

Building with CI/CD pipelines?

Complete Guide to Kubernetes Production Setup with Rust

Complete Guide to Kubernetes Production Setup with Go

Complete Guide to Kubernetes Production Setup with Python

Kubernetes Production Setup Best Practices for Startup Teams

Complete Guide to Kubernetes Production Setup with Rust

Start aConversation.

Start a
Conversation.