Is the Rust OpenTelemetry SDK production-ready?

The tracing SDK is stable and production-ready. The metrics SDK reached 1.0 in late 2024. Logging support is still maturing. For production Rust services, use the `metrics` crate with Prometheus exporter for metrics and the OTel SDK for tracing.

How do you profile Rust monitoring agents in production?

Compile with `[profile.release] debug = true` to keep debug symbols. Use `perf record` and `perf report` for CPU profiling via `kubectl exec`. For memory profiling, use jemalloc's profiling mode (`_RJEM_MALLOC_CONF=prof:true`). Tokio Console provides async runtime visibility for debugging task scheduling issues.

What is the memory overhead of Rust instrumentation?

Near zero. Counter increments are atomic operations on cache-aligned integers. Histogram observations use a fixed-size bucket array. A fully instrumented Rust service with 100 metrics adds <1MB to RSS. The metrics exposition endpoint allocates during serialization but is called only during scrapes (every 15-30 seconds).

Should you use the metrics crate or prometheus crate for Rust?

The `metrics` crate is the modern choice — it provides a facade pattern similar to Rust's `log` crate, with pluggable backends. The `prometheus` crate is a direct port of the Go Prometheus client and is functional but less idiomatic Rust. For new projects, use `metrics` with `metrics-exporter-prometheus`.

Complete Guide to Monitoring & Observability with Rust

Q: Should you use the metrics crate or prometheus crate for Rust?

The `metrics` crate is the modern choice — it provides a facade pattern similar to Rust's `log` crate, with pluggable backends. The `prometheus` crate is a direct port of the Go Prometheus client and is functional but less idiomatic Rust. For new projects, use `metrics` with `metrics-exporter-prometheus`.

Rust's zero-cost abstractions, lack of garbage collection, and memory safety guarantees make it the ideal choice for building monitoring infrastructure that must handle extreme throughput with predictable latency. This guide covers instrumenting Rust services and building monitoring components.

Prometheus Metrics with the metrics Crate

rust

1use axum::{routing::get, Router, middleware};

2use metrics::{counter, gauge, histogram};

3use metrics_exporter_prometheus::PrometheusBuilder;

4use std::{net::SocketAddr, time::Instant};

6#[tokio::main]

7async fn main() {

8 let builder = PrometheusBuilder::new();

9 builder.with_http_listener(([0, 0, 0, 0], 9090)).install().unwrap();

11 let app = Router::new()

12 .route("/api/orders", get(list_orders).post(create_order))

13 .layer(middleware::from_fn(metrics_middleware));

15 let addr = SocketAddr::from(([0, 0, 0, 0], 8080));

16 axum::serve(tokio::net::TcpListener::bind(addr).await.unwrap(), app).await.unwrap();

17}

19async fn metrics_middleware(

20 req: axum::extract::Request,

21 next: axum::middleware::Next,

22) -> axum::response::Response {

23 let method = req.method().to_string();

24 let path = req.uri().path().to_string();

25 let start = Instant::now();

26 gauge!("http_requests_in_flight").increment(1.0);

28 let response = next.run(req).await;

30 gauge!("http_requests_in_flight").decrement(1.0);

31 let duration = start.elapsed().as_secs_f64();

32 let status = response.status().as_u16().to_string();

33 counter!("http_requests_total", "method" => method.clone(), "path" => path.clone(), "status" => status).increment(1);

34 histogram!("http_request_duration_seconds", "method" => method, "path" => path).record(duration);

36 response

37}

Distributed Tracing

rust

1use opentelemetry::{global, trace::TracerProvider as _};

2use opentelemetry_otlp::WithExportConfig;

3use opentelemetry_sdk::{trace::{self, RandomIdGenerator, Sampler}, Resource};

4use tracing_subscriber::{layer::SubscriberExt, util::SubscriberInitExt};

6fn init_tracing(service_name: &str) {

7 let tracer = opentelemetry_otlp::new_pipeline()

8 .tracing()

9 .with_exporter(opentelemetry_otlp::new_exporter().tonic())

10 .with_trace_config(

11 trace::Config::default()

12 .with_sampler(Sampler::ParentBased(Box::new(Sampler::TraceIdRatioBased(0.1))))

13 .with_id_generator(RandomIdGenerator::default())

14 .with_resource(Resource::new(vec![

15 opentelemetry::KeyValue::new("service.name", service_name.to_string()),

16 ])),

17 )

18 .install_batch(opentelemetry_sdk::runtime::Tokio)

19 .unwrap();

21 let otel_layer = tracing_opentelemetry::layer().with_tracer(tracer);

22 let fmt_layer = tracing_subscriber::fmt::layer().json();

24 tracing_subscriber::registry()

25 .with(otel_layer)

26 .with(fmt_layer)

27 .init();

28}

Need a second opinion on your DevOps pipelines architecture?

I run free 30-minute strategy calls for engineering teams tackling this exact problem.

Book a Free Call

Custom Metrics Collector

rust

1use std::sync::Arc;

2use tokio::sync::RwLock;

3use metrics::{gauge, describe_gauge};

5struct SystemMetrics {

6 cpu_usage: f64,

7 memory_used_bytes: u64,

8 disk_io_read_bytes: u64,

9 disk_io_write_bytes: u64,

10}

12async fn collect_system_metrics() {

13 describe_gauge!("system_cpu_usage_percent", "CPU usage percentage");

14 describe_gauge!("system_memory_used_bytes", "Memory used in bytes");

16 loop {

17 let cpu = read_cpu_usage().await;

18 let mem = read_memory_usage().await;

20 gauge!("system_cpu_usage_percent").set(cpu);

21 gauge!("system_memory_used_bytes").set(mem as f64);

23 tokio::time::sleep(std::time::Duration::from_secs(15)).await;

24 }

25}

High-Performance Log Processing

rust

1use tokio::io::{AsyncBufReadExt, BufReader};

2use tokio::fs::File;

3use serde::Deserialize;

5#[derive(Deserialize)]

6struct LogEntry {

7 timestamp: String,

8 level: String,

9 service: String,

10 message: String,

11 trace_id: Option<String>,

12}

14async fn process_log_stream(path: &str) -> anyhow::Result<()> {

15 let file = File::open(path).await?;

16 let reader = BufReader::new(file);

17 let mut lines = reader.lines();

19 let (tx, mut rx) = tokio::sync::mpsc::channel::<LogEntry>(10_000);

21 // Producer: parse log lines

22 tokio::spawn(async move {

23 while let Ok(Some(line)) = lines.next_line().await {

24 if let Ok(entry) = serde_json::from_str::<LogEntry>(&line) {

25 if entry.level == "ERROR" {

26 counter!("log_errors_total", "service" => entry.service.clone()).increment(1);

27 }

28 let _ = tx.send(entry).await;

29 }

30 }

31 });

33 // Consumer: batch and forward

34 let mut batch = Vec::with_capacity(1000);

35 while let Some(entry) = rx.recv().await {

36 batch.push(entry);

37 if batch.len() >= 1000 {

38 forward_batch(&batch).await;

39 batch.clear();

40 }

41 }

42 Ok(())

43}

Rust processes log streams at 500K-1M lines/second per core — 5-10x faster than Python and 2-3x faster than Go for parsing-heavy workloads.

Conclusion

Rust monitoring infrastructure operates at the efficiency frontier — minimum memory, maximum throughput, zero GC pauses. The metrics crate provides ergonomic instrumentation with zero-allocation recording on hot paths. OpenTelemetry's Rust SDK enables distributed tracing with the same performance characteristics. For building monitoring agents, log processors, and data pipelines where every byte of memory and microsecond of latency matters, Rust is unmatched.

The practical trade-off is development velocity. Rust monitoring tools take 2-3x longer to build than Go equivalents. This investment is justified for infrastructure that runs at the scale of millions of events per second or on resource-constrained edge devices. For standard service instrumentation, the Rust ecosystem is functional but less ergonomic than Go or Java alternatives.

FAQ

Need expert help?

Building with CI/CD pipelines?

I help teams ship production-grade systems. From architecture review to hands-on builds.

Book a Free Call Send a Brief

monitoring observability logging tracing rust guide

Muneer Puthiya Purayil

SaaS Architect & AI Systems Engineer. 10+ years shipping production infrastructure across fintech, automotive, e-commerce, and healthcare.

View Portfolio Book a Call

← Previous

Complete Guide to Monitoring & Observability with Rust

Prometheus Metrics with the metrics Crate

Distributed Tracing

Custom Metrics Collector

High-Performance Log Processing

Conclusion

FAQ

Building with CI/CD pipelines?

Complete Guide to Monitoring & Observability with Java

Complete Guide to Monitoring & Observability with Go

Complete Guide to Monitoring & Observability with Python

Monitoring & Observability: Typescript vs Python in 2025

Complete Guide to Monitoring & Observability with Java

Start a
Conversation.

Prometheus Metrics with the metrics Crate

Distributed Tracing

Custom Metrics Collector

High-Performance Log Processing

Conclusion

FAQ

Building with CI/CD pipelines?

Complete Guide to Monitoring & Observability with Java

Complete Guide to Monitoring & Observability with Go

Complete Guide to Monitoring & Observability with Python

Monitoring & Observability: Typescript vs Python in 2025

Complete Guide to Monitoring & Observability with Java

Start aConversation.

Start a
Conversation.