Back to Journal
DevOps

Complete Guide to Monitoring & Observability with Rust

A comprehensive guide to implementing Monitoring & Observability using Rust, covering architecture, code examples, and production-ready patterns.

Muneer Puthiya Purayil 19 min read

Rust's zero-cost abstractions, lack of garbage collection, and memory safety guarantees make it the ideal choice for building monitoring infrastructure that must handle extreme throughput with predictable latency. This guide covers instrumenting Rust services and building monitoring components.

Prometheus Metrics with the metrics Crate

rust
1use axum::{routing::get, Router, middleware};
2use metrics::{counter, gauge, histogram};
3use metrics_exporter_prometheus::PrometheusBuilder;
4use std::{net::SocketAddr, time::Instant};
5 
6#[tokio::main]
7async fn main() {
8 let builder = PrometheusBuilder::new();
9 builder.with_http_listener(([0, 0, 0, 0], 9090)).install().unwrap();
10 
11 let app = Router::new()
12 .route("/api/orders", get(list_orders).post(create_order))
13 .layer(middleware::from_fn(metrics_middleware));
14 
15 let addr = SocketAddr::from(([0, 0, 0, 0], 8080));
16 axum::serve(tokio::net::TcpListener::bind(addr).await.unwrap(), app).await.unwrap();
17}
18 
19async fn metrics_middleware(
20 req: axum::extract::Request,
21 next: axum::middleware::Next,
22) -> axum::response::Response {
23 let method = req.method().to_string();
24 let path = req.uri().path().to_string();
25 let start = Instant::now();
26 gauge!("http_requests_in_flight").increment(1.0);
27 
28 let response = next.run(req).await;
29 
30 gauge!("http_requests_in_flight").decrement(1.0);
31 let duration = start.elapsed().as_secs_f64();
32 let status = response.status().as_u16().to_string();
33 counter!("http_requests_total", "method" => method.clone(), "path" => path.clone(), "status" => status).increment(1);
34 histogram!("http_request_duration_seconds", "method" => method, "path" => path).record(duration);
35 
36 response
37}
38 

Distributed Tracing

rust
1use opentelemetry::{global, trace::TracerProvider as _};
2use opentelemetry_otlp::WithExportConfig;
3use opentelemetry_sdk::{trace::{self, RandomIdGenerator, Sampler}, Resource};
4use tracing_subscriber::{layer::SubscriberExt, util::SubscriberInitExt};
5 
6fn init_tracing(service_name: &str) {
7 let tracer = opentelemetry_otlp::new_pipeline()
8 .tracing()
9 .with_exporter(opentelemetry_otlp::new_exporter().tonic())
10 .with_trace_config(
11 trace::Config::default()
12 .with_sampler(Sampler::ParentBased(Box::new(Sampler::TraceIdRatioBased(0.1))))
13 .with_id_generator(RandomIdGenerator::default())
14 .with_resource(Resource::new(vec![
15 opentelemetry::KeyValue::new("service.name", service_name.to_string()),
16 ])),
17 )
18 .install_batch(opentelemetry_sdk::runtime::Tokio)
19 .unwrap();
20 
21 let otel_layer = tracing_opentelemetry::layer().with_tracer(tracer);
22 let fmt_layer = tracing_subscriber::fmt::layer().json();
23 
24 tracing_subscriber::registry()
25 .with(otel_layer)
26 .with(fmt_layer)
27 .init();
28}
29 

Need a second opinion on your DevOps pipelines architecture?

I run free 30-minute strategy calls for engineering teams tackling this exact problem.

Book a Free Call

Custom Metrics Collector

rust
1use std::sync::Arc;
2use tokio::sync::RwLock;
3use metrics::{gauge, describe_gauge};
4 
5struct SystemMetrics {
6 cpu_usage: f64,
7 memory_used_bytes: u64,
8 disk_io_read_bytes: u64,
9 disk_io_write_bytes: u64,
10}
11 
12async fn collect_system_metrics() {
13 describe_gauge!("system_cpu_usage_percent", "CPU usage percentage");
14 describe_gauge!("system_memory_used_bytes", "Memory used in bytes");
15 
16 loop {
17 let cpu = read_cpu_usage().await;
18 let mem = read_memory_usage().await;
19 
20 gauge!("system_cpu_usage_percent").set(cpu);
21 gauge!("system_memory_used_bytes").set(mem as f64);
22 
23 tokio::time::sleep(std::time::Duration::from_secs(15)).await;
24 }
25}
26 

High-Performance Log Processing

rust
1use tokio::io::{AsyncBufReadExt, BufReader};
2use tokio::fs::File;
3use serde::Deserialize;
4 
5#[derive(Deserialize)]
6struct LogEntry {
7 timestamp: String,
8 level: String,
9 service: String,
10 message: String,
11 trace_id: Option<String>,
12}
13 
14async fn process_log_stream(path: &str) -> anyhow::Result<()> {
15 let file = File::open(path).await?;
16 let reader = BufReader::new(file);
17 let mut lines = reader.lines();
18 
19 let (tx, mut rx) = tokio::sync::mpsc::channel::<LogEntry>(10_000);
20 
21 // Producer: parse log lines
22 tokio::spawn(async move {
23 while let Ok(Some(line)) = lines.next_line().await {
24 if let Ok(entry) = serde_json::from_str::<LogEntry>(&line) {
25 if entry.level == "ERROR" {
26 counter!("log_errors_total", "service" => entry.service.clone()).increment(1);
27 }
28 let _ = tx.send(entry).await;
29 }
30 }
31 });
32 
33 // Consumer: batch and forward
34 let mut batch = Vec::with_capacity(1000);
35 while let Some(entry) = rx.recv().await {
36 batch.push(entry);
37 if batch.len() >= 1000 {
38 forward_batch(&batch).await;
39 batch.clear();
40 }
41 }
42 Ok(())
43}
44 

Rust processes log streams at 500K-1M lines/second per core — 5-10x faster than Python and 2-3x faster than Go for parsing-heavy workloads.

Conclusion

Rust monitoring infrastructure operates at the efficiency frontier — minimum memory, maximum throughput, zero GC pauses. The metrics crate provides ergonomic instrumentation with zero-allocation recording on hot paths. OpenTelemetry's Rust SDK enables distributed tracing with the same performance characteristics. For building monitoring agents, log processors, and data pipelines where every byte of memory and microsecond of latency matters, Rust is unmatched.

The practical trade-off is development velocity. Rust monitoring tools take 2-3x longer to build than Go equivalents. This investment is justified for infrastructure that runs at the scale of millions of events per second or on resource-constrained edge devices. For standard service instrumentation, the Rust ecosystem is functional but less ergonomic than Go or Java alternatives.

FAQ

Need expert help?

Building with CI/CD pipelines?

I help teams ship production-grade systems. From architecture review to hands-on builds.

Muneer Puthiya Purayil

SaaS Architect & AI Systems Engineer. 10+ years shipping production infrastructure across fintech, automotive, e-commerce, and healthcare.

Engage

Start a
Conversation.

For teams building at scale: SaaS platforms, agentic AI systems, and enterprise mobile infrastructure. Scope and fit are evaluated before any engagement begins.

Limited availability · Q3 / Q4 2026