The Rise of Continuous Profiling in Distributed Systems Observability

Introduction

For years, observability has been built on the foundation of three core telemetry pillars: metrics, logs, and traces. These pillars have served as the cornerstone for understanding and managing the behavior of complex distributed systems:

Metrics: Show what changed over time with numeric system data.
Logs: Show what happened with detailed event records.
Traces: Show where time was spent across services and requests.

While the three observability pillars are vital for monitoring modern applications, they miss a key insight: why resources are used inefficiently, especially at the code level over time.

Continuous profiling, now emerging as the fourth pillar, fills this gap. It offers detailed, code-level visibility into how applications consume resources in production. Unlike metrics, logs, and traces, which focus on system-wide or service-level insights, continuous profiling drills down to function- and even line-level performance.

Pioneers in Cloud Consulting & Migration Services

Reduced infrastructural costs
Accelerated application deployment

Get Started

Continuous Profiling

Continuous profiling continuously collects high-resolution CPU, memory, and other resource usage data from production applications rather than ad hoc. It provides a systematic method of collecting and analyzing performance data from production systems with minimal overhead. (Profiling, in simple terms, measures how your program uses resources like CPU time and memory. Think of it as a way to see inside your application and understand what’s going on.)

Unlike traditional profiling methods that are run manually in test environments or during specific troubleshooting sessions, continuous profiling:

Runs constantly in production with very low overhead (typically 1-5% CPU overhead)
Collects lightweight samples across distributed systems automatically
Provides real-time insights into how code behaves in real-world environments
Stores profiling data in a time-series database for historical analysis and trend detection

Benefits of Continuous Profiling

Code-Level Visibility
Complements traditional monitoring (metrics, logs, traces) by providing granular, line-level insights into how application code consumes resources like CPU and memory.
Faster Root Cause Analysis
Reduces Mean Time to Resolution (MTTR) by allowing teams to jump directly to the problematic code when incidents occur, no need to reproduce issues locally.
Proactive Performance Optimization
Identifies inefficient code paths and performance regressions early, before they impact users or trigger alerts, enabling teams to optimize proactively.
Bridging Development and Operations
Gives developers visibility into production behavior and provides SREs with actionable data to optimize system performance, fostering better collaboration.
Cost Efficiency
Helps reduce cloud and infrastructure costs by revealing high resource usage areas (e.g., inefficient algorithms, memory leaks), enabling teams to optimize compute usage.
Improved System Reliability
Enhances incident response and overall stability by providing real-time, actionable insights into application performance during production.

Differences Between Traditional Profiling and Continuous Profiling

diff

Requirements for Continuous Profiling

Supported Languages

Continuous profiling tools support a wide range of programming languages, with different approaches for different language types:

Natively Compiled Languages

Go: Excellent support with built-in pprof endpoints
C/C++: Supported via eBPF profiling (requires frame pointers)
Rust: Supported via eBPF profiling (requires frame pointers)

Runtime-Based Languages

Java: Comprehensive support via JFR (Java Flight Recorder) and language-specific agents
Python: Supported via both SDK instrumentation and eBPF (with python_enabled=true)
Ruby: SDK-based instrumentation available
js: SDK and auto-instrumentation support
.NET: Language-specific SDK support
Scala, Clojure, Kotlin: Supported via JVM-based profiling

Instrumentation Methods

Auto-Instrumentation

Pros:

No code changes required
Works with existing applications
Easy to deploy and manage
Can profile multiple applications simultaneously

Cons:

Less granular control
May not capture application-specific context
Limited to certain profile types (mainly CPU)

Implementation Options:

eBPF-based: Kernel-level profiling with minimal overhead
Agent-based: Language-specific agents that attach to running processes

SDK-Based Instrumentation

Pros:

Fine-grained control over profiling
Can add custom tags and context
Supports multiple profile types (CPU, memory, allocations)
Better integration with application logic

Cons:

Requires code modifications
Need to redeploy applications
More complex setup and maintenance

Implementation:

Language-specific SDKs provided by profiling platforms
Direct integration with application code
Configurable sampling rates and profile types

Deploying Continuous Profiling

Prerequisites

Kubernetes cluster (1.32 or +)
Linux kernel 4.9+ (for eBPF support)
Helm 3+
kubectl access

Grafana Installation:

helm repo add grafana https://grafana.github.io/helm-charts
helm repo update

helm install <release-name> grafana/grafana -n <namespace>

helm repo add grafana https://grafana.github.io/helm-charts

helm repo update

helm install <release-name> grafana/grafana -n <namespace>

Pyroscope Installation (main tool):

helm repo add pyroscope-io https://pyroscope-io.github.io/helm-chart

helm install <release-name> pyroscope-io/pyroscope -n <namespace>

helm repo add pyroscope-io https://pyroscope-io.github.io/helm-chart

helm install <release-name> pyroscope-io/pyroscope -n <namespace>

Installing Pyroscope-ebpf:

Pyroscope-ebpf is an auto Instrumentation, running as a Daemonset, Kernel Level profiling (collect CPU profiles and other performance data by leveraging the Linux kernel’s capabilities).

Steps:

Before installing, download the pyroscope-ebpfyaml (https://github.com/pyroscope-io/helm-chart/blob/main/chart/pyroscope-ebpf/values.yaml)
Edit the values.yaml at args endpoint as “http://<Pyroscope-release-name>.<namespace>:4040”. The Pyroscope-release-name and namespace should be the same as the Pyroscope main tool above.

helm repo add pyroscope-io https://pyroscope-io.github.io/helm-chart

1	helm repo add pyroscope-io https://pyroscope-io.github.io/helm-chart

If you have already used the above command, you can skip it.

helm install <release-name> pyroscope-io/pyroscope-ebpf

1	helm install <release-name> pyroscope-io/pyroscope-ebpf

Grafana UI:

Open Grafana in a web browser, enter admin and password (you can get the secrets of Grafana and decode them with base64).
Go to Data sources, add New Data source, choose Pyroscope.
Give Connection URL – “http://<Pyroscope-svc-name>.<namespace>:4040”, click on Save&Test.
On the top-right side of the data source, you can see “Explore data”, click it.
By editing the query table (A). If you don’t see anything, adjust the “Absolute Time Range” to last 30minutes at the Top-Right side.

2diff

Read and understand Profiles

3diff

Each rectangle represents a function in the call stack.
Width = resource usage (e.g., time or CPU), the wider it is, the more time is spent in that function and its descendants.
Vertical stacking shows the call hierarchy: top boxes are callers of boxes below; deeper levels represent deeper call stacks.
The X-axis is not chronological, it’s arranged for clarity, not time sequence.
Spot hotspots quickly: look for the widest blocks, these are your busiest functions and starting points for optimization

A flame graph helps you see where your application spends time or resources. Each block is a function call:

Width → how much time/resources it used (wider = heavier).
Stacking → shows call hierarchy (parent at the bottom, children above).
Self Time → time spent in the function itself, excluding children.
Total Time → time spent in the function plus everything it called.
Samples → number of times the profiler saw this function running, translating into time.

To spot bottlenecks:

Look for the widest blocks → these are your hotspots.
If Self is high, the function itself is slow.
If Total is high but Self is low, its children are slow.

42diff

Conclusion

Continuous profiling is the next step in observability, the fourth pillar that fills the blind spot left by metrics, logs, and traces by revealing code-level behavior in production.

Tools like Grafana, Pyroscope, and eBPF make it easy and low-overhead (typically 1–5%) to collect actionable profiles and link them to your existing telemetry. Teams that adopt continuous profiling gain faster root-cause analysis, better performance tuning, and lower infrastructure costs.

Drop a query if you have any questions regarding Continuous profiling and we will get back to you quickly.

Empowering organizations to become ‘data driven’ enterprises with our Cloud experts.

Reduced infrastructure costs
Timely data-driven decisions

Get Started

About CloudThat

CloudThat is an award-winning company and the first in India to offer cloud training and consulting services worldwide. As a Microsoft Solutions Partner, AWS Advanced Tier Training Partner, and Google Cloud Platform Partner, CloudThat has empowered over 850,000 professionals through 600+ cloud certifications winning global recognition for its training excellence including 20 MCT Trainers in Microsoft’s Global Top 100 and an impressive 12 awards in the last 8 years. CloudThat specializes in Cloud Migration, Data Platforms, DevOps, IoT, and cutting-edge technologies like Gen AI & AI/ML. It has delivered over 500 consulting projects for 250+ organizations in 30+ countries as it continues to empower professionals and enterprises to thrive in the digital-first world.

FAQs

1. Does continuous profiling slow down applications?

ANS: – No, modern profilers (like eBPF-based ones) add very low overhead, typically 1–5% CPU. They use sampling instead of tracing every call, so the performance impact in production is negligible.

2. How can I quickly start with Pyroscope on Kubernetes?

ANS: – Install Pyroscope with Helm, then deploy the pyroscope-ebpf DaemonSet pointing to the Pyroscope service. Finally, it will be connected as a data source in Grafana to explore flame graphs and identify hotspots.

WRITTEN BY Nallagondla Nikhil

Nallagondla Nikhil works as a Research Associate at CloudThat. He is passionate about continuously expanding his skill set and knowledge by actively seeking opportunities to learn new skills. Nikhil regularly explores blogs and articles on various technologies and industry trends to stay up to date with the latest developments in the field.