Overview
Linux performance monitoring involves measuring and analyzing system metrics to identify bottlenecks, optimize resource utilization, and ensure system stability. This cheat sheet covers:
- CPU performance: utilization, context switches, run queues
- Memory performance: usage, swapping, page faults
- Disk I/O: throughput, latency, utilization
- Network performance: connections, throughput, errors
- System-wide metrics: load average, process accounting
- Advanced tracing: eBPF, bpftrace, perf, BCC tools for deep dive analysis
Key performance concepts:
- Utilization: % of time resource is busy
- Saturation: amount of queued work (backlog)
- Errors: count of error events
- Capacity: maximum throughput of resource
Quick Start
Essential One-Liners
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
| # System overview (CPU, memory, load)
top -b -n 1 | head -20
# Real-time CPU and memory usage
htop
# Disk I/O statistics (all devices)
iostat -xz 1
# Network connections and listening ports
ss -tuln
# Memory usage summary
free -h
# System load and process count
uptime
# CPU performance counters (basic)
perf stat -a sleep 1
# Trace system calls (all processes)
strace -c -p <PID>
|
| Command |
Description |
Example |
top |
Interactive process viewer, CPU usage |
top -u <user> |
htop |
Enhanced top with colors, tree view |
htop -p <PID> |
mpstat |
Per-CPU statistics |
mpstat -P ALL 1 |
vmstat |
Virtual memory stats (includes CPU) |
vmstat 1 |
sar |
System activity reporter (historical) |
sar -u 1 5 |
dstat |
Versatile resource statistics |
dstat -c -d -n 1 |
nmon |
Interactive system monitor |
nmon |
uptime |
Load average and uptime |
uptime |
cat /proc/loadavg |
Raw load average data |
cat /proc/loadavg |
CPU Metrics Explained
- %us: User space CPU time
- %sy: System (kernel) CPU time
- %ni: Nice priority processes
- %id: Idle CPU time
- %wa: I/O wait (CPU idle waiting for I/O)
- %hi: Hardware interrupts
- %si: Software interrupts
- %st: Steal time (virtualized environments)
- Load Average: Running + queued processes (1-minute, 5-minute, 15-minute)
Advanced CPU Analysis with perf
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
| # Profile CPU cycles for 5 seconds
perf record -F 99 -g -a sleep 5
# Show annotated disassembly
perf annotate
# Generate flame graph (requires FlameGraph scripts)
perf record -F 99 -g -a sleep 5
perf script | ./stackcollapse-perf.pl | ./flamegraph.pl > flamegraph.svg
# Top functions by CPU cycles
perf top
# Count specific events
perf stat -e cycles,instructions,cache-misses -a sleep 1
# Trace process scheduling
echo 1 > /proc/sys/kernel/sched_debug
cat /proc/sched_debug
|
| Command |
Description |
Example |
free |
Memory usage summary |
free -h |
vmstat |
Includes swap and memory stats |
vmstat 1 |
smem |
Proportional set size (PSS) |
smem -t -p |
ps |
Process memory usage |
ps aux --sort=-%mem |
top |
Interactive memory view |
top (press M) |
htop |
Memory columns, tree view |
htop (F5 for tree) |
sar |
Historical memory stats |
sar -r 1 5 |
numastat |
NUMA memory allocation |
numastat |
Memory Metrics
- total: Total usable RAM
- used: Used memory (applications + cache)
- free: Unused memory
- shared: Memory used by tmpfs
- buff/cache: Buffer and cache memory
- available: Memory available for new apps (without swapping)
- Swap: Virtual memory on disk
- PSS: Proportional set size (shared memory divided)
- RSS: Resident set size (physical memory used)
Advanced Memory Analysis
1
2
3
4
5
6
7
8
9
10
11
12
13
14
| # Memory leak detection with valgrind
valgrind --leak-check=full ./program
# Page fault statistics
vmstat 1 10
# NUMA balancing statistics
cat /proc/sys/kernel/numa_balancing
# Slab allocator usage
cat /proc/slabinfo
# Memory fragmentation index
grep -E "MemFree|MemTotal" /proc/meminfo
|
| Command |
Description |
Example |
iostat |
Device I/O statistics |
iostat -xz 1 |
iotop |
Process-level I/O |
iotop |
df |
Disk space usage |
df -h |
du |
Directory disk usage |
du -sh /* |
lsblk |
Block device tree |
lsblk |
blkid |
Block device attributes |
blkid |
fio |
Flexible I/O tester |
fio --name=test --rw=randread --bs=4k --ioengine=libaio --iodepth=32 --size=1G --numjobs=4 --runtime=60 --group_reporting |
I/O Metrics
- %util: Device utilization (0-100%)
- await: Average I/O wait time (ms)
- svctm: Service time (ms) - often unreliable
- r/s, w/s: Read/write operations per second
- rkB/s, wkB/s: Read/write throughput (KB/s)
- avgqu-sz: Average queue length
Advanced Disk I/O Analysis
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
| # Block layer tracing with bpftrace
bpftrace -e 'tracepoint:block:block_rq_issue { @[comm] = count(); }'
# I/O latency distribution
bpftrace -e 'tracepoint:block:block_rq_complete { @latency = hist(nsecs); }'
# Top processes by I/O
sudo /usr/share/bcc/tools/biosnoop
# Disk I/O flame graph
perf record -e block:block_rq_issue -a sleep 10
perf script | ./stackcollapse-perf.pl | ./flamegraph.pl > io-flamegraph.svg
# File system latency
sudo /usr/share/bcc/tools/ext4slower 1
|
| Command |
Description |
Example |
ss |
Socket statistics (replaces netstat) |
ss -tuln |
netstat |
Legacy network stats |
netstat -tuln |
ip |
IP configuration and stats |
ip -s link |
ping |
Connectivity and latency |
ping -c 10 host |
traceroute |
Path and latency |
traceroute host |
mtr |
Combined ping/traceroute |
mtr host |
nethogs |
Per-process network usage |
nethogs |
iftop |
Bandwidth usage by connection |
iftop -nP |
bmon |
Bandwidth monitor |
bmon |
sar |
Network interface stats |
sar -n DEV 1 5 |
Network Metrics
- rx/tx: Received/transmitted packets/bytes
- errs/drop: Errors and drops
- fifo: FIFO buffer overrun
- coll: Collisions (Ethernet)
- rx/tx-queue: Network driver queue lengths
- TCP metrics: retransmits, timewait, established, listen
Advanced Network Analysis
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
| # TCP connection tracking
ss -t -i state established
# Network packet capture (basic)
tcpdump -i eth0 -c 100 -w capture.pcap
# Packet capture with filters
tcpdump -i eth0 'tcp port 80'
# Network latency with hping3
hping3 -S -p 80 -c 10 host
# TCP retransmission analysis
ss -t -e state established | grep retrans
# BPF-based packet tracing
bpftrace -e 'tracepoint:skb:skb_birth { @[comm] = count(); }'
# Socket buffer overflow detection
bpftrace -e 'tracepoint:sock:sock_exceed_buf_limit { printf("%s %d\n", comm, sk->sk_sndbuf); }'
|
| Command |
Description |
Example |
uptime |
Load average and uptime |
uptime |
w |
Who is logged in and load |
w |
who |
Logged in users |
who |
last |
Login history |
last -a |
dmesg |
Kernel ring buffer |
dmesg | tail -50 |
journalctl |
Systemd journal logs |
journalctl -u nginx --since "1 hour ago" |
sar |
Historical system data |
sar -q 1 10 |
collectl |
Comprehensive system monitor |
collectl -s +C +D +N +S |
Load Average Interpretation
- 1-minute: Current load
- 5-minute: Medium-term load
- 15-minute: Long-term load
Rule of thumb: Load average should not exceed number of CPU cores for sustained periods.
1
2
3
4
5
6
7
8
| # Check CPU count
nproc
# Compare load to CPU count
cat /proc/loadavg | awk '{print $1, $2, $3}'
# Historical load from sar
sar -q 1 10
|
Process-Level Monitoring
| Command |
Description |
Example |
ps |
Process status |
ps aux --sort=-%cpu |
top |
Interactive process viewer |
top -p <PID> |
htop |
Enhanced process viewer |
htop |
pgrep |
Find process by name |
pgrep -f nginx |
pkill |
Kill process by name |
pkill -f nginx |
kill |
Send signal to process |
kill -9 <PID> |
strace |
Trace system calls |
strace -p <PID> |
ltrace |
Trace library calls |
ltrace -p <PID> |
lsof |
List open files |
lsof -p <PID> |
nice |
Start process with priority |
nice -n 19 command |
renice |
Change process priority |
renice -n 10 -p <PID> |
cgroups |
Control resource usage |
systemd-cgls |
Advanced Process Tracing with bpftrace
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
| # Trace process execve (new processes)
bpftrace -e 'tracepoint:syscalls:sys_enter_execve { printf("%s %s\n", comm, str(args->filename)); }'
# Count process creations by executable
bpftrace -e 'tracepoint:syscalls:sys_enter_execve { @[comm] = count(); }'
# Trace process exits
bpftrace -e 'tracepoint:sched:sched_process_exit { @[comm] = count(); }'
# Top CPU consumers (using BCC)
sudo /usr/share/bcc/tools/cpudist
# Process I/O
sudo /usr/share/bcc/tools/biosnoop
# Process memory allocation
sudo /usr/share/bcc/tools.memleak
|
What is eBPF?
eBPF (extended Berkeley Packet Filter) is a revolutionary Linux kernel technology that allows safe, efficient, and programmable tracing without kernel modules. It’s used for:
- Performance profiling: CPU, memory, I/O, network
- Security monitoring: Syscall filtering, file access
- Networking: XDP, TC, socket filters
- Observability: Dynamic tracing with low overhead
Key eBPF concepts:
- CO-RE (Compile Once – Run Everywhere): Single BPF bytecode runs on multiple kernel versions
- BPF tokens: Fine-grained permissions for BPF operations
- BPF arena: New memory model for BPF programs
- Tracepoints: Stable kernel instrumentation points
- Kprobes/Uprobes: Dynamic function entry/exit tracing
BCC (BPF Compiler Collection) provides ready-to-use eBPF tools. Install: apt install bpfcc-tools or yum install bcc-tools
| Tool |
Purpose |
Example |
execsnoop |
Trace process execution |
execsnoop |
opensnoop |
Trace file opens |
opensnoop -n <process> |
ext4slower |
Slow ext4 operations |
ext4slower 1 |
biolatency |
Block I/O latency |
biolatency |
biosnoop |
Block I/O requests |
biosnoop |
cachestat |
Cache hit/miss stats |
cachestat 1 |
tcpconnect |
TCP connection attempts |
tcpconnect |
tcpretrans |
TCP retransmissions |
tcpretrans |
funccount |
Count function calls |
funccount -p <PID> <function> |
profile |
CPU profiling (sampling) |
profile -F 99 -d 10 |
offcputime |
Off-CPU time analysis |
offcputime -p <PID> |
runqlat |
Run queue latency |
runqlat |
wakeuptime |
Process wakeup latency |
wakeuptime |
bpfstat |
BPF program statistics |
bpfstat |
bpftrace One-Liners
bpftrace is a high-level tracing language for eBPF. Install: apt install bpftrace
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
| # System call rate every second
bpftrace -e 'tracepoint:raw_syscalls:sys_enter { @ = count(); } interval:s:1 { print(@); clear(@); }'
# Top system calls by count
bpftrace -e 'tracepoint:raw_syscalls:sys_enter { @[probe] = count(); } interval:s:5 { print(@); clear(@); }'
# Trace file opens by specific process
bpftrace -e 'tracepoint:syscalls:sys_enter_openat /comm == "nginx"/ { printf("%s %s\n", comm, str(args->filename)); }'
# Count read/write bytes per process
bpftrace -e 'tracepoint:syscalls:sys_enter_read /pid > 0/ { @[comm] = sum(args->count); } tracepoint:syscalls:sys_enter_write /pid > 0/ { @[comm] = sum(args->count); } interval:s:10 { print(@); clear(@); }'
# Trace process blocking (off-CPU)
bpftrace -e 'tracepoint:sched:sched_switch /prev_comm != "swapper"/ { @[prev_comm] = hist(delta); }'
# Network packet drops
bpftrace -e 'tracepoint:skb:skb_kfree_skb { @[comm] = count(); }'
# CPU migration events
bpftrace -e 'tracepoint:sched:sched_migrate_task { printf("%s migrated from %d to %d\n", comm, args->orig_cpu, args->dest_cpu); }'
# Kernel function entry/exit
bpftrace -e 'kprobe:vfs_read { @start[tid] = nsecs; } kretprobe:vfs_read /@start[tid]/ { @latency = hist(nsecs - @start[tid]); delete(@start[tid]); }'
# Memory allocation (kmalloc)
bpftrace -e 'kprobe:kmalloc { @[probe] = count(); } interval:s:5 { print(@); clear(@); }'
|
Advanced bpftrace Scripts
Save these as .bt files and run with bpftrace script.bt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
| #!/usr/bin/env bpftrace
# offcpu.bt - Off-CPU time analysis by stack trace
kprobe:sched_switch
{
@start[tid] = nsecs;
}
kretprobe:sched_switch
/@start[tid]/
{
@offcpu[stack] = hist(nsecs - @start[tid]);
delete(@start[tid]);
}
{
trunc(@offcpu, -10);
}
|
1
2
3
4
5
6
7
8
9
10
11
12
13
14
| #!/usr/bin/env bpftrace
# iolatency.bt - Disk I/O latency by device
tracepoint:block:block_rq_issue
{
@start[args->dev] = nsecs;
}
tracepoint:block:block_rq_complete
/@start[args->dev]/
{
@latency[devname] = hist(nsecs - @start[args->dev]);
delete(@start[args->dev]);
}
|
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
| #!/usr/bin/env bpftrace
# tcplife.bt - TCP connection lifetime
tracepoint:tcp:tcp_set_state
/args->newstate == 2/ # TCP_ESTABLISHED
{
@start[tid] = nsecs;
}
tracepoint:tcp:tcp_set_state
/@start[tid] && args->newstate == 7/ # TCP_CLOSE
{
@lifetime[comm] = hist(nsecs - @start[tid]);
delete(@start[tid]);
}
|
Flame Graphs
Flame graphs visualize profiled software, showing which code paths are hottest (most frequently on-CPU).
Generating CPU Flame Graphs
1
2
3
4
5
6
7
8
9
| # Using perf (Linux)
perf record -F 99 -g -a sleep 10
perf script | ./stackcollapse-perf.pl | ./flamegraph.pl > cpu-flamegraph.svg
# Using bpftrace (simpler)
bpftrace -e 'profile:hz:99 { @[ustack] = count(); } interval:s:10 { print(@); }' | ./stackcollapse-bpftrace.pl | ./flamegraph.pl > bpftrace-flamegraph.svg
# Using BCC
/usr/share/bcc/tools/profile -F 99 -d 10 | ./stackcollapse-bpftrace.pl | ./flamegraph.pl > bcc-flamegraph.svg
|
Generating Off-CPU Flame Graphs
Off-CPU time shows where threads block (I/O, locks, etc.).
1
2
3
4
5
| # Using bpftrace offcpu.bt script
bpftrace offcpu.bt | ./stackcollapse-bpftrace.pl | ./flamegraph.pl > offcpu-flamegraph.svg
# Using BCC offcputime
/usr/share/bcc/tools/offcputime -p <PID> -f 10 | ./stackcollapse-bcc.pl | ./flamegraph.pl > offcpu-bcc.svg
|
Flame Graph Resources
- Download scripts: https://github.com/brendangregg/FlameGraph
- Interactive zoom: Click any box to zoom in
- Colors: Warm (red/yellow) = hot paths, cool (blue/green) = cold paths
- Width = total time spent in that function
CPU Bottleneck Diagnosis
- Check overall CPU utilization:
top or htop
- Identify high CPU processes:
ps aux --sort=-%cpu
- Profile with
perf top or perf record + flame graph
- Analyze kernel vs user time:
vmstat 1 (us vs sy)
- Check run queue:
uptime (load average vs CPU count)
- Investigate interrupts:
vmstat 1 (hi, si)
Memory Bottleneck Diagnosis
- Check memory usage:
free -h
- Identify memory-hog processes:
ps aux --sort=-%mem
- Check for swapping:
vmstat 1 (si/so columns)
- Analyze page faults:
vmstat 1 (majflt)
- NUMA issues:
numastat
- Slab usage:
cat /proc/slabinfo
Disk I/O Bottleneck Diagnosis
- Check device utilization:
iostat -xz 1 (%util > 80% = saturated)
- Identify slow devices: high await, avgqu-sz
- Process-level I/O:
iotop
- Trace I/O with bpftrace:
biolatency, biosnoop
- File system level:
ext4slower, xfsslower
- Generate I/O flame graphs
Network Bottleneck Diagnosis
- Check interface stats:
ip -s link (errors, drops)
- Connection count:
ss -s
- Listening ports:
ss -tuln
- Packet drops:
netstat -s | grep drop
- TCP retransmits:
ss -t -e
- Trace with
tcpconnect, tcpretrans
Application Slowdown Diagnosis
- Check system metrics (CPU, memory, I/O)
- Identify slow processes:
top, ps
- Trace system calls:
strace -p <PID>
- Profile CPU:
perf record -p <PID>
- Analyze off-CPU time:
offcputime -p <PID>
- Generate flame graphs for both on-CPU and off-CPU
Reference Tables
Metric Thresholds
| Metric |
Warning |
Critical |
| CPU Utilization |
>80% sustained |
>95% sustained |
| Load Average |
>cores * 2 |
>cores * 4 |
| Memory Usage |
>90% |
>95% + swapping |
| Disk %util |
>70% |
>90% |
| Disk await |
>20ms |
>50ms |
| Network drops |
>0.1% |
>1% |
| TCP retransmits |
>1% |
>5% |
| Task |
Tool |
Command |
| System overview |
top |
top |
| CPU profiling |
perf |
perf record -F 99 -g -a sleep 5 |
| Memory usage |
free |
free -h |
| Disk I/O |
iostat |
iostat -xz 1 |
| Network stats |
ss |
ss -tuln |
| Process tree |
pstree |
pstree -p |
| System calls |
strace |
strace -p <PID> |
| Advanced tracing |
bpftrace |
bpftrace -e '...' |
| BCC tools |
Various |
/usr/share/bcc/tools/<tool> |
| Flame graphs |
perf + scripts |
See Flame Graphs section |
Additional Resources
This cheat sheet provides both essential daily monitoring commands and advanced tracing techniques for comprehensive Linux performance analysis.