# biolatency.bt
Attaching 3 probes...
Tracing block device I/O... Hit Ctrl-C to end.
^C
@usecs:
[256, 512) 2 | |
[512, 1K) 10 |@ |
[1K, 2K) 426 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
[2K, 4K) 230 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@ |
[4K, 8K) 9 |@ |
[8K, 16K) 128 |@@@@@@@@@@@@@@@ |
[16K, 32K) 68 |@@@@@@@@ |
[32K, 64K) 0 | |
[64K, 128K) 0 | |
[128K, 256K) 10 |@ |
eBPF Perf Tools 2019
Brendan Gregg
SCaLE
Mar 2019
LIVE DEMO
eBPF Minecraft Analysis
Enhanced BPF
Kernel
kprobeskprobes
uprobesuprobes
tracepointstracepoints
socketssockets
SDN ConfigurationSDN Configuration
User-Defined BPF Programs
…
Event TargetsRuntime
also known as just "BPF"
Linux 4.*
perf_eventsperf_events
BPF
actions
BPF
actions
BPFBPF
verifierverifier
DDoS MitigationDDoS Mitigation
Intrusion DetectionIntrusion Detection
Container SecurityContainer Security
ObservabilityObservability
Firewalls (bpfilter)Firewalls (bpfilter)
Device DriversDevice Drivers
eBPF bcc Linux 4.4+
https://github.com/iovisor/bcc
eBPF bpftrace (aka BPFtrace) Linux 4.9+
https://github.com/iovisor/bpftrace
# Files opened by process
bpftrace -e 't:syscalls:sys_enter_open { printf("%s %sn", comm,
str(args->filename)) }'
# Read size distribution by process
bpftrace -e 't:syscalls:sys_exit_read { @[comm] = hist(args->ret) }'
# Count VFS calls
bpftrace -e 'kprobe:vfs_* { @[func]++ }'
# Show vfs_read latency as a histogram
bpftrace -e 'k:vfs_read { @[tid] = nsecs }
kr:vfs_read /@[tid]/ { @ns = hist(nsecs - @[tid]); delete(@tid) }’
# Trace user-level function
Bpftrace -e 'uretprobe:bash:readline { printf(“%sn”, str(retval)) }’
…
eBPF is solving new things: off-CPU + wakeup analysis
Raw BPF
samples/bpf/sock_example.c
87 lines truncated
C/BPF
samples/bpf/tracex1_kern.c
58 lines truncated
bcc/BPF (C & Python)
bcc examples/tracing/bitehist.py
entire program
bpftrace
https://github.com/iovisor/bpftrace
entire program
bpftrace -e 'kr:vfs_read { @ = hist(retval); }'
The Tracing Landscape, Mar 2019
Scope & Capability
Easeofuse
sysdig
perf
ftrace
C/BPF
stap
Stage of
Development
(my opinion)
(brutal)(lessbrutal)
(alpha) (mature)
bcc/BPF
ply/BPF
Raw BPF
LTTng
(hist triggers)
recent changes
(many)
bpftrace
(eBPF)
(0.9)
e.g., identify multimodal disk I/O latency and outliers
with bcc/eBPF biolatency
# biolatency -mT 10
Tracing block device I/O... Hit Ctrl-C to end.
19:19:04
msecs : count distribution
0 -> 1 : 238 |********* |
2 -> 3 : 424 |***************** |
4 -> 7 : 834 |********************************* |
8 -> 15 : 506 |******************** |
16 -> 31 : 986 |****************************************|
32 -> 63 : 97 |*** |
64 -> 127 : 7 | |
128 -> 255 : 27 |* |
19:19:14
msecs : count distribution
0 -> 1 : 427 |******************* |
2 -> 3 : 424 |****************** |
[…]
bcc/eBPF programs can be laborious: biolatency
# define BPF program
bpf_text = """
#include <uapi/linux/ptrace.h>
#include <linux/blkdev.h>
typedef struct disk_key {
char disk[DISK_NAME_LEN];
u64 slot;
} disk_key_t;
BPF_HASH(start, struct request *);
STORAGE
// time block I/O
int trace_req_start(struct pt_regs *ctx, struct request *req)
{
u64 ts = bpf_ktime_get_ns();
start.update(&req, &ts);
return 0;
}
// output
int trace_req_completion(struct pt_regs *ctx, struct request *req)
{
u64 *tsp, delta;
// fetch timestamp and calculate delta
tsp = start.lookup(&req);
if (tsp == 0) {
return 0; // missed issue
}
delta = bpf_ktime_get_ns() - *tsp;
FACTOR
// store as histogram
STORE
start.delete(&req);
return 0;
}
"""
# code substitutions
if args.milliseconds:
bpf_text = bpf_text.replace('FACTOR', 'delta /= 1000000;')
label = "msecs"
else:
bpf_text = bpf_text.replace('FACTOR', 'delta /= 1000;')
label = "usecs"
if args.disks:
bpf_text = bpf_text.replace('STORAGE',
'BPF_HISTOGRAM(dist, disk_key_t);')
bpf_text = bpf_text.replace('STORE',
'disk_key_t key = {.slot = bpf_log2l(delta)}; ' +
'void *__tmp = (void *)req->rq_disk->disk_name; ' +
'bpf_probe_read(&key.disk, sizeof(key.disk), __tmp); ' +
'dist.increment(key);')
else:
bpf_text = bpf_text.replace('STORAGE', 'BPF_HISTOGRAM(dist);')
bpf_text = bpf_text.replace('STORE',
'dist.increment(bpf_log2l(delta));')
if debug or args.ebpf:
print(bpf_text)
if args.ebpf:
exit()
# load BPF program
b = BPF(text=bpf_text)
if args.queued:
b.attach_kprobe(event="blk_account_io_start", fn_name="trace_req_start")
else:
b.attach_kprobe(event="blk_start_request", fn_name="trace_req_start")
b.attach_kprobe(event="blk_mq_start_request", fn_name="trace_req_start")
b.attach_kprobe(event="blk_account_io_completion",
fn_name="trace_req_completion")
print("Tracing block device I/O... Hit Ctrl-C to end.")
# output
exiting = 0 if args.interval else 1
dist = b.get_table("dist")
while (1):
try:
sleep(int(args.interval))
except KeyboardInterrupt:
exiting = 1
print()
if args.timestamp:
print("%-8sn" % strftime("%H:%M:%S"), end="")
dist.print_log2_hist(label, "disk")
dist.clear()
countdown -= 1
if exiting or countdown == 0:
exit()
… rewritten in bpftrace (launched Oct 2018)!
#!/usr/local/bin/bpftrace
BEGIN
{
printf("Tracing block device I/O... Hit Ctrl-C to end.n");
}
kprobe:blk_account_io_start
{
@start[arg0] = nsecs;
}
kprobe:blk_account_io_completion
/@start[arg0]/
{
@usecs = hist((nsecs - @start[arg0]) / 1000);
delete(@start[arg0]);
}
… rewritten in bpftrace
# biolatency.bt
Attaching 3 probes...
Tracing block device I/O... Hit Ctrl-C to end.
^C
@usecs:
[256, 512) 2 | |
[512, 1K) 10 |@ |
[1K, 2K) 426 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
[2K, 4K) 230 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@ |
[4K, 8K) 9 |@ |
[8K, 16K) 128 |@@@@@@@@@@@@@@@ |
[16K, 32K) 68 |@@@@@@@@ |
[32K, 64K) 0 | |
[64K, 128K) 0 | |
[128K, 256K) 10 |@ |
bcc
canned complex tools, agents
bpftrace
one-liners, custom scripts
bcc
eBPF bcc Linux 4.4+
https://github.com/iovisor/bcc
bpftrace
eBPF bpftrace Linux 4.9+
https://github.com/iovisor/bcc
Major Features (v1)
Known Bug Fixes
Packaging
API Stability Stable Docs
Oct 2018
v0.90
Mar?2019
v1.0
?2019Dec 2016
More Bug Fixes
v0.80
Jan-2019
Minor Features (v1) ...
bpftrace Development
bpftrace Syntax
bpftrace -e ‘k:do_nanosleep /pid > 100/ { @[comm]++ }’
Probe
Filter
(optional)
Action
Probes
Probe Type Shortcuts
tracepoint t Kernel static tracepoints
usdt U User-level statically defined tracing
kprobe k Kernel function tracing
kretprobe kr Kernel function returns
uprobe u User-level function tracing
uretprobe ur User-level function returns
profile p Timed sampling across all CPUs
interval i Interval output
software s Kernel software events
hardware h Processor hardware events
Filters
● /pid == 181/
● /comm != “sshd”/
● /@ts[tid]/
Actions
●
Per-event output
– printf()
– system()
– join()
– time()
●
Map Summaries
– @ = count() or @++
– @ = hist()
– …
The following is in the https://github.com/iovisor/bpftrace/blob/master/docs/reference_guide.md
Functions
●
hist(n) Log2 histogram
●
lhist(n, min, max, step) Linear hist.
●
count() Count events
●
sum(n) Sum value
●
min(n) Minimum value
●
max(n) Maximum value
●
avg(n) Average value
●
stats(n) Statistics
●
str(s) String
●
sym(p) Resolve kernel addr
●
usym(p) Resolve user addr
●
kaddr(n) Resolve kernel symbol
●
uaddr(n) Resolve user symbol
●
printf(fmt, ...) Print formatted
●
print(@x[, top[, div]]) Print map
●
delete(@x) Delete map element
●
clear(@x) Delete all keys/values
●
reg(n) Register lookup
●
join(a) Join string array
●
time(fmt) Print formatted time
●
system(fmt) Run shell command
●
exit() Quit bpftrace
Variable Types
●
Basic Variables
– @global
– @thread_local[tid]
– $scratch
●
Associative Arrays
– @array[key] = value
●
Buitins
– pid
– ...
Builtin Variables
● pid Process ID (kernel tgid)
● tid Thread ID (kernel pid)
● cgroup Current Cgroup ID
● uid User ID
● gid Group ID
● nsecs Nanosecond timestamp
● cpu Processor ID
● comm Process name
● stack Kernel stack trace
● ustack User stack trace
● arg0, arg1, ... Function arguments
● retval Return value
● func Function name
● probe Full name of the probe
● curtask Current task_struct (u64)
● rand Random number (u32)
biolatency (again)
#!/usr/local/bin/bpftrace
BEGIN
{
printf("Tracing block device I/O... Hit Ctrl-C to end.n");
}
kprobe:blk_account_io_start
{
@start[arg0] = nsecs;
}
kprobe:blk_account_io_completion
/@start[arg0]/
{
@usecs = hist((nsecs - @start[arg0]) / 1000);
delete(@start[arg0]);
}
bpftrace Internals
Issues
●
All major capabilities exist
●
Many minor things
●
https://github.com/iovisor/bpftrace/issues
Other Tools
Netlfix Vector: BPF heat maps
https://medium.com/netflix-techblog/extending-vector-with-ebpf-to-inspect-host-and-container-performance-
5da3af4c584b
Anticipated Worldwide Audience
●
BPF Tool Developers:
– Raw BPF: <20
– C (or C++) BPF: ~20
– bcc: >200
– bpftrace: >5,000
●
BPF Tool Users:
– CLI tools (of any type): >20,000
– GUIs (fronting any type): >200,000
Other Tools
●
kubectl-trace
●
sysdig eBPF support
Take Aways
Easily explore systems with bcc/bpftrace
Contribute: see bcc/bpftrace issue list
Share: posts, talks
URLs
- https://github.com/iovisor/bcc
- https://github.com/iovisor/bcc/blob/master/docs/tutorial.md
- https://github.com/iovisor/bcc/blob/master/docs/reference_guide.md
- https://github.com/iovisor/bpftrace
- https://github.com/iovisor/bpftrace/blob/master/docs/tutorial_one_liners.md
- https://github.com/iovisor/bpftrace/blob/master/docs/reference_guide.md
Thanks
●
bpftrace
– Alastair Robertson (creator)
– Netflix: myself so for
– Sthima: Matheus Marchini, Willian Gaspar
– Facebook: Jon Haslam, Dan Xu
– Augusto Mecking Caringi, Dale Hamel, ...
●
eBPF/bcc
– Facebook: Alexei Starovoitov, Teng Qin, Yonghong Song, Martin Lau, Mark
Drayton, …
– Netlfix: myself
– VMware: Brenden Blanco
– Sasha Goldsthein, Paul Chaignon, ...

More Related Content

PDF
UM2019 Extended BPF: A New Type of Software
PDF
BPF Internals (eBPF)
PDF
Meet cute-between-ebpf-and-tracing
PDF
eBPF Trace from Kernel to Userspace
PDF
DoS and DDoS mitigations with eBPF, XDP and DPDK
PDF
BPF: Tracing and more
PDF
eBPF - Rethinking the Linux Kernel
PDF
Systems@Scale 2021 BPF Performance Getting Started
UM2019 Extended BPF: A New Type of Software
BPF Internals (eBPF)
Meet cute-between-ebpf-and-tracing
eBPF Trace from Kernel to Userspace
DoS and DDoS mitigations with eBPF, XDP and DPDK
BPF: Tracing and more
eBPF - Rethinking the Linux Kernel
Systems@Scale 2021 BPF Performance Getting Started

What's hot (20)

PDF
BPF / XDP 8월 세미나 KossLab
PDF
EBPF and Linux Networking
PDF
Linux dma engine
PDF
LISA2019 Linux Systems Performance
PPTX
Linux Kernel Booting Process (2) - For NLKB
PDF
Performance Wins with BPF: Getting Started
PDF
Qemu device prototyping
PDF
Introduction of eBPF - 時下最夯的Linux Technology
PDF
eBPF/XDP
PDF
Performance Wins with eBPF: Getting Started (2021)
PDF
ARM Trusted FirmwareのBL31を単体で使う!
PDF
YOW2018 Cloud Performance Root Cause Analysis at Netflix
PDF
Profiling your Applications using the Linux Perf Tools
PDF
BPF - in-kernel virtual machine
PDF
Part 02 Linux Kernel Module Programming
PDF
Linux kernel tracing
PPTX
Understanding eBPF in a Hurry!
PDF
Java Performance Analysis on Linux with Flame Graphs
PDF
The linux networking architecture
PDF
Security Monitoring with eBPF
BPF / XDP 8월 세미나 KossLab
EBPF and Linux Networking
Linux dma engine
LISA2019 Linux Systems Performance
Linux Kernel Booting Process (2) - For NLKB
Performance Wins with BPF: Getting Started
Qemu device prototyping
Introduction of eBPF - 時下最夯的Linux Technology
eBPF/XDP
Performance Wins with eBPF: Getting Started (2021)
ARM Trusted FirmwareのBL31を単体で使う!
YOW2018 Cloud Performance Root Cause Analysis at Netflix
Profiling your Applications using the Linux Perf Tools
BPF - in-kernel virtual machine
Part 02 Linux Kernel Module Programming
Linux kernel tracing
Understanding eBPF in a Hurry!
Java Performance Analysis on Linux with Flame Graphs
The linux networking architecture
Security Monitoring with eBPF
Ad

Similar to eBPF Perf Tools 2019 (20)

PDF
LSFMM 2019 BPF Observability
PDF
re:Invent 2019 BPF Performance Analysis at Netflix
PDF
OSSNA 2017 Performance Analysis Superpowers with Linux BPF
PDF
Kernel Recipes 2017 - Performance analysis Superpowers with Linux BPF - Brend...
PDF
Kernel Recipes 2017: Performance Analysis with BPF
PDF
ATO Linux Performance 2018
PDF
Reproducible Computational Pipelines with Docker and Nextflow
PDF
Osol Pgsql
PDF
Scaling the #2ndhalf
PDF
PDF
pg_proctab: Accessing System Stats in PostgreSQL
PDF
pg_proctab: Accessing System Stats in PostgreSQL
PPTX
Debugging linux issues with eBPF
PDF
Accelerating microbiome research with OpenACC
PDF
Spectre(v1%2 fv2%2fv4) v.s. meltdown(v3)
PPTX
Embedded JavaScript
PDF
bcc/BPF tools - Strategy, current tools, future challenges
PDF
BPF Tools 2017
PDF
USENIX ATC 2017 Performance Superpowers with Enhanced BPF
LSFMM 2019 BPF Observability
re:Invent 2019 BPF Performance Analysis at Netflix
OSSNA 2017 Performance Analysis Superpowers with Linux BPF
Kernel Recipes 2017 - Performance analysis Superpowers with Linux BPF - Brend...
Kernel Recipes 2017: Performance Analysis with BPF
ATO Linux Performance 2018
Reproducible Computational Pipelines with Docker and Nextflow
Osol Pgsql
Scaling the #2ndhalf
pg_proctab: Accessing System Stats in PostgreSQL
pg_proctab: Accessing System Stats in PostgreSQL
Debugging linux issues with eBPF
Accelerating microbiome research with OpenACC
Spectre(v1%2 fv2%2fv4) v.s. meltdown(v3)
Embedded JavaScript
bcc/BPF tools - Strategy, current tools, future challenges
BPF Tools 2017
USENIX ATC 2017 Performance Superpowers with Enhanced BPF
Ad

More from Brendan Gregg (16)

PDF
YOW2021 Computing Performance
PDF
IntelON 2021 Processor Benchmarking
PDF
Computing Performance: On the Horizon (2021)
PDF
YOW2020 Linux Systems Performance
PDF
LPC2019 BPF Tracing Tools
PDF
YOW2018 CTO Summit: Working at netflix
PDF
NetConf 2018 BPF Observability
PDF
FlameScope 2018
PDF
Linux Performance 2018 (PerconaLive keynote)
PDF
How Netflix Tunes EC2 Instances for Performance
PDF
LISA17 Container Performance Analysis
PDF
Kernel Recipes 2017: Using Linux perf at Netflix
PDF
EuroBSDcon 2017 System Performance Analysis Methodologies
PDF
USENIX ATC 2017: Visualizing Performance with Flame Graphs
PDF
Velocity 2017 Performance analysis superpowers with Linux eBPF
PDF
Container Performance Analysis
YOW2021 Computing Performance
IntelON 2021 Processor Benchmarking
Computing Performance: On the Horizon (2021)
YOW2020 Linux Systems Performance
LPC2019 BPF Tracing Tools
YOW2018 CTO Summit: Working at netflix
NetConf 2018 BPF Observability
FlameScope 2018
Linux Performance 2018 (PerconaLive keynote)
How Netflix Tunes EC2 Instances for Performance
LISA17 Container Performance Analysis
Kernel Recipes 2017: Using Linux perf at Netflix
EuroBSDcon 2017 System Performance Analysis Methodologies
USENIX ATC 2017: Visualizing Performance with Flame Graphs
Velocity 2017 Performance analysis superpowers with Linux eBPF
Container Performance Analysis

Recently uploaded (20)

DOCX
Basics of Cloud Computing - Cloud Ecosystem
PDF
Electrocardiogram sequences data analytics and classification using unsupervi...
PPTX
SGT Report The Beast Plan and Cyberphysical Systems of Control
PDF
substrate PowerPoint Presentation basic one
PDF
Transform-Your-Supply-Chain-with-AI-Driven-Quality-Engineering.pdf
PDF
Dell Pro Micro: Speed customer interactions, patient processing, and learning...
PDF
Lung cancer patients survival prediction using outlier detection and optimize...
PDF
The AI Revolution in Customer Service - 2025
PDF
CXOs-Are-you-still-doing-manual-DevOps-in-the-age-of-AI.pdf
PDF
4 layer Arch & Reference Arch of IoT.pdf
PDF
ment.tech-Siri Delay Opens AI Startup Opportunity in 2025.pdf
PDF
LMS bot: enhanced learning management systems for improved student learning e...
PDF
A symptom-driven medical diagnosis support model based on machine learning te...
PDF
Aug23rd - Mulesoft Community Workshop - Hyd, India.pdf
PDF
SaaS reusability assessment using machine learning techniques
PDF
MENA-ECEONOMIC-CONTEXT-VC MENA-ECEONOMIC
PPTX
Build automations faster and more reliably with UiPath ScreenPlay
PDF
Auditboard EB SOX Playbook 2023 edition.
PDF
CEH Module 2 Footprinting CEH V13, concepts
PDF
EIS-Webinar-Regulated-Industries-2025-08.pdf
Basics of Cloud Computing - Cloud Ecosystem
Electrocardiogram sequences data analytics and classification using unsupervi...
SGT Report The Beast Plan and Cyberphysical Systems of Control
substrate PowerPoint Presentation basic one
Transform-Your-Supply-Chain-with-AI-Driven-Quality-Engineering.pdf
Dell Pro Micro: Speed customer interactions, patient processing, and learning...
Lung cancer patients survival prediction using outlier detection and optimize...
The AI Revolution in Customer Service - 2025
CXOs-Are-you-still-doing-manual-DevOps-in-the-age-of-AI.pdf
4 layer Arch & Reference Arch of IoT.pdf
ment.tech-Siri Delay Opens AI Startup Opportunity in 2025.pdf
LMS bot: enhanced learning management systems for improved student learning e...
A symptom-driven medical diagnosis support model based on machine learning te...
Aug23rd - Mulesoft Community Workshop - Hyd, India.pdf
SaaS reusability assessment using machine learning techniques
MENA-ECEONOMIC-CONTEXT-VC MENA-ECEONOMIC
Build automations faster and more reliably with UiPath ScreenPlay
Auditboard EB SOX Playbook 2023 edition.
CEH Module 2 Footprinting CEH V13, concepts
EIS-Webinar-Regulated-Industries-2025-08.pdf

eBPF Perf Tools 2019

  • 1. # biolatency.bt Attaching 3 probes... Tracing block device I/O... Hit Ctrl-C to end. ^C @usecs: [256, 512) 2 | | [512, 1K) 10 |@ | [1K, 2K) 426 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| [2K, 4K) 230 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@ | [4K, 8K) 9 |@ | [8K, 16K) 128 |@@@@@@@@@@@@@@@ | [16K, 32K) 68 |@@@@@@@@ | [32K, 64K) 0 | | [64K, 128K) 0 | | [128K, 256K) 10 |@ | eBPF Perf Tools 2019 Brendan Gregg SCaLE Mar 2019
  • 3. Enhanced BPF Kernel kprobeskprobes uprobesuprobes tracepointstracepoints socketssockets SDN ConfigurationSDN Configuration User-Defined BPF Programs … Event TargetsRuntime also known as just "BPF" Linux 4.* perf_eventsperf_events BPF actions BPF actions BPFBPF verifierverifier DDoS MitigationDDoS Mitigation Intrusion DetectionIntrusion Detection Container SecurityContainer Security ObservabilityObservability Firewalls (bpfilter)Firewalls (bpfilter) Device DriversDevice Drivers
  • 4. eBPF bcc Linux 4.4+ https://github.com/iovisor/bcc
  • 5. eBPF bpftrace (aka BPFtrace) Linux 4.9+ https://github.com/iovisor/bpftrace # Files opened by process bpftrace -e 't:syscalls:sys_enter_open { printf("%s %sn", comm, str(args->filename)) }' # Read size distribution by process bpftrace -e 't:syscalls:sys_exit_read { @[comm] = hist(args->ret) }' # Count VFS calls bpftrace -e 'kprobe:vfs_* { @[func]++ }' # Show vfs_read latency as a histogram bpftrace -e 'k:vfs_read { @[tid] = nsecs } kr:vfs_read /@[tid]/ { @ns = hist(nsecs - @[tid]); delete(@tid) }’ # Trace user-level function Bpftrace -e 'uretprobe:bash:readline { printf(“%sn”, str(retval)) }’ …
  • 6. eBPF is solving new things: off-CPU + wakeup analysis
  • 9. bcc/BPF (C & Python) bcc examples/tracing/bitehist.py entire program
  • 11. The Tracing Landscape, Mar 2019 Scope & Capability Easeofuse sysdig perf ftrace C/BPF stap Stage of Development (my opinion) (brutal)(lessbrutal) (alpha) (mature) bcc/BPF ply/BPF Raw BPF LTTng (hist triggers) recent changes (many) bpftrace (eBPF) (0.9)
  • 12. e.g., identify multimodal disk I/O latency and outliers with bcc/eBPF biolatency # biolatency -mT 10 Tracing block device I/O... Hit Ctrl-C to end. 19:19:04 msecs : count distribution 0 -> 1 : 238 |********* | 2 -> 3 : 424 |***************** | 4 -> 7 : 834 |********************************* | 8 -> 15 : 506 |******************** | 16 -> 31 : 986 |****************************************| 32 -> 63 : 97 |*** | 64 -> 127 : 7 | | 128 -> 255 : 27 |* | 19:19:14 msecs : count distribution 0 -> 1 : 427 |******************* | 2 -> 3 : 424 |****************** | […]
  • 13. bcc/eBPF programs can be laborious: biolatency # define BPF program bpf_text = """ #include <uapi/linux/ptrace.h> #include <linux/blkdev.h> typedef struct disk_key { char disk[DISK_NAME_LEN]; u64 slot; } disk_key_t; BPF_HASH(start, struct request *); STORAGE // time block I/O int trace_req_start(struct pt_regs *ctx, struct request *req) { u64 ts = bpf_ktime_get_ns(); start.update(&req, &ts); return 0; } // output int trace_req_completion(struct pt_regs *ctx, struct request *req) { u64 *tsp, delta; // fetch timestamp and calculate delta tsp = start.lookup(&req); if (tsp == 0) { return 0; // missed issue } delta = bpf_ktime_get_ns() - *tsp; FACTOR // store as histogram STORE start.delete(&req); return 0; } """ # code substitutions if args.milliseconds: bpf_text = bpf_text.replace('FACTOR', 'delta /= 1000000;') label = "msecs" else: bpf_text = bpf_text.replace('FACTOR', 'delta /= 1000;') label = "usecs" if args.disks: bpf_text = bpf_text.replace('STORAGE', 'BPF_HISTOGRAM(dist, disk_key_t);') bpf_text = bpf_text.replace('STORE', 'disk_key_t key = {.slot = bpf_log2l(delta)}; ' + 'void *__tmp = (void *)req->rq_disk->disk_name; ' + 'bpf_probe_read(&key.disk, sizeof(key.disk), __tmp); ' + 'dist.increment(key);') else: bpf_text = bpf_text.replace('STORAGE', 'BPF_HISTOGRAM(dist);') bpf_text = bpf_text.replace('STORE', 'dist.increment(bpf_log2l(delta));') if debug or args.ebpf: print(bpf_text) if args.ebpf: exit() # load BPF program b = BPF(text=bpf_text) if args.queued: b.attach_kprobe(event="blk_account_io_start", fn_name="trace_req_start") else: b.attach_kprobe(event="blk_start_request", fn_name="trace_req_start") b.attach_kprobe(event="blk_mq_start_request", fn_name="trace_req_start") b.attach_kprobe(event="blk_account_io_completion", fn_name="trace_req_completion") print("Tracing block device I/O... Hit Ctrl-C to end.") # output exiting = 0 if args.interval else 1 dist = b.get_table("dist") while (1): try: sleep(int(args.interval)) except KeyboardInterrupt: exiting = 1 print() if args.timestamp: print("%-8sn" % strftime("%H:%M:%S"), end="") dist.print_log2_hist(label, "disk") dist.clear() countdown -= 1 if exiting or countdown == 0: exit()
  • 14. … rewritten in bpftrace (launched Oct 2018)! #!/usr/local/bin/bpftrace BEGIN { printf("Tracing block device I/O... Hit Ctrl-C to end.n"); } kprobe:blk_account_io_start { @start[arg0] = nsecs; } kprobe:blk_account_io_completion /@start[arg0]/ { @usecs = hist((nsecs - @start[arg0]) / 1000); delete(@start[arg0]); }
  • 15. … rewritten in bpftrace # biolatency.bt Attaching 3 probes... Tracing block device I/O... Hit Ctrl-C to end. ^C @usecs: [256, 512) 2 | | [512, 1K) 10 |@ | [1K, 2K) 426 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| [2K, 4K) 230 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@ | [4K, 8K) 9 |@ | [8K, 16K) 128 |@@@@@@@@@@@@@@@ | [16K, 32K) 68 |@@@@@@@@ | [32K, 64K) 0 | | [64K, 128K) 0 | | [128K, 256K) 10 |@ |
  • 16. bcc canned complex tools, agents bpftrace one-liners, custom scripts
  • 17. bcc
  • 18. eBPF bcc Linux 4.4+ https://github.com/iovisor/bcc
  • 20. eBPF bpftrace Linux 4.9+ https://github.com/iovisor/bcc
  • 21. Major Features (v1) Known Bug Fixes Packaging API Stability Stable Docs Oct 2018 v0.90 Mar?2019 v1.0 ?2019Dec 2016 More Bug Fixes v0.80 Jan-2019 Minor Features (v1) ... bpftrace Development
  • 22. bpftrace Syntax bpftrace -e ‘k:do_nanosleep /pid > 100/ { @[comm]++ }’ Probe Filter (optional) Action
  • 24. Probe Type Shortcuts tracepoint t Kernel static tracepoints usdt U User-level statically defined tracing kprobe k Kernel function tracing kretprobe kr Kernel function returns uprobe u User-level function tracing uretprobe ur User-level function returns profile p Timed sampling across all CPUs interval i Interval output software s Kernel software events hardware h Processor hardware events
  • 25. Filters ● /pid == 181/ ● /comm != “sshd”/ ● /@ts[tid]/
  • 26. Actions ● Per-event output – printf() – system() – join() – time() ● Map Summaries – @ = count() or @++ – @ = hist() – … The following is in the https://github.com/iovisor/bpftrace/blob/master/docs/reference_guide.md
  • 27. Functions ● hist(n) Log2 histogram ● lhist(n, min, max, step) Linear hist. ● count() Count events ● sum(n) Sum value ● min(n) Minimum value ● max(n) Maximum value ● avg(n) Average value ● stats(n) Statistics ● str(s) String ● sym(p) Resolve kernel addr ● usym(p) Resolve user addr ● kaddr(n) Resolve kernel symbol ● uaddr(n) Resolve user symbol ● printf(fmt, ...) Print formatted ● print(@x[, top[, div]]) Print map ● delete(@x) Delete map element ● clear(@x) Delete all keys/values ● reg(n) Register lookup ● join(a) Join string array ● time(fmt) Print formatted time ● system(fmt) Run shell command ● exit() Quit bpftrace
  • 28. Variable Types ● Basic Variables – @global – @thread_local[tid] – $scratch ● Associative Arrays – @array[key] = value ● Buitins – pid – ...
  • 29. Builtin Variables ● pid Process ID (kernel tgid) ● tid Thread ID (kernel pid) ● cgroup Current Cgroup ID ● uid User ID ● gid Group ID ● nsecs Nanosecond timestamp ● cpu Processor ID ● comm Process name ● stack Kernel stack trace ● ustack User stack trace ● arg0, arg1, ... Function arguments ● retval Return value ● func Function name ● probe Full name of the probe ● curtask Current task_struct (u64) ● rand Random number (u32)
  • 30. biolatency (again) #!/usr/local/bin/bpftrace BEGIN { printf("Tracing block device I/O... Hit Ctrl-C to end.n"); } kprobe:blk_account_io_start { @start[arg0] = nsecs; } kprobe:blk_account_io_completion /@start[arg0]/ { @usecs = hist((nsecs - @start[arg0]) / 1000); delete(@start[arg0]); }
  • 32. Issues ● All major capabilities exist ● Many minor things ● https://github.com/iovisor/bpftrace/issues
  • 34. Netlfix Vector: BPF heat maps https://medium.com/netflix-techblog/extending-vector-with-ebpf-to-inspect-host-and-container-performance- 5da3af4c584b
  • 35. Anticipated Worldwide Audience ● BPF Tool Developers: – Raw BPF: <20 – C (or C++) BPF: ~20 – bcc: >200 – bpftrace: >5,000 ● BPF Tool Users: – CLI tools (of any type): >20,000 – GUIs (fronting any type): >200,000
  • 37. Take Aways Easily explore systems with bcc/bpftrace Contribute: see bcc/bpftrace issue list Share: posts, talks
  • 38. URLs - https://github.com/iovisor/bcc - https://github.com/iovisor/bcc/blob/master/docs/tutorial.md - https://github.com/iovisor/bcc/blob/master/docs/reference_guide.md - https://github.com/iovisor/bpftrace - https://github.com/iovisor/bpftrace/blob/master/docs/tutorial_one_liners.md - https://github.com/iovisor/bpftrace/blob/master/docs/reference_guide.md
  • 39. Thanks ● bpftrace – Alastair Robertson (creator) – Netflix: myself so for – Sthima: Matheus Marchini, Willian Gaspar – Facebook: Jon Haslam, Dan Xu – Augusto Mecking Caringi, Dale Hamel, ... ● eBPF/bcc – Facebook: Alexei Starovoitov, Teng Qin, Yonghong Song, Martin Lau, Mark Drayton, … – Netlfix: myself – VMware: Brenden Blanco – Sasha Goldsthein, Paul Chaignon, ...