Ebpf
Ebpf
Daniel Borkmann
<[email protected]>
Noiro Networks / Cisco Systems
Daniel Borkmann tc and cls bpf with eBPF January 31, 2016 1 / 16
Background, history.
# tcpdump -i any -d ip
(000) ldh [14]
(001) jeq #0x800 jt 2 jf 3
(002) ret #65535
(003) ret #0
Daniel Borkmann tc and cls bpf with eBPF January 31, 2016 2 / 16
2012, No longer networking only!
BPF engine used for seccomp (syscall filtering)
Used inside Chrome as a sandbox, minimal example in bpf asm:
Daniel Borkmann tc and cls bpf with eBPF January 31, 2016 3 / 16
BPF (any flavour) used in the kernel today.
Networking
Socket filtering for most protocols
AF PACKET fanout demuxing
SO REUSEPORT socket demuxing
tc classifier (cls bpf) and actions (act bpf)
team driver load balancing
netfilter xtables (xt bpf)
Some misc ones: PTP classifier, PPP and ISDN
Tracing
BPF as kprobes-based extensions
Sandboxing
syscall filtering with seccomp
Daniel Borkmann tc and cls bpf with eBPF January 31, 2016 4 / 16
Classic BPF (cBPF) in a nutshell.
Daniel Borkmann tc and cls bpf with eBPF January 31, 2016 5 / 16
Extended BPF (eBPF) as next step.
Daniel Borkmann tc and cls bpf with eBPF January 31, 2016 6 / 16
eBPF, General remarks.
Stable ABI for user space, like the case with cBPF
Management via bpf(2) syscall through file descriptors
Points to kernel resource → eBPF map / program
No cBPF interpreter in kernel anymore, all eBPF!
Kernel performs internal cBPF to eBPF migration for cBPF users
JITs for eBPF: x86 64, s390, arm64 (remaining ones are still cBPF)
Various stages for in-kernel cBPF loader
Security (verifier, JIT spraying mitigations, RO images, unpriv restr.)
Daniel Borkmann tc and cls bpf with eBPF January 31, 2016 7 / 16
eBPF and cls bpf.
Daniel Borkmann tc and cls bpf with eBPF January 31, 2016 8 / 16
eBPF and cls bpf.
skb metadata:
Read/write: mark, priority, tc index, cb[5], tc classid
Read: len, pkt type, queue mapping, protocol, vlan *, ifindex, hash
Tunnel metadata:
Read/write: tunnel key for IPv4/IPv6 (dst-meta by vxlan, geneve, gre)
Helpers:
eBPF map access (lookup/update/delete)
Tail call support
Store/load payload (multi-)bytes
L3/L4 csum fixups
skb redirection (ingress/egress)
Vlan push/pop and tunnel key
trace printk debugging
net cls cgroup classid
Routing realms (dst->tclassid)
Get random number/cpu/ktime
Daniel Borkmann tc and cls bpf with eBPF January 31, 2016 9 / 16
cls bpf, Invocation points.
__netif_receive_skb_core() __dev_queue_xmit()
sch_handle_ingress() sch_handle_egress()
TX path
Daniel Borkmann tc and cls bpf with eBPF January 31, 2016 10 / 16
cls bpf, Example setup in 1 slide.
$ clang -O2 -target bpf -o foo.o foo.c
Daniel Borkmann tc and cls bpf with eBPF January 31, 2016 12 / 16
tc eBPF examples, minimal module.
$ cat >foo.c <<EOF
#include "bpf_api.h"
__section_cls_entry
int cls_entry(struct __sk_buff *skb)
{
/* char fmt[] = "hello prio%u world!\n"; */
skb->priority = get_cgroup_classid(skb);
/* trace_printk(fmt, sizeof(fmt), skb->priority); */
return TC_ACT_OK;
}
BPF_LICENSE("GPL");
EOF
# cgcreate -g net_cls:/foo
# echo 6 > foo/net_cls.classid
# cgexec -g net_cls:foo ./bar # -> app ./bar xmits with priority of 6
Daniel Borkmann tc and cls bpf with eBPF January 31, 2016 13 / 16
tc eBPF examples, map sharing.
#include "bpf_api.h"
Daniel Borkmann tc and cls bpf with eBPF January 31, 2016 15 / 16
Code and further information.
Take-aways:
No, development on tc is not in deep hibernation mode ;)
eBPF implementation details may be complex, BUT workflow and
writing eBPF programs is really easy (perhaps easiest in tc?)
Low overhead, fully programmable for your specific use-case
Native performance when JITed!
Code:
Everything upstream in kernel, iproute2 and llvm!
Available from usual places, e.g. https://git.kernel.org/
Some further information:
Man pages bpf(2), tc-bpf(8)
Examples in iproute2’s examples/bpf/
Documentation/networking/filter.txt
Daniel Borkmann tc and cls bpf with eBPF January 31, 2016 16 / 16