eBPF - Unleash the Linux kernel
eBPF opens the pandora box of observability, tracing, networking and security applications
User Space vs Kernel Space
In order to understand where eBPF comes into the picture, first we need to understand the basic difference of User Space vs Kernel Space
eBPF fits into the kernel space; but giving a user-space analogy of the significance of eBPF to the Operating System would be what WASM was for the browser. WASM allowed applications to be written in non-javascript to be interpreted in the browser. It allowed products like Figma to be developed for the web. Sketch and Invision would not be able keep up with the power of Figma's powerful editor. Good news is that eBPF for its revolutionary and novel will aid in observing the kernel space and not bring any technologies on its knees.
Why eBPF
Say you want to look into what is happening inside the Linux; I mean really look into the nuts and bolts; you might use an interface from User Space that can talk to Kernel Space which would give information such as File IO
, Network Traces
etc. but this would cause 2 issues
Any unexpected crash at the Kernel Space level can lead to lack of observability on root cause of failure
As it is an interface, the hardware or other Kernel Level details could not be visible from User Space
That is where eBPF comes in and shines. It is a sort of Virtual Machine that hooks up inside the Kernel which allows visibility to the User Space from the Kernel. This opens up the pandora's box of possibilities of logging, instrumentation and security applications that could leverage eBPF.
History of BPF and eBPF
eBPF's name has been derived from a old technology called as BPF (Berkely Packet Filter).
BPF
BPF was first conceived in early 1990s (before Linux became mainstream) as a way to intercept network traffic from the kernel itself instead of relying on user-level processes. It was a network tap (monitor network traffic) and packet filter (filtering out unwanted packets to reduce noise in netowrk monitoring)
eBPF
Video Recommendation : eBPF: Unlocking the Kernel
In early 2010s, there was a need for having better observability tools that could be defined in a software instead of hardware. Also, all the things that things that went from User Space to Kernel Space could crash unexpectedly without actually knowing what caused the crash at the Kernel space layer. Alexie wanted something that an application at User Space could call which would be a hook/probe-point to the Kernel which would further make a decision to send information to the User space. This is where eBPF was born.
eBPF Architecture
eBPF programs are called by hooks/probes either by Kernel or the user-land application. These hooks are pre-defined are include
System calls
Function entry/exit: Custom Programs can attach to entry/exit functions so that they can run at these scenarios.
Kernel Tracepoints: Tracepoints are lightweight hooks to call a function at runtime. It is used for tracing and perf analysis at the kernel.
Network interfaces via XDP: eXpress Data Path (XDP) allows custom programs to attach to eBPF which can execute these custom program when the network packets are received.
LSI Module interface etc.
If the hook/probe are not available for particular use-case; it is possible to attach probes to eBPF at both user and kernel levels. These are called uprobe
and kprobe
respectively.
Loading
Before a BPF can be run on the kernel; it requires to be Loaded with the help of some eBPF loader libraries:
eBPF by Cilium: Go-based eBPF loading library
Verification
Need to ensure that the eBPF program is safe to run It make sure that it follows various conditions such as:
Program is run by priveleged eBPF program (unless stated otherwise)
Program does not bring the system down
Program always run to completion
JIT Compilation
Translating the eBPF bytecode (generated at User Space) into machine-readable code.
eBPF Maps
These are required to hold and retrieve data. These hold wide variety of data such as
Hash Tables
Arrays
Stack traces
etc...
Helper calls
eBPF do not call kernel functions directly in order to make eBPF loosely couple with Kernel versions. eBPF call function calls to pre-defined helper functions defined by the kernel. These helper functions include:
Current time and date
Process/cgroup context
- cgroups are basically isolation of processes, CPU, N/W for separate containers
Network packet manipulation and forwarding logic
Random number generator
Tail and Function Calls
eBPF allows tail call of other eBPF functions which allows for compatibility and extendability of eBPF programs
Ensuring eBPF is safe
As eBPF is a very powerful concept, which allows user-level programs to hook into kernel level details; it is imperative that the architecture of eBPF ensures that such a technology does not break the system when used.
Writing safe eBPF programs
This includes
eBPF programs should be non-blocking. eBPF program can contain a loop if and only if, the Verifier can ensure that loop exits!
eBPF programs do not use out-of-bounds memory
eBPF programs should be small
eBPF
Hardening eBPF
Kernel memory inside a eBPF program is read-only.
Spectre migration: Spectre was a big vulnerability in CPU architecture which allowed for out-of-order branch execution which lead to sensitive data access to attackers. eBPF prevents Spectre-type attacks at the Verifier level
Constant Binding
eBPF Applications
Networking
Networking use-cases for eBPF include doing Traffic Control, controlling network policy (via XDP) etc Tools:
Observability
Send Kernel Level details to observability platform
Grafana Beyla allows instrumentation of HTTP and HTTPS services from the Linux kernel to Grafana directly.
Tracing
Want to look into production tracing and troubleshooting. BPFTrace is the tool for you
If wanted to know how the VACUUM
process is happening in Postgres under-the-hood. BPFTrace can help with that! Check this article to monitor Postgres's VACUUM process
List of eBPF based tracing tools
Security
Traditionally, auditd was used for auditing things happening in the Linux OS; but as it is user-space component; there is some performance penalty.
Low-level observability through eBPF can aid in finding alerts for kernel level changes by attackers. For example, when application changes privileges are changed it can trigger an alert to a listening eBPF custom program
Security Libraries that are based on eBPF
KubeArmour: K80based Security engine which used eBPF and Linux Security Modules (LSM)
Falco