Observing container traffic with eBPF: where to attach and what to expect
April 26, 2026 · kernel notes · ebpf · containers · networking · observability
When a container starts on Linux, the kernel creates a virtual ethernet pair (veth) — two interfaces wired together as a tube. One end goes inside the container’s network namespace; the other stays on the host, usually attached to a bridge or routed via iptables.
If you want to observe what the container is doing on the network, you have to pick where to attach your eBPF program. The choice determines what you see and how reliable it is.
The four obvious places
For a packet leaving a container, the journey is:
- Application writes to a socket inside the container’s namespace.
- Kernel processes it, builds an
sk_buff, sends it out the container-side veth. - Packet appears on the host-side veth.
- Linux bridge or iptables routes it to the host’s external interface (eth0).
- Packet leaves the host.
Each step is a place you can attach. They give different views:
A. Container-side veth, ingress/egress (tc-bpf): sees raw packets exactly as the container does. Most intuitive view, but you have to enter the container’s namespace to attach. Fragile across container restart.
B. Host-side veth, ingress/egress (tc-bpf): sees the same packets from the host’s perspective. Easier to attach (no namespace dancing), persistent across container restart if you re-attach when veth reappears.
C. Host’s external interface (eth0), ingress/egress (XDP or tc-bpf): sees aggregated traffic from all containers, but already with whatever NAT/SNAT the host applies. Source addresses may be the host’s IP, not the pod IP.
D. Kernel hooks via kprobes: trace specific kernel functions like tcp_sendmsg or inet_csk_accept. See connection lifecycle, not raw packets.
What changes between attach points
This matters more than people expect.
Source IP visibility. If you attach at the host’s eth0 (option C), the source IP you see depends on whether SNAT happened. With Kubernetes default behavior (externalTrafficPolicy: Cluster and Services), most outbound traffic gets MASQUERADEd to the node’s IP. The pod IP is gone. If you want pod-level identity, attach inside the pod or at the host-side veth (B), before SNAT.
Encryption. TLS-encrypted traffic looks like opaque bytes at any L3/L4 attach point. To see HTTP/HTTPS payloads, you’d need to attach at the application or socket layer with kprobes/uprobes — invasive and brittle.
Performance. XDP at eth0 is fastest. tc-bpf at host-side veth is fast enough for almost anything. tc-bpf inside the container namespace is similar but adds attach complexity.
Visibility under failure. If your eBPF program crashes the kernel reading inside a container’s namespace, you’ve affected the host. There’s no isolation. Attach in less-trusted contexts only when you’ve validated thoroughly.
A practical pattern
For a node-level observability agent (à la Cilium, Pixie, AgentGate-shape projects), the typical pattern is:
- Attach tc-bpf programs to the host-side veth of each pod, both ingress and egress.
- Use a kernel watcher (netlink) to detect new pod veths appearing and re-attach.
- Aggregate metrics per pod via a map keyed by veth name or pod identity.
- Optionally also attach at the host’s eth0 to compare aggregates against external view.
Cilium does roughly this. So does Pixie. The host-side veth gives you per-pod traffic with original IPs and reasonable performance.
Watching for new veths
ip monitor link
This streams netlink events as interfaces appear and disappear. Programmatic version uses golang.org/x/sys/unix.NetlinkRIB or the vishvananda/netlink library. When a new veth shows up matching your pattern (vethXXXXXXXX), attach your tc-bpf program.
Don’t try to enumerate at startup and forget. Pods restart constantly. Your agent has to re-attach as veths appear.
Container runtime quirks
Different runtimes do slightly different things:
- Docker with default bridge mode: veth pair, bridge
docker0, iptables masquerade. - containerd with CRI: similar, but bridge name varies (
cni0is common with Calico/Flannel). - Kubernetes with Cilium: no bridge, eBPF directly handles routing. Veth still exists but iptables is mostly empty.
- podman rootless: uses slirp4netns or pasta — entirely different stack, no veth at all on the host.
For rootless or specialty container modes, your eBPF observability won’t work the way it does in standard runtimes. Test specifically; don’t assume.
What about hostNetwork pods?
A hostNetwork pod doesn’t have its own namespace. There’s no veth. The pod’s containers just run with the host’s networking. eBPF attached at the host’s eth0 sees everything. eBPF attached at “the pod’s veth” finds nothing because there’s no veth.
This means most observability solutions either treat hostNetwork pods specially or report them as “unobserved” until you handle them as host-level traffic. Plan for it.
Practical advice
Start with tc-bpf attached to the host’s external interface, both directions. You’ll see aggregate traffic but lose per-pod identity. This is the simplest and most reliable starting point.
When you need per-pod resolution, attach at the host-side veth and watch netlink for veth events. This is the design used by every serious K8s observability eBPF project.
Don’t reach for in-container attachment unless you specifically need the pre-SNAT, pre-routing view. The complexity isn’t worth it for most use cases.
The verifier is going to make you suffer the first few times. The patterns settle out within a project or two.