cilium/ebpf: writing the Go side of an eBPF program

Most production eBPF code is split into two parts: a C program that runs in the kernel, and a userspace controller that loads it, configures it, and reads its data. The Go ecosystem has converged on github.com/cilium/ebpf as the library for the controller side. It is mature, well-maintained, and the closest thing to a standard.

This is a practical tour of how to use it without the surprises.

The big picture

Your workflow looks like:

Write a C program in bpf/your_program.c.
Generate Go bindings via bpf2go (a code generator that ships with cilium/ebpf).
In Go, Load the compiled object, get back struct fields for each map and program.
Attach the program to an interface or hook.
Read/write maps, consume ringbuf events, etc.

bpf2go is the integration point that makes this ergonomic. It compiles the C with clang, embeds the resulting object file as a Go variable, and generates typed accessors for every map and program defined in the C source.

Setting up bpf2go

In your Go package:

package bpf

//go:generate go run github.com/cilium/ebpf/cmd/bpf2go@latest bpf ../../bpf/xdp_counter.c

Run go generate ./... and you get two files: bpf_bpfeb.go (big-endian) and bpf_bpfel.go (little-endian). You include the architecture-specific one via build tags. Don’t edit these — they regenerate on every go generate.

The generated code gives you bpfObjects, bpfPrograms, and bpfMaps structs. Each map and program in your C is a typed field.

Loading and attaching

import (
    "github.com/cilium/ebpf"
    "github.com/cilium/ebpf/link"
    "github.com/cilium/ebpf/rlimit"
)

// Required on most kernels — bumps RLIMIT_MEMLOCK
if err := rlimit.RemoveMemlock(); err != nil {
    log.Fatal("removing memlock:", err)
}

var objs bpfObjects
if err := loadBpfObjects(&objs, nil); err != nil {
    log.Fatal("loading objects:", err)
}
defer objs.Close()

iface, err := net.InterfaceByName("eth0")
if err != nil {
    log.Fatal("interface lookup:", err)
}

xdpLink, err := link.AttachXDP(link.XDPOptions{
    Program:   objs.CountIpv4,
    Interface: iface.Index,
    Flags:     link.XDPGenericMode,
})
if err != nil {
    log.Fatal("attaching XDP:", err)
}
defer xdpLink.Close()

XDPGenericMode runs in generic/SKB mode — works on any interface but slow. XDPDriverMode requires native support in the NIC driver. XDPOffloadMode requires programmable hardware.

Reading maps from userspace

key := uint32(0)
var values []uint64
if err := objs.PktCount.Lookup(key, &values); err != nil {
    log.Printf("map lookup: %v", err)
}

possibleCPUs, _ := ebpf.PossibleCPU()
if len(values) != possibleCPUs {
    log.Printf("unexpected per-cpu length: got %d, expected %d", len(values), possibleCPUs)
}

var total uint64
for _, v := range values {
    total += v
}

For per-CPU maps, the cilium/ebpf library handles the per-CPU array marshaling. You pass a slice of the value type; it returns one entry per CPU. Forgetting this and reading a scalar gives you a confusing error.

Iterating maps

var key uint32
var value uint64
iter := objs.SeenIps.Iterate()
for iter.Next(&key, &value) {
    fmt.Printf("ip=%s seen=%d\n", netip.AddrFrom4(*(*[4]byte)(unsafe.Pointer(&key))), value)
}
if err := iter.Err(); err != nil {
    log.Printf("iteration error: %v", err)
}

Iteration takes a snapshot conceptually — entries added during iteration may or may not appear, deletions may show as zero values. Don’t rely on iteration order.

Consuming a ringbuf

import "github.com/cilium/ebpf/ringbuf"

reader, err := ringbuf.NewReader(objs.NewFlowEvents)
if err != nil {
    log.Fatal("ringbuf reader:", err)
}
defer reader.Close()

for {
    record, err := reader.Read()
    if err != nil {
        if errors.Is(err, ringbuf.ErrClosed) {
            return
        }
        log.Printf("ringbuf read: %v", err)
        continue
    }
    if len(record.RawSample) != 4 {
        log.Printf("unexpected event size: %d", len(record.RawSample))
        continue
    }
    // record.RawSample is a copy, safe to use after Read returns
    srcIP := netip.AddrFrom4([4]byte{
        record.RawSample[0], record.RawSample[1],
        record.RawSample[2], record.RawSample[3],
    })
    fmt.Println("new flow from", srcIP)
}

Read() blocks until an event arrives or the reader is closed. Close it from your shutdown handler — that’s how you escape the loop on signal.

Common gotchas

Struct layout mismatch. If your C struct is struct { __u8 a; __u32 b; } and your Go struct is struct { A uint8; B uint32 }, the Go struct is 8 bytes (3 bytes padding) and the C struct depends on platform. Use bpf2go-generated types or pin layout explicitly with _ structs.HostLayout (Go 1.22+) and __attribute__((packed)) on the C side. This is the single biggest source of “the values are wrong” bugs.

Forgetting Close(). Maps and programs hold kernel resources. If your process exits without calling objs.Close(), the kernel cleans up eventually but you’ll see EEXIST errors on next start while the old objects are still around. Always defer Close.

Map full silently. Update returns unix.E2BIG when the map is full. If you don’t check, updates silently fail. Always check returns from Update and Put.

XDP works in lab, fails in prod. Generic XDP runs anywhere, including in containers and VMs. Native XDP requires driver support and the right NIC. Test what you’ll deploy on; don’t assume.

Project layout

A typical layout that works:

project/
├── bpf/
│   └── xdp_counter.c
├── cmd/
│   └── agent-node/
│       └── main.go
├── internal/
│   ├── bpf/
│   │   ├── gen.go              // go:generate directive
│   │   ├── bpf_bpfeb.go        // generated
│   │   ├── bpf_bpfel.go        // generated
│   │   └── loader.go           // your wrapper
│   └── ...
├── go.mod
└── Dockerfile

The internal/bpf package owns loading, attaching, and exposing typed accessors. Your business logic in cmd/agent-node calls into that package. Keep the C close to the Go that consumes it.

cilium/ebpf has come a long way. The library is genuinely good now, the verifier errors are getting better, and bpf2go solves real problems. Most of the pain you’ll meet is C verifier complaints, not Go library issues.