eBPF × AI/LLMs: The Convergence of System Observability and Artificial Intelligence

Yusheng Zheng

The convergence of Artificial Intelligence and eBPF is rapidly defining the next frontier in system software, creating a paradigm shift in both how we build and manage complex applications. As Large Language Models (LLMs) evolve from mere applications into active AI Agent participants in the software development lifecycle, they are increasingly used to generate, optimize, and verify low-level systems code, including kernel extensions. At the same time, as these sophisticated AI workloads and agents execute, they demand an unprecedented level of runtime context to be operated efficiently, securely, and reliably. This is where eBPF excels, offering a safe and performant mechanism to program the kernel and provide the exact high-fidelity telemetry that modern systems require.

This powerful relationship is a symbiotic loop, where each technology enhances the capabilities of the other. The synergy flows in two primary directions:

eBPF for AI: Enhancing AI/ML Workloads In this direction, eBPF acts as an advanced sensor suite and extension runtime for AI systems. By providing deep, real-time visibility into the kernel, it allows developers to diagnose performance bottlenecks in complex AI/ML pipelines, such as GPU stalls, network I/O latency, and inefficient data access patterns. This telemetry is crucial for optimizing resource utilization, ensuring security compliance for AI agents, and enabling closed-loop feedback systems that can dynamically tune workloads for maximum performance and efficiency.
AI for eBPF: Optimizing the Operating System In the reverse direction, AI and LLMs serve as a force multiplier for kernel development. They are used to automatically generate, verify, and optimize eBPF programs from high-level natural language prompts, dramatically lowering the barrier for creating sophisticated system logic. This unlocks the potential to build dynamic, intelligent, and self-tuning OS policies, including learned CPU schedulers, adaptive network traffic management, and proactive security enforcement, fundamentally enhancing the behavior of the underlying operating system.

The practical impact of this synergy is already evident across a range of cutting-edge applications from 2024–2025. Researchers and practitioners are building systems for comprehensive AI agent observability by tracing LLM Agent prompts and operations, using machine learning to assist in the generation of correct and efficient eBPF programs, and creating intelligent in-kernel data paths with XDP to pre-filter and steer traffic for ML services. Furthermore, this combination enables zero-instrumentation tracing of GPU and LLM workloads, providing critical performance data without requiring any application code changes.

As an open-source community, we are also working on projects combining eBPF and AI, such as:

eGPU Offload eBPF bytecode onto GPUs via PTX/SPIR-V injection. It's merged into the main branch of our eBPF runtime bpftime.
Agentsight Zero instrucment LLM and AI agent (e.g. claude code, gemini-cli) observability in eBPF
GPTtrace and MCPtrace using LLM to help you trace your kernel, with the paper Kgent LLM-powered eBPF synthesis tool that incorporates a Z3-based symbolic checks and tests to produce more reliable code, achieving ~80% semantic correctness on its test sets.

Awesome Lists of eBPF×AI Use Cases (Welcome contribution!)

Part 1: eBPF for AI — Observability, Security, and Performance

This section covers how eBPF provides the essential data for monitoring, securing, and optimizing complex AI and LLM workloads.

A. Tracing and Securing LLM Applications

To effectively manage and secure AI agents, it's crucial to stitch high-level prompts to their low-level system effects using a combination of library uprobes and syscall/network hooks. While LLMs can be used to summarize incidents, hard security policies should be enforced directly in the kernel with eBPF and LSM.

AgentSight: An open-source solution that combines eBPF TLS interception with kernel signals and secondary LLM analysis to trace agent activity with less than 3% overhead (arXiv).
Groundcover LLM Observability: Provides enterprise-grade, eBPF-based visibility into LLM API calls and their content (groundcover.com).
Protect AI: An eBPF agent designed to monitor LLM provider traffic within Kubernetes environments for security and compliance (protectai.com).
Prompt Security: Uses eBPF for real-time tracing of the model stack and vector database interactions to prevent threats (Prompt Security).
eInfer: An eBPF-based, transparent tracer for distributed LLM inference that correlates per-request performance across CPU/GPU nodes with low overhead (ACM Digital Library).
Runtime Anomaly Detection: Research demonstrates using eBPF to feed kernel-level signals into ML models for detecting anomalous behavior, such as in ransomware (arXiv) and general process activity via autoencoders on syscall sequences (evilsocket).
ActPlane: An eBPF-based policy engine that enforces labeled information-flow control rules across process, file, and network boundaries for AI agents, with corrective semantic feedback (GitHub).
ProfInfer: An eBPF-based fine-grained LLM inference profiler that attaches probes across runtime layers without recompilation, with less than 4% overhead (arXiv).
ARMO / Kubescape 4.0: eBPF-based AI agent sandboxing and progressive enforcement for Kubernetes; monitors tool invocations, agent execution chains, and HTTP content at kernel level (ARMO blog).
Backend.AI GPU Cluster Audit: eBPF-based security auditing for GPU training clusters, solving auditd buffer overflow at high syscall rates (backend.ai blog).
Busted: A Rust+eBPF tool that intercepts decrypted LLM API traffic (OpenAI, Anthropic, Google, MCP JSON-RPC) via uprobes on OpenSSL, with in-kernel blocking via LSM hooks and Rego policies (GitHub).
Causely: A causal reasoning platform using eBPF-based auto-instrumentation to build live service dependency models, augmenting LLMs with structural knowledge to address hallucination in observability AI (causely.ai).
Splunk + Isovalent Network Explorer: Feeds Cilium Tetragon eBPF flow data (enriched with K8s metadata) into Splunk for AI-assisted Kubernetes attack simulation and detection (splunk.com).
DeepFlow: An eBPF-based zero-instrumentation observability platform for AI training/inference (GPU, RDMA, LLM serving) with an LLM-powered diagnostic agent; used by Tencent BlueKing (deepflow.io, GitHub).
BpfJailer (Meta): eBPF-based mandatory access control used at Meta to jail untrusted AI training/inference workloads; presented at Linux Plumbers 2025, open-source planned for 2026 (LPC).
OpenTelemetry eBPF Instrumentation (OBI): Successor to Grafana Beyla, includes GenAI instrumentation for OpenAI, Anthropic, Gemini, Bedrock, Qwen, and MCP tracing via eBPF (Docs).
Datadog GPU Monitoring: Uses eBPF system-probe for per-pod GPU telemetry across AI training/inference workloads, integrated with LLM Observability product (Blog).
eBPF-PATROL: An adaptive eBPF-LSM runtime security agent for containerized AI environments with Deny/Allow/Kill enforcement (arXiv).
CryptoGuard: Deep learning on syscall traces collected via eBPF to detect cryptojacking in Linux cloud environments; tested on 123 real-world malware samples (arXiv, GitHub).
ebpfangel: Ransomware detection using ML (decision tree + MLP) implemented directly in eBPF kernel programs (GitHub).
Tetragon + ML Cryptojacking Detection: Framework using Cilium Tetragon eBPF traces as ML features to classify cryptomining containers with 99.75% accuracy (MDPI Electronics 2025).

B. Zero-Instrumentation GPU Performance Analysis

A practical approach to GPU performance monitoring is to start with eBPF uprobes on user-space libraries like CUDA, supplement this data with NVML/driver metrics, and only then consider more complex device-resident mechanisms.

eGPU: Research prototype that offloads eBPF bytecode onto GPUs via PTX injection—a device-resident path that aligns with AI/GPU workflows (ACM HCDS'25). It's merged into the main branch of bpftime.
CUDA Events Tutorial: A comprehensive guide for tracing specific CUDA GPU operations using eBPF (eunomia.dev).
eACGM: A system that merges eBPF kernel events with NVML device metrics to enable end-to-end performance analysis and fault diagnosis for GPU training (arXiv).
GPUprobe Tutorials: A collection of guides on using eBPF uprobes for zero-instrumentation CUDA API tracing, memory tracking, and kernel launch profiling (DEV Community, Medium).
SysOM-AI (Alibaba): Cross-layer performance diagnosis for production AI training; deployed on 80,000+ GPUs, reduces diagnosis time from days to ~10 minutes using eBPF tracing (arXiv).
gpu_ext: Extends eBPF into GPU drivers/devices as a programmable OS subsystem; up to 4.8x throughput improvement for inference/training/vector-search workloads (arXiv).
Grafana Beyla GPU Support: eBPF auto-instrumentation extended to capture CUDA calls (kernel launches, memory allocations) for AI/ML workload profiling; donated to OpenTelemetry (FOSDEM 2025).
Ingero: An open-source eBPF agent + MCP server for GPU causal observability; traces CUDA/ROCm APIs via uprobes, stores events in SQLite, and exposes them to AI agents through MCP tools (GitHub).

Part 2: eBPF for AI — In-Kernel Data Path Acceleration

This section explores how eBPF's position in the kernel's data path can be leveraged to pre-process and accelerate data flows for ML services.

A. Intelligent Traffic Processing with XDP/TC

The most effective pattern is to use eBPF as a high-performance sensing and pre-filtering layer in the kernel, while offloading heavy ML and LLM inference to user space or dedicated hardware.

ML-Powered Traffic Processing: Research demonstrates pipelines that integrate eBPF with ML on commodity hardware for intelligent traffic processing (ACM Digital Library).
SmartX Intelligent Security: A framework that uses a BiLSTM model with eBPF/XDP for high-speed threat detection and real-time packet dropping (arXiv).
In-Kernel vs. User-Space Trade-offs: Studies have analyzed the latency and throughput trade-offs between in-kernel eBPF/XDP and user-space pipelines for packet classification (ScienceDirect).
Intrusion Prevention: Work from DSN'24 shows how eBPF, XDP, and TC can be used to implement real-time intrusion detection and prevention with in-kernel neural networks (DSN 2024).
Kernel-Time Pre-processing: While promising, using eBPF to aggregate events for ML services has shown mixed results, highlighting important performance trade-offs that must be measured carefully (The New Stack).

B. In-Kernel ML Decision Support

Recent advances enable embedding pre-verified ML models directly in the kernel via eBPF, allowing for intelligent kernel-time decisions.

eBPF^ML: A proposal to attach pre-verified ML models via eBPF objects, including matrix-multiply helpers leveraging CPU matrix engines, for kernel-time decisions (ACM Digital Library).
O2C: Demonstrates embedding a decision-tree model inside eBPF to enforce kernel compartmentalization on-the-fly, showing what "tiny ML in eBPF" looks like when verifiable (arXiv).
Flow-based IDS: Baseline decision-tree-in-eBPF implementation for flow classification; useful foil for "sketch-in-kernel, model-in-user-space" approaches (GitHub).
In-Kernel CNN Inference: eBPF-based CNN inference framework for edge devices achieving 86.8% latency reduction vs user-space; for IoT/edge classification (ICA3PP 2025) (Springer).
KerDqn: Deep reinforcement learning congestion control running in-kernel via eBPF struct_ops (IEEE ICNP 2024) (IEEE).
Dynamic Fixed-Point eBPF: Enables ML algorithms in eBPF by introducing dynamic fixed-point arithmetic to overcome floating-point restrictions (AINTEC 2024) (ACM).
Mixture-of-Schedulers (ASA): ML-trained adaptive scheduling agent on sched_ext/eBPF that routes workloads to optimal scheduler policies; outperforms default Linux scheduler in 86% of scenarios (arXiv).

Part 3: AI for eBPF — Synthesizing and Verifying Kernel Extensions

This section details how AI and LLMs are being used to automate the creation and validation of eBPF programs, making kernel programming more accessible and reliable.

Kgent(KEN): The first LLM-powered eBPF synthesis tool, that incorporates a Z3-based symbolic checks and tests to produce more reliable code, achieving ~80% semantic correctness on its test sets. (eBPF'24, arXiv).
SimpleBPF: A framework that couples an eBPF DSL with an LLM generator, a semantic checker, and an LLM-based optimizer to consistently emit verifier-friendly programs (ratul.org).
DiffSpec: An LLM-powered differential testing framework that generates tests for eBPF runtimes using natural language specifications; found 4 confirmed eBPF bugs including a kernel memory leak (arXiv).
Agentic OS / LLM Scheduler Agents: An LLM agent framework that synthesizes eBPF-based Linux schedulers via automatic workload characterization (arXiv).
Inspektor Gadget MCP Server: Official MCP server letting AI agents invoke eBPF-based Inspektor Gadget tools via natural language for Kubernetes debugging; the LLM autonomously selects which telemetry to collect (GitHub).
ebpf-mcp: A standalone MCP server exposing structured JSON Schema tools for loading, attaching, and streaming eBPF programs; designed for safe AI agent control without shell escapes (lobehub.com).

Community & Venues

AgenticOS Workshop @ ASPLOS 2026: First workshop on OS design for AI agents, covering isolation, scheduling, and observability via eBPF. March 2026, Pittsburgh (os-for-agent.github.io).
3rd Workshop on eBPF and Kernel Extensions @ SIGCOMM 2025: Featured eInfer and other AI+eBPF work (SIGCOMM 2025).

Continue exploring

AgentSight: Zero-Instrumentation LLM Agent Observability with eBPF

![License: MIT](https://opensource.org/licenses/MIT) ![Build Status](https://github.com/eunomia-bpf/agentsight)

Last updated: Jun 1, 2026
First published: Feb 10, 2025
Contributors: LinuxDev9002, yunwei37, github-actions[bot], 云微, +1 more

Edit this page Share on X Share on Facebook Join discussion RSS feed

Was this page helpful?