Skip to content

eBPF Practice: Tracing User Space Rust Applications with Uprobe

eBPF, or Extended Berkeley Packet Filter, is a revolutionary technology in the Linux kernel that allows developers to run custom "micro-programs" in kernel mode, thus changing system behavior or collecting granular performance data without modifying the kernel code.

This article discusses how to trace user space Rust applications with Uprobe and eBPF, including how to obtain symbol names and attach them, get function parameters, get return values, etc. This article is part of the eBPF developer tutorial, more detailed content can be found here: https://eunomia.dev/tutorials/

The complete source code: https://github.com/eunomia-bpf/bpf-developer-tutorial/tree/main/src/37-uprobe-rust

Uprobe

Uprobe is a user space probe. Uprobe probes allow dynamic instrumentation in user space programs, with instrumentation locations including: function entry points, specific offsets, and function return points. When we define a Uprobe, the kernel creates a fast breakpoint instruction (the int3 instruction on x86 machines) at the attached instruction. When the program executes this instruction, the kernel triggers an event, the program falls into kernel mode, and the probe function is called in a callback manner. After the probe function is executed, it returns to user mode to continue executing subsequent instructions.

Uprobe is useful for parsing traffic in user space that cannot be parsed by kernel probes, such as http2 traffic, https traffic, and can also analyze runtime program, business logic, etc. For more information about Uprobe, you can refer to:

Running Uprobe in kernel mode eBPF might also produce significant performance overhead, in which case you might consider using user space eBPF runtime, such as bpftime. bpftime is a user-space eBPF runtime based on LLVM JIT/AOT. It can run eBPF Uprobe programs in user mode and is compatible with kernel mode eBPF. Because it avoids context switching between user and kernel modes, bpftime's Uprobe overheads are about 10 times less than the kernel's, and it also more easy to extend.

Rust

Rust is an open-source systems programming language that focuses on safety, speed, and concurrency. It was developed by Graydon Hoare at the Mozilla Research Center in 2010 and released its first stable version in 2015. The design philosophy of Rust language is to provide the performance advantages of C++ while greatly reducing memory safety vulnerabilities. Rust is gradually popular in the field of systems programming, especially in applications that require high performance, security, and reliability, such as operating systems, file systems, game engines, network services, etc. Many large technology companies, including Mozilla, Google, Microsoft, and Amazon, are using or supporting the Rust language.

You can refer to the official Rust website for more information about Rust language and install the Rust toolchain.

Simplest example: Symbol name mangling

Let's start with a simple example, tracing the main function of a Rust program with Uprobe, with the code as follows:

pub fn hello() -> i32 {
    println!("Hello, world!");
    0
}

fn main() {
    hello();
}

Build and try to get the symbol:

$ cd helloworld
$ cargo build
$ nm helloworld/target/release/helloworld | grep hello
0000000000008940 t _ZN10helloworld4main17h2dce92cb81426b91E

We find that the corresponding symbol has been converted to _ZN10helloworld4main17h2dce92cb81426b91E. This is because rustc uses Symbol name mangling to encode a unique name for the symbols used in the code generation process. The encoded name will be used by the linker to associate the name with the content it points to. The -C symbol-mangling-version option can be used to control the handling of symbol names.

We can use the rustfilt tool to parse and obtain the corresponding symbol. This tool can be installed with cargo install rustfilt:

$ cargo install rustfilt
$ nm helloworld/target/release/helloworld > name.txt
$ rustfilt _ZN10helloworld4main17h2dce92cb81426b91E
helloworld::main
$ rustfilt -i name.txt | grep hello
0000000000008b60 t helloworld::main

Next we can try to use bpftrace to trace the corresponding function:

$ sudo bpftrace -e 'uprobe:helloworld/target/release/helloworld:_ZN10helloworld4main17h2dce92cb81426b91E { printf("Function hello-world called\n"); }'
Attaching 1 probe...
Function hello-world called

Tracing function calls with multiple invocations and getting return values

For a more complex example, which includes multiple calls and retrieving return values:

use std::env;

pub fn hello(i: i32, len: usize) -> i32 {
    println!("Hello, world! {} in {}", i, len);
    i + len as i32
}

fn main() {
    let args: Vec<String> = env::args().collect();

    // Skip the first argument, which is the path to the binary, and iterate over the rest
    for arg in args.iter().skip(1) {
        match arg.parse::<i32>() {
            Ok(i) => {
                let ret = hello(i, args.len());
                println!("return value: {}", ret);
            }
            Err(_) => {
                eprintln!("Error: Argument '{}' is not a valid integer", arg);
            }
        }
    }
}

First, we need to build in debug mode because the hello function gets inlined in release mode and won't have its own symbol:

$ cd args
$ cargo build
$ nm target/debug/helloworld | grep hello
0000000000016250 t _ZN10helloworld4main17ha3594bca2af541f6E
0000000000016540 t _ZN10helloworld5hello17h5f3a03dda56661e1E

Note that in release mode (cargo build --release), only the main function symbol appears because hello gets inlined during optimization.

Now we can trace the hello function using its symbol. Since Rust mangles symbol names and includes a hash that changes with each compilation, we use a wildcard pattern to match any version of the hello function:

$ sudo bpftrace -e 'uprobe:target/debug/helloworld:_ZN10helloworld5hello* { printf("Function hello called\n"); }'
Attaching 1 probe...
Function hello called
Function hello called
Function hello called
Function hello called

When we run the program with multiple arguments, bpftrace correctly catches all calls to the hello function:

$ ./target/debug/helloworld 1 2 3 4
Hello, world! 1 in 5
return value: 6
Hello, world! 2 in 5
return value: 7
Hello, world! 3 in 5
return value: 8
Hello, world! 4 in 5
return value: 9

We can also get the return value using Uretprobe. Again, we use a wildcard to match the symbol:

$ sudo bpftrace -e 'uretprobe:target/debug/helloworld:_ZN10helloworld5hello* { printf("Function hello returned: %d\n", retval); }'
Attaching 1 probe...
Function hello returned: 6
Function hello returned: 7
Function hello returned: 8
Function hello returned: 9

Note: The wildcard pattern _ZN10helloworld5hello* matches the hello function symbol regardless of the hash suffix that Rust adds during compilation. You can use nm target/debug/helloworld | grep hello to see the exact symbol names if needed.

References

Share on Share on