Rust Contexts

… in the context of logging

21 January 2022 — 4 min

I realize I might be a bit late to the party, considering tmandrys blog post is already a month old, and there is now even an initiative focused on contexts.

Alas, I want to explore the way in which implicit contexts could solve some very real problems that we have with todays methods of modeling context. As practical examples, I want to highlight the issues using the sentry-rust, and tracing APIs.

# Footguns

IMO the currently available APIs are a bit too hard to mis-use. An example of this is mentioned directly in the tracing docs:

In asynchronous code that uses async/await syntax, Span::enter may produce incorrect traces if the returned drop guard is held across an await point.

From a sentry point of view, the configure_scope and with_scope/push_scope APIs have a similar potential of being misused. This can cause slight problems, like tags not being applied correctly, up to bigger problems such as a panic if scope manipulation is unbalanced.

# Tradeoffs

In essence, I think these problems stem from the tradeoff of favoring convenience over correctness.

The problem here is that both tracing and sentry-rust use a mixture of static mut (which is almost impossible to use correctly), lazy_static and thread_local.

In both cases, there is one global Hub that is automatically inherited to all newly spawned threads. And each of these threads keeps a current state around which is mutable. The global hub also needs to be mutable, since you have to initialize it at some point.

This mutability in turns requires the use of way too many Arcs and Mutexes. A fact of sentry-rust internals I recently discussed on Discord as well.

The convenience we get out of this is that users can just call sentry::capture_event anywhere in the code. Similarly, any library can annotate a function with #[tracing::instrument] and things just work. Well unless they don’t.

The problems happen when you write something to the current state, but that mutable state is being shared among multiple concurrent async tasks. How can we avoid these footguns?

In case of tracing, you have to use manually instrument() a future. Similarly in sentry-rust, you have to bind a Hub to the future via bind_hub(). But that unfortunately is also prone to be misused when dealing with join_all concurrency. The right thing to use here is .bind_hub(Hub::new_from_top(Hub::current())). Well that is a mouthful, and extremely easy to get wrong.

Essentially these issues boil down to shared mutable state. Something that the Rust compiler and borrow-checker promise to solve.

# Mutability

Which brings me to the next topic. I believe that just following Rusts normal ownership and borrowing semantics would solve most or all of the outlined problems. We have mutable data, so make sure to declare it as &mut, and the compiler will tell us where we are tripping up.

struct Context;

async fn uses_mutable_ctx(_ctx: &mut Context) {}

#[tokio::main]
async fn main() {
    let mut ctx = Context;

    // normal calls work just fine
    uses_mutable_ctx(&mut ctx).await;
    uses_mutable_ctx(&mut ctx).await;

    // futures concurrency: nope
    let futures = (0..2).map(|_i| uses_mutable_ctx(&mut ctx));
    futures::future::join_all(futures).await;

    // concurrent tasks: nope
    let _ = tokio::task::spawn(uses_mutable_ctx(&mut ctx)).await;

    // threads: nope
    let _ = std::thread::spawn(|| {
        tokio::runtime::Runtime::new()
            .unwrap()
            .block_on(uses_mutable_ctx(&mut ctx))
    })
    .join();
}

The above example fails to compile for all cases that involve concurrency:

error: captured variable cannot escape `FnMut` closure body
  --> playground\contexts\src\main.rs:15:35
   |
8  |     let mut ctx = Context;
   |         ------- variable defined here
...
15 |     let futures = (0..2).map(|_i| uses_mutable_ctx(&mut ctx));
   |                                 - ^^^^^^^^^^^^^^^^^^^^^^---^
   |                                 | |                     |
   |                                 | |                     variable captured here
   |                                 | returns a reference to a captured variable which escapes the closure body
   |                                 inferred to be a `FnMut` closure
   |
   = note: `FnMut` closures only have access to their captured variables while they are executing...
   = note: ...therefore, they cannot allow references to captured variables to escape

error[E0597]: `ctx` does not live long enough
  --> playground\contexts\src\main.rs:23:49
   |
23 |     let _ = tokio::task::spawn(uses_mutable_ctx(&mut ctx)).await;
   |                                -----------------^^^^^^^^-
   |                                |                |
   |                                |                borrowed value does not live long enough
   |                                argument requires that `ctx` is borrowed for `'static`
...
41 | }
   | - `ctx` dropped here while still borrowed

error[E0499]: cannot borrow `ctx` as mutable more than once at a time
  --> playground\contexts\src\main.rs:28:32
   |
23 |     let _ = tokio::task::spawn(uses_mutable_ctx(&mut ctx)).await;
   |                                --------------------------
   |                                |                |
   |                                |                first mutable borrow occurs here
   |                                argument requires that `ctx` is borrowed for `'static`
...
28 |     let _ = std::thread::spawn(|| {
   |                                ^^ second mutable borrow occurs here
...
31 |             .block_on(uses_mutable_ctx(&mut ctx))
   |                                             --- second borrow occurs due to use of `ctx` in closure

In all of these cases, the compiler forces us to create an explicit clone.

// futures concurrency
let futures = (0..2).map(|_i| {
    let mut ctx = ctx.clone();
    async move { uses_mutable_ctx(&mut ctx).await }
});
futures::future::join_all(futures).await;

// concurrent tasks
let mut spawn_ctx = ctx.clone();
let _ = tokio::task::spawn(async move { uses_mutable_ctx(&mut spawn_ctx).await }).await;

// threads
let mut spawn_ctx = ctx.clone();
let _ = std::thread::spawn(move || {
    tokio::runtime::Runtime::new()
        .unwrap()
        .block_on(uses_mutable_ctx(&mut spawn_ctx))
})
.join();

Unfortunately for us, thinking of contexts as implicit function argument would only solve half of the problem. We won’t have to thread it down all our call chain, but it does not solve the problem what happens when there are forks in the road.

From the compilers perspective, the rules of shared mutability are quite clear. However the compiler does not automatically clone for you. Also from a users perspective, what should "cloning" mean in these cases?

For example in the case of a tracing span hierarchy. Are you await-ing or join-ing all the spawned tasks/threads? Or are they more “fire and forget”?

# Back to Tradeoffs

And here we are back where we started. We have to chose the right tradeoffs between easy of use, and possibility of misuse. Contexts in the sense of implicit function arguments get us quite far in easy of use though.