How a GraphQL DataLoader works

13 April 2025 — 8 min

It has been a few years that I have last touched GraphQL, but I am digging into it once more these recent weeks.

For those that have never heard of GraphQL, it is a standard of how to query arbitrary data from a server. In contrast to REST, and its OpenAPI specification, it allows more freedom of which data to query.

In REST/OpenAPI, you have a fixed set of routes to request specific data. And usually you get back a predetermined set of fields, some of which you might not even be interested in. And for other related objects, you would have to send more request.

This is typically called the 1 + N query problem.

GraphQL on the other hand does not give you fixed routes, but rather a Datamodel/Schema, and allows you to request related objects in one query.

It is thus possible to avoid the 1 + N query problem, if done right.

However, it is often quite cumbersome to get it right. In the worst-case, you might only be able to move the 1 + N problem from the client to the server.

Aside: you can actually make it worse still, if the client itself is doing way more requests than necessary. GraphQL on its own makes it possible for requesting a ton of data, related or unrelated, in a single query/request. But the client itself must be sophisticated enough to actually take advantage of that. If for example each widget on a website is requesting its own (overlapping) subset of data, you end up with an N * M problem instead. But I digress.

Now, the Dataloader pattern is one way to solve the server-side 1 + N problem. I am convinced that the best way to learn how something works, and to understand it, is to build it from scratch from first principles. And that is exactly what we will be doing today.

# Setting the stage with an example

To make this whole problem more approachable, lets use a somewhat real-world example.

My wife is currently reading her way through a bunch of well known german authors / books, so lets use that as an example.

I am using the async-graphql crate for the server-side, and the cynic for the client. If you want to follow along, you can find the complete code (and history) over at GitHub.

As a side note, async-graphql has a dataloader feature itself, and there is a standalone dataloader crate as well. But as the goal for today is to learn how how it works, we will implement our own anyway :-)

Here is the simplified Rust code for the server:

pub struct Library;

#[Object]
impl Library {
    async fn authors(&self, _ctx: &Context<'_>) -> Vec<Author> { … }
}

#[derive(SimpleObject)]
#[graphql(complex)]
pub struct Author {
    name: &'static str,
    born: &'static str,
}

#[ComplexObject]
impl Author {
    async fn books(&self, _ctx: &Context<'_>) -> Vec<Book> { … }
}

#[derive(SimpleObject)]
#[graphql(complex)]
pub struct Book {
    title: &'static str,
}

#[ComplexObject]
impl Book {
    async fn summary(&self, _ctx: &Context<'_>) -> String { … }
}

And the above Rust code generates the following GraphQL Schema:

schema {
  query: Library
}

type Library {
  authors: [Author!]!
}

type Author {
  name: String!
  born: String!
  books: [Book!]!
}

type Book {
  title: String!
  summary: String!
}

So this schema defines all the fields that we can query. We can also see that not all the fields are present as actual fields in the Rust code. Some of them are implemented as async fn, which GraphQL calls resolvers.

# The `1 + N`

This is where the 1 + N is happening.

Our datamodel and data looks like this:

pub struct Author {
    pub name: &'static str,
    pub born: &'static str,
}

pub struct Book {
    pub title: &'static str,
    pub author: &'static str,
}

pub static ALL_AUTHORS: &[Author] = &[
    Author {
        name: "Hermann Hesse",
        born: "1877-07-02",
    },
    Author {
        name: "Thomas Mann",
        born: "1875-06-06",
    },
];

pub static ALL_BOOKS: &[Book] = &[
    Book {
        title: "Siddhartha",
        author: "Hermann Hesse",
    },
    Book {
        title: "Das Glasperlenspiel",
        author: "Hermann Hesse",
    },
    Book {
        title: "Zauberberg",
        author: "Thomas Mann",
    },
];

In this example, we have a flat list of all the books, similar to how this could look like in a relational database. So in order to fetch only the books of Hermann Hesse, we would have to filter this list down.

This is what happens in our Author.books resolver:

async fn books(&self, _ctx: &Context<'_>) -> Vec<Book> {
    println!("starting to resolve Books by `{}`", self.name);

    let books = ALL_BOOKS
        .iter()
        .filter(|b| b.author == self.name)
        .map(|b| Book { title: b.title })
        .collect();

    println!("finished resolving Books by `{}`", self.name);
    books
}

Lets assume we want to resolve all the authors in the library, including a list of their books, and a summary for each of those.

This is how a query like that would look like using cynic client:

#[derive(cynic::QueryFragment, Debug)]
pub struct Library {
    pub authors: Vec<Author>,
}

#[derive(cynic::QueryFragment, Debug)]
pub struct Book {
    pub title: String,
    pub summary: String,
}

#[derive(cynic::QueryFragment, Debug)]
pub struct Author {
    pub name: String,
    pub born: String,
    pub books: Vec<Book>,
}

This is pretty much a mirror of our server-side schema, and results in a very simple GraphQL query: { authors { name books { title summary } } }

You can see how GraphQL makes it very simple to express these queries, and also the schema in the background.

But lets get back to talking about the 1 + N problem. Running the above query, we can see the following being logged on the server for this request:

starting to resolve authors
finished resolving authors
starting to resolve Books by `Hermann Hesse`
finished resolving Books by `Hermann Hesse`
starting to resolve summary for `Siddhartha`
finished resolving summary for `Siddhartha`
starting to resolve summary for `Das Glasperlenspiel`
finished resolving summary for `Das Glasperlenspiel`
starting to resolve Books by `Thomas Mann`
finished resolving Books by `Thomas Mann`
starting to resolve summary for `Zauberberg`
finished resolving summary for `Zauberberg`

As we can see from this log output, the authors resolver is called once, but the books and summary resolvers are run for each author and each book respectively, making the whole query effectively 1 + N (authors) + M (books).

# Make it async

Well this is not as ideal, so lets work our way to a solution one step at a time.

The first step here is to make things async. In real world code, this would be hitting a database or external API anyway.

To simulate async, we will simply introduce a yield_now in the beginning of the resolver, like so:

async fn books(&self, _ctx: &Context<'_>) -> Vec<Book> {
    println!("starting to resolve Books by `{}`", self.name);
    yield_now().await;
    println!("actually resolving Books by `{}`", self.name);

With this in place, the order of our log output changes slightly, and I have grouped the lines a bit to make it clearer:

starting to resolve authors
finished resolving authors
starting to resolve Books by `Hermann Hesse`
starting to resolve Books by `Thomas Mann`

actually resolving Books by `Hermann Hesse`
finished resolving Books by `Hermann Hesse`
starting to resolve summary for `Siddhartha`
starting to resolve summary for `Das Glasperlenspiel`
actually resolving Books by `Thomas Mann`
finished resolving Books by `Thomas Mann`
starting to resolve summary for `Zauberberg`

actually resolving summary for `Siddhartha`
finished resolving summary for `Siddhartha`
actually resolving summary for `Das Glasperlenspiel`
finished resolving summary for `Das Glasperlenspiel`
actually resolving summary for `Zauberberg`
finished resolving summary for `Zauberberg`

We can now clearly see that just by using async and running the resolvers concurrently, things end up naturally being grouped into three distinct phases. One resolving the authors and starting to resolve theirs books, and then a next phase that starts to resolve book summaries.

And all of this just happened magically by the inner workings of async:

We have a list of authors, and want to resolve their books.
The resolver is implemented as an async fn and so returns a Future.
The async-graphql crate internally uses join_all to run all those Futures concurrently.
What this means is that each future is being polled one after the other.
And because neither of them resolves immediately (returning Poll::Pending instead of Poll::Ready),
we end up in a situation where join_all concurrency has magically grouped the first call to poll,
and will then call poll again on all pending futures on a next iteration.

And this is fundamentally the essence of how Dataloader works. By not resolving things immediately, but rather delaying the actual resolution by a bit, it is able to group together all the resolve calls first, before actually executing them as a batch.

As a side note, the Dataloader in async-graphql literally uses a delay with a default of 1ms, whereas the dataloader crate uses a customizable function that defaults to calling yield_now a bunch of times.

# Introducing the `Dataloader` abstraction

Now that we now the fundamentals of how Dataloader works, lets start first by actually creating one, and using it within our resolvers.

In our simple example, we already have two related objects which we want to load: books, and summaries. For this reason, the Dataloader has to be generic over what we want to load. And obviously also the actual implementation of how to load things in bulk.

So we will define a trait BatchLoader, and the actual DataLoader like so (simplified):

pub trait BatchLoader {
    type K: Hash + Eq + Clone;
    type V: Clone;

    async fn load_batch(&mut self, keys: &[Self::K]) -> HashMap<Self::K, Self::V>;
}

pub struct DataLoader<B> { … }

impl<B> DataLoader<B>
where
    B: BatchLoader,
    B::K: Debug,
{
    pub fn load(&self, key: B::K) -> impl Future<Output = B::V> { … }
}

And within our resolvers, we will use this DataLoader instead of manually resolving things:

async fn books(&self, ctx: &Context<'_>) -> Vec<Book> {
    ctx.data_unchecked::<DataLoader<LoadBooks>>()
        .load(self.name)
        .await
}

In order for this to work, we also have to attach an actual DataLoader instance to the GraphQL Context.

We can do that by attaching a fresh DataLoader for each GraphQL request like so:

type FullSchema = Schema<Library, EmptyMutation, EmptySubscription>;

async fn graphql_handler(State(schema): State<FullSchema>, req: GraphQLRequest) -> GraphQLResponse {
    let req = req
        .into_inner()
        .data(DataLoader::new(LoadBooks))
        .data(DataLoader::new(LoadSummaries));

    schema.execute(req).await.into()
}

pub fn make_app() -> Router {
    let schema = Schema::build(Library, EmptyMutation, EmptySubscription).finish();

    Router::new()
        .route("/", get(graphiql).post(graphql_handler))
        .with_state(schema)
}

As you can see, the State here is provided by axum, which calls our graphql_handler for each incoming request.

The code for the BatchLoaders themselves is pretty much the previous resolver code, but adopted to work with batches:

pub struct LoadBooks;
impl BatchLoader for LoadBooks {
    type K = &'static str;
    type V = Vec<Book>;

    async fn load_batch(&mut self, keys: &[Self::K]) -> HashMap<Self::K, Self::V> {
        let debug_keys = keys.join("`, `");
        println!("actually resolving Books by `{debug_keys}`");

        let mut books: HashMap<_, Vec<Book>> = HashMap::with_capacity(keys.len());
        for book in ALL_BOOKS {
            if keys.contains(&book.author) {
                books
                    .entry(book.author)
                    .or_default()
                    .push(Book { title: book.title });
            }
        }

        println!("finished resolving Books by `{debug_keys}`");
        books
    }
}

The calls to data_unchecked look a bit out of place in my opinion. One option we have it to move that code out and define a nicer method for it using an extension trait:

pub trait Loaders {
    async fn load_books(&self, name: &'static str) -> Vec<Book>;
    async fn load_summary(&self, title: &'static str) -> String;
}

impl Loaders for Context<'_> {
    async fn load_books(&self, name: &'static str) -> Vec<Book> {
        self.data_unchecked::<DataLoader<LoadBooks>>()
            .load(name)
            .await
    }

    async fn load_summary(&self, title: &'static str) -> String {
        self.data_unchecked::<DataLoader<LoadSummaries>>()
            .load(title)
            .await
    }
}

That way, we can just call ctx.load_summary(self.title).await within our resolver. This just hides a bit of complexity behind a nicer method.

So far, we have just moved all the resolver code behind the DataLoader abstraction, but so far we haven’t started implementing the DataLoader itself. I have just stubbed out the actual logic, and just forwarded all the calls directly to the load_batch function.

This shim implementation is pretty much using a Mutex<BatchLoader> in the background, thats why you see some lock indirection.

pub fn load(&self, key: B::K) -> impl Future<Output = B::V> {
    async move {
        println!("starting to resolve related objects for `{key:?}`");
        yield_now().await;

        let mut inner = self.inner.lock().await;
        inner
            .load_batch
            .load_batch(&[key.clone()])
            .await
            .get(&key)
            .unwrap()
            .clone()
    }
}

The only reason I did this for now is to make sure that introducing the abstraction itself continues to work just like before, and it does:

starting to resolve authors
finished resolving authors
starting to resolve related objects for `"Hermann Hesse"`
starting to resolve related objects for `"Thomas Mann"`

actually resolving Books by `Hermann Hesse`
finished resolving Books by `Hermann Hesse`
starting to resolve related objects for `"Siddhartha"`
starting to resolve related objects for `"Das Glasperlenspiel"`
actually resolving Books by `Thomas Mann`
finished resolving Books by `Thomas Mann`
starting to resolve related objects for `"Zauberberg"`

actually resolving summaries for `Siddhartha`
finished resolving summaries for `Siddhartha`
actually resolving summaries for `Das Glasperlenspiel`
finished resolving summaries for `Das Glasperlenspiel`
actually resolving summaries for `Zauberberg`
finished resolving summaries for `Zauberberg`

The log output looks a bit different, but the results are still the same. So far so good.

While we are not yet at the point where we have actually implemented the DataLoader ourselves, we have learned that it fundamentally piggybacks on async concurrency to do what it does.

And we also ended up building a small GraphQL server, and learned how to modified our it so that it can use some DataLoader implementation, like the one from async-graphql or dataloader that I mentioned previously.

But this blog post is already quite long, and I will defer the actual implementation to a followup post, so stay tuned :-)

Update: The follow up blogpost, to implement a DataLoader the hard way is here

How a GraphQL DataLoader works

# Setting the stage with an example

# The 1 + N

# Make it async

# Introducing the Dataloader abstraction

# The `1 + N`

# Introducing the `Dataloader` abstraction