Interfacey things in Rust

11 minute read Published: 2023-02-12

Today we are going to look at something that came up a bunch when I finally got to write some thirty hours of Rust at work: how do I decide between different implementations of something in Rust. Big deal right? It's not, but there can certainly be some syntax involved and piecing it together from various chapters of books was a pain so I'm writing it down.

Example

In Rust there is a complexity cliff once you introduce async fn and Futures. The types and annotations get heavier and the compiler errors become much less helpful than the wonderful ones your normally get. So in this example we are going to look at changing out behavior in a component that returns a Future / uses an async fn.

We'll run into the following concepts:

async fns return impl Futures see here and here we are going to be focusing on dynamic dispatch. When we return an impl Future we aren't returning a specific type, but promising that the type we return implements this trait so they let the caller use methods from the trait on the object they get back.
- Future is a trait. It is not a type. It has no size and holds no data. It's used as an interfaace.
Box<dyn Future<Output=SomeType>> means storing a type where all we know is that it implements this trait. We don't know the size of this object at compile time, but at runtime we can follow pointer to the actual implementation (think vtables). It's allocated on the heap. The box is a pointer (of known size) we can store it on stack frames where we can only put things of known size.
Pin<Box<dyn Future<Output=SomeType>>> is the same as the above but we also tell the compiler that the data on the heap can n ever move. This is out of my wheel house but I put some links below for more information. The compiler is making sure you can not get a mutable reference to the stored data.

As far as implementing this, in an OOP language this would be Interfaces + subclassing. You could do the same in scala or you can solve it with: an ADT, passing an anonymous function (the simplest and most general interface in reality), or something with type classes.

We'll use a trivial example for exploring this in Rust

we accept integers
in one implementation we return the integer as is
in another implementation we return the square of the integer
we're going to pretend a bunch of work is happening somewhere else and concurrency is involved so we'll be using async/await and Futures

Fire up a new project and add the following to your Cargo.toml

[package]
name = "async-example"
version = "0.1.0"
edition = "2021"

[dependencies]
async-trait = "0.1.64"
tokio = { version = "1.25.0", features = ["macros", "rt-multi-thread"] }

Option 1 - Enum/ADT

The simplest approach is the Enum/ADT approach followed by some pattern. It's not really an option but if you can get away with it, it is certainly the lightest on syntax. You have to know
all your options at compile time aka already know all the possible things you are going to construct. It has a number of other drawbacks as well.

enum Printer {
    Identity,
    Square,
}

impl Printer {
    pub async fn do_work(&self, i: u32) -> u32 {
        match &self {
            Printer::Identity => i,
            Printer::Square => i * i,
        }
    }
}

#[tokio::main]

async fn main() {
    let i = 10;
    let case = Printer::Identity;
    println!("one option: {}", case.do_work(i).await);

    let case2 = Printer::Square;
    println!("other option: {}", case2.do_work(i).await);
}

Running this gives us the expected

one option: 10
other option: 100

Pros:

Straight forward, provided you can setup all your cases ahead of time.
Not heavy on the syntax

Cons:

Need to know all your cases ahead of time, not dynamic
Expression Problem. You can add more variants but you have to update every patern match
Not extensible by third parties. No one else could extend Printer for their own type.
- Even in your own code, what if you wanted a TestPrinter or RecordingPrinter or something, you can't define it just in your tests. It would bleed into your production code.
Not compositional

It works, but not what we need here.

Option 2 - Trait Objects

Trait objects come up as a way to do (handwavy) sort of do interface/implementations in Rust. The book above does not do them great justice given this obtuse example text:

However, trait objects are more like objects in other languages in the sense that they combine data and behavior. But trait objects differ from traditional objects in that we can’t add data to a trait object. Trait objects aren’t as generally useful as objects in other languages: their specific purpose is to allow abstraction across common behavior.

TLDR; instead of returning a concrete type you are returning some type that at least implements the trait. This comes at a cost for us:

Trait objects aren't types so we can't throw them in a struct
We are using a trait object but not returning one from a function, so we can't use impl trait , and need to use dyn trait to indicate they are dynamically dispatched.
dyn Traits are types that we can return, but we don't know their size at compile time so we need to box them and put them on the heap

Say I have a struct and I want to embed different implementations of something onto it:

use async_trait::async_trait;

#[async_trait]
pub trait Printer {
    async fn print(&self, i: u32) -> u32;
}

pub struct IdentityPrinter; 
pub struct SquarePrinter; 

#[async_trait]
impl Printer for IdentityPrinter {
    async fn print(&self, i: u32) -> u32 {
        i
    }
}

#[async_trait]
impl Printer for SquarePrinter {
    async fn print(&self, i: u32) -> u32 {
        i * i
    }
}

struct Print<S: Printer> {
    printer: S
}

impl<S: Printer> Print<S> {
    async fn print(&self, i: u32) -> u32 {
        self.printer.print(i).await
    }
}

You immediately run into the first problem which is that Rust cannot have async functions in trait. You can get around it by using the async-trait crate to have functions return impl Futures plus an ungodly amount of other annotations. You can read more on it here if you care.

This feels like identical to what you would do in an OOP language. Does it work?

#[tokio::main]

async fn main() {
    let i = 10;
    let case_one =  Print {
        printer: IdentityPrinter
    };

    let case_two = Print {
        printer: SquarePrinter
    };

    println!("{}", case_one.printer.print(i).await);
    println!("{}", case_two.printer.print(i).await);

}

It seems like it does but it has a dirty little secret: we did not specify the types of case_one and case_two. The compiler has inferred the type case_one to be Print<IdentityPrinter> and case_two to be Print<SquarePrinter>, while correct this isn't the whole story. If we wanted to pass this around generically as Print<Printer> which is what we'll want to do when solving real problems we are out of luck.

    let case_one: Print<Printer> =  Print {
        printer: IdentityPrinter
    };

    let case_two: Print<Printer> = Print {
        printer: SquarePrinter
    };

And we explode with many errors. A shortlist:

error[E0277]: the size for values of type `dyn Printer` cannot be known at compilation time
   --> src/main.rs:132:18
    |
132 |         printer: SquarePrinter
    |                  ^^^^^^^^^^^^^ doesn't have a size known at compile-time
    |
    = help: the trait `Sized` is not implemented for `dyn Printer`
note: required by a bound in `Print`
   --> src/main.rs:111:14
    |
111 | struct Print<S: Printer> {
    |              ^ required by this bound in `Print`
help: you could relax the implicit `Sized` bound on `S` if it were used through indirection like `&S` or `Box<S>`
   --> src/main.rs:111:14
    |
111 | struct Print<S: Printer> {
    |              ^ this could be changed to `S: ?Sized`...
112 |     printer: S
    |              - ...if indirection were used here: `Box<S>`

Doing all the necessary boxing, keywords and sytnax you will end up with the finished implementation

use async_trait::async_trait;

#[async_trait]
pub trait Printer {
    async fn print(&self, i: u32) -> u32;
}

pub struct IdentityPrinter; 
pub struct SquarePrinter; 

#[async_trait]
impl Printer for IdentityPrinter {
    async fn print(&self, i: u32) -> u32 {
        i
    }
}

#[async_trait]
impl Printer for SquarePrinter {
    async fn print(&self, i: u32) -> u32 {
        i * i
    }
}

struct Print<S: Printer +?Sized> {
    printer: Box<S>
}

impl<S: Printer + ?Sized> Print<S> {
    async fn print(&self, i: u32) -> u32 {
        self.printer.print(i).await
    }
}

#[tokio::main]

async fn main() {
    let i = 10;
    let case_one: Print<dyn Printer> =  Print {
        printer: Box::new(IdentityPrinter)
    };

    let case_two: Print<dyn Printer> = Print {
        printer: Box::new(SquarePrinter)
    };

    println!("{}", case_one.printer.print(i).await);
    println!("{}", case_two.printer.print(i).await);

}

Not so bad once you wrap your head around it once. We can do typeclasses and pass around a generic version of our typeclass but we need to store it on the heap and deal w/ a bunch of syntax to make it happen. Not the worst.

Pros:

easy to provide new implementations later
- do not need to know them all ahead of time
- can seperate out a Test implementation later
- can add new implementations w/o having to update code (no need to change pattern matches, etc.)
other people can extend

Cons:

syntax heavy w/ a bit of indirection (generics + trait bounds, the ?Sized bound, etc.)

A more functional approach by passing around a function

My motto from FP work in Scala/Haskell is: when in doubt, pass a function. It is the lightest of interfaces and often a great choice. Rust supports higher order functions so let's give that a whirl.

This is really easy in scala or Haskell. It looks something like this:

val identity: Int => IO[Int] = i => IO(i)
val square: Int => IO[Int] = i => IO(i*i)

So let's treat our Print struct as a bag of data and throw a field on it that holds onto a function. You see this in haskell all the time as something called a record of functions. Since we can define functions like any other piece of data, we can pass them around, accept them into methds (vec![1,2,3].into.map(|i| i+ 1)) for instance), put them on structs, etc. Unfortunately, while passing around functions is easy in Rust, passing around functions that return Future (which is hiding behind every async fn) is anything but.

The setup:

struct Print {
    printer: ??? // it's not something simple like  Fn(u32) -> u32

// alternatively 
// let identity = |i: u32| async { i };
async fn identity(i: u32) -> u32 {
    i
}

// alternatively 
// let square = |i: u32| async { i*i };
async fn square(i: u32) -> u32 {
    i * i 
}

#[tokio::main]

async fn main() {
    let i = 10;

    let case_one = Print {
        printer: identity
    };

    let case_two = Print { 
        printer: square
    };

    println!("{}", case_one.printer.print(i).await);
    println!("{}", case_two.printer.print(i).await);

}

This is an area where the compiler is less nice. You immediately run into problems:

Future is a trait, not a concrete type, so we are back in dyn territory
We'll need to not only Box our Future becuase we don't know it's size at compile time, we'll need to Pin it as well
- I'm still wrapping my head around Pin. Suggested reading: Pin and Suffering
We need to throw the whole thing on the heap

use std::{future::Future, pin::Pin};


// type alias so I can start fitting this on a screen

type PinnedFuture<T> = Pin<Box<dyn Future<Output = T>>>>

struct Print {
    pub print: Box<dyn FnOnce(u32) -> PinnedFuture<u32>
}

...

    let case_one = Print {
        print: Box::pin(identity)
    };

    let case_two = Print { 
        print: Box::pin(square)
    };

...

Unfortunately we have three problems here

No implicit sugar to turn an async fn into a FnOnce for us. We need to handle this explicitly
We need to box the whole thing, inlucind the function
Rust sees functions that return the same thing as two different types (investigate Futures and opaque types). You run into this if you try to choose between two functions in a match statement or an if statemnet. So you need an explicit cast to the same type.

Doing that:

    let case_one =  Print {
        print: Box::new(|u: u32| Box::pin(identity(u)) as PinnedFuture<u32> 
    };

    let case_two = Print { 
        print: Box::new(|u: u32| Box::pin(square(u)) as PinnedFuture<u32> 
    };

And now it works but wow at what a cost compared to the scala solution.

Pros:

passing a function is the lightest of interfaces
easy to construct the functions themselves (con: the wrapping sucks)
opens up possibilities for function composition
easy to extend by third parties

Cons:

syntax heavy with poor error messages from the compiler