Polymorphism in Rust Using Arbitrary Self Types

Table of Contents

Back to my homepage

1 Motivating example

Let's say that you had a Rust trait that you want to turn into a trait object. To do that, you need to make sure that the trait is "object safe", (i.e., can be expressed as a data pointer and a pointer to a vtable of concretely typed functions). Here is an example of such a trait, expressing an evaluation node in a hypothetical array-based programming language:

trait Expression {
    fn eval(&self) -> Vec<i32>;
}

Now that we have an object-safe trait, we can create trait objects for it by wrapping implementer structs in a pointer type:

struct Num(i32);

impl Expression for Num {
    fn eval(&self) -> Vec<i32> {
        Vec::new(self.0)
    }
}

fn main() {
    let expr = Box::new(Num(5)) as Box<dyn Expression>;
    assert_eq!(expr.eval(), vec![ 5 ]);
}

This works out nicely for the Num struct because its inner type, i32, implements Copy. Copying an i32 is very efficient.

However, what if we had a struct that was much larger in size?

/// This operator produces a vector of its inputs doubled. 
struct Double(Vec<i32>);

The only way that this operator is going to work is by creating a new Vec to push the doubled elements into:

impl Expression for Double {
    fn eval(&self) -> Vec<i32> {
        self.0.iter().map(|v| v * 2).collect()
        // New Vec allocated here:   ^^^^^^^^^ 
    }
}

And this is basically fine. For most instances, this is going to be okay because you want the trait objects to live for as long as possible. Using this method, an evaluation node can be eval'd as many times as we want:

fn main() {
    let expr = Box::new(Double(vec![ 1, 2, 3, 4, 5 ])) as Box<dyn Expression>;
    assert_eq!(expr.eval(), vec![ 2, 4, 6, 8, 10 ]);
    assert_eq!(expr.eval(), vec![ 2, 4, 6, 8, 10 ]);
    assert_eq!(expr.eval(), vec![ 2, 4, 6, 8, 10 ]);
    assert_eq!(expr.eval(), vec![ 2, 4, 6, 8, 10 ]);
    // etc...
}

This is a pattern you'd typically see with interpreters; code is often stored as functions that are run multiple times.

However, if we only intend to evaluate this once, there is a better solution.

2 An alternate path with arbitrary Self types

As of some time ago - I have no idea when because according to this issue this technique shouldn't even work in stable yet - rust allowed a limited form of arbitrary Self types, presumably so that self: Pin<Self> worked fine.

At another point in time that I think that I vaguely recall seeing the pull request for, the rules on object safety relaxed a bit.

Long story short (I guess), traits with functions that have a signature of type self: Box<Self> are now object safe. The other standard library pointer types work as well. Fundamentally this provides a mechanism to take ownership of data while using run-time polymorphism.

If we re-write our Expression trait as following:

trait Expression {
    fn eval(self: Box<Self>) -> Vec<i32>;
}

Now, any struct that implements Expression takes ownership of its data when eval is run:

impl Expression for Double {
    fn eval(mut self: Box<Self>) -> Vec<i32> {
        for v in self.0.iter_mut() {
            *v *= 2;
        }
        self.0
    }
}

This means that the same heap location will be reused. It is memory efficient.

As a consequence of this, we can only evaluate the expression once. The following:

fn main() {
    let expr = Box::new(Double(vec![ 1, 2, 3, 4, 5 ])) as Box<dyn Expression>;
    assert_eq!(expr.eval(), vec![ 2, 4, 6, 8, 10 ]);
    expr.eval();
}

Results in the obvious compilation error:

error[E0382]: use of moved value: `expr`
  --> src/main.rs:20:5
   |
18 |     let expr = Box::new(Double(vec![ 1, 2, 3, 4, 5 ])) as Box<dyn Expression>;
   |         ---- move occurs because `expr` has type `std::boxed::Box<dyn Expression>`, which does not implement the `Copy` trait
19 |     assert_eq!(expr.eval(), vec![ 2, 4, 6, 8, 10 ]);
   |                ---- value moved here
20 |     expr.eval();
   |     ^^^^ value used here after move

3 Conclusion

I cannot believe that this is available on stable. It's rarely useful, but it is indispensable for ensuring that memory is used most efficiently in certain instances.

There were ways around this but none of them were particularly functional. So this is nice.

Author: Matthew Plant

Email: map@maplant.com

Created: 2020-01-16 Thu 14:30

Validate