Rust for Beginners: Cleaning-up your Code with the “?” Operator
This is the last article in a three-part series where we take a really deep dive into the fundamental Option<T>
and Result<T,E>
type classes and how to idiomatically (cleanly and elegantly) use them to keep you code crisp, concise, and safe. In the first article, we dipped our toes into using pattern matching to work with the values that are tucked inside the classes without constantly testing-and-extracting the values with unsafe .unwrap()
calls. In the second article we introduced .map(fn)
and .flat_map(fn)
(the latter of which is sometimes oddly called .and_then(fn)
), and we showed the true power of “Happy Path Programming”.
All of these concepts are deeply steeped in Functional Programming (and really they originate from Category Theory in Mathematics), and although FP programmers may find them delightful to work with, their way of programming by chaining lots of .map
and .flat_map
calls can look “less-than-aesthetic” to the traditional imperative programmer.
By the way, if you landed on this article without reading the first two, I highly suggest you start at the beginning, because I’m going to be building on top of the conceptual foundation that we’ve carefully laid down.
What is this “?” Operator Anyway?
As a beginning Rust developer, I have to admit I’d been a bit baffled by this question mark that I would find at the end of certain lines. I read the official documentation as well as this section in Rust by Example and still found myself confused enough as to postpone trying to play with it.
It wasn’t until I started writing this very blog series that the light bulb went off! As I mentioned in the first article, I had written a three-part article on using Option, Try, and Either in Scala that was very fond of. As I was trying to write the ninth article in my Embedded Rust blog series, I had realized I was starting to use some patterns that the Rust beginner would probably find confusing. When I tried to explain what I was doing, I came to the realization that I was cram all of these concepts in. The article was getting unwieldy. It was time to retell my original tale in a little Rust mini-series!
When I embarked on translating the third and last part—talking about how to use these interesting things called “for-comprehensions” to keep code clean, readable, and concise—I didn’t think I was going to be able to show this compelling and aesthetically pleasing trick in the Rust world. I did some Google searching on Rust and for-comprehensions in the hope that I’d been missing something, and it landed me on an epiphany: this crazy question-mark operator is the equivalent of Scala’s for-comprehensions!
As I worked through some code examples for this last part in the series, I was thrilled to see that I got this concept working.
Equivalence of Rust “let” statements and function mapping
Let me first start by stating that let
statements are often themselves a type of syntactic sugar for the developer. If I have the following lines:
let x: i32 = 47;
let y: i32 = x + 1;
println!("{}",y);
The compiler isn’t necessarily allocating two 4-byte chunks of memory on the stack and filling one with 47 and the other with 48. It’s more likely realizing that it can skip the assignment to x
and go straight to the addition step. It’s almost like you could write it out like this:
println!("{}", Identity(47).map(|x| x + 1));
(Pretend there’s an Identity wrapper that doesn’t do anything but lets you call .map
on a basic type.) My point is that in Rust, once you define y
based on x
, we supposedly “consume” x
so that it’s no longer available. The compiler knows that it was just a temporary placeholder for a value, and we don’t need to reuse its state later. Heck, it might even see the constant addition and compute the 48 value during compilation.
My point is this: all those examples in the previous article where I used all the Functional Programming style of function chaining could be broken down equivalently into a series of let
statements.
I’m going to start with an example of chained .map
and .and_then
calls, and then work backwards to show how it can be converted into a bunch of simple let
statements!
Example: defining my “test” function
I want a special function fn test(n: i32) -> Option<i32>
that performs the following logic:
- If you pass it any odd number, it will return
None
. - If you pass it any negative number, it will return
None
. - For any remaining (non-negative even) values, it will return that value incremented by 1.
Yes, I could simply write this function like this:
fn test(n: i32) -> Option<i32> {
if (n % 2 == 0) && (n >= 0) { Some(n+1) } else { None }
}
But I want a simple code example that’s chaining Options, some let’s say there’s a reason why I need to perform the above three requirements sequentially. Yes, this is going to look really contrived, but here’s how I would write the function purely with .map
and .and_then
calls and some lambdas:
fn test(n: i32) -> Option<i32> {
{ if n % 2 == 0 { Some(n) } else { None } }
.map(|x| x + 1)
.and_then(|x| if x >= 0 { Some(x) } else { None })
}
In the first line, I used the if-statement to perform the first test and return the Some(n)
value for even numbers and None
for odd numbers. Then I take the content of that Option
and increment it (only applying for even numbers, of course!), and finally I chain the test for positivity, returning None if it fails.
Again, I know this is contrived, but watch how I can rewrite this with simple let
statements now, introducing the question-mark operator…
fn test(n: i32) -> Option<i32> {
let x: i32 = if n % 2 == 0 { Some(n) } else { None }?; // flat-map
let y: i32 = x + 1; // map
let z: i32 = if y >= 0 { Some(y) } else { None }?; // flat-map
Some(z) // final wrapping
}
In the first line, when I put the question-mark operator at the end of an Option
type class, the compiler pretends that my x
variable is of type i32
rather than Option<i32>
. When I define the simple y = x + 1
, the compiler again pretends that I’ve simply incremented the x
value. When I define z
, I’m again taking my apparently-simple y
value and conditionally wrapping it in a Some
or None
structure, but by putting that same question-mark at the end, the compiler again pretends it’s a simple i32
value.
What is happening behind the scenes is that we are really calling .map
or .and_then
(in the comments I’m calling it flat-map) a number of times. It’s all syntactic sugar for the same thing, but here’s the beauty of it all: it reads like normal typical code! We are solving for the nested-structures problem that flat-map (and_then
) was designed to fix, and we are focusing on Happy Path programming rather than writing a bunch of meaningless failure-case pass-throughs that we had seen with nested pattern matching!
Again, I know the above example was quite contrived, but you may run into a coding situation where you’re dealing with a bunch of Option
wrapped variables, sometimes needing to unpack them sequentially thus leading to that nesting problem. In addition to using the .and_then
flat-mapping trick, it’s just great to know you could also take a better “self-documenting” approach that’s easier-to-read using the “?” operator. I also wanted to show this example because I suspect most people think “?” is specific to Result
values only. (Mwahh hah hah hah!)
Let’s apply this to our original sensor example!
At the conclusion of the last article, we had arrived at the following solution to handle a potentially misconfigured sensor, take a potentially unsuccessful reading, and finally try to send the telemetry to an IoT edge server—the last step returning a boolean if the delivery was successful.
fn take_temp(sensor: &Result<TempSensor,&'static str>) -> Result<(),&'static str> {
sensor.as_ref().map_err(|e| *e)
.and_then(|s| s.get_reading())
.map(send_telemetry)
.and_then(|s| if s { Ok(()) } else { Err("Could not send telemetry") })
}
Using Rust’s “?” comprehension, we can rewrite things to arrive at the following:
fn take_temp(sensor: &Result<TempSensor,&'static str>) -> Result<(),&'static str> {
let s: &TempSensor = sensor.as_ref().map_err(|e| *e)?;
let reading: Reading = s.get_reading()?;
if send_telemetry(reading) {
Ok(())
} else {
Err("Could not send telemetry")
}
}
In the first line, we are still using the trick we learned in the last article to use .as_ref()
to convert the &Result<TempSensor,&str>
into something where the reference going inside container: Result<&TempSensor,&&str>
, and we still had to use the .map_err()
function to undo the double-reference on the error type. But we put the question-mark at the end, making our Result
look more like a simple &TempSensor
.
Next, we call our .get_reading()
method which would normally create another nested Result
structure. Again we use the question-mark to, behind the scenes, call .and_then()
(flat-map) while making the return appear to be again the happy-path Reading
object.
Finally, we have to return a Result
object in order to match the return type in the function signature, and here we have to convert the boolean success from send_telemetry(reading)
anyway, so we do that with a simple if-else statement!
One more thorny patch to look at: Error Types
Up until now, I’ve used static strings to represent the error types for our Result
type classes. The reason I was doing this was to keep things simple for our initial build-out. I do this all the time with Scala’s Either
type class which is the equivalent to Rust’s Result
, but strings are no-worries objects in the Java world where you have a garbage collector and don’t have to worry about borrow checkers and lifetimes. And especially if you’re working in the Embedded Rust world where you don’t have the alloc layer beneath the standard library, then allocating and cloning String
objects isn’t even an option. We’ll want to tackle errors more idiomatically.
Here’s another reason why I was avoiding idiomatic errors in the beginning: when you have nested Result
objects, resolving error types gets thorny as we’re about to see. Let’s consider a situation where our sensor object is contained in a Result
structure that allows a SensorError
if we weren’t able to find, configure, and initialize the sensor. We then want to have .get_reading()
return its own Result
structure that considers the possibility of a ReadingError
if we got some sort of IO error when attempting a reading.
If I try to put this together in a function, what type signature would I return?
fn take_temp(sensor: &Result<TempSensor, SensorError>) -> Result<Reading,???> {
sensor.as_ref() // returns Result<&TempSensor, &SensorError>
.and_then(|s| s.get_reading()) // returns Result<Reading,ReadingError>
}
To be explicit, we have three possible return values:
Ok<Reading>
: a successful readingErr<SensorError>
: we couldn’t configure the sensorErr<ReadingError
>: we failed to get a reading from a properly configured sensor
Unfortunately, we have to choose our error type. What we need is some sort of “union type” for Result<Reading,TempSensor|SensorError>
.
In the Java/Scala world I come from, the typical approach for this would be to find some higher-level parent class for the return type, potentially all the way up to java.lang.Exception
. The problem with this is that this casting removes a lot of potentially useful information.
Enums as Union Types
Rust actually has a pretty elegant approach to this problem. What we really want to do is to say that our reading
object will have an error type of ReadingError
or SensorError
. The Rust enum
provides this elegant way of representing these optional unions. (And it’s what powers the core machinery that powers Option
being either Some<T>
or None
, or Result
being either an Ok<T>
or an Err<E>
!)
The idiomatic way of handing your application errors in Rust is to create your own error enums, often in a clean and separate Error
module in your application. (At least, that’s what I’ve been seeing in most of the embedded Rust libraries when I dig through their source code.)
To make our example realistic, let’s pretend that we are using a sensor library from a crate that returns a SensorCrateError
that is defined like this:
// imported from the sensor crate
enum SensorCrateError {
ConfigurationError,
ReadingError
}
For our application, we want to retain these two error possibilities, and we want to add to it the error state that our send_telemetry
function wasn’t able to validate delivery.
We would then define our application error as follows:
enum Error {
SensorError,
ReadingError,
NetworkError
}
Next, we have to define some logic that can convert the previous SensorCrateError
values into our own application error, so we’ll define the From
trait for our custom error:
impl From<SensorCrateError> for Error {
fn from(value: SensorCrateError) -> Self {
match value {
SensorCrateError::ConfigurationError => Self::SensorConfigError,
SensorCrateError::ReadingError => Self::ReadingError,
}
}
}
(We have an edge case where this isn’t going to work. I’m going to come back and fix something, but bear with me.)
We can now write out our full logic. This looks very much like the final solution we had in the previous blog article:
fn take_temp(sensor: &Result<TempSensor,SensorCrateError>) -> Result<(),Error> {
sensor.as_ref().map_err(|e| Error::from(*e) // compiler error here!
.and_then(|s| s.get_reading().map_err(Error::from))
.map(send_telemetry)
.and_then(|success| if success {
Ok(())
} else {
Err(Error::NetworkError)
}
)
}
Unfortunately, I’ve run into a little snag here. When my errors all had the type &'static str
, life was relatively simple because I didn’t have to worry about the borrow checker. All my errors were references to static code-locations in memory. But now I have to think about lifetimes!
I’m passing a reference to a Result<TempSensor,SensorCrateError>
because I know I’m going to probably want to use the sensor multiple times. If I passed it as a value, my take_temp
function would consume it, and it wouldn’t be available for anything else!
When I call sensor.as_ref()
I’m converting my &Result<TempSensor,SensorCrateError>
into a Result<&TempSensor, &SensorCrateError>
. That is, both the sensor and the error (whichever one I get) will be a reference. Well, when I try to call Error::from(*e)
the compiler won’t let me move the error out of its ref. It can’t guarantee that I’m not going to eventually try to keep that value after the original sensor has gone out of scope.
I found that there were two ways I could solve this:
- I could add a
#[derive(Clone)]
line before my error enumeration and then write.map_err(Error::from(*e.clone()))
. - I could define both
From<SensorCrateError>
andFrom<&SensorCrateError>
since I need the former for theResult<Reading,SensorCrateError>
I get from.get_reading
and the later for theResult<&TempSensor,&SensorCrateError>
I pass into my function.
I’m going to pick the second option, because I can’t assume the sensor crate that I’m using has defined .clone()
. Also, you’ll see momentarily some really nice and tidy code that results. I’m going to write my error definition now like this:
enum Error {
SensorError,
ReadingError,
NetworkError
}
impl From<&SensorCrateError> for Error {
fn from(value: &SensorCrateError) -> Self {
match *value {
SensorCrateError::ConfigurationError => Self::SensorConfigError,
SensorCrateError::ReadingError => Self::ReadingError,
}
}
}
impl From<SensorCrateError> for Error {
fn from(value: SensorCrateError) -> Self {
From::from(&value)
}
}
fn take_temp(sensor: &Result<TempSensor,SensorCrateError>) -> Result<(),Error> {
sensor.as_ref().map_err(Error::from)
.and_then(|s| s.get_reading().map_err(Error::from))
.map(send_telemetry)
.and_then(|success| if success {
Ok(())
} else {
Err(Error::NetworkError)
}
)
}
This is okay, but I still don’t like the fact that I’m always calling .map_err
all the time to convert my error types.
One Last Gift from the “?” Comprehension
It turns out that the “?” operator will automatically do the error type mapping for you! In addition to making your Result
operators look like their simplified wrapped success types and handling mapping of normal operations and flat-mapping of nested Results
, it also takes a look at your function’s error return type, and if a From<ReturnError>::from(e: ReturnErrorType)
is defined anywhere, it will automatically call it for you! That means you can leave all those .map_err
bits out! Ergo…
fn take_temp(sensor: &Result<TempSensor,SensorCrateError>) -> Result<(),Error> {
let s = sensor.as_ref()?;
let reading = s.get_reading()?;
let success = send_telemetry(reading);
if success { Ok(()) } else { Err(Error::NetworkError) }
}
Voila!
Postscript: Embedding Foreign Errors
In some of the library source code that I’ve seen, when there are certain errors that are known to be coming from a certain library, the application Error
enumeration will sometimes wrap them. In other words, instead of this:
enum Error {
SensorError,
ReadingError,
NetworkError
}
impl From<SensorCrateError> for Error {
fn from(value: SensorCrateError) -> Self {
match value {
SensorCrateError::ConfigurationError => Self::SensorConfigError,
SensorCrateError::ReadingError => Self::ReadingError,
}
}
}
…where you’re essentially repeating each of the SensorCrateError
enumerations into its own value in your error type, they do this:
enum Error {
SensorError(SensorCrateError),
NetworkError
}
impl From<SensorCrateError> for Error {
fn from(value: SensorCrateError) -> Self {
Self::SensorError(value)
}
}
Note that if you need to be able to work with a reference of an error return value like we did, you will need to verify that the imported crate’s error type implemented the Clone trait, via #[derive(Clone)]
, so that we can write this:
impl From<&SensorCrateError> for Error {
fn from(value: &SensorCrateError) -> Self {
Self::SensorError(value.clone())
}
}