Rust Iterators
A Cheat Sheet

01 August 2019

When you start programming, one of the first constructs you learn about is the humble for loop. When I think back on my own experience, the for loop came shortly after learning about the if statement.

When I’m writing code in Rust, I find that my ideas can often be expressed more clearly and more succinctly if I forgo the for loop and rather make heavy use of Rust’s iterator functions.

Generally speaking, iterators are objects that specialize in walking over collections. This is a bit of a simplification (you can iterate over other things too), but walking over the items in a collection is a good enough mental model. You can write your own iterators for your types, but especially early on most of the benefit you’ll get is from knowing how to use the iterators already in the Rust standard library.

My intention in writing this article is to present a bunch of different functionality that iterators give you out of the box, where they’re useful, and examples of how to use them.

I’ll also be providing for each one how you’d get the same effect using a normal for loop and a mutable variable. The reason behind this is that, if your experience is like mine, you learned the imperative for loop patterns first and they’re familiar. Seeing how iterator patterns map to for loop patterns can help to build an intuitive understanding of what the iterator functions do. You’ll also notice that I’ve specified the types in my lambdas more than is necessary. This is to make it clearer what’s happening in the code samples just from reading them.

Where do you get these iterators?

The most common place that you’ll get iterators from is functions on your containers.

There are three common function names used to get iterators: iter(), iter_mut(), and into_iter().

let mut collection = vec![1, 2, 3];

// iter iterates over the collection, giving you a reference to each
// element. for_each consumes the iterator in a for loop.
collection
    .iter()
    .for_each(|item: &i32| {
        dbg!(item);
    });

// iter_mut is like iter, iterating over the collection and giving you
// a reference to each element, but the reference is mutable. Useful
// if you want to change the collection in place. You need to already
// have a mutable collection to call this.
collection
    .iter_mut()
    .for_each(|item: &mut i32| {
        *item += 1;
        dbg!(item);
    });

// into_iter consumes the collection, so it passes you ownership of
// each element. Great for if you want to avoid unnecessary copies,
// and this is the last thing you need to do with your source list.
collection
    .into_iter()
    .for_each(|item: i32| {
        dbg!(item);
    });

You can also create your own iterators. You just need to implement the Iterator trait. This is fairly simple since it only has one function: getting the next element from the iterator.

In the following examples I’ll mostly be using the iter() variant, which iterates over a reference to each item, since my samples use their source lists multiple times.

Mapping

Often, you have a collection of things, and you want a collection of something else. If you’ve done lots of imperative programming before, you’ll recognise this pattern: create a new collection for the results, then loop over the input and append each result to the destination.

You can do this with an iterator using map.

let src = vec![1, 2, 3];

let mut dest: Vec<String> = Vec::new();
for item in &src {
    // The essence of map is that for each item in your source list,
    // you're writing something into your destination list.
    dest.push(format!("{} doubled is {}", item, item * 2));
}

// The same format expression appears here, but it's in a call to
// map. The collect at the end takes the iterator and collects it into
// a Vec.
// (more details on collect later in this article)
let dest_with_map: Vec<String> = src
    .iter()
    .map(|item: &i32| format!("{} doubled is {}", item, item * 2))
    .collect();

dbg!(dest);
dbg!(dest_with_map);
// ["1 doubled is 2", "2 doubled is 4", "3 doubled is 6"]

It’s another common pattern when writing map-like code to discover that you have multiple mapped things to add to your output list for each thing in your input list. Think of a situation where you would use map, and then want to flatten your list of lists into a single flat list. That’s what flat_map is for!

let src = vec![ 1, 2, 3, 4 ];

let mut dest: Vec<i32> = Vec::new();
for item in &src {
    // In the imperative form, flat_map translates to a nested for
    // loop.
    for range_item in 0..*item {
        dest.push(range_item);
    }
}

let dest_with_flatmap: Vec<i32> = src
    .iter()
    .flat_map(|item: &i32| 0..*item)
    .collect();

dbg!(dest);
dbg!(dest_with_flatmap);
// [0, 0, 1, 0, 1, 2, 0, 1, 2, 3]

A final common pattern, especially early in your journey into learning to use iterators for everything, is still wanting to know the index of the item you’re working with. enumerate will put your data in tuples with their index.

let src = vec![5, 6, 7];

let mut dest: Vec<String> = Vec::new();
for i in 0..src.len() {
    // The imperative form forces you to index into your vector. It's
    // simple, but I've found this to be surprisingly error prone,
    // leading to index out of bounds errors or logic errors when I
    // use 'i' when I meant 'j'.
    dest.push(format!("src[{}] = {}", i, src[i]));
}

// Using enumerate you get the same effect, but you don't run the risk
// of using the wrong variable as your index, or accidentally going
// outside the bounds of your vector. Also, this works with data
// structures that you can't index into, like linked lists.
let dest_with_enumerate: Vec<String> = src
    .iter()
    .enumerate()
    .map(|(i, item): (usize, &i32)| format!("src[{}] = {}", i, item))
    .collect();

dbg!(dest);
dbg!(dest_with_enumerate);
// ["src[0] = 5", "src[1] = 6", "src[2] = 7"]

Filtering

Ok, so we can map from one list to another list, but what if we don’t want everything from the original list? That’s where filtering comes in. Pass in a function to determine if you want something or not.

let src = vec![1, 2, 3, 4, 5, 6];

let mut dest: Vec<i32> = Vec::new();
for item in &src {
    // The essence of filter is that you have an if statement, and you
    // write to your destination inside the filter.
    if item % 2 == 0 {
        dest.push(item.clone());
    }
}

let dest_with_filter: Vec<i32> = src
    .iter()
    .filter(|item: &&i32| **item % 2 == 0)
    .cloned()
    .collect();

dbg!(dest);
dbg!(dest_with_filter);
// [2, 4, 6]

Ok, something weird happened with the types there. What’s with the &&i32? Filter takes a reference to each item you’re iterating over. iter() is iterating over a reference to each item in the vector. That’s why filter in this case ends up taking a double reference.

There’s another special case for map and filter that comes up from time to time. Maybe you want to do a map that works on some values but not others, and only keep the values where the map worked. You could do that with a map, followed by a filter, but it’s easier to do it with filter_map!

let src = vec!["one", "1", "3", "not a number"];

let mut dest: Vec<i32> = Vec::new();
for item in &src {
    // This is a lot like a regular map, followed by a filter.
    let maybe_parsed: Option<i32> = item.parse().ok();
    if let Some(parsed) = maybe_parsed {
        dest.push(parsed);
    }
}

let dest_with_filter_map: Vec<i32> = src
    .iter()
    .filter_map(|item: &&str| item.parse().ok())
    .collect();

dbg!(dest);
dbg!(dest_with_filter_map);
// [1, 3]

Folding

If map is used to convert one item in the source list into one item in the destination list, and filter can only include or exclude individual items, how do you aggregate a list into a single result?

The answer is folding. Some other languages I’ve worked with also called this reduce, or aggregate.

The signature is simple, but it can take a while to appreciate how to use this. You can think of folding as combining all of the items in your list by sticking a function between each one.

let src = vec![1, 2, 3, 4, 5, 6];

// When you think "fold", think "accumulator variable"
let mut sum: i32 = 0;
for item in &src {
    sum = sum + item
}

// This is functionally equivalent to
// 0 + 1 + 2 + 3 + 4 + 5 + 6
// (the 0 is important in case the list is empty)
// It's a common convention to call the accumulator `acc`
let sum_with_fold: i32 = src
    .iter()
    .fold(0, |acc: i32, item: &i32| acc + item);

// Summing is a common use case, so there's a fold shorthand just for
// that in the standard library.
let sum_with_sum: i32 = src
    .iter()
    .sum();

// Notice that we don't have a `collect` on these, because `fold` is
// already consuming the iterator into the type that we want to end up
// with.

dbg!(sum);
dbg!(sum_with_fold);
dbg!(sum_with_sum);
// 21

There are a few more convenience functions for common uses of fold. Max and min find the biggest or smallest item in a list.

let src = vec![1, 2, 3, 4, 5, 6];

// Empty lists here give you None.
let max_with_fold: Option<i32> = src
    .iter()
    .fold(None, |acc: Option<i32>, item: &i32| {
        acc
            .map(|acc: i32| if *item > acc { item.clone() } else { acc })
            .or(Some(item.clone()))
    });

let max_with_max: Option<i32> = src
    .iter()
    .max()
    .cloned();

dbg!(max_with_fold);
dbg!(max_with_max);
// Some(6)

If you’re working with booleans, there are also convenience functions for finding if any of them are true, or if all of them are true.

let src = vec![1, 2, 3, 4, 5, 6];

let any_with_fold: bool = src
    .iter()
    .fold(false, |acc: bool, item: &i32| acc || *item > 3);

let any_with_any: bool = src
    .iter()
    .any(|item: &i32| *item > 3);

dbg!(any_with_fold);
dbg!(any_with_any);
// true

let all_with_fold: bool = src
    .iter()
    .fold(false, |acc: bool, item: &i32| acc && *item > 3);

let all_with_all: bool = src
    .iter()
    .all(|item: &i32| *item > 3);

dbg!(all_with_fold);
dbg!(all_with_all);
// false

You can also look through the items in an iterator to find the first item that matches a certain predicate with find. This is a lot like any, except for when you actually want the item that matches your predicate.

let src = vec![1, 2, 3, 4, 5, 6];

let mut found: Option<&i32> = None;
for item in &src {
    // This is a lot like a filter that stops after the first result.
    if *item > 3 {
        found = Some(item);
        break;
    }
}

let found_with_find: Option<&i32> = src
    .iter()
    .find(|item: &&i32| **item > 3);

dbg!(found);
dbg!(found_with_find);
// Some(4)

Find is similar to filter, where the predicate takes a reference to the iterator type. If you used iter() (rather than into_iter()), that iterator type is also a reference, so you get the unintuitive double reference (&&) in the type.

Collecting

Folding is very useful, but after a while you start to notice that you’re writing the same fold repeatedly. Some of them, like sum, min and max, aggregate into a result. Others are used to take an iterator and convert it into a concrete data structure of the items, like a Vec, or a HashMap.

You’ve already seen a few examples of me using the collect function to convert an iterator into a Vec. Under the covers, collect looks for a trait called FromIterator on the thing you’re collecting into.

This is especially useful if you’ve written your own type that you need to build from an iterator.

// This is a very standard linked list implementation for sake of
// example. Each element in the list is either the empty list (Nil),
// or a single item and a pointer (Box) to the rest of the list
// (Cons).
#[derive(Debug)]
enum List {
    Cons(i32, Box<List>),
    Nil
}

// We can then implement FromIterator, specifying the type of thing the
// iterator needs to be going over. In this case, we can take an
// iterator of i32s, and make our List of i32s. This can also be
// generic.
impl std::iter::FromIterator<i32> for List {
    // IntoIterator means it has the into_iter() function. If the
    // thing is already an iterator, like in all of the examples using
    // `collect` above, then into_iter() is a no-op which will be
    // optimized away by the compiler.
    fn from_iter<I: IntoIterator<Item=i32>>(iter: I) -> List {
        iter
            .into_iter()
            .fold(List::Nil, |acc: List, next: i32| List::Cons(next, Box::new(acc)))
    }
}

let src = vec![1, 2, 3];

// Having implemented FromIterator, we can collect into a List. The
// only difference from the examples above, where we collected into a
// Vec, is the type annotation on our variable.
let list: List = src
    .iter()
    .map(|item: &i32| item * 2)
    .collect();

dbg!(list);
// Cons(6, Cons(4, Cons(2, Nil)))

Composing

There are two ways of sticking two iterators together: zipping and chaining.

Zipping lines up two iterators and gives you an item from each iterator in a tuple. It’s good for situations where you have two lists, and items with the same index in their list refer to the same broader object. The enumerate function that we saw before is a type of zip. In fact, some languages call enumerate zip_with_index.

let src_numbers = vec![1, 2, 3];
let src_words = vec!["one", "two", "three"];

let mut dest: Vec<String> = Vec::new();
for i in 0..src_numbers.len().min(src_words.len()) {
    // Indexing into two lists can be particularly error prone. What
    // if one happens to be longer than the other, for example?
    let num_item: i32 = src_numbers[i];
    let word_item: &str = src_words[i];
    dest.push(format!("{}: {}", num_item, word_item));
}

// Notice how we can chain iterators, using zip to combine the two but
// after that it's just a normal map and collect.
let dest_with_zip: Vec<String> = src_numbers
    .iter()
    .zip(src_words.iter())
    .map(|(num_item, word_item): (&i32, &&str)| format!("{}: {}", num_item, word_item))
    .collect();

dbg!(dest);
dbg!(dest_with_zip);
// ["1: one", "2: two", "3: three"]

Chaining sticks an iterator onto the end of another one. It’s useful for when you have two lists, and you want to concatenate them, one after the other.

let src1 = vec![1, 2, 3];
let src2 = vec![4, 5, 6];

let mut dest: Vec<i32> = Vec::new();
// Without chain, you need this function body listed out twice.
for item in &src1 {
    dest.push(item * 2);
}
for item in &src2 {
    dest.push(item * 2);
}

// This version separates the composing of the iterators from the map
// that you wanted to do to them.
let dest_with_chain: Vec<i32> = src1
    .iter()
    .chain(src2.iter())
    .map(|item: &i32| item * 2)
    .collect();

dbg!(dest);
dbg!(dest_with_chain);
// [2, 4, 6, 8, 10, 12]

Some real world code

I’m entering this year’s Entelect challenge again. The game this year is a top down take on the game Worms, where you control your army of worms to find and destroy your opponent’s worms.

As part of that, I need some way of enumerating all of the ways that a single worm could move.

  1. They can move one cell in any direction (including diagonally).
  2. They can only move to a cell which is unoccupied.
  3. They cannot move off the edge of the map.
  4. If there is dirt on the map in their way, they can’t move there, but they can dig it out.

Let’s see how this list of requirements translates into code. Both the imperative and the iterator based code express the same logic and the same set of rules, but I feel the iterator code can be linked back to the rules more easily.

pub fn valid_moves_for_worm_imperative(
    worm: &Worm,
    map: &Map,
    occupied_cells: HashSet<Point2d>
) -> Vec<Action> {
    let mut valid_moves = Vec::new();

    // Direction::ALL is a constant list of all directions a worm
    // could walk, including diagonals.
    for dir in &Direction::ALL {
        // 1. Target is one cell in a direction from the worm
        let target = worm.position + Direction::as_vec(dir);
        // 2. We check that the target is free
        if !occupied_cells.contains(&target) {
            // 3. and 4. Air cells are moved to, Dirt cells are dug,
            // out of bounds cells do nothing.
            match map.at(target) {
                Some(MapCell::Air) => {
                    valid_moves.push(Action::Move(target));
                },
                Some(MapCell::Dirt) => {
                    valid_moves.push(Action::Dig(target));
                },
                _ => {},
            }
        }
    }

    // By the end of the loop, this Vec is the result I want, but
    // you have to read through to the most nested part to see where
    // it's being modified.
    valid_moves
}

pub fn valid_moves_for_worm(
    worm: &Worm,
    map: &Map,
    occupied_cells: HashSet<Point2d>
) -> Vec<Action> {
    Direction::ALL
        .iter()
         // 1. Target is one cell in a direction from the worm
        .map(|dir| worm.position + Direction::as_vec(dir))
         // 2. We check that the target is free
        .filter(|target| !occupied_cells.contains(target))
        .filter_map(|target| match map.at(target) {
            // 3. Air cells are moved to, Dirt cells are dug
            Some(MapCell::Air) => Some(Action::Move(target)),
            Some(MapCell::Dirt) => Some(Action::Dig(target)),
            // 4. Out of bounds cells do nothing
            None => None,
        })
        // This is all bundled up into a Vec here, no need to
        // manually create one and push items to it deep in my list.
        .collect()
}

Where to find out more

Hopefully this has you feeling a bit more comfortable using iterators in your own code. I find that replacing loops with iterators generally makes my code more concise and less nested. I hope that the same is true of your experience with iterators.

If you want to learn more, I find the best reference for any individual iterator function is the official standard library docs. I can also recommend the chapter on iterators in The Rust Programming Language.

Until next time, happy programming!