Rust provides implementations for commonly used collections in its std
library. They come with different guarantees and for different purposes, and are usually applicable for 90% use cases.
For better understanding std::collections
purpose, design, limitations and use cases, read through the following articles:
- Rust Book: 8. Common Collections
- Rust By Example: 19.2. Vectors
- Rust By Example: 19.7. HashMap
- Official
std::collections
docs
Iterators are heavily used in idiomatic Rust code, so it's worth becoming familiar with them.
While collection represents a some complete set of data, an Iterator
is a way of iteration over its elements.
An iterator has a method,
next
, which when called, returns anOption<Item>
.next
will returnSome(Item)
as long as there are elements, and once they've all been exhausted, will returnNone
to indicate that iteration is finished. Individual iterators may choose to resume iteration, and so callingnext
again may or may not eventually start returningSome(Item)
again at some point.Iterators are also composable, and it's common to chain them together to do more complex forms of processing.
There are three forms of iteration over a collection in Rust:
iter()
iterates over borrowed elements (&T
), so used for read-only operations with a collection.iter_mut()
iterates over mutably borrowed elements (&mut T
), so used when in-place elements mutation is required.into_iter()
iterates over owned element (T
), so used when whole collection transformation and/or moving is required.
It's important to remember, that iterators (and their adapters) are lazy. Iterator
does nothing, unless its next()
method is called. This property leads to the next one: iterators do not have to be finite. So, if you need a sort of an infinite collection (like endless fibonacci sequence), an Iterator
implementation is a way to go, as each new element will be evaluated lazily on request.
Iterator
comes with a lot of powerful and useful adapters in std
library, which makes them highly composable and pleasant to use. If std
capabilities are not enough for your needs, consider to use itertools
crate, which provides more non-trivial adapters.
For better understanding Rust iterators purpose, design, limitations and use cases, read through the following articles:
Immutable collections (aka "persistent data structures") are collections which preserve interface and behavior of its mutable analogues, but have a different implementation under-the-hood, which allows each piece of code to work with its own copy of a whole collection without worrying about accidentally changing elements for others. The key feature is in implicit data deduplication. This inevitably comes in a price of performance, so immutable collection has other performance guarantees than mutable ones.
Rust ecosystem has im
and rpds
crates, which provide immutable implementations for some collections.
For better understanding immutable collections' nature, design, and a motivation behind them, read through the following articles:
- Official
im
crate docs - Wikipedia: Persistent data structure
- Jean Niklas L'orange: Understanding Clojure's Persistent Vectors, pt. 1
- Jean Niklas L'orange: Understanding Clojure's Persistent Vectors, pt. 2
- Jean Niklas L'orange: Understanding Clojure's Persistent Vectors, pt. 3
- Immutable Data Structures and why You want them
When you need to operate with the same collection from multiple threads, the most common and obvious way to go is to put it behind some synchronization primitive (like Arc<RwLock<VecDeque<T>>>
, for example). However, this performs too bad for an extensive use of a collection. That's why concurrent collections exist: they allow usage of a collection from multiple threads without explicit synchronization and provide efficient synchronization mechanism under-the-hood (usually, leveraging lock-free algorithms).
Rust ecosystem has crossbeam
and lockfree
crates, providing efficient lock-free implementations for some collections usually used in a concurrent context. Also, consider flurry
and chashmap
crates for a concurrent hash map implementation.
For better understanding concurrent collections' nature, design, and a motivation behind them, read through the following articles:
- Aaron Turon: Lock-freedom without garbage collection
- Stjepan Glavina: Lock-free Rust: Crossbeam in 2019
- Wikipedia: Non-blocking algorithm
- Ibraheem Ahmed: A Lock-Free Vector
Estimated time: 1 day
Write a simple UsersRepository
trait, which supports 3 operations (consider to chose correct collections):
- returns single
User
by its ID; - returns multiple
User
s by their IDs; - return IDs of
User
s whichnickname
contains given string (search function).
Provide an implementation of UsersRepository
trait backed by some immutable collection.
Prove your implementation correctness with tests.
After completing everything above, you should be able to answer (and understand why) the following questions:
- What is a collection? What is an iterator? How do they differ? How are they used? Which limitations does each one have?
- What are immutable collections? How do they work? Why shouldn't we use them all the time? When does it make sense to use them?
- What are concurrent collections? How do they work? Why are they better than explicit synchronization on a normal collection?