22.5 Sharing Data Safely Between Threads
A primary challenge in threaded programming is safely managing access to data shared between threads. Rust’s type system and standard library provide several primitives that guarantee data race freedom in safe code.
22.5.1 Shared Ownership: Arc<T>
When multiple threads need to own or have long-term access to the same piece of data on the heap, Arc<T>
(Atomically Reference Counted) is the tool of choice. It’s a thread-safe version of Rc<T>
. Arc<T>
provides shared ownership of a value of type T
by maintaining a reference count that is updated using atomic operations, making it safe to clone and share across threads.
Arc<T>
can be cloned (Arc::clone(&my_arc)
). Cloning increments the atomic reference count and returns a newArc<T>
pointer to the same allocation.- When an
Arc<T>
pointer is dropped, the reference count is atomically decremented. - The inner value
T
is dropped only when the reference count reaches zero. - For
Arc<T>
to be sendable between threads (Send
) or accessible from multiple threads (Sync
), the inner typeT
must itself beSend + Sync
.
Arc<T>
provides shared immutable access by default. To allow mutation of the shared data, Arc
is typically combined with interior mutability types that provide synchronization, such as Mutex
or RwLock
.
22.5.2 Mutual Exclusion: Mutex<T>
A Mutex<T>
(Mutual Exclusion) ensures that only one thread can access the data T
it protects at any given time. To access the data, a thread must first acquire the mutex’s lock.
lock()
: Attempts to acquire the lock. If the lock is already held by another thread, the current thread will block until the lock becomes available. It returns aResult<MutexGuard<T>, PoisonError<MutexGuard<T>>>
.- A
Mutex
becomes “poisoned” if a thread panics while holding the lock. Subsequent calls tolock()
on a poisoned mutex will return anErr(PoisonError)
. Usingunwrap()
on the result will propagate the panic, which is often the desired behavior to avoid operating on potentially inconsistent state. You can also handle thePoisonError
explicitly if needed.
- A
MutexGuard<T>
: A smart pointer returned by a successfullock()
call. It implementsDeref
andDerefMut
, allowing access to the protected dataT
. Crucially, it also implementsDrop
. When theMutexGuard
goes out of scope, itsDrop
implementation automatically releases the lock. This RAII (Resource Acquisition Is Initialization) pattern prevents accidentally forgetting to release the lock, a common bug in C/C++.
The standard pattern for sharing mutable state across threads is Arc<Mutex<T>>
: Arc
handles the shared ownership, and Mutex
handles the synchronized exclusive access for mutation.
use std::sync::{Arc, Mutex}; use std::thread; fn main() { // Wrap the counter in Mutex for synchronized access, // and Arc for shared ownership across threads. let counter = Arc::new(Mutex::new(0)); let mut handles = vec![]; for i in 0..10 { // Clone the Arc pointer. This increases the reference count. // The new Arc points to the same Mutex in memory. let counter_clone = Arc::clone(&counter); let handle = thread::spawn(move || { // Acquire the lock. Blocks if another thread holds it. // unwrap() panics if the mutex was poisoned. let mut num: std::sync::MutexGuard<i32> = counter_clone.lock().unwrap(); // Access the data via the MutexGuard (dereferences to &mut i32). *num += 1; println!("Thread {} incremented count to {}", i, *num); // The lock is automatically released when 'num' (the MutexGuard) // goes out of scope at the end of this block (RAII). }); handles.push(handle); } // Wait for all threads to complete their work. for handle in handles { handle.join().unwrap(); } // Lock the mutex in the main thread to read the final value. // Need .lock() even for reading, as Mutex provides exclusive access. println!("Final count: {}", *counter.lock().unwrap()); // Should be 10 }
22.5.3 Read-Write Locks: RwLock<T>
An RwLock<T>
(Read-Write Lock) offers more flexible locking than a Mutex
. It allows multiple threads to hold read locks concurrently or allows a single thread to hold a write lock exclusively. This can improve performance for data structures that are read much more often than they are written, as readers do not block each other.
read()
: Acquires a read lock. Blocks if a write lock is currently held. ReturnsResult<RwLockReadGuard<T>, PoisonError<...>>
. Multiple threads can hold read locks simultaneously.write()
: Acquires a write lock. Blocks if any read locks or a write lock are currently held. ReturnsResult<RwLockWriteGuard<T>, PoisonError<...>>
. Only one thread can hold the write lock.RwLockReadGuard<T>
/RwLockWriteGuard<T>
: RAII guards similar toMutexGuard
. They provide access (Deref
for read,Deref
/DerefMut
for write) and automatically release the lock when dropped. Poisoning works similarly toMutex
.
use std::sync::{Arc, RwLock}; use std::thread; use std::time::Duration; fn main() { let config = Arc::new(RwLock::new(String::from("Initial Config"))); let mut handles = vec![]; // Spawn reader threads for i in 0..3 { let config_clone = Arc::clone(&config); let handle = thread::spawn(move || { // Acquire a read lock (shared access). let cfg: std::sync::RwLockReadGuard<String> = config_clone.read().unwrap(); println!("Reader {}: Config is '{}'", i, *cfg); thread::sleep(Duration::from_millis(50)); // Simulate work // Read lock released when 'cfg' drops. }); handles.push(handle); } // Wait briefly to ensure readers likely acquire locks first thread::sleep(Duration::from_millis(10)); // Spawn a writer thread let config_clone_w = Arc::clone(&config); let writer_handle = thread::spawn(move || { println!("Writer: Attempting to acquire write lock..."); // Acquire a write lock (exclusive access). Blocks until all readers release. let mut cfg: std::sync::RwLockWriteGuard<String> = config_clone_w.write().unwrap(); *cfg = String::from("Updated Config"); println!("Writer: Config updated."); // Write lock released when 'cfg' drops. }); handles.push(writer_handle); // Wait for all threads for handle in handles { handle.join().unwrap(); } println!("Final config: {}", *config.read().unwrap()); }
Caution: RwLock
can suffer from “writer starvation” on some platforms if there is a continuous stream of readers, potentially preventing a writer from ever acquiring the lock. Behavior can be platform-dependent.
22.5.4 Condition Variables: Condvar
A Condvar
(Condition Variable) allows threads to wait efficiently for a specific condition to become true. Condition variables are almost always used together with a Mutex
to protect the shared state representing the condition.
The typical pattern is:
- A waiting thread acquires the
Mutex
. - It checks the condition based on the shared state protected by the
Mutex
. - If the condition is false, it calls
condvar.wait(guard)
passing theMutexGuard
. This atomically releases the mutex lock and puts the thread to sleep. - When the thread is woken up (by another thread calling
notify_one
ornotify_all
),wait()
automatically re-acquires the mutex lock before returning the newMutexGuard
. - The waiting thread must re-check the condition in a loop (a
while
loop is idiomatic) because wakeups can be “spurious” (occurring without a notification) or the condition might have changed again between the notification and the lock re-acquisition. - A notifying thread acquires the same
Mutex
. - It modifies the shared state, making the condition true.
- It calls
condvar.notify_one()
(wakes up one waiting thread) orcondvar.notify_all()
(wakes up all waiting threads). - It releases the
Mutex
(typically via RAII when its guard goes out of scope).
This pattern closely mirrors the usage of pthread_cond_t
and pthread_mutex_t
in C, but Rust’s type system ensures the mutex is correctly held and released.
use std::sync::{Arc, Mutex, Condvar}; use std::thread; use std::time::Duration; fn main() { // Shared state: a boolean flag protected by a Mutex, paired with a Condvar. let pair = Arc::new((Mutex::new(false), Condvar::new())); let pair_clone = Arc::clone(&pair); // Waiter thread let waiter_handle = thread::spawn(move || { let (lock, cvar) = &*pair_clone; // Destructure the tuple inside the Arc println!("Waiter: Waiting for notification..."); // 1. Acquire the lock let mut started_guard = lock.lock().unwrap(); // 2. Check condition in a loop & 3. Wait if false while !*started_guard { println!("Waiter: Condition false, waiting..."); // wait() atomically releases the lock and waits. // Re-acquires lock before returning. started_guard = cvar.wait(started_guard).unwrap(); println!("Waiter: Woken up, re-checking condition..."); } // 5. Condition is now true println!("Waiter: Condition met! Proceeding."); // Lock automatically released when started_guard drops here. }); // Notifier thread (main thread) println!("Notifier: Doing some work..."); thread::sleep(Duration::from_secs(1)); // Simulate work before notifying let (lock, cvar) = &*pair; // Destructure the original pair // 6. Acquire the lock { // Scope for the lock guard let mut started_guard = lock.lock().unwrap(); // 7. Modify shared state *started_guard = true; println!("Notifier: Set condition to true."); // 8. Notify one waiting thread cvar.notify_one(); println!("Notifier: Notified waiter."); // 9. Lock released here when started_guard drops. } // End of scope for lock guard waiter_handle.join().unwrap(); println!("Notifier: Waiter thread finished."); }
22.5.5 Atomic Types
For simple primitive types (bool
, integers, pointers), Rust provides atomic types in std::sync::atomic
(e.g., AtomicBool
, AtomicUsize
, AtomicIsize
, AtomicPtr
). These types guarantee that operations performed on them are atomic—they complete indivisibly without interruption from other threads, even without using explicit locks like Mutex
.
Atomic operations include:
load()
: Atomically read the value.store()
: Atomically write the value.swap()
: Atomically write a new value and return the previous value.compare_exchange(current, new, ...)
: Atomically compare the current value withcurrent
, and if they match, writenew
. Returns the previous value. Useful for implementing lock-free algorithms.Workspace_add()
,Workspace_sub()
,Workspace_and()
,Workspace_or()
,Workspace_xor()
: Atomically perform the operation (e.g., add) and return the previous value.
These operations require specifying a memory ordering (Ordering
), such as Relaxed
, Acquire
, Release
, AcqRel
, or SeqCst
(Sequentially Consistent). Memory ordering controls how atomic operations synchronize memory visibility between threads, preventing unexpected behavior due to compiler or CPU reordering of instructions. Understanding memory ordering is complex and crucial for correctness in lock-free programming, similar to std::memory_order
in C++. For simple counters or flags, Relaxed
(least strict) or SeqCst
(most strict, default, easiest to reason about but potentially slower) are often sufficient starting points.
use std::sync::atomic::{AtomicUsize, Ordering}; use std::sync::Arc; use std::thread; fn main() { // Use Arc to share the atomic counter among threads. let shared_counter = Arc::new(AtomicUsize::new(0)); let mut handles = vec![]; for _ in 0..10 { let counter_clone = Arc::clone(&shared_counter); handles.push(thread::spawn(move || { for _ in 0..1000 { // Atomically increment the counter. // Ordering::Relaxed is sufficient here because we only care // about the final count, not the order of increments relative // to other memory operations. counter_clone.fetch_add(1, Ordering::Relaxed); } })); } for handle in handles { handle.join().unwrap(); } // Atomically load the final value. // Ordering::SeqCst provides the strongest guarantees, ensuring all previous // writes (from any thread) are visible before this load. let final_count = shared_counter.load(Ordering::SeqCst); println!("Atomic counter final value: {}", final_count); // Should be 10000 }
Atomics are more efficient than mutexes for simple operations but are limited to primitive types and require careful handling of memory ordering for complex interactions.
22.5.6 Scoped Threads for Borrowing (Rust 1.63+)
As mentioned earlier, std::thread::spawn
requires closures with a 'static
lifetime, preventing them from directly borrowing local data from the parent thread’s stack unless that data is itself 'static
. This often forces the use of Arc
even when true shared ownership isn’t strictly necessary.
Scoped threads, introduced via std::thread::scope
, provide a solution. This function creates a scope, and any threads spawned within that scope using the provided scope object (s
in the example below) are guaranteed by the compiler to finish before the scope
function returns. This guarantee allows threads spawned within the scope to safely borrow data from the parent stack frame that outlives the scope.
use std::thread; fn main() { let mut numbers = vec![1, 2, 3]; let mut message = String::from("Hello"); // Mutable data println!("Before scope: message = '{}'", message); // Create a scope for threads that can borrow local data. thread::scope(|s| { // Spawn a thread that immutably borrows 'numbers'. s.spawn(|| { // 'numbers' is borrowed here. println!("Scoped thread 1 sees numbers: {:?}", numbers); // The borrow ends when this thread finishes. }); // Spawn another thread that mutably borrows 'message'. s.spawn(|| { // 'message' is mutably borrowed here. message.push_str(" from scoped thread 2!"); println!("Scoped thread 2 modified message."); // The mutable borrow ends when this thread finishes. }); // Note: Rust's borrowing rules still apply *within* the scope. // You couldn't, for example, spawn two threads that both try to // mutably borrow 'message' simultaneously. The compiler prevents this. println!("Main thread inside scope, after spawning."); // The 'scope' function implicitly waits here for all threads // spawned via 's' to complete before it returns. }); // <- All threads guaranteed joined here. // Scoped threads have finished, borrows have ended. // We can safely access 'numbers' and 'message' again. numbers.push(4); println!("After scope: message = '{}'", message); // Shows modification println!("After scope: numbers = {:?}", numbers); }
Scoped threads make many common concurrent patterns, especially those involving partitioning work over borrowed data, significantly more ergonomic than using Arc
or other complex lifetime management techniques. The compiler statically verifies that the borrowed data will live long enough.