22.5 Sharing Data Safely Between Threads

A primary challenge in threaded programming is safely managing access to data shared between threads. Rust’s type system and standard library provide several primitives that guarantee data race freedom in safe code.

22.5.1 Shared Ownership: Arc<T>

When multiple threads need to own or have long-term access to the same piece of data on the heap, Arc<T> (Atomically Reference Counted) is the tool of choice. It’s a thread-safe version of Rc<T>. Arc<T> provides shared ownership of a value of type T by maintaining a reference count that is updated using atomic operations, making it safe to clone and share across threads.

  • Arc<T> can be cloned (Arc::clone(&my_arc)). Cloning increments the atomic reference count and returns a new Arc<T> pointer to the same allocation.
  • When an Arc<T> pointer is dropped, the reference count is atomically decremented.
  • The inner value T is dropped only when the reference count reaches zero.
  • For Arc<T> to be sendable between threads (Send) or accessible from multiple threads (Sync), the inner type T must itself be Send + Sync.

Arc<T> provides shared immutable access by default. To allow mutation of the shared data, Arc is typically combined with interior mutability types that provide synchronization, such as Mutex or RwLock.

22.5.2 Mutual Exclusion: Mutex<T>

A Mutex<T> (Mutual Exclusion) ensures that only one thread can access the data T it protects at any given time. To access the data, a thread must first acquire the mutex’s lock.

  • lock(): Attempts to acquire the lock. If the lock is already held by another thread, the current thread will block until the lock becomes available. It returns a Result<MutexGuard<T>, PoisonError<MutexGuard<T>>>.
    • A Mutex becomes “poisoned” if a thread panics while holding the lock. Subsequent calls to lock() on a poisoned mutex will return an Err(PoisonError). Using unwrap() on the result will propagate the panic, which is often the desired behavior to avoid operating on potentially inconsistent state. You can also handle the PoisonError explicitly if needed.
  • MutexGuard<T>: A smart pointer returned by a successful lock() call. It implements Deref and DerefMut, allowing access to the protected data T. Crucially, it also implements Drop. When the MutexGuard goes out of scope, its Drop implementation automatically releases the lock. This RAII (Resource Acquisition Is Initialization) pattern prevents accidentally forgetting to release the lock, a common bug in C/C++.

The standard pattern for sharing mutable state across threads is Arc<Mutex<T>>: Arc handles the shared ownership, and Mutex handles the synchronized exclusive access for mutation.

use std::sync::{Arc, Mutex};
use std::thread;

fn main() {
    // Wrap the counter in Mutex for synchronized access,
    // and Arc for shared ownership across threads.
    let counter = Arc::new(Mutex::new(0));
    let mut handles = vec![];

    for i in 0..10 {
        // Clone the Arc pointer. This increases the reference count.
        // The new Arc points to the same Mutex in memory.
        let counter_clone = Arc::clone(&counter);
        let handle = thread::spawn(move || {
            // Acquire the lock. Blocks if another thread holds it.
            // unwrap() panics if the mutex was poisoned.
            let mut num: std::sync::MutexGuard<i32> = counter_clone.lock().unwrap();

            // Access the data via the MutexGuard (dereferences to &mut i32).
            *num += 1;
            println!("Thread {} incremented count to {}", i, *num);

            // The lock is automatically released when 'num' (the MutexGuard)
            // goes out of scope at the end of this block (RAII).
        });
        handles.push(handle);
    }

    // Wait for all threads to complete their work.
    for handle in handles {
        handle.join().unwrap();
    }

    // Lock the mutex in the main thread to read the final value.
    // Need .lock() even for reading, as Mutex provides exclusive access.
    println!("Final count: {}", *counter.lock().unwrap()); // Should be 10
}

22.5.3 Read-Write Locks: RwLock<T>

An RwLock<T> (Read-Write Lock) offers more flexible locking than a Mutex. It allows multiple threads to hold read locks concurrently or allows a single thread to hold a write lock exclusively. This can improve performance for data structures that are read much more often than they are written, as readers do not block each other.

  • read(): Acquires a read lock. Blocks if a write lock is currently held. Returns Result<RwLockReadGuard<T>, PoisonError<...>>. Multiple threads can hold read locks simultaneously.
  • write(): Acquires a write lock. Blocks if any read locks or a write lock are currently held. Returns Result<RwLockWriteGuard<T>, PoisonError<...>>. Only one thread can hold the write lock.
  • RwLockReadGuard<T> / RwLockWriteGuard<T>: RAII guards similar to MutexGuard. They provide access (Deref for read, Deref/DerefMut for write) and automatically release the lock when dropped. Poisoning works similarly to Mutex.
use std::sync::{Arc, RwLock};
use std::thread;
use std::time::Duration;

fn main() {
    let config = Arc::new(RwLock::new(String::from("Initial Config")));
    let mut handles = vec![];

    // Spawn reader threads
    for i in 0..3 {
        let config_clone = Arc::clone(&config);
        let handle = thread::spawn(move || {
            // Acquire a read lock (shared access).
            let cfg: std::sync::RwLockReadGuard<String> = config_clone.read().unwrap();
            println!("Reader {}: Config is '{}'", i, *cfg);
            thread::sleep(Duration::from_millis(50)); // Simulate work
            // Read lock released when 'cfg' drops.
        });
        handles.push(handle);
    }

    // Wait briefly to ensure readers likely acquire locks first
    thread::sleep(Duration::from_millis(10));

    // Spawn a writer thread
    let config_clone_w = Arc::clone(&config);
    let writer_handle = thread::spawn(move || {
        println!("Writer: Attempting to acquire write lock...");
        // Acquire a write lock (exclusive access). Blocks until all readers release.
        let mut cfg: std::sync::RwLockWriteGuard<String> = config_clone_w.write().unwrap();
        *cfg = String::from("Updated Config");
        println!("Writer: Config updated.");
        // Write lock released when 'cfg' drops.
    });
    handles.push(writer_handle);

    // Wait for all threads
    for handle in handles {
        handle.join().unwrap();
    }

    println!("Final config: {}", *config.read().unwrap());
}

Caution: RwLock can suffer from “writer starvation” on some platforms if there is a continuous stream of readers, potentially preventing a writer from ever acquiring the lock. Behavior can be platform-dependent.

22.5.4 Condition Variables: Condvar

A Condvar (Condition Variable) allows threads to wait efficiently for a specific condition to become true. Condition variables are almost always used together with a Mutex to protect the shared state representing the condition.

The typical pattern is:

  1. A waiting thread acquires the Mutex.
  2. It checks the condition based on the shared state protected by the Mutex.
  3. If the condition is false, it calls condvar.wait(guard) passing the MutexGuard. This atomically releases the mutex lock and puts the thread to sleep.
  4. When the thread is woken up (by another thread calling notify_one or notify_all), wait() automatically re-acquires the mutex lock before returning the new MutexGuard.
  5. The waiting thread must re-check the condition in a loop (a while loop is idiomatic) because wakeups can be “spurious” (occurring without a notification) or the condition might have changed again between the notification and the lock re-acquisition.
  6. A notifying thread acquires the same Mutex.
  7. It modifies the shared state, making the condition true.
  8. It calls condvar.notify_one() (wakes up one waiting thread) or condvar.notify_all() (wakes up all waiting threads).
  9. It releases the Mutex (typically via RAII when its guard goes out of scope).

This pattern closely mirrors the usage of pthread_cond_t and pthread_mutex_t in C, but Rust’s type system ensures the mutex is correctly held and released.

use std::sync::{Arc, Mutex, Condvar};
use std::thread;
use std::time::Duration;

fn main() {
    // Shared state: a boolean flag protected by a Mutex, paired with a Condvar.
    let pair = Arc::new((Mutex::new(false), Condvar::new()));
    let pair_clone = Arc::clone(&pair);

    // Waiter thread
    let waiter_handle = thread::spawn(move || {
        let (lock, cvar) = &*pair_clone; // Destructure the tuple inside the Arc
        println!("Waiter: Waiting for notification...");

        // 1. Acquire the lock
        let mut started_guard = lock.lock().unwrap();

        // 2. Check condition in a loop & 3. Wait if false
        while !*started_guard {
            println!("Waiter: Condition false, waiting...");
            // wait() atomically releases the lock and waits.
            // Re-acquires lock before returning.
            started_guard = cvar.wait(started_guard).unwrap();
            println!("Waiter: Woken up, re-checking condition...");
        }

        // 5. Condition is now true
        println!("Waiter: Condition met! Proceeding.");
        // Lock automatically released when started_guard drops here.
    });

    // Notifier thread (main thread)
    println!("Notifier: Doing some work...");
    thread::sleep(Duration::from_secs(1)); // Simulate work before notifying

    let (lock, cvar) = &*pair; // Destructure the original pair

    // 6. Acquire the lock
    { // Scope for the lock guard
        let mut started_guard = lock.lock().unwrap();
        // 7. Modify shared state
        *started_guard = true;
        println!("Notifier: Set condition to true.");
        // 8. Notify one waiting thread
        cvar.notify_one();
        println!("Notifier: Notified waiter.");
        // 9. Lock released here when started_guard drops.
    } // End of scope for lock guard

    waiter_handle.join().unwrap();
    println!("Notifier: Waiter thread finished.");
}

22.5.5 Atomic Types

For simple primitive types (bool, integers, pointers), Rust provides atomic types in std::sync::atomic (e.g., AtomicBool, AtomicUsize, AtomicIsize, AtomicPtr). These types guarantee that operations performed on them are atomic—they complete indivisibly without interruption from other threads, even without using explicit locks like Mutex.

Atomic operations include:

  • load(): Atomically read the value.
  • store(): Atomically write the value.
  • swap(): Atomically write a new value and return the previous value.
  • compare_exchange(current, new, ...): Atomically compare the current value with current, and if they match, write new. Returns the previous value. Useful for implementing lock-free algorithms.
  • Workspace_add(), Workspace_sub(), Workspace_and(), Workspace_or(), Workspace_xor(): Atomically perform the operation (e.g., add) and return the previous value.

These operations require specifying a memory ordering (Ordering), such as Relaxed, Acquire, Release, AcqRel, or SeqCst (Sequentially Consistent). Memory ordering controls how atomic operations synchronize memory visibility between threads, preventing unexpected behavior due to compiler or CPU reordering of instructions. Understanding memory ordering is complex and crucial for correctness in lock-free programming, similar to std::memory_order in C++. For simple counters or flags, Relaxed (least strict) or SeqCst (most strict, default, easiest to reason about but potentially slower) are often sufficient starting points.

use std::sync::atomic::{AtomicUsize, Ordering};
use std::sync::Arc;
use std::thread;

fn main() {
    // Use Arc to share the atomic counter among threads.
    let shared_counter = Arc::new(AtomicUsize::new(0));
    let mut handles = vec![];

    for _ in 0..10 {
        let counter_clone = Arc::clone(&shared_counter);
        handles.push(thread::spawn(move || {
            for _ in 0..1000 {
                // Atomically increment the counter.
                // Ordering::Relaxed is sufficient here because we only care
                // about the final count, not the order of increments relative
                // to other memory operations.
                counter_clone.fetch_add(1, Ordering::Relaxed);
            }
        }));
    }

    for handle in handles {
        handle.join().unwrap();
    }

    // Atomically load the final value.
    // Ordering::SeqCst provides the strongest guarantees, ensuring all previous
    // writes (from any thread) are visible before this load.
    let final_count = shared_counter.load(Ordering::SeqCst);
    println!("Atomic counter final value: {}", final_count); // Should be 10000
}

Atomics are more efficient than mutexes for simple operations but are limited to primitive types and require careful handling of memory ordering for complex interactions.

22.5.6 Scoped Threads for Borrowing (Rust 1.63+)

As mentioned earlier, std::thread::spawn requires closures with a 'static lifetime, preventing them from directly borrowing local data from the parent thread’s stack unless that data is itself 'static. This often forces the use of Arc even when true shared ownership isn’t strictly necessary.

Scoped threads, introduced via std::thread::scope, provide a solution. This function creates a scope, and any threads spawned within that scope using the provided scope object (s in the example below) are guaranteed by the compiler to finish before the scope function returns. This guarantee allows threads spawned within the scope to safely borrow data from the parent stack frame that outlives the scope.

use std::thread;

fn main() {
    let mut numbers = vec![1, 2, 3];
    let mut message = String::from("Hello"); // Mutable data

    println!("Before scope: message = '{}'", message);

    // Create a scope for threads that can borrow local data.
    thread::scope(|s| {
        // Spawn a thread that immutably borrows 'numbers'.
        s.spawn(|| {
            // 'numbers' is borrowed here.
            println!("Scoped thread 1 sees numbers: {:?}", numbers);
            // The borrow ends when this thread finishes.
        });

        // Spawn another thread that mutably borrows 'message'.
        s.spawn(|| {
            // 'message' is mutably borrowed here.
            message.push_str(" from scoped thread 2!");
            println!("Scoped thread 2 modified message.");
            // The mutable borrow ends when this thread finishes.
        });

        // Note: Rust's borrowing rules still apply *within* the scope.
        // You couldn't, for example, spawn two threads that both try to
        // mutably borrow 'message' simultaneously. The compiler prevents this.

        println!("Main thread inside scope, after spawning.");
        // The 'scope' function implicitly waits here for all threads
        // spawned via 's' to complete before it returns.
    }); // <- All threads guaranteed joined here.

    // Scoped threads have finished, borrows have ended.
    // We can safely access 'numbers' and 'message' again.
    numbers.push(4);
    println!("After scope: message = '{}'", message); // Shows modification
    println!("After scope: numbers = {:?}", numbers);
}

Scoped threads make many common concurrent patterns, especially those involving partitioning work over borrowed data, significantly more ergonomic than using Arc or other complex lifetime management techniques. The compiler statically verifies that the borrowed data will live long enough.