Crate simple_raft[][src]

Expand description

Raft consensus algorithm implementation.

Raft is a consensus algorithm which replicates a strongly-consistent distributed log of entries with arbitrary data amongst a group of peers. It is also fault-tolerant, allowing replication to continue while a majority of peers can still communicate with each other. This crate provides an implementation of the Raft consensus algorithm with some optional features not implemented, such as pre-voting, membership changes, and snapshots.

The Raft algorithm is implemented as a state machine driven in a few ways:

  • When attempting to append a new entry to the distributed log: append is called.
  • When a message is received from a peer: receive is called.
  • Every time a fixed amount of time has elapsed: timer_tick is called.

Each of these functions modifies the internal state and returns messages to be sent to peers. Once a log entry is “committed”, or guaranteed to be returned at the same index on every functioning peer in the group, it may be retrieved using take_committed. An append to the log may be cancelled before reaching the committed state, however, which is discussed in more detail in “Appending entries to the distributed log”.

The backing storage for the distributed log must be provided as an implementation of the RaftLog trait, with careful attention to following the trait specification. A trivial in-memory implementation is provided by RaftLogMemory.

Example

use simple_raft::log::mem::RaftLogMemory;
use simple_raft::node::{RaftConfig, RaftNode};
use simple_raft::message::{RaftMessageDestination, SendableRaftMessage};
use rand_chacha::ChaChaRng;
use rand_core::SeedableRng;
use std::collections::VecDeque;
use std::str;

// Construct 5 Raft peers
type NodeId = usize;
let mut peers = (0..5).map(|id: NodeId| RaftNode::new(
    id,
    (0..5).collect(),
    RaftLogMemory::new_unbounded(),
    ChaChaRng::seed_from_u64(id as u64),
    RaftConfig {
        election_timeout_ticks: 10,
        heartbeat_interval_ticks: 1,
        replication_chunk_size: usize::max_value(),
    },
)).collect::<Vec<_>>();

// Simulate reliably sending messages instantaneously between peers
let mut inboxes = vec![VecDeque::new(); peers.len()];
let send_message = |src_id: NodeId, sendable: SendableRaftMessage<NodeId>, inboxes: &mut Vec<VecDeque<_>>| {
    match sendable.dest {
        RaftMessageDestination::Broadcast => {
            println!("peer {} -> all: {}", src_id, &sendable.message);
            inboxes.iter_mut().for_each(|inbox| inbox.push_back((src_id, sendable.message.clone())))
        }
        RaftMessageDestination::To(dst_id) => {
            println!("peer {} -> peer {}: {}", src_id, dst_id, &sendable.message);
            inboxes[dst_id].push_back((src_id, sendable.message));
        }
    }
};

// Loop until a log entry is committed on all peers
let mut appended = false;
let mut peers_committed = vec![false; peers.len()];
while !peers_committed.iter().all(|seen| *seen) {
    for (peer_id, peer) in peers.iter_mut().enumerate() {
        // Tick the timer
        let new_messages = peer.timer_tick();
        new_messages.for_each(|message| send_message(peer_id, message, &mut inboxes));

        // Append a log entry on the leader
        if !appended && peer.is_leader() {
            if let Ok(new_messages) = peer.append("Hello world!") {
                new_messages.for_each(|message| send_message(peer_id, message, &mut inboxes));
                appended = true;
            }
        }

        // Process message inbox
        while let Some((src_id, message)) = inboxes[peer_id].pop_front() {
            let new_messages = peer.receive(message, src_id);
            new_messages.for_each(|message| send_message(peer_id, message, &mut inboxes));
        }

        // Check for committed log entries
        for log_entry in peer.take_committed() {
            if !log_entry.data.is_empty() {
                println!("peer {} saw commit {}", peer_id, str::from_utf8(&log_entry.data).unwrap());
                assert!(!peers_committed[peer_id]);
                peers_committed[peer_id] = true;
            }
        }
    }
}

Modules

Unstable, low-level API for the complete state of a Raft node.

Types related to Raft log storage.

Raft message types for sending between nodes.

Higher-level API for a Raft node.