# Databases : etcd
\[ [src](https://github.com/etcd-io/etcd/) | [docs](https://etcd.io/docs/v3.5/) ]
#### Overview
A strongly-consistent, distributed, persistent, hierarchial key-value store implemented in Go.
- uses [[#Raft]] for concensus
- leader handles all client requests that need consensus, but client can send requests to any node (they’ll be forwarded)
- API is [[#gRPC]] over HTTP
- can monitor keys or trees for events
- supports auth with users and roles
#### Service Discovery
Cluster is bootstrapped by:
1. Generating a new discovery token. (UUID)
2. Setting a key on an existing etcd cluster to define the new cluster, including size.
`curl -X PUT http://example.com/v2/keys/_etcd/registry/${UUID}/_config/size -d value=${cluster_size}`
3. All nodes register themselves under that key (aka Discovery URL).
###### Public Discovery Services
- https://discovery.etcd.io/
## Raft
\[ [Wikipedia](https://en.wikipedia.org/wiki/Raft_(algorithm)) | [site](https://raft.github.io/) ]
#### Architecture
Client -> Leader -> Follower
###### Nodes
- are either a Leader (chosen by Election) or Follower
- have a State Machine and a Log
- the State Machine takes input Commands from the Log
- achieve consensus via the elected Leader
###### Leaders
- server for a Term, which is sequentially numbered, and ends when the Leader is unavailable
- regularly inform their followers of its existence by sending a heartbeat message
- are responsible for log replication to the followers
- once the leader receives confirmation from half or more of its followers that the AppendEntry has been replicated, the leader applies the entry to its local state machine, and the request is considered committed
###### Consensus
- the problem is decomposed into two independent sub-problems: Leader Election and Log Replication
- the algorithm must ensure that if any state machine applies set x to 3 as the nth command, no other state machine will ever apply a different nth command
#### Concepts
###### CAP Theorem
You can have two of:
- **Consistency**
- **Availability**
- **Partition Tolerance**
###### Failure Scenarios
- **Fail-stop** — participant leaves and doesn't come back
- **Fail-recover** — participant leaves and comes back
- **Network partition**
- **Byzantine failure** — malicious participant
> [!info]
> Raft is not a Byzantine fault tolerant (BFT) algorithm: The nodes trust the elected leader.
## gRPC
\[ [Wikipedia](https://en.wikipedia.org/wiki/GRPC) | [site](https://grpc.io/) | [grpcio (Py)](https://pypi.org/project/grpcio/) ]
- The "g" is for Google.
- It uses HTTP/2 for transport, Protocol Buffers as the interface description language, and provides features such as authentication, bidirectional streaming and flow control, blocking or nonblocking bindings, and cancellation and timeouts.
- gRPC's complex use of HTTP/2 makes it impossible to implement a gRPC client in the browser, instead requiring a proxy.