Gitaly Transactions
๐ฅ 2025-02-04
A transaction is a group of operations performing a logical change. ACID (everything or nothing.)
๐งช ACID¶
- Atomicity -- all or nothing
- Consistency -- leave the DB in a good and correct state
- Isolation -- executes as if it's the only operation running
- Durability -- changes will not be lost after being acknowledged
๐ฎ๐น Gitaly¶
- gRPC wrapper around
git
- RPC handlers are implemented as invocations of the
git
CLI - repos on disk are normal
git
repos
๐ญ Problems in git¶
- insufficient concurrency control
- insufficient recovery from write interruptions
- partially applied writes
- garbage left in the repo
- no atomicity, isolation or consistency
- Operations unnecessarily fail due to concurrent changes: refs changed while git fsck running are likely to be reported as missing, resulting false positives for repository checks
- Residual temp packfiles are wasting terabytes of storage
๐ Enter Transaction Management in Gitaly¶
- ACID
- Snapshot isolation
- MVCC
- Optimistic concurrency control
- Write-ahead logging so transactions don't go straight to the disk.
- partitions
- repository + forks
- partitions
๐ค Demo¶
See staging
and snapshots
. Contents are hardlinked to the main repo.
txclient --relative-path=@hashed/64/4d/asdfasdfasdf.git --rpc=write
No .lock
file left behind. See "Lock file exists" from the docs. Read about optimize_repository.
Each partition has a repo and its forks and the WAL is shared.
dump-manifest -path ../repositores/whatever/WAL/00000/MANIFEST
The source_path
numbers indicate order and there's a pointer to the @hashed
path.