Getting Started
Installation
Get yourself a Rust toolchain and run
$ cargo install backpak
Better packaging to follow.
Creating a repository
Backpak saves backups in a repository. We can make one in a local folder:
backpak --repository ~/myrepo init filesystem
Or, if you'd like to upload to Backblaze B2,
the -r/--repository
flag just sets the repo's config file:
$ backpak -r ~/myrepo.toml \
init --gpg MY_FAVORITE_GPG_KEY \
backblaze \
--key-id "deadbeef"
--application-key "SOMEBASE64" \
--bucket "matts-bakpak"
With --gpg
, Backpak will run a quick check that it can round-trip data
with
gpg --encrypt --recipient <KEY>
then encrypt all files in the repo using that command. You can edit the repo config file to use a different, arbitrary command.
More backends to follow.
Backing up
Let's make a backup!
$ backpak -r ~/myrepo backup ~/src/backpak/src
Walking {"/home/me/src/backpak/src"} to see what we've got...
/ 297 KB
Opening repository srctest
Building a master index
Finding a parent snapshot
Running backup...
/ P 17 KB + 7 KB | R 281 KB | Z 8 KB | U 9 KB
I 2 packs indexed
D 20 KB downloaded
/home/me/src/backpak/src
Snaphsot afe4ajdi done
We print updates as we go:
- How much we Packed into this backup (files + metadata)
- How much we Reused from previous backups
- How much Zstandard ensmallened the data
- How much we Uploaded
If interrupted, the incomplete backup
will leave behind a backpak-wip.index
and a handful
of other files. This allows Backpak to resume where it left off.
You can also:
- Pass multiple paths to
backup
. - Specify a backup author with
--author
(otherwise the machine's hostname is used). - Annotate your backup with
--tag
. - Skip over files and folders (matching regular expressions) with
--skip
. - Dereference symbolic links with
-L
. - See what you'd backup with
--dry-run
. (Most commands have this!)
Your new backup is saved as a snapshot. You can view a list of the repository's snapshots with...
snapshots
:
$ backpak -r ~/myrepo snapshots
...
snapshot afe4ajdifcgfkghmq2tivqlsjnptvri5inb8inn99k0k2
Author: my-desktop
Date: Thu Nov 7 2024 22:55:36 US/Pacific
- /home/me/src/backpak/src
By default, we see the snapshot ID, the author, any tags, the date, and the paths backed up. We can get some additional info by passing more flags:
-
--sizes
will calculate how much data each snapshot adds to the repo. -
--file-sizes
breaks this down further, showing which files added data, sorted largest to smallest. -
--stat
shows the changes each backup made compared to the previous — what was added, removed, etc. (Kinda likegit log --stat
.) Add--metadata
to see changes to that as well.
Examining snapshots
Each snapshot can be referenced by a few digits of its ID (enough to be unique),
or relative to the most recent snapshot — LAST
is the latest,
followed by LAST~
, then LAST~2
, LAST~3
, and so on.1
Using these, we can do some routine things, like list the files in the snapshot:
$ backpak -r ~/myrepo ls LAST
src/
src/backend/
src/backend/backblaze.rs
src/backend/cache.rs
...
src/ui/snapshots.rs
src/ui/usage.rs
src/ui.rs
src/upload.rs
Or compare the snapshot to whatever's in the directory currently:
$ backpak -r ~/myrepo diff ra8o
+ src/some-new-thing
+ src/some-other-new-thing
Restoring data
To restore a snapshot,
$ backpak -r ~/myrepo restore LAST
by default, restore
doesn't delete anything. If you want to do that:
$ backpak -r ~/myrepo restore --delete LAST
- /home/me/src/backpak/src/some-new-thing
- /home/me/src/backpak/src/some-other-new-thing
Additional flags like --times
and --permissions
can restore metadata,
and --output
can restore the snapshot to a different directory than where it came from.
If you'd like to dump an individual file from a snapshot, you can do that too:
$ backpak -r ~/myrepo dump LAST src/lib.rs
//! Some big dumb backup system.
//!
//! See the [`backup`] module for an overview and a crappy block diagram.
pub mod backend;
pub mod backup;
pub mod blob;
...
Deleting snapshots
Sometimes you want to remove old snapshots, or you backed up the wrong things. You can remove a snapshot from your repository with
$ backpak -r ~/myrepo forget <ID>
This only deletes the snapshot itself, not the data it points to. (After all, many snapshots can reference the same data!) To run garbage collection on the repo and remove files that aren't referenced by any snapshot anymore, run
$ backpak -r ~/myrepo prune
Repository health
If you'd like to know how much space a repository is using, try usage
:
$ backpak -r photo-backup.toml usage
2 snapshots, from 2024-08-17T12:39:15 to 2024-08-17T12:57:30
16.48 GB unique data
16.48 GB reused (deduplicated)
2 indexes reference 165 packs
Backblaze usage after zstd compression and gpg:
snapshots: 1 KB
indexes: 448 KB
packs: 16.29 GB
total: 16.29 GB
Like any sane backup system, Backpak tries very hard to make sure data is always left in a consistent state — packs are always uploaded before the index that references them, which is uploaded before its snapshot, etc. But if you're the "trust but verify" type:
$ backpak -r photo-backup.toml check
This reads the indexes and ensures that every pack they mention is present.
check --read-packs
will go a step further and verify the contents of each pack!
To state the obvious, expect this to take a while since it's reading every byte in the repo.
Read up on this implementation details if you're wondering what the hell an index or a pack is.
Other commands
-
backpak copy
will copy snapshots between repositories. You can add--skip
to leave files you don't want out of the new one. -
backpak filter-snapshot
creates a copy of a snapshot in the same repo, but with certain files skipped. (--skip
is mandatory!) -
backpak cat
will print objects in the repo as JSON. It's mostly meant for debugging.
If your Git habits die hard, HEAD
, HEAD~1
, HEAD~2
, etc. also work.