Thinking more about backups
Published on
  The Backblaze logo
Almost a year ago I wrote about how I do backups with Restic and Hetzner. That system has been ticking along well ever since, but recently I had some… thoughts. These backups are all well and good if I accidentally delete a file, or a database gets corrupted, or something, but there are two glaring issues:
Firstly, I’m backing up my Hetzner server to Hetzner cloud storage. If something happens to Hetzner — or my Hetzner account — then all my eggs go down with that basket. Obviously Hetzner are a big organisation and aren’t likely to just vanish overnight, but I’m less confident about my account. Could a false abuse report get it suspended? What if the UK passes even more dumb laws and Hetzner decide it’s easier just to not do business with people here? This is the same sort of concern I have about Google accounts: if you have half of your life in Google Drive and Google Mail, what happens if you comment on a YouTube video, get flagged by an AI moderation process, and your account gets suspended? It’s probably not very likely, but these are things my brain likes to dwell on.
Secondly, the credentials to access the backups sit on each machine that is backed up. If someone malicious gained access to the machine, they’d also have access to delete or tamper with all the backups. It feels a little silly that the same attack could take down both the originals and the backups. There’s no way to avoid that with Hetzner’s S3 implementation, as far as I can tell.
Exploring options
I toyed with the idea of making local copies of the backups, but the only way to avoid the same problems would be to keep them offline and do a manual copy every now and then. I didn’t really want to do that, and was concerned that if I did a monthly offline backup then I stood to lose up to a month of data in the worst case.
I then looked around at other S3 providers. Amazon’s glacier offering is tempting due to its very low storage costs, but you pay for that if you ever want to restore anything. There are also lots of weird pricing edge cases around moving data between storage classes, minimum file sizes, and so on. A much better option is Backblaze’s B2 product. Their pricing is much more straight-forward, and they have an interesting feature that’s particularly useful in this case: lifecycle rules. Coupled with the ability to create API keys that don’t have access to delete files (just “hide” them), this allows for what’s effectively an append-only store.
This works more-or-less out of the box with Restic. Joseph Price has a guide
that goes into the setup in a bit more depth. Basically, whenever Restic would delete a file (e.g. during
a “forget” or “prune” operation), it instead gets hidden and is only deleted when the B2 lifecycle rules
decide it should be. I’ve kept the existing Hetzner S3 backups for now, and just added an extra step to
the end of my script: a simple restic copy and a restic forget. B2 actually works out cheaper than the
Hetzner storage, as they don’t bill you for a minimum of 1TB storage; my current usage is around $3/month.
Not a bad price for some extra peace of mind!
Thanks for reading!
Have thoughts? Send me an e-mail!
Related posts
            Simple backups with Restic and Hetzner Cloud
I have a confession: for the past few years I’ve not been backing up any of my computers. Everyone knows that you should do backups, but actually getting around to doing it is another story. Don’t get me wrong: most of my important things are “backed up” by virtue of being committed to remote git repositories, or attached to e-mails, or re-obtainable from the original sourc...
            How I use Tailscale
I’ve been using Tailscale for around four years to connect my disparate devices, servers and apps together. I wanted to talk a bit about how I use it, some cool features you might not know about, and some stumbling blocks I encountered. I’m not sure Tailscale needs an introduction for the likely audience of this blog, but I’ll give one anyway. Tailscale is basically a WireGuard o...
            How to break everything by fuzz testing
Fuzz testing, if you’re not aware, is a form of testing that uses procedurally generated random inputs to see how a program behaves. For instance, if you were fuzz testing a web page renderer you might generate a bunch of HTML - some valid, and some not - and make sure the rendering process didn’t unexpectedly crash. Fuzz testing doesn’t readily lend itself to all types of software, but it p...
{% figure "right" "The Backblaze logo: a stylised flame above the word Backblaze" %} Almost a year ago I wrote about [how I do backups with Restic and Hetzner](/simple-backups-restic-hetzner/). Th...