Cummulus backups
The cloud computing is all the rage these days. However I'm kind of reluctant to trust some company in a country half way across the globe with all my personal data, even if they picture it floating in fluffy heaven. So in a kind of old-fashioned way, I still store all my private documents, photos and emails on hard disks that are my own property.
However, backing up my home server to an external USB drive has become a little bit inconvenient lately. So I decided to give it a go and try backing up to Amazon Simple Storage Service (S3) - i. e. the cloud.
My server is on a residential ADSL line with relatively poor upstream bandwidth (1 Mb/s), so incremental backups must use the bandwidth efficiently. There's approximately 4 GB of compressed data to be backed up, which is just on the limit of what I would consider feasible for a setup like this - theoretically it should take around 12 hours to upload a full snapshot, but I don't plan to do these very often.
After some investigation I decided to use Duplicity: first because it's advertised as efficient in exactly this use case and secondly because I already use it to backup my computer at work. Although the official man page is a little short on details about S3 storage, there are quite a few articles floating around.
The cost of S3 storage is pretty minimal: I never plan to store more than around 20 GB worth of backups and if I count in a monthly 4 GB full snapshot, that comes to $3.50 per month. Granted this is very expensive if you compare it to the price of an external USB drive, but it has the benefit of being off-site and conveniently accessible from anywhere on the internet.
Of course, the tiny paranoid voice in my head made me check all the worst case scenarios: If Amazon suddenly disappears from the face of the Earth, I would be left without backups. But I judge that the possibility of that happening and me needing the backups in the same instant is too low to worry about. The data I'm sending over the Atlantic is encrypted with GPG, so it's presumably safe even in the unlikely case someone in US would want to browse through my stuff pretending to be looking for terrorists or some such nonsense.
One problem I do see is that these backups are not safe in case someone breaks into my server, since they could be altered or erased by the attacker - but that's the case with most if not all automated backups. In addition to that, Sysadminman makes an interesting point that in case someone gets my Amazon credentials they can run up a huge bill in my name since it's not possible to put bandwidth limits on an account. That's not a possibility that would make me loose my sleep at night, but I did make a note to check occasionally my account activity.
Finally, how does this work in practice? A full backup takes 14 hours while an incremental one is finished in a little less than 15 minutes. One thing I still have on the to-do list is to look into Linux QoS settings and make some adjustments so I could still comfortably read my email over IMAP and the NTP client wouldn't panic once a month when the full backup is made.
So right now, after two weeks of use, it looks like I'll stick to this backup scheme. Still, it's nice to know I can cancel the service at any moment should any serious problems come up.


