I'm Jon Olick. I make shiny things. I simplify.

I presented Sparse Voxel Octrees at Siggraph 2008.

Tuesday, March 20, 2012

Backing up your Data in the Cloud

Prerequisites: An Ubuntu Linux Box

Due to all the lightning around here my paranoia kicked in and I've decided to move my data into the cloud. The question was, how to do it at a reasonable cost? and also be reasonably certain that the people who host your data don't some day go out of business and you end up losing your data.

There are many options: http://en.wikipedia.org/wiki/Comparison_of_online_backup_services

Which to choose?

Dropbox seemed like the best choice in terms of company reliability.

Dropbox costs $10 per month for 50gb or $20 for 100gb.

This is not actually horrible, but has the problem of paying for gb you don't actually use.

Lets look at raw providers:

Rackspace CloudFiles: $0.15 * 50gb = $7.50 per month.

Amazon AWS S3 Standard: $0.125 * 50gb = $6.25 per month

Amazon AWS S3 Reduced: $0.093 * 50gb = $4.65 per month

Clearly Amazon is the winner here in terms of cost at less than 50% the cost of dropbox.

Setting up with CloudFiles:

Install cloudfuse: https://github.com/redbo/cloudfuse

Setting up with Amazon AWS S3:

Install s3fs: http://code.google.com/p/s3fs/

Setting up the common elements:

  1. Set up a network share with Samba. https://help.ubuntu.com/11.04/serverguide/C/samba-fileserver.html
  2. Install rsync via apt-get install rsync
  3. Then open up "crontab -e" and setup a rsync copy into your fuse mount from the samba share.
    0 0 * * * rsync -va --size-only /srv/samba/share /media/cloudfiles


  1. I do something like this with Dreamhost, which advertises "unlimited" storage and bandwidth for $5 or $10 a month.

    WebDAV clients for Win7 and Linux work pretty well. Unless I'm working with really big files, it's fast enough to load and save directly to the cloud drive without syncing to a local drive.

    For textual work, everything goes into a local git repo, which gets push'ed out to a git repo in the cloud from time to time. This is a lot nicer than regular syncs because a complete history of changes (including abandoned dead-ends) is preserved.

  2. Rsync doesnot seem to work with cloud fuse any ideas ?

    1. you sure? IIRC it works with --size-only (which isn't a very good hash, but works in most cases)