Talk Nerdy To Me: Amazon EC2 backup shell script

Update: right, here’s a new version of the script that doesn’t spew when you run it as root.

Update 2: made the path to /sbin/service explicit so that mysqld goes down properly. 

Update 3: here’s a text version without the silly smartquotes. Download it, change the extension to .sh, and go nuts.

Update 4: oh just go look at it on Github

As you may have noticed from the intermittent downtime, I’ve been screwing around with the josh.sg server for the last couple of weekends, moving it off a Rimuhosting server and onto an Amazon AWS EC2 micro instance. (Incidentally, if you’re looking for a virtual server host, I can’t recommend Rimuhosting enough: their service is terrific, their prices are eminently reasonable, and their automatic backup and clueful tech staff have saved my arse multiple times.)

Now, EC2 is pretty great. It’s infinitely flexible, comically cheap to spin up new servers (especially if you use reserved instances (for base-load needs) or spot instances (for peak-load needs)), and it’s usually pretty reliable. But Amazon specifically warns that you shouldn’t expect your instances to stay up forever, and that you should build your applications to be fault-tolerant.

Mercifully, EC2 makes it extremely easy to roll your own backups: the EC2 API has an ec2-create-snapshot call that, with the right parameters, will take a snapshot of the entire system that you can boot from if things go wrong.

Here’s the bash script that I put together to automate my backups on an Amazon Linux-based AMI. Stick it in root’s crontab, set it to run once a week when you can afford two minutes’ downtime, and let it rip:

#!/bin/bash
export EC2_PRIVATE_KEY={path to your AWS private key}
export EC2_CERT={path to your AWS certificate}
export JAVA_HOME=/usr/lib/jvm/jre

export EC2_HOME=/opt/aws/apitools/ec2                

/sbin/service mysqld stop

Find the volume ID that we’re currently attached to

VOLUME_ID=/opt/aws/bin/ec2-describe-volume-status | grep vol | awk '{print $2}'

Take a snapshot of that volume; return the snap ID

SNAP_ID=/opt/aws/bin/ec2-create-snapshot $VOLUME_ID -d "Weekly backup" | awk '{ print $2 }'

STATUS=pending
while [ “$STATUS” != “completed” ]
do

check whether the snapshot is “completed” or still “pending”

echo volume $VOLUME_ID, awaiting snap completion…
sleep 3
STATUS=/opt/aws/bin/ec2-describe-snapshots $SNAP_ID | grep SNAPSHOT | awk '{ print $4 }'

done

/sbin/service mysqld start

A few little wrinkles:

  • The script stops mysqld before taking the snapshot; if you don’t do this, you’re running the risk of a corrupted database. Yes, this means a few seconds of failed database reads and writes, so it’s not a perfect solution.

  • You’ll need to go through your list of snapshots every couple of weeks and prune them, otherwise you’re gonna have a lot of old snapshots sitting around and costing you money. You could script something around ec2-delete-snapshot that would take care of this for you, and that might be my next project…