This post is an open deliberation on the best usage strategy for the new EBS offered by Amazon Web Services.
Last Thursday Amazon launched a much anticipated new service: Elastic Block Store (:EBS).
EBS offers persistent storage (something that was missing from EC2) at a price of 10 cents per Gb per month. In addition, one has to pay another 10 cents per million of I/O requests.
I was quick to jump and test the new offering. As a matter of fact, I had to, because I was about to transfer a customer’s server to EC2 on Friday anyhow. It took me a few hours to accomplish the task, most of them spent in moving data from server to server.
The setup and use of a new EBS volume is fairly simple and with the aid of Elasticfox, is even simpler.
While the technicalities of the migration did not pose a serious obstacle, I realized that all of a sudden there was too much storage available and I needed a strategy for making the best and most cost effective use of it.
Let me explain.
The site I was migrating was a small site, so my choice of the EC2 instance was obvious: the smaller one.
The small instance comes with 160 GB of storage, split in two: 10G for the operating system (only a fourth of it actually used) and the rest completely free.
Now, the problem with the EC2 instances is that once shut down, storage and data evaporate to thin air. Frequent backing up is, therefore, a must.
Before the introduction of EBS, one could back up the data to S3 (but the process was rather tedious), or to another server.
With the EBS, one can decide how many Gigabytes of persistent storage he needs, create a volume, attach it to its instance and make the necessary changes so that all data, from then on, is stored in the EBS. In the case that something goes wrong with the instance, all data stay intact.
For the record, I created a 50Gb volume, so the storage available was now on 210 Gb
EBS offers a kind of backing up to S3 called snapshot. With Elastic fox it is a piece of cake to take a snapshot of EBS and store it to S3, and while EBS risk of failure is 1/10 of a normal hard disk, in the case that it happens, the data can be restored from the latest snapsot to S3.
Now, this is all well, but isn’t it too much of redundant storage there? And doesn’t this mean a waste of money for the AWS customers?
My first thought was to place the database files in the EBS too, to make them safe from failure. The uploaded files had to go there as well, for the same reason.
The application itself, not a really gigantic one, stayed in the 10Gb partition. And since it is not going to change that often, I built an instance image with the application included and stored it to S3.
What else is left to store? Not much. See the problem? The instance now has 150Gb of storage completely unused.
On second thought, I decided to move the database back to the EC2 instance from the EBS. The reason? Well, the database is responsible for most of the I/O and I/O, as we said, is a cost factor.
Having the database in the instance does not consume too much space. A cron script can back it up at regular intervals and save the backup to EBS. In case of instance failure, the data between two consequent saves are jeopardised though.
EBS cost issues aside, there is always this 150Gb left unused that puzzles me. Assuming most AWS customers will play safe and utilize EBS, what are they supposed to do with the storage of their instances?
Either they have to come up with a reasonable use, or Amazon needs to change the instance policies. What do you think?
Photo courtesy of Flickr user skimaniac