Update (3/3/2010): a better measure of RAID performance is available here.
While considering different options for a database server, I decided to do some digging into Amazon Web Services (AWS) as an alternative to dedicated servers from an ISP. I was most curious about the I/O of the Elastic Block Storage (EBS) on the Elastic Compute Cloud (EC2). What I tested was a number of different file systems EXT3, JFS, XFS, ReiserFS as single block devices and then some different software RAID configurations leveraging JFS. The tests were run using Bonnie++.
The configuration was vanilla, no special tuning was done, just the default values that are assigned by the tools. I used Fedora Core 9 as the OS from the default Amazon AMI and used “yum install” to aquire the necessary utilities (more on that below). I expect with further tuning, some increases in performance can still be obtained. I used the small instance for cost reasons, which includes “moderate” I/O performance. Running on a large or extra-large standard instance should perform even better with “high” I/O performance. You can get all the instance specifications from Amazon.
First I wanted to determine what the EBS devices would compare to in the physical world. I ran Bonnie against a few entry level boxes provided by a number of ISP’s and found the performance roughly matched a locally attached SATA or SCSI drive when formatted with EXT3. I also found that JFS, XFS and ReiserFS performed slightly better than EXT3 in most tests except block writes.
Again, let me re-iterate that some numbers may not be accurately reflected in your production environment. Amazon states, small instances have “moderate” I/O availability. Presumably if your running this for a production DB, you’ll want to consider a large or extra-large instance for the memory and so you should see slightly better performance from your configuration. Also note, that the drives I allocated were rather small (to keep testing costs low) so you may experience different results with larger capacities.
Note: The graph below is in KB, not bytes as titled.
|Size (Filesystem)||Output Per Char||Output Block||Output Re-write||Input Per Char||Input Block|
|4x5Gb RAID5 (JFS)||22,349||58,672||39,149||25,332||84,863|
|4x5Gb RAID0 (JFS)||24,271||99,152||43,053||26,086||96,320|
As expected, RAID 0 does best with read/write speed and RAID 5 does very well on reads (input block) as well. For InnoDB, the re-write and block read (input)/write (output) operations are the most critical values. Longer bars mean better values. To better understand what the test is doing, be sure to read the original Bonnie description of each field.
The process for making a device is simple. There are many tutorials on how to make this persistent and you can certainly build this into your own AMI when you’re done – this is not a tutorial on how to do that. To get a volume up and running you’ll follow these basic steps:
- Determine what you want to create – capacity, filesystem type etc.
- Allocate EBS storage
- Attache the EBS storage to your EC2 instance
- If using RAID, create the volume.
- Format the filesystem
- Create the mount point on the instance filesystem
- Mount the storage
- Add any necessary entries to mount storage at boot time.
Single Disk Images
Remember, the speed and efficiency of the single EBS device is roughly comparable to a modern SATA or SCSI drive. Use of a different filesystem (other than EXT3) can increase different aspects of drive performance, just as it would with a physical hard drive. This isn’t a comparison of the pros and cons of different engines, but simply providing my findings during testing.
|JFS||yum install jfsutils|
|XFS||yum install xfsprogs|
|ReiserFS||yum install reiserfs-utils|
I didn’t test any other filesystems such as ZFS, because I’ve read some other filesystems are unstable on Linux and I’ll be running production on Linux so the extra time for the tests seemed unnecessary. I am interested in other alternatives that could increase performance if you have any to share I’d love to hear about them.
You can quickly get a volume setup with the following:
mkfs -t ext3 /dev/sdf mkdir /vol1 mount /dev/sdf /vol1
Next time you mount the volume, you won’t need to use “mkfs” because the drive is already formatted.
The default AMI already includes support for RAID, but if you needed to add them to your yum enabled system, it’s “yum install mdadm”. On the Fedora Core 9 test rig I was using, RAID 0, 1, 5, 6 were supported, YMMV.
To create a 4 disk RAID 0 volume, it’s simply:
mdadm --create --verbose /dev/md0 --level=0 --raid-devices=4 /dev/sdf /dev/sdg /dev/sdh /dev/sdi mkfs -t ext3 /dev/md0 mkdir /raid mount /dev/md0 /raid
To create a 4 disk RAID 5 volume instead, it’s simply:
mdadm --create --verbose /dev/md0 --level=5 --raid-devices=4 /dev/sdf /dev/sdg /dev/sdh /dev/sdi mkfs -t ext3 /dev/md0 mkdir /raid mount /dev/md0 /raid
This example assumes you have 4 EBS volumes attached to the system. AWS shows 7 possible mount points /dev/sdf – /dev/sdl in the web console, however, the documentation states you can use through /dev/sdp, which is 11 EBS volumes in addition to the non-persistent storage. This would be a theoretical maximum of 10TB of RAID 5 or 11TB of RAID 0 storage!
Checking in on things…
- cat /proc/mdstat
is a great way to check in on the RAID volume. If you run it directly after creating a mirroring or striping array, you’ll also be able to see the scrubbing process and how far along it is.
- mount -l
shows the currently mounted devices and any options specified.
disk free provides a nice list of device mounts and their total, available and used space.
It’s clear from the numbers that software RAID offer a clear performance advantage over a ESB volume. Since with EBS you pay per Gb not per disk, it’s certainly cost effective to create a robust RAID volume. The question that remains is how careful do you need to be with your data? RAID 0 offered blistering fast performance but like a traditional array, without redundancy. You can always set it up as RAID 5, RAID 6 or RAID 10 but this of course requires more unusable space to handle the redundancy.
Since the volumes on EBS are theoretically invincible, it may be okay to run unprotected by a mirror or parity drive, however, I haven’t found anyone who would recommend this in production. If anyone knows of a good reason to ignore the saftey of RAID 10 or RAID 6 or RAID 5, I’d love to hear the reasoning.
I am also curious if these drives maintain a consistent throughput over the full capacity of the disk or will they slow down as the drive fills like a traditional drive? I did not test this. It remains open for another test (and subsequent blog post). Should anyone run ZCAV against a 100Gb+ drive and figure that out, please let me know.
Fine Print – The Costs
Storage starts at a reasonable $0.10/GB-Month which is reasonable and is prorated for only the time you use it. A 1Tb RAID 0 volume made of 10x100Gb volumes would only cost $1,200 per year. Good luck getting performance/dollar costs for 1Tb like that from any SAN solution at a typical ISP. There are however some hidden costs in the I/O that you’ll need to pay attention to. Each time you read or write a block to disk, there’s an incremental cost. The pricing is $0.10 per million I/O requests – which seems cheap, but just running the simple tests I ran with Bonnie++ I consumed almost 2 million requests in less than 3 hours of instance time. If you have a high number of reads or writes, which you likely do if you’re considering reading this, you’ll need to factor these costs in.
The total AWS cost for running these tests was $0.71 of which $0.19 were storage related. The balance was the machine instances and bandwidth.