Posts Tagged ‘ebs’

Share Honesty Box: EBS Performance Revisited

Tuesday, March 2nd, 2010

As part of my work on Honesty Box, I’ve been reviewing EBS disk performance once again. This was a great opportunity to expand on the research from last year. After re-reading what I posted then, along with the wealth of data that has been compiled since, I realized I still didn’t have sufficient information to answer two key questions.

  1. How does the number of EBS volumes impact a performance of RAID 0?
  2. Does the instance size, make a significant difference in the RAID performance?

As before I used Bonnie++ to measure the results. You can read about the full method I used below.

Results

  • RAID 0 performed better with an even number of EBS volumes.
  • RAID 0 performed best with 8 volumes for writes and random seek.
  • RAID 0 performed poorly for reads!
  • Larger instances perform significantly better than smaller instances.
  • The ephemeral store has very good overall performance.

Data

The titles of the Bonnie tests can be confusing for folks removed from the programming process. Be sure to read the full explanation of what each test is doing.

Sequential Output is a measure of the write performance to the drive. Higher bars are better. With RAID 0 it appears that an even number of drives performs significantly better than an odd number.

Sequential Create is a measure of the files created by Bonnie. Higher values are better. Test that complete too quickly return no values. That is the cause of the missing bars for Read/sec above. You can safely consider that value too fast to measure.

Sequential Input is a measure of the read performance from the drive. Higher values are better. This is concerning because of the steady decline in block read performance associated with the number of available volumes. This may have to do with the time of day that these tests were run and really warrants more investigation. It should also be noted that this is a measure of sequential performance so unless your reading contiguous files off the disk, this number may be irrelevant to you.

Random Create measures how the files are created and deleted. Higher values are better. Again, tests that happen too quickly are discarded explaining the Read/sec result having no values.

Random Seeks should scale consistently with the number of EBS volumes added. Higher values are better. However, that did not appear to be the case and a limit appeared to be reached at 8.

Effect of CPU

To test the impact of the CPU units, I selected the 4 volume array and then compared it with the tests run last year. Both were using 4 volume EBS RAID 0 with XFS file systems. They both used the noop IO scheduler. The underlying OS did change from Fedora to Ubuntu and a year has passed.

Sequential Output Taller is better. Clearly the additional IO capacity in the larger instance does make a big difference in the performance of the volumes. I would expect smaller increments in CPU capacity would result in smaller differences.

Sequential Input Taller is better. Clearly the m1.large instance out performs the smaller m1.small instance.

Thoughts and Next Steps

After reviewing the performance of the native ephemeral storage, I wonder if partitioning the ephemeral store and assembling a RAID array from there might not be the best route for high speed storage? Of course backup would be a potential issue, but snapshotting of XFS may be able to mitigate that. For future tests I would like to study the impact of using the -b flag which causes Bonnie++ to flush to disk. I also think larger volume sets as shown by these tests and different I/O schedulers may yield different results.

Method

As before I used Bonnie++ to measure disk performance but it’s limitations are fairly well understood and it gives us a metric that can be compared with other metrics. You can read the full explanation of what each value actually means here. Armed with 16 EBS stores mapped to an unused m1.large instance, I began running tests. The process was as follows:

  1. Create a new RAID set using a chunk size of 256
  2. Use XFS to format the drives
  3. Mount the filesystem w/ Ubuntu defaults
  4. Capture Bonnie results
  5. Dissassemble the RAID set
  6. Rinse and repeat

I did this for 2-10 volumes and then one additional test with 16 volumes. For comparison, I also ran the test with the ephemeral store and a single EBS volume. Those are the results represented in each of the graphs above. I reran the 6 volume test 3 times over the course of a day and took an average value for the graphs.

Share EC2 Instances Die and Other Lessons From The Cloud

Sunday, March 29th, 2009

Today I learned a few painful lessons: First, EC2 instances die and a simple reboot will not recover them. Second, unlike many web hosts – amazon doesn’t offer any level of monitoring. Third, backups are only useful if they’re current. :(

Lesson 1:

When I originally built my EC2 instance to host this site (and a few other applications) I was learning about Amazon’s EC2 and so spent a good amount of time trying to be as thorough in my documentation of my server setup. The result was an install script, “install_server” that effectively did all the steps I did when I turned on the server. The script goes something like this:

#!/bin/sh
yum -y upgrade
yum -y install sendmail
yum -y install httpd
yum -y install php
yum -y install php-mysql
yum -y install php-pecl-memcache
yum -y install memcached
yum -y install subversion
pear install HTTP_Request
cp configs/freetds.conf /etc/freetds.conf
cp configs/httpd.conf /etc/httpd/conf/httpd.conf
cp configs/memcached /etc/sysconfig/memcached
cp configs/php.ini /etc/php.ini
cp configs/fstab /etc/fstab
tar -xzf webroot.01.tar.gz 
/etc/init.d/memcached restart
/etc/init.d/httpd restart
/etc/init.d/sendmail restart
echo configs/crontab.txt
echo NOW SETUP CRONTAB!!!

This effectively copies into place all of the settings I need for a server. This is nice, because I am able to bring a new server online within 10 minutes or so of it going down. I should probably just automate the creation of a server specific AMI once or twice a day, but I’m just not there yet. Also – I know there are some weak points in the startup script… I’ll be working those out soon now that I see it’s really a useful tool.

Lesson 2:

Monitoring of EC2 instances needs to be done by an external service. Fortunately, not too many people care what’s going on with this site over a Saturday night to Sunday morning. I’m looking at the following options in two realms, first – a basic alert that there’s a problem, and secondly a more proactive approach that can do some instance killing and restarting on it’s own.

Right now I’m looking at these services for quick and dirty SMS alerts about the status of my instances:

And I’m looking at these for a more holistic approach to monitoring but am gun shy on relying on these to manage the instances until I learn more about them:

Any experiences anyone has had with any of these products is always appreciated.

Lesson 3:

Last but not least are backups. I have another script, aptly named “backup_server” that makes a snapshot of the settings and configurations every 24 hours during off peak times storing the data on an elastic block storage device that I have mounted to the application server. That goes something like this:

crontab -l > configs/crontab.txt
cp /etc/php.ini configs/php.ini
cp /etc/httpd/conf/httpd.conf configs/httpd.conf
cp /etc/sysconfig/memcached configs/memcached
cp /etc/fstab configs/fstab
tar -czvf backups/configs.01.tar.gz configs
tar -czvf backups/webroot.01.tar.gz /mnt

Where I was burned here is that my cron job only backed up my data every 24 hours at 4am. However, the application server crashed and burned at 2am EDT. Clearly I need to consider something like rsync to prevent this type of data loss. Rsync can grab the incremental changes hourly, thus reducing work losses during between the full backups every 24 hours. As a stop gap, I’ve increased the frequency of the backups until I can get back to the system and setup rsync.

Share Amazon EC2 Disk Performance

Friday, February 27th, 2009

Amazon Web Services Logo
Update (3/3/2010): a better measure of RAID performance is available here.

While considering different options for a database server, I decided to do some digging into Amazon Web Services (AWS) as an alternative to dedicated servers from an ISP. I was most curious about the I/O of the Elastic Block Storage (EBS) on the Elastic Compute Cloud (EC2). What I tested was a number of different file systems EXT3, JFS, XFS, ReiserFS as single block devices and then some different software RAID configurations leveraging JFS. The tests were run using Bonnie++.

The configuration was vanilla, no special tuning was done, just the default values that are assigned by the tools. I used Fedora Core 9 as the OS from the default Amazon AMI and used “yum install” to aquire the necessary utilities (more on that below). I expect with further tuning, some increases in performance can still be obtained. I used the small instance for cost reasons, which includes “moderate” I/O performance. Running on a large or extra-large standard instance should perform even better with “high” I/O performance. You can get all the instance specifications from Amazon.

First I wanted to determine what the EBS devices would compare to in the physical world. I ran Bonnie against a few entry level boxes provided by a number of ISP’s and found the performance roughly matched a locally attached SATA or SCSI drive when formatted with EXT3. I also found that JFS, XFS and ReiserFS performed slightly better than EXT3 in most tests except block writes.

The Numbers

Again, let me re-iterate that some numbers may not be accurately reflected in your production environment. Amazon states, small instances have “moderate” I/O availability. Presumably if your running this for a production DB, you’ll want to consider a large or extra-large instance for the memory and so you should see slightly better performance from your configuration. Also note, that the drives I allocated were rather small (to keep testing costs low) so you may experience different results with larger capacities.

Note: The graph below is in KB, not bytes as titled.

Bonnie Disk Performance on EC2

Size (Filesystem) Output Per Char Output Block Output Re-write Input Per Char Input Block
4x5Gb RAID5 (JFS) 22,349 58,672 39,149 25,332 84,863
4x5Gb RAID0 (JFS) 24,271 99,152 43,053 26,086 96,320
10Gb (XFS) 20,944 43,897 24,386 25,029 65,710
10Gb (ReiserFS) 22,864 57,248 17,880 21,716 44,554
10Gb (JFS) 23,905 47,868 21,725 24,585 55,688
10Gb (EXT3) 22,986 57,840 22,100 24,317 48,502

As expected, RAID 0 does best with read/write speed and RAID 5 does very well on reads (input block) as well. For InnoDB, the re-write and block read (input)/write (output) operations are the most critical values. Longer bars mean better values. To better understand what the test is doing, be sure to read the original Bonnie description of each field.

Making Devices

The process for making a device is simple. There are many tutorials on how to make this persistent and you can certainly build this into your own AMI when you’re done – this is not a tutorial on how to do that. To get a volume up and running you’ll follow these basic steps:

  1. Determine what you want to create – capacity, filesystem type etc.
  2. Allocate EBS storage
  3. Attache the EBS storage to your EC2 instance
  4. If using RAID, create the volume.
  5. Format the filesystem
  6. Create the mount point on the instance filesystem
  7. Mount the storage
  8. Add any necessary entries to mount storage at boot time.

Single Disk Images

Remember, the speed and efficiency of the single EBS device is roughly comparable to a modern SATA or SCSI drive. Use of a different filesystem (other than EXT3) can increase different aspects of drive performance, just as it would with a physical hard drive. This isn’t a comparison of the pros and cons of different engines, but simply providing my findings during testing.

JFS yum install jfsutils
XFS yum install xfsprogs
ReiserFS yum install reiserfs-utils

I didn’t test any other filesystems such as ZFS, because I’ve read some other filesystems are unstable on Linux and I’ll be running production on Linux so the extra time for the tests seemed unnecessary. I am interested in other alternatives that could increase performance if you have any to share I’d love to hear about them.

You can quickly get a volume setup with the following:

mkfs -t ext3 /dev/sdf
mkdir /vol1
mount /dev/sdf /vol1

Next time you mount the volume, you won’t need to use “mkfs” because the drive is already formatted.

RAID

The default AMI already includes support for RAID, but if you needed to add them to your yum enabled system, it’s “yum install mdadm”. On the Fedora Core 9 test rig I was using, RAID 0, 1, 5, 6 were supported, YMMV.

To create a 4 disk RAID 0 volume, it’s simply:

mdadm --create --verbose /dev/md0 --level=0 --raid-devices=4 /dev/sdf /dev/sdg /dev/sdh /dev/sdi
mkfs -t ext3 /dev/md0
mkdir /raid
mount /dev/md0 /raid

To create a 4 disk RAID 5 volume instead, it’s simply:

mdadm --create --verbose /dev/md0 --level=5 --raid-devices=4 /dev/sdf /dev/sdg /dev/sdh /dev/sdi
mkfs -t ext3 /dev/md0
mkdir /raid
mount /dev/md0 /raid

This example assumes you have 4 EBS volumes attached to the system. AWS shows 7 possible mount points /dev/sdf – /dev/sdl in the web console, however, the documentation states you can use through /dev/sdp, which is 11 EBS volumes in addition to the non-persistent storage. This would be a theoretical maximum of 10TB of RAID 5 or 11TB of RAID 0 storage!

Checking in on things…

  • cat /proc/mdstat
    is a great way to check in on the RAID volume. If you run it directly after creating a mirroring or striping array, you’ll also be able to see the scrubbing process and how far along it is.
  • mount -l
    shows the currently mounted devices and any options specified.
  • df
    disk free provides a nice list of device mounts and their total, available and used space.

Conclusion

It’s clear from the numbers that software RAID offer a clear performance advantage over a ESB volume. Since with EBS you pay per Gb not per disk, it’s certainly cost effective to create a robust RAID volume. The question that remains is how careful do you need to be with your data? RAID 0 offered blistering fast performance but like a traditional array, without redundancy. You can always set it up as RAID 5, RAID 6 or RAID 10 but this of course requires more unusable space to handle the redundancy.

Since the volumes on EBS are theoretically invincible, it may be okay to run unprotected by a mirror or parity drive, however, I haven’t found anyone who would recommend this in production. If anyone knows of a good reason to ignore the saftey of RAID 10 or RAID 6 or RAID 5, I’d love to hear the reasoning.

I am also curious if these drives maintain a consistent throughput over the full capacity of the disk or will they slow down as the drive fills like a traditional drive? I did not test this. It remains open for another test (and subsequent blog post). Should anyone run ZCAV against a 100Gb+ drive and figure that out, please let me know.

Fine Print – The Costs

Storage starts at a reasonable $0.10/GB-Month which is reasonable and is prorated for only the time you use it. A 1Tb RAID 0 volume made of 10x100Gb volumes would only cost $1,200 per year. Good luck getting performance/dollar costs for 1Tb like that from any SAN solution at a typical ISP. There are however some hidden costs in the I/O that you’ll need to pay attention to. Each time you read or write a block to disk, there’s an incremental cost. The pricing is $0.10 per million I/O requests – which seems cheap, but just running the simple tests I ran with Bonnie++ I consumed almost 2 million requests in less than 3 hours of instance time. If you have a high number of reads or writes, which you likely do if you’re considering reading this, you’ll need to factor these costs in.

The total AWS cost for running these tests was $0.71 of which $0.19 were storage related. The balance was the machine instances and bandwidth.

Resources

Share Amazon Block Storage is huge

Thursday, August 21st, 2008

Amazon Web Services Logo This morning my spam filter caught this message from Amazon announcing the new Amazon Block Storage service. I’m looking forward to seeing this work it’s way into implementations from hosting companies that are currently re-selling a service layer on top of the already interesting EC2.

One of the largest drawbacks with EC2 (and a major reason I’ve stayed away from it with a 10′ pole) has been the lack of persistent storage. If your just purchasing occasional horsepower – say for a large compute project – you would have to configure your instance – import your data – do your calculations etc and then bring all your data back down. How many times have you needed to wait until the middle of the night to run complex queries or analysis on your data and didn’t want to take down your database to do it? It’s been more than I can count for me. Now you can run the server only when you need it. I can see this a serious boon for folks wanting to do data analysis but don’t need the large EC2 container running over weekends and holidays but don’t want to pay for transfer costs to and from local sources (or deal with pushing data in and out of S3).

From the email:

Prior to Amazon EBS, block storage within an Amazon EC2 instance was tied to the instance itself so that when the instance was terminated, the data within the instance was lost. Now with Amazon EBS, users can chose to allocate storage volumes that persist reliably and independently from Amazon EC2 instances. Amazon EBS volumes can be created in any size between 1 GB and 1 TB, and multiple volumes can be attached to a single instance. Additionally, for even more durable backups and an easy way to create new volumes, Amazon EBS provides the ability to create point-in-time, consistent snapshots of volumes that are then stored to Amazon S3.

Amazon EBS is well suited for databases, as well as many other applications that require running a file system or access to raw block-level storage. As Amazon EC2 instances are started and stopped, the information saved in your database or application is preserved in much the same way it is with traditional physical servers. Amazon EBS can be accessed through the latest Amazon EC2 APIs, and is now available in public beta.

© 1998-2008 AF-Design, All rights reserved.