Share EC2 Instances Die and Other Lessons From The Cloud
March 29th, 2009 by ErikToday I learned a few painful lessons: First, EC2 instances die and a simple reboot will not recover them. Second, unlike many web hosts – amazon doesn’t offer any level of monitoring. Third, backups are only useful if they’re current.
Lesson 1:
When I originally built my EC2 instance to host this site (and a few other applications) I was learning about Amazon’s EC2 and so spent a good amount of time trying to be as thorough in my documentation of my server setup. The result was an install script, “install_server” that effectively did all the steps I did when I turned on the server. The script goes something like this:
#!/bin/sh yum -y upgrade yum -y install sendmail yum -y install httpd yum -y install php yum -y install php-mysql yum -y install php-pecl-memcache yum -y install memcached yum -y install subversion pear install HTTP_Request cp configs/freetds.conf /etc/freetds.conf cp configs/httpd.conf /etc/httpd/conf/httpd.conf cp configs/memcached /etc/sysconfig/memcached cp configs/php.ini /etc/php.ini cp configs/fstab /etc/fstab tar -xzf webroot.01.tar.gz /etc/init.d/memcached restart /etc/init.d/httpd restart /etc/init.d/sendmail restart echo configs/crontab.txt echo NOW SETUP CRONTAB!!! |
This effectively copies into place all of the settings I need for a server. This is nice, because I am able to bring a new server online within 10 minutes or so of it going down. I should probably just automate the creation of a server specific AMI once or twice a day, but I’m just not there yet. Also – I know there are some weak points in the startup script… I’ll be working those out soon now that I see it’s really a useful tool.
Lesson 2:
Monitoring of EC2 instances needs to be done by an external service. Fortunately, not too many people care what’s going on with this site over a Saturday night to Sunday morning. I’m looking at the following options in two realms, first – a basic alert that there’s a problem, and secondly a more proactive approach that can do some instance killing and restarting on it’s own.
Right now I’m looking at these services for quick and dirty SMS alerts about the status of my instances:
And I’m looking at these for a more holistic approach to monitoring but am gun shy on relying on these to manage the instances until I learn more about them:
Any experiences anyone has had with any of these products is always appreciated.
Lesson 3:
Last but not least are backups. I have another script, aptly named “backup_server” that makes a snapshot of the settings and configurations every 24 hours during off peak times storing the data on an elastic block storage device that I have mounted to the application server. That goes something like this:
crontab -l > configs/crontab.txt cp /etc/php.ini configs/php.ini cp /etc/httpd/conf/httpd.conf configs/httpd.conf cp /etc/sysconfig/memcached configs/memcached cp /etc/fstab configs/fstab tar -czvf backups/configs.01.tar.gz configs tar -czvf backups/webroot.01.tar.gz /mnt |
Where I was burned here is that my cron job only backed up my data every 24 hours at 4am. However, the application server crashed and burned at 2am EDT. Clearly I need to consider something like rsync to prevent this type of data loss. Rsync can grab the incremental changes hourly, thus reducing work losses during between the full backups every 24 hours. As a stop gap, I’ve increased the frequency of the backups until I can get back to the system and setup rsync.
You should follow me on Twitter.
Tags: amazon, cloud computing, ebs, ec2, gotcha, monitoring
March 30th, 2009 at 1:20 pm
Hi Erik, do you find memcached lack of HA/failover to be a problem?
March 30th, 2009 at 2:12 pm
@bret – No but I’ve designed my apps to run either with or without memcached so it’s simply offloading from MySQL meaning that HA Memcached doesn’t really affect me.
April 4th, 2009 at 4:49 am
I would recommend installing applications and logs on EBS and using links to link to the OS. Then you don’t need to use EBS as backup. But you still need to create snapshots of the EBS to back that up or backup a database using a database utility to S3
Neill Turner
http://ec2dream.blogspot.com
May 12th, 2009 at 7:40 am
I will suggest using http://www.monitis.com, for monitoring your instance cause it provides not only external monitoring, but although resource utilization of your instances, give it a try and you will like it
July 7th, 2009 at 5:53 pm
Hi Erik,
I just got started on cloud computing stuff. Was looking at AWS and to be honest was a bit afraid about the complexity. Scalr seems to be a good solution and doesnt seem to cost an arm and a leg like rightscale.
Have you used Scalr yet? Do you recommend it?
Thanks,
Fuji
July 9th, 2009 at 12:51 pm
@Fuji
I haven’t used Scalr or RightScale myself. Now that Amazon has added auto scaling and elastic load balancing, I’ve opted to roll my own solution with those technologies.
Erik