Earlier this year, Amazon launched a suite of new services that replaced the need to work with a product like Scalr and RightScale for building scaleable applications on the EC2 platform. Those tools help you allocate more resources according to current application load. The key benefit of using a cloud based service is that you only pay for what you use. However, without one of the afore mentioned providers, and their additional costs, you were in a lurch designing a system that could detect the current load of your infrastructure and respond accordingly. Amazon has now made it very simple to create infrastructure that can expand AND contract very simply, of course only paying for what you use.
Elastic Load Balancer
Solutions for load balancing were as varied as round robin selection DNS to running a load balancer on an instance (I’d been using Nginx on an m1.small instance $0.10/hr). Nginx worked well, with an assigned an elastic ip (static ip) that could move from machine to machine as needed and special scripts to manage the pool of servers (or do it manually). It worked, but was by no means efficient or even easy to maintain. Furthermore, there is a single point of failure with the Nginx host. Being proactive, it was possible to create a monitoring system to monitor Nginx, and then bring up and configure a new server before re-directing the elastic ip to the new host should it fail. It was a hack and certainly not elegant!
Enter elastic load balancing. You create an elastic load balancer and then add the instances to the load balancer. That’s it! Amazon handles the redundancy and the best part is that it’s only $0.025/hr that’s a savings of $54/month over running a load balancer instance. There is of course a drawback. With Nginx and other load balancers, you have the option to do intelligent load balancing. Advanced functionality like sticky sessions and response rewriting isn’t available for the Amazon solution. However, with a well designed application, this should be irrelevant.
Monitoring the cloud is VERY important. Amazon has issues with all sorts of things from EBS stores going offline to instances being completely unavailable. Before CloudWatch, I used a mix of systems including SNMP monitoring and 3rd party service Pingdom to keep tabs on my instances. The CloudWatch product doesn’t replace these, but rather supplements the data I gather from them. CloudWatch is an additional $0.015 per server above the default instance cost, it takes about 2 minutes to come online and the statistics are available through the API almost immediately after that. CloudWatch provides access to monitored instance’s CPU utilization, disk read bytes, disk read operations, disk write bytes, disk write operations, network in, and network out. I find for my needs, CPU utilization is an excellent indicator of server performance and I use that to determine when to add a new server or take one away.
You can gather these statistics grouped by AMI, Instance Id, instance type and even AutoScaling group. If you can reliably detect your need to add an additional server based on these statistics, you’ll be able to take advantage of Auto Scaling; more on that in a minute. If not, it’s very simple to write a script that determines if it’s time to start a new server up to help with processing and register it with the load balancer. Oh, and did I mention for the load balancer you also get access to healthy host count, latency, request count, and unhealthy host count? These could be helpful metrics for rolling your own scaling scripts or may be sufficient for knowing when you need an additional server.
This is the glue that brings it all together. Auto Scaling monitors your statistics from CloudWatch and starts new instances when needed then turns them back off when no longer needed. Currently this is all offered for FREE if you are using CloudWatch! The setup is simple once you go through it the first time, but took me a couple of tries to get it right. So in my case, I monitor my application server pool and when I see that it’s stressed, I add another server. Because of the way it’s configured, I have some safe guards in place that keep me from starting thousands of instances too.
How To Do It
This assumes you’ve installed all the Amazon CLI tools for Elastic Load Balancing, Auto Scaling and CloudWatch, your fairly comfortable at the command line and know how to make your own AMI. Now, you’ll need to determine what the best way for you to publish your code to a new server is. Some possible solutions to this are rsync, subversion, nfs mount, s3 or a mix of technologies. Some folks just bundle up the code in their AMI (works well if your codebase is static). Regardless, that’s a bit beyond the scope of this post. After you create your solution, you’ll create a server image (AMI) that can boot up and correctly get a copy of the code you’re running. If you already have that, you can of course just use that one. Once you create an instance that can be turned on and handle traffic…
- Create the Load Balancer
- Create the Auto Scale Launch Config
- Create the Auto Scale Group
- Create the Auto Scale Trigger(s)
Create the Load Balancer
The DNS-NAME that is returned is the point you’ll direct all traffic to. Add this as a CNAME in your DNS for your domain.
elb-create-lb ApplicationServer --availability-zones us-east-1a --listener "protocol=http,lb-port=80,instance-port=80" DNS-NAME ApplicationServer-12345678.us-east-1.elb.amazonaws.com
Create the Auto Scale Launch Config
The AMI will of course be your AMI that knows how to come online and get a fresh copy of your code and you may be using different instance types. Definitely take a look over the documentation to ensure you are doing it all right
as-create-launch-config AppServerConfig --image-id ami-12345678 --instance-type m1.small --group default
Create the Auto Scale Group
I use a nice long cooldown period here (10 minutes) so that the servers don’t come online or go offline too quickly. If you expect an occasional slashdotting, you might want this to be shorter. This also provides some a boundry. There will always be at least 1 server and no more than 3. This also tells auto scaling that you want the new instance to join the load balancer.
as-create-auto-scaling-group AppServerGroup --launch-configuration AppServerConfig --availability-zones us-east-1a --min-size 1 --max-size 3 --cooldown 600 --load-balancers ApplicationServer
Create the Auto Scale Trigger(s)
You will likely spend a good bit of time working on this portion. What this basically does is if the average CPU utilization for my servers is above 70% for 10 minutes, bring a new server online. Then likewise, if it falls below 30% for 10 minutes, turn one off. The Auto Scaling Group we created ensures there is always at least 1 server online.
as-create-or-update-trigger AppServerTrigger --auto-scaling-group AppServerGroup --namespace "AWS/EC2" --measure CPUUtilization --statistic Average --dimensions "AutoScalingGroupName=AppServerGroup" --period 60 --lower-threshold 30 --upper-threshold 70 --lower-breach-increment=-1 --upper-breach-increment 1 --breach-duration 600
That is all there is to it! You now have an system that can grow your application servers up on demand! I hope this helps you build out an infrastructure that lets you scale up your next web application. You might want to look over the command line tool documentation before getting started.