Exploring Amazon CloudFront
July 5th, 2009 by ErikA few weeks ago I switched Sexii to use Amazon’s CloudFront content delivery network (CDN) instead of serving images through my own Apache server. The decision came after analysis of watching my EC2 instance get slammed through a couple of high volume peaks. I crunched my logs and found that over 80% of the traffic I had was serving images! Adding a new instance for handling images exclusively would certainly solve the problem, but at ~$75/month for a new instance, I figured there was probably a better way. My ultimate goal here to increase the potential work my infrastructure can do per instance hour, not necessarily save costs, but I don’t want to spend more than I need to.
Setting Up
Setting up CloudFront is like any AWS service, agree to the terms and costs and get going. There’s a full API for interacting with it, but rather than learn another new API, I opted for the point and click features of S3 Organizer. If you’re unfamiliar with the product, it gives you Finder or Explorer like functionality for managing the data you store on the Amazon S3 platform.
I started by creating a unique bucket in S3 and uploading my images. Now with my content happily hosted on S3, I actually had a CDN of sorts. I could have set the access control list (ACL) to allow anonymous reads for that bucket then updated my code accordingly and stopped there. S3 would handle the serving of the content and my server is free to do other tasks. Some popular websites are already doing this; for example Twitter serves your profile picture from S3. This actually accomplishes my primary goal, reducing the stress on my server. But, being the curious geek that I am, I wanted to try out the full blown CDN and with the pay per use pricing model, it was low risk.
With the bucket setup, S3 Fox was able to create the distribution with a couple of clicks (literally). What surprised me was how long it actually took (10 minutes) for the distribution to come online. Since my image URL’s are assembled on the fly in code, I updated my configuration file to reflect the new source and I was live. I did run into one problem, the URL’s for images had to be syntactically correct and I was bit by this. By default, Apache is more forgiving and just ignores the double forward slash.
Works: <img src="http://a5e3px8iw78h4.cloudfront.net/images/logo.png" /> Doesn't: <img src="http://a5e3px8iw78h4.cloudfront.net/images//logo.png" /> |
A nice feature, that I didn’t use, is the ability to map your own domain name to the delivery network; effectively masking a name like http://a5e3px8iw78h4.cloudfront.net with http://mycdn.example.com if you want. This wasn’t critical for me and so I skipped it, besides Amazon’s DNS servers are likely much more robust than mine.
Costs
Before dissecting the costs, its important to understand the content I moved. Sexii doesn’t actually have many images. It relies heavily on CSS and the layout as the design. It was done this way to eliminate hosting costs early on. MySpace applications that have home and profile surfaces incur a very high traffic volume with a low return on the investment. The assets hosted by the application, are a set of small icons used in a feature that facilitates flirting. The feature shows a grid of around 30 icons that the user can attach to a flirt. Each icon is a separate image file and all of these icons are shown at the same time resulting in about 30 requests for <1k images. See below for the consequences of this design mistake.
Costs are cumulative across the AWS platform and so are a bit difficult to break down on a per application basis, but I'm going to try to give you an overview of what the change to CloudFront cost me and where I can potentially save more money. I've skipped over the costs of uploading and hosting on S3 as they're insignificant for my data and usage pattern, less than $0.08/month.

During the roughly 4 1/2 day window analyised, the cost for CloudFront was $6.44, roughly $1.46/day. I served over 5.4 million images or 14.3 images per second to the 4 regions. If I had built a dedicated server and configured my site to use that for all images, it would have cost me around $11.44 including bandwidth for the same time period. The largest portion of the cost for me was the overall number of requests. By rewriting the icon code to use CSS positioning, I expect to reduce the request count more than 90% (-$4.95 or -$1.12/day). The bandwidth number should stay about the same because the larger icon file transfers the same amount of information as the 30 independent requests.

For those of you who are curious, the Japanese traffic was the smallest portion at 0.11% and Hong Kong was 0.16%.
Where Next?
Move more to CloudFront!
First step, after reducing the number of requests, is to move externally loaded CSS and JavaScript files. These are no-brainers (and are even suggested on the CloudFront site). Taking it a bit further, I’m considering moving static advertising iframe files, common in Social Network based applications. I can continue to maintaing my generic demographic targeting by creating a series of files that include the appropriate demographic distinctions. skyscraper_m_18.html, skyscraper_f_18.html, skyscraper_m_25.html, skyscraper_f_25.html and so on. I’ll be looking to see if this can be better handled through S3 alone to keep the costs of these low value CPM/CPC ads lower. MySpace hosts the actual application code on their servers, but sites like Hi5 and Orkut do not. I would strongly suggest moving your application.xml files to a CDN and only run the ajax response files on your application servers.
Where would it all end?
It’s conceivable you could host an entire brochureware site on CloudFront but that may be taking it too far.
Conclusion
I’m pleased with CloudFront. I’ve definitely increased the capacity of the application server which is especially useful during peak times. It also exposed a design weakness, a sad but important lesson to learn. This is a GREAT way for companies to get the speed benefits of a CDN and increase the processing capacity of an application server.
You should follow me on Twitter.
Tags: amazon, aws, cdn, cloudfront
July 6th, 2009 at 5:17 pm
I always enjoy learning how other people employ Amazon S3 and CloudFront. You can also check out CloudBerry Explorer that helps to manage S3 and CloudFront. It is a freeware. http://cloudberrylab.com/