Posts Tagged ‘cloud’

Share Testing Your AWS Elastic Load Balancer

Tuesday, July 27th, 2010

Vijay Ramachandran asked me, via twitter, how to test if an Amazon Elastic Load Balancer is really doing it’s job. Because 140 characters really isn’t sufficient space to handle this answer, I’ve created this post. Feel free to use any of this in any of your environment.

First, I’ll assume you’ve covered some of the basics with ELB.

The default configuration you’ll end up with following my guides above is a stateless system that distributes the requests more or less evenly across all configured servers. However, when you do it the first time, it’s nice to see that it’s actually doing what you think it should be. The steps are simple

  1. Verify each instance is working as expected
  2. Verify the load balancer is distributing the requests across multiple instances
  3. Verify the instances are working behind the load balancer

1. Verify each instance is working

This is far and away the easiest step. You can simply access each machine by the amazon assigned IP address for that specific instance and ensure that it’s doing what you expect. The only potential issue here is you might jump from one machine to a different machine if you are not watching your URL. For example, if you are on ec2-123-123-123-123.compute-1.amazonaws.com, access your application at that address and ensure it works as expected, if it jumps to a domain name because you’ve hard coded a link somewhere, you may not be testing the new server at all.

2. Verify the load balancer is distributing the requests across multiple instances

To test that requests are being distributed across multiple machines, I use a test file. I generate my test file automatically by running the following script as part of the boot-up routine. This simply saves the instance-id from the metadata into a text file. If you are uncomfortable placing this information in the web root, you can optionally place it behind basic authentication, put it into a script that hashes it (md5 or sha1) or some other application based logic to access it.

/usr/local/bin/curl http://169.254.169.254/latest/meta-data/instance-id
 > /var/www/html/instance-id.txt

Check the path for curl and the web root for your local system and adjust accordingly. This should work from RedHat flavored distributions.

Once you’ve run this on each of your instances, you can tell that requests are being distributed to both machines by simply requesting your load balancer address and verifying that it changes. (Obviously replace the following request with the correct address for your machine.)

http://applicationservers-123456789.us-east-1.elb.amazonaws.com/instance-id.txt

3. Verify the instances are working behind the load balancer

Now for the last and final test. Confident that your requests are being distributed across both machines, test that your application works as expected. First under the Amazon assigned name, applicationservers-123456789.us-east-1.elb.amazonaws.com in this example, then under your CNAME’d alias.

If everything still works, you can assume all is good.

4. Bonus Check

If you really, really, really want to know… you can also verify using your access logs. Check in /var/log/httpd/access_log or wherever your web server logs are kept to see that requests are being distributed to each machine.

DNS Tips:

1. Never use the real IP returned from dig or nslookup as an A record in DNS unless you automate checking it (and even still I wouldn’t) because the actual IP changes from time to time. Only use CNAME entries.

2. If you are using GoDaddy’s DNS tool, you can’t CNAME the root of a domain (ie .example.com). For this case I use one instance as a permanent instance with an elastic IP and point the root A record for my domains to this. I then assign www. as a CNAME for the load balancer’s AWS assigned domain. Last but not least, I use .htaccess and mod_rewrite to ensure requests are sent to www.example.com. This ensures traffic is being sent to the load balancer address.

Share I Love the Cloud/I Hate the Cloud

Friday, March 5th, 2010

Love and Hate Developers have been consuming “cloud” services long before it was a buzzword. For me the first real transition to a cloud mentality was with web services. WSDL’s provided a uniform way to consume a remote resource that was tuned to provide specific information. There were of course limitations with data typing etc, but most of those could be worked around. I didn’t concern myself with how the services I called generated or manipulated the information, only that it responded quickly and was correct. Jump forward a few years and now we can get more than data, we can get infrastructure, platforms and software via simple requests. The terminology has changed but the underlying ideas are the same.

I’ve spent a fair amount of time working and thinking about “cloud” technology in the last year. Some of this time has been joyful and some of it painful. I started this list a while ago and feel it’s finally reached a critical mass, so I’m unleashing it on the world. The remainder of this post is some of the things I Love and Hate about the cloud and the services it provides me today.

I Love the Cloud

  • Provision 500Gb of storage on a 8 volume RAID array in less than 10 minutes
  • Incremental backup in seconds
  • 500Gb of redundant storage costs pennies
  • Access to mountains of meaningful data about almost any topic (Twitter, Flickr, Google)
  • Geo-location encoding/decoding!
  • Work from anywhere (although Coworking would be my first choice)
  • A powerful server online in minutes, use it for a day and then turn it off
  • New ways of building applications using loosely coupled systems
  • I don’t have to manage failed/failing hardware
  • Server Software as a Service (MySQL, SQS, SMTP etc)
  • Rapid scalability without capital expense
  • Wide variety of service offerings (and growing every day)

I Hate the Cloud

  • Inconsistent performance from infrastructure providers
  • Inconsistent performance from API’s (ahem Facebook)
  • Automating EC2 is labor intensive
  • Inconsistent use of terminology confuses developers, executives, media, consumers… really everyone
  • Difficult to monitor resource usage to see if upgrades are necessary
  • I still have to patch and administer infrastructure (EC2)
  • Code isn’t portable
  • More vague technology acronyms and buzzwords
  • Many points of failure within applications that leverage multiple services
  • Merging / Evolving / Failing / Deprecating platforms and services
  • Quotas and request limits

What do you love and hate about “the cloud”?

Share People Don’t Use Clouds

Thursday, March 4th, 2010

Are Microsoft Outlook and Apple’s Mail, software? Are web based products like Gmail and Windows Live Mail cloud offerings? What about Flickr? I can edit my photos using Picnic (for now) giving me basic photo editing functionality. Does moving traditional desktop applications into a web browser make them into “cloud” software? If so, it should hold true that any web based product or service is in some way a “cloud” service from a customer perspective.

Online CRM solutions like Salesforce are often referred to as software as a service (SaaS). How then do we classify software that enables other software to function? For example, Amazon’s Simple DB and SQS services. These are SaaS solutions for developers to build products on. Do we need to further break down SaaS into more granular distinctions? CDW offers 13 different categories for software; Best Buy has 15. Clearly SaaS is too broad in scope to accurately describe what is being offered.

When I talk with non-technical people and mention the word “cloud”, eyes quickly glaze over. People don’t store their family photos in the cloud, they use Flickr. They don’t care how Flickr stores them, so long as they continue to have access to them. They can get their minds around products and services and generally don’t care how that product or service is delivered. Do Gmail, Flickr and Twitter make their life easier, more enjoyable, more profitable, more fun and so on. These are the areas they care about, not if it’s built on scaleable cloud infrastructure or redundant dedicated hardware in an enterprise grade datacenter. In the mind set of my non-technical friends, these are services, tools, websites and in some cases just verbs, adjectives and nouns. The non-technical people I talk with don’t use “search engines” anymore, they “Google it” or use Bing. People don’t use “social networking sites”, they use Facebook, Twitter or MySpace. They have little idea how their computer interacts with these services and generally don’t care. They describe the usable features a specific product has and what it does for them. Clouds for them, are things in the sky! People don’t use clouds; people use products and services. People interact with brands. Individuals outside of technology circles are far less likely to understand or even care about the distinction between SaaS, IaaS or PaaS. The current cloud terminology is lost to them.

As for technologists, what’s most important is to be clear what you mean even when speaking with folks who might “get it.” Yesterday, I described what the different types of clouds mean to me. I did this to clarify for myself and for others who might read my thoughts later. After my experience at CloudCamp, it became clear to me that irrespective of how savvy an audience may seem, it is worth while to take a minute or two up front to define what you mean when you use different cloud terminology. After all, not everyone is functioning with same operational knowledge.

So if people don’t use clouds, who does? Applications use clouds, or more specifically, cloud services. Applications and products are built to interact with services. Services are abstracted access to some unknown back end. If an application needs to write a file, it simply creates a file handle, stores the information and moves on. It doesn’t need to be aware of the underlying technology (SAN, NFS, SSD etc) that might be driving it. Using cloud services is really a discussion of how to architect applications, products and solutions to effectively and efficiently take advantage of the growing array of on-demand infrastructure, platforms and software. We do this to avoid provisioning physical resources. We do this to reduce time to market. We do this to be able to rapidly prototype products. We do this so we can throw something out there and see if it sticks. We do this to change technology cost from capital expenses to operational expenses.

Thinking about the cloud in this way, moves the discussion to tangible instead of philosophical. Engineers, system architects and developers can use these services to build products. Need a way to store data? The discussion becomes which vendor’s product meets the need of the application. Does the relational table structures in Microsoft’s SQL Azure meet the current need, or would it be better served using hash tables like Big Table or Simple DB? The focus for me is about the correct solution and not about clouds.

Share Three Types of Clouds

Wednesday, March 3rd, 2010


Last night I attended CloudCamp in Minneapolis. While there was much healthy discussion about the “cloud”, one thing became crystal clear for me. The cloud means different things to different people. George Reese summed it up well, there are three distinct types of clouds: Infrastructure, Platform and Software. I took away from the discussions that this distinction wasn’t clear for many people (including myself).

Infrastructure as a Service (IaaS):
Amazon and Rackspace are the two largest players in this space, but there are other solid offerings (including ReliaCloud) that compete with them. This is very similar in concept to leasing a dedicated server from an ISP, but with flexible pricing. Keith Schacht pointed out on my post about cloud pricing models, that some providers are offering non-virtualized infrastructure on a per-hour basis. Is there any benefit to choosing a virtualized machine vs. a real machine? I think that goes beyond the scope of this discussion. Something companies need to be aware of here is that running infrastructure in the cloud doesn’t reduce the need for good system administrators and that in terms of architecture very little has changed. Developers in this tier still need to be concerned with system capacity etc. The upside is many of these problems are well understood and the solutions for dealing with them are common place.

Platform as a Service (PaaS):
Salesforce and Google App Engine run platforms which you can build services on. These providers abstract everything away so that product can become the focus. Designing products for platforms doesn’t require an in-depth understanding of the sub-systems. Developers don’t need to know if MySQL, Oracle, MS SQL Server or some other storage engine are handling the data storage layer, they can just trust that the data is being stored and retrieve it when they need it. Of course this model has limitations and anyone building a product would be well served to learn about the best way to leverage the platform efficiently. The drawbacks are obvious as well. Google is an extremely reliable provider, however, they do have downtime. It’s is also extremely difficult to migrate platforms. None of the vendors currently provide import/export style functionality for data.

Software as a Service (Saas):
SaaS isn’t consumer offerings, such as those from 37 signals. Those are products or applications. What I’m referring to is a lower level software like MySQL, SQL Server, Amazon’s SQS and so on. Leveraging these services provides a unique opportunity to use the best solutions for each task, instead of a complete vendor lock-in. Developers interact with the tools and sub-systems they’re already familiar with. What the SaaS vendor does is abstract the management and scaleability tasks. Unfortunately this has a dark side. Reliance on multiple providers requires building systems that degrade gracefully when any single sub-system is no longer available. Zynga, developers of the massively popular Facebook game Farmville, build their scaleable systems in the cloud using the notion of degradeable services. Architecting the solution such that it’s dependence on external systems can be dialed back on demand. Building this into products up-front requires a different way of thinking about application design. Someone raised the point last night during the breakout about architecting for the cloud, that these are the same problems that were being solved in the 70′s. Designing networks of loosely coupled systems is not a new problem. However, it is a problem that many developers I’ve met haven’t spent much time thinking about… yet.

Share Cloud Databases Coming Soon

Monday, April 7th, 2008

Startups looking for a database back end will soon have more options (hopefully). MySQL is a fantastic lightweight database that scales reasonably well for most projects. Microsoft SQL Server, Oracle, DB2 and other commercial offerings scale well and have lots of great support but at a massive cost, creating a barrier to entry for most providers. Last year, Amazon announced their solution, a database in the cloud (my thoughts). Not to be outdone, Google is rumored to be announcing their own online equivillent tonight at one of their campfire format press events. They used Campfire One to announce OpenSocial and Tech Crunch is rumoring that they may be announcing their cloud database platform BigTable. With this, small businesses will be able to better outsource database (and database management) with minimal costs and potentially unlimited scale. This is ideal as the amount of user generated, tags, index, and categorized content online continues to grow. Many companies are having great success using Amazon’s Web Services for storage and computational intensive projects. This is a natural extension to that.

© 1998-2008 AF-Design, All rights reserved.