Posts Tagged ‘memcached’

Cloud Pricing Models

Monday, December 14th, 2009

By ArcticNomad Yesterday Amazon announced their Spot pricing model. Effectively providing market driven pricing for instances on EC2. Depending on your product, this probably won’t impact you much, but it got me to thinking about pricing of the cloud. Amazon’s Web Services was a game changer when it launched. Buy the computing resources you need for only the time you needed them. However, your stuck with a very limited set of instances and therefore you need to architect your systems around their pre-defined instance sizes. While they expanded their instance offering to include high cpu and more recently high memory instances, you’re still stuck with a fairly rigid set of boxes from which to run your systems.

A specific weak spot I’m having with the pre-defined box sizes is Memcached. It turns out that Memcached is fairly light on the processor and requires essentially no disk I/O. Really the processor is just a go between for the memory and the network card. If you are looking at putting a 32Gb server online to manage the caching tier for your app, you’d need to buy the “High-Memory Double Extra Large Instance” for $1.20/hr (or $10,512/year) wait… what?! Okay, obviously we should pre-pay this, typical business model is to run the hardware over a 3 year cycle, so lets pay the $4,900 up front and then we enjoy a more comfortable $0.42/hr (or $3,679.20/year + $1,633.34/year for the pre-pay = $5,312.54 each year for 3 years). Obviously the $15,937.60 we pay over 3 years is easier to swallow than the $31,536 if we don’t pre-pay it.

Now, if your running your infrastructure in the cloud and considering using Memcached, you really can’t put a box in a rack somewhere else because the increased latency and unreliability means you may not be able to get data from your cache in a cost effective way so I’m not going to look at what buying a box with that kind of memory would cost, not to mention there is such variation in buying rack/ping/power that it would be too messy to calculate here.

This has me intruiged to see how other providers are doing their billing. I love the idea of a-la-carte servers paid by the hour. But really what would be great is allowing me to choose the CPU, memory, and I/O I need. This brings me to two smaller cloud providers who seem to have interesting offerings.

First up is 3Tera. 3Tera offers a completely different take on the cloud infrastructure model. The idea behind their offering is that you purchase hardware (or lease it) and then slice the box however you want. Basically, running your own virtual cloud! You can consider different hardware options, including stuffing a ton of RAM into weaker boxes and so on. Ultimately the product is a resource allocation tool. The dark side is that you have to pay for all that hardware, even if your not using it. Really this isn’t a cost savings over EC2. Although it’s an interesting idea if your system resource needs shift significantly over time, but are consistent enough to warrant buying or leasing hardware. I’m really interested in their technology and they have an impressive list of partners running the software that you can then lease the virtual images from.

The second provider is OpSource Cloud. OpSource charges a base fee for the VLAN service and then you build your infrastructure on top of that. The beauty is that it’s a-la-carte down to the cpu cycles and memory! Currently the memory footprint is limited to 8Gb and each machine needs between 1 and 4 CPU’s. However, this pricing model is interesting as you can provision a single CPU with 8Gb of RAM which comes out to roughly $0.24/hr (or $2,102.40/year). Starting 4 of these instances to hold the 32Gb of cache is only slightly cheaper than Amazon’s model coming in at a whopping $8,409.60/year. There are some cost savings available if you buy a silver, gold or platinum pricing tier for a monthly pre-pay. The pricing for those starts at $500/month and goes up; so you really need to have some significant hardware running to justify those costs. Another gotcha with this plan is that you need to provision a network which is $0.20/hr. I’m going to be keeping an eye on this provider. I think in the future they may have a winning solution.

Unfortunately, I don’t yet see a solution that fits my specific need. Perhaps I need to adjust my thinking and look at alternatives. It may be time to consider Amazon’s Simple DB, which provides simple key/value storage like Memcached, although as a service. Is it the answer for putting large amounts of data into a non-RDBMS? I’ll consider that in another post.


Creative Commons Photo by ArcticNomad

PHP Memory Caching Performance

Wednesday, January 28th, 2009

Today I had the great pleasure to work with both APC and Memcached in a production environment. I’ve read that the performance of APC is roughly 3-5x faster than Memcache. So I decided to do my own test to see which performed better on my rig. If you want more numbers that may better mimic a production server environment, be sure to read Peter from the MySQL Performance Blog’s initial article which found APC to be roughly 3x faster. Although, I found that his results are spot on a little over 2 years later.

Being ultimately more concerned with the number of requests Apache serves back to my clients, I opted to use metrics similar to Jay Pipes method, leveraging ApacheBench to see determine the best way to push more content to the users faster. I created two test files which are nearly identical, accessing the cache layer using the simplest calls possible.

The tests were run against localhost after a pre-warmup to ensure the correct number of Apache workers were initiated and the cache had the correct content. This is important, because these numbers are a good comparison of read performance, but do not provide write throughput. The test was performed against a desktop system, so background processes may have varied slightly from test to test despite efforts to disable everything.

Apache bench was run over 10,000 requests with concurrency of 50. The values are the average of 3 consecutive runs. Standard deviation for connection time was consistent over each of the runs. APC proved significantly faster (30%) even with Memcached on the localhost. However, memcached scales out where as APC is tied to the local machine. That alone may be sufficient reason to use it over APC in your environment despite the performance benefit.

Some large scale applications benefit from multiple layers of caching. According to Matt Raible’s notes from OSCON 2008 Facebook uses $GLOBALS, APC and Memcached as their first lines of caching defense. This seems to further validate Peter’s findings.

Results:

APC Memcached
Requests/Second 2,088.43 1,611,.59 APC ~30% More
Time/Request (mean) 0.48ms 0.62ms APC ~23% Faster
99% of Requests Finished in 63ms 102ms APC ~39% Faster

Test Code:

The scripts are fairly straightforward. I didn’t want this to be a comparison of MySQL database accesses and so I manually created a simple object to cache.

<?php
 
	// APC Cache 
 
	$data = apc_fetch("test_object", &$success);
	if($success){
		print_r($data);
	} else {
		// store an object in the cache for the next call
		$test_object = array();
		$test_object['key1'] = 1;
		$test_object['key2'] = "hello world";
		$test_object['key3'] = array(1,2,3);
		$result = apc_store("test_object", $test_object, 3600);
		print_r($data);
	}
?>
 
<?php
 
	// Memcached Cache 
 
	$memcache = new Memcache();
	$memcache->addServer("127.0.0.1","11211");
	$data = $memcache->get("test_object");
	if($data){
		print_r($data);
	} else {
		// store an object in the cache for the next call
		$test_object = array();
		$test_object['key1'] = 1;
		$test_object['key2'] = "hello world";
		$test_object['key3'] = array(1,2,3);
		$memcache->set("test_object", $test_object, 3600);
		print_r($data);
	}
?>

System Configuration:

  • Mac OS 10.5.6
  • 2.16Ghz Intel Core Duo
  • 2 Gb RAM
  • PHP 5.2.6
  • APC 3.1.2
  • Memcached 1.1.12
  • Apache 2.2.9
  • ApacheBench 2.3

Memcached with PHP on Mac OS X

Saturday, October 18th, 2008

Nate Haug provides a great script for installing memcached along with some very detailed instructions on setting up a sandbox environment. I’m not using his MAMP sandbox, instead opting for the built in PHP / Apache install, so I needed to change a few things from his tutorial. My system is a fully updated Intel MacBook Pro running OS X 10.5.5 with the Xcode tools installed – YMMV. PHP is currently reporting version 5.2.6.

  1. I added my revised start script for memcached.
  2. The PHP version that ships with OS X doesn’t have PECL, so I downloaded the source and compiled manually.
    phpize; configure; make; sudo make install
  3. Edit to /etc/php.ini: Changed: extension_dir = /usr/lib/php/extensions/no-debug-non-zts-20060613/
  4. Edit to /etc/php.ini: Added: extension=memcache.so

You can skip his Apache scripts. Restart apache by restarting Web Sharing in the System Preferences.

The major changes I made from Nate’s memcached startup script were the singular instance and binding the service to localhost (127.0.0.1) only. This keeps memcached slightly more secure by only having it listen on the loopback adapter. If you need more space, just change the -m attribute to be higher; it’s measured in MB.

# /bin/sh
memcached -m 1 -l 127.0.0.1 -p 11211 -d

NOTE: As with any other service running on your system, opening a web server exposes your system to potential attack and worse. Be sure to keep production data away from your test environment. Someone at Starbucks, sharing your WiFi connection, may be surfing your development site too. Consider yourself warned!

Memcache Feature-Bug Gotchas

Thursday, July 17th, 2008

Recently I’ve been doing a lot of work with memcached using PHP and have been bitten a few times by different things with how things worked. I’m calling those items out here so anyone getting started with Memcache can learn from my mistakes. Memcached is an amazingly powerful caching layer with lots and lots of online documentation. It’s easy to get running on Linux and hooking PHP into it – I’ll save yet another post about how to do it since there are so many excellent resources already. The hard part is determining where you’ll implement it and in what way. For this post, I’ll leave the implementation strategy aside and walk you through a couple of examples of where I’ve been bit. I’ve created a layer to further abstract the memcache() object in PHP so I can ignore dealing with the add() vs. replace() vs. set() switches and allowing me to have one stop shopping for all of my configuration settings. The source for that basic class is included at the bottom of this post, feel free to use/distribute as you see fit.

One last item before we delve into the examples. Memcached and memcache are not quite the same thing. When I reference Memcached – I’m actually referring to the server instance of Memcache which you are running on the server, which is accessible using a variety of methods, on a variety of platforms including but not limited to PHP. When I use memcache in this post, I’m referring to the API hooks that have been created for PHP to interact with your Memcached server instance. Some (or possibly all) of these items ONLY APPLY TO PHP and shouldn’t be construed as feature-bugs with the Memcached server itself.

Compression and Object Sizes

It took me a while to track down this bug – but I finally read up on memcache a bit more and learned that memcache behaves oddly with small chunks of information if compression is turned on. I’m pretty sure this is memcache not memcached that’s causing the issue because the compression layer happens in PHP. The output for the following code is The value is not the same. However, if $compression is set to false, it will work as expected. Integers and character strings seem to be okay with compression on – as do complex objects. The setCompressThreshold method allows adjustment of this size but I’ve gotten in the habit of not caching simple values like true and false – instead opting to cache objects, classes, arrays and alternately JSON.

// create the cache
$cache = new Memcache();
$cache->addServer("localhost","11211");
$compression = true;
 
// create the value in the cache
$x = true;
$cache->add("x", $x, $compression, 10);
 
// access the value
$y = $cache->get("x", $compression);
 
// check what happened
if($x == $y){
   print "The value is the same";
} else {
   print "The value is not the same";
}

Caching of Class Objects

When caching complex objects like classes, memcache serializes the entire object and then caches it was as it was instantiated. So be warned if your class definition changes, you’ll need to flush your cache entirely of those objects or you might find it behaves a little differently than your expecting. Let’s say you have a class with a few properties that update multiple properties when certain methods are called and you wish to change one of those properties slightly. Any objects that are in the cache already will continue to work with the old values until they are flushed from the cache. It’s not sufficient to read it out and put it back in – the object really is your OLD class definition.

class Foo{
   protected $property = array();
   public function __construct($arr){
      if(count($arr) > 0){
         $this->properties = $arr;
      }
   }
   public function __get($key){
      return $this->property[$key];
   }
   public function __set($key, $value){
      $this->property[$key] = $value;
   }
}

So now you can create Foo objects all day and stuff all sorts of information into them and cache them. You can also get them back out willy nilly later (I’ll use my cache class to save time on the code below).

$foo1 = new Foo(array("apples"=>11,"orange"=>20));
$cache->set("foo1",$foo1, 60);
$foo2 = $cache->get("foo1");
echo $foo2->apples; // should be 11

So we see that all works but, what if we change the way the class works? For example adding a layer of math to calculate a tax or something along those lines.

class Foo{
   protected $property = array();
   public function __construct($arr){
      if(count($arr) > 0){
         $this->properties = $arr;
      }
   }
   public function __get($key){
      return round($this->property[$key] * .9); // calculate storage/depreciation loss
   }
   public function __set($key, $value){
      $this->property[$key] = $value;
      $this->property['num_items'] = count($this->properties) - 1;
   }
}

Our existing cached object doesn’t behave as expected. One of two things seems to happen, and I haven’t fully flushed it out what happens when. First, the object just comes back as it was initially instantiated or second, it silently dies without returning an error. This might be a good reason to create a version value for your cached objects so you can switch on the version to determine if the cached value is valid.

// let's access our existing cache object from before again...
$foo2 = $cache->get("foo1");
echo $foo2->apples; // we might hope for 10 but...

Database Results

Caching of resources doesn’t work. The data being cached needs to be able to be serialized by memcache so it can be inserted into memcached. Database handles are much like your memcached connection – they’re a socket you talk to and unfortunately, so are MySQL results. There are good reasons for this so don’t gripe about it. You’ll need to write a simple wrapper that does all of the result parsing for you prior to caching. Then you can easily create a cacheable MySQL object that can be inserted into memcached. It only takes a few minutes to do this and I may even post later describing the wrappers I’m now using to do just this. Until then – know that you can quickly create an array of your data using the following code and cache that result instead.

// create a cache object (using class from below)
$cache = new Cache();
 
// create an array to populate the data with
$data_array = array();
 
// run the query 
$mysqli_result = $mysqli->query("select * from table where condition=true");
 
// stuff all the data into the array
while($row = $mysqli_result->fetch_assoc()){
   $data_array[] = $row;
}
 
// cache the array
$cache->set("query_data",$data_array,90);

Cache Time to Live

Nothing too big here, but if you provide 0 (zero) or false for a time to live/expiration value, the item never expires, it just gets pushed out if needed later. This all happens on a LRU basis and is well documented.

Protect your Namespaces

This may seem trivial, but I’ve been bit here too. Often it’s sufficient to use one server for multiple tasks. Since Memcached is easy to run in one large pool and share it among multiple resources (much like you would with MySQL) it’s easy to share across multiple applications. There are some nice economies of scale this will afford you. But consider the following bug you could create for yourself in your logic.

Application 1 accessing it’s DB table.

 
// Application 1 fetching content about a user
$memcache = new Cache();
$query = "select * from users where userid = 12";
$result = $memcache->get(hash(md5,$query));
 
// The data wasn't in cache, so we run the query below and store the data
if(!$result){
     $result = $mysql->query($query);
     $memcache->set(hash(md5,$query), $result, 600);
}

Application 2 accessing it’s DB table.

// Application 2 fetching content about a user
 
$memcache = new Cache();
$query = "select * from users where userid = 12";
$result = $memcache->get(hash(md5,$query));
 
// the value existed in cache - so it skips the query and uses the cached value
if(!$result){
     $result = $mysql->query($query);
     $memcache->set(hash(md5,$query), $result, 600);
}

Application 2 and Application 1 are using the EXACT same key to reference their data. Unless this is intentional (because they share a common database) it can be a real pain to debug. The easiest way to correct this is to create a namespace for the cache layer and append it to any keys you may use. The example class provided below does just that with minimal fuss. The code above would be changed to reflect the correct namespace for each application and they could co-exist using the memcached server together.

// in app 1 - use that namespace
$cache = new Cache("app1");
 
// in app 2 - use that namespace
$cache = new Cache("app2");

Memcache Abstraction Class

This is the abstraction class I use to handle all memcache interaction. It’s little more than a thin veneer over the existing PHP object. You can see where it’s easy to expand this basic cache layer within the constructor and you can tune for your data, servers and other bits relevant to your implementation as needed.

class Cache{
 
   protected $cache = false;
   protected $namespace = "";
 
   public function __construct($namespace = ""){
      $this->cache = new Memcache();
      $this->cache->addServer("localhost","11211");
      $this->cache->setCompressThreshold(127,0.2);
      $this->namespace = $namespace;
   }
 
   public function __destruct(){
      $this->cache->close();
   }
 
   public function set($key, $value, $ttl = 600){
      $this->cache->set($key . $this->namespace, $value, true, $ttl);
   }
 
   public function get($key){
      return $this->cache->get($key . $this->namespace, true);
   }
 
}
© 1998-2008 AF-Design, All rights reserved.