Posts Tagged ‘performance’

Share Using Data Pipelining to Improve Gadget Speed

Wednesday, June 2nd, 2010

Beginning with OpenSocial 0.9, support for a new featured called Data Pipelining was introduced. Pipelining is really simple to implement and can bring a lot of speed and power to your applications. Pipelined requests are sent before a gadget has been loaded so they are significantly faster.

Data pipelining can be used to load all sorts of information from friends lists, remote json data and plain HTML. For this example, I’m showing how to use pipelining to load in HTML to generate a home view more quickly than using gadgets.io.makeRequest() or osapi.http() after the gadget loads. Another option for quickly loading content is to use tag which provides essentially the same functionality as the data pipelining example here, but may or may not be supported by your container. To really get into the power of these alternatives to fetching data after load, be sure to read the developer wiki notes on remote data requests.

This simple gadget that will load the contents of a file and inject them into the DOM using simple JavaScript.

<?xml version="1.0" encoding="UTF-8" ?>
<Module>
	<ModulePrefs>
		<Require feature="opensocial-0.9" />
		<Require feature="opensocial-data"/>
		<Require feature="views" />
		<Require feature="dynamic-height" />
	</ModulePrefs>
	<Content type="html" view="home">
	<![CDATA[
<script xmlns:os="http://ns.opensocial.org/2008/markup" type="text/os-data">
	<os:HttpRequest key="home" href="http://example.com/data_pipeline.php" method="post" format="text" authz="signed" />
</script>
<script type="text/javascript" src="http://example.com/display.js"></script>
<div id="target_div">
</div>
}]>
	</Content>
</Module>

The content of display.js simply loads the pre-fetched content into the target_div on the home page view.

View Code JAVASCRIPT
function init(){
	var home = opensocial.data.getDataContext().getDataSet('home');
	if(home && !home.error){
		if(home.content != undefined){
			document.getElementById('target_div').innerHTML = home.content;
		} else {
			document.getElementById('target_div').innerHTML = home.result.content;
		}
	}
}
gadgets.util.registerOnLoadHandler(init);

The entire “page” that you would want to generate for this display should be passed back as the output from the file http://example.com/data_pipeline.php. The file will be called with as a signed post containing the current viewer $_POST['opensocial_viewer_id'] and owner $_POST['opensocial_owner_id'] user id values along with the other required signature fields.

You can now query your systems to generate the appropriate display. For example, if the owner and viewer are the same, you might show a summary of activity to their account since they last visited the application. If the viewer and owner are not the same, you might display a public view of the owners activity within the application (and possibly promotional material if the viewer isn’t currently a user of the app).

Share Using CSS for Icons

Friday, July 17th, 2009

During an exploration of CloudFront it was noted that requests were the lionshare of the costs for one application. How much savings could be realized by merging all icon files into one and serving them using positioning in CSS? Quite a bit actually.

First, I created an icon file that is 128 pixels square to hold my 16 pixel square icons. Each of the icons individually averages out to ~700 bytes. After adding all the images, the total filesize for this new icon file was ~21.5K. This already equates to a significant space savings on any page that loads all 64 icons which would translate to ~44.8K. More space could be saved had I used a gif, but I wanted retain the alpha – personal preference. Next I created the necessary CSS to position these images using intelligent names like s-icon-heart to replace images/heart.png, this makes the code very readable. The CSS positioning code for my icon file to support all 64 icons is ~3.6K which needs to be added to the stylesheet sitewide.

Adding the icon file and the CSS together yields a total savings of ~19.7K for pages that use all icons in the file. While not every page uses every icon, stylesheets and images are usually cached by the browser so future requests for these assets won’t incur a trip to the server. Furthermore we’ve eliminated up to 63 http requests which, although individually are quick, cumulatively they add up to serious time for the browser!

You can see how I made my icon file by dropping the icons into a grid in Photoshop. I then save this out as a large PNG. Be aware that older versions of IE do not support PNG transparency so you might get some odd behaviors there. As always test in the browsers you wish to support.

Icon Screenshot

I then add the necessary positioning elements into my CSS file for each of the icons. Getting the names and positions was probably the most tedious part of the process.

div.icon { width:16px; height:16px; }
.icon { background-image: url('images/icons.png'); }
.icon-star-bronze-red { background-position: 0 0; }
.icon-star-bronze-green { background-position: -16px 0; }
.icon-star-bronze-blue { background-position: -32px 0; }
.icon-star-gold-red { background-position: -48px 0; }
.icon-star-gold-green { background-position: -64px 0; }
.icon-star-gold-blue { background-position: -80px 0; }
/* ... 58 more definitions */

There are a number of ways to then use this in your code. You can point towards a clear.gif, if you’re already using one for positioning or layout. You can apply it as the background image on an element. Get creative.

<!-- using an image -->
<img src="images/clear.gif" width="16" height="16" class="icon icon-star-gold-red" />
<!-- using a div -->
<div class="icon icon-star-gold-red"></div>

If you are looking for a kick ass icon library, look no further than the FAMFAMFAM Silk Icons which are available for free under a creative commons license.

Share Amazon EC2 Disk Performance

Friday, February 27th, 2009

Amazon Web Services Logo
Update (3/3/2010): a better measure of RAID performance is available here.

While considering different options for a database server, I decided to do some digging into Amazon Web Services (AWS) as an alternative to dedicated servers from an ISP. I was most curious about the I/O of the Elastic Block Storage (EBS) on the Elastic Compute Cloud (EC2). What I tested was a number of different file systems EXT3, JFS, XFS, ReiserFS as single block devices and then some different software RAID configurations leveraging JFS. The tests were run using Bonnie++.

The configuration was vanilla, no special tuning was done, just the default values that are assigned by the tools. I used Fedora Core 9 as the OS from the default Amazon AMI and used “yum install” to aquire the necessary utilities (more on that below). I expect with further tuning, some increases in performance can still be obtained. I used the small instance for cost reasons, which includes “moderate” I/O performance. Running on a large or extra-large standard instance should perform even better with “high” I/O performance. You can get all the instance specifications from Amazon.

First I wanted to determine what the EBS devices would compare to in the physical world. I ran Bonnie against a few entry level boxes provided by a number of ISP’s and found the performance roughly matched a locally attached SATA or SCSI drive when formatted with EXT3. I also found that JFS, XFS and ReiserFS performed slightly better than EXT3 in most tests except block writes.

The Numbers

Again, let me re-iterate that some numbers may not be accurately reflected in your production environment. Amazon states, small instances have “moderate” I/O availability. Presumably if your running this for a production DB, you’ll want to consider a large or extra-large instance for the memory and so you should see slightly better performance from your configuration. Also note, that the drives I allocated were rather small (to keep testing costs low) so you may experience different results with larger capacities.

Note: The graph below is in KB, not bytes as titled.

Bonnie Disk Performance on EC2

Size (Filesystem) Output Per Char Output Block Output Re-write Input Per Char Input Block
4x5Gb RAID5 (JFS) 22,349 58,672 39,149 25,332 84,863
4x5Gb RAID0 (JFS) 24,271 99,152 43,053 26,086 96,320
10Gb (XFS) 20,944 43,897 24,386 25,029 65,710
10Gb (ReiserFS) 22,864 57,248 17,880 21,716 44,554
10Gb (JFS) 23,905 47,868 21,725 24,585 55,688
10Gb (EXT3) 22,986 57,840 22,100 24,317 48,502

As expected, RAID 0 does best with read/write speed and RAID 5 does very well on reads (input block) as well. For InnoDB, the re-write and block read (input)/write (output) operations are the most critical values. Longer bars mean better values. To better understand what the test is doing, be sure to read the original Bonnie description of each field.

Making Devices

The process for making a device is simple. There are many tutorials on how to make this persistent and you can certainly build this into your own AMI when you’re done – this is not a tutorial on how to do that. To get a volume up and running you’ll follow these basic steps:

  1. Determine what you want to create – capacity, filesystem type etc.
  2. Allocate EBS storage
  3. Attache the EBS storage to your EC2 instance
  4. If using RAID, create the volume.
  5. Format the filesystem
  6. Create the mount point on the instance filesystem
  7. Mount the storage
  8. Add any necessary entries to mount storage at boot time.

Single Disk Images

Remember, the speed and efficiency of the single EBS device is roughly comparable to a modern SATA or SCSI drive. Use of a different filesystem (other than EXT3) can increase different aspects of drive performance, just as it would with a physical hard drive. This isn’t a comparison of the pros and cons of different engines, but simply providing my findings during testing.

JFS yum install jfsutils
XFS yum install xfsprogs
ReiserFS yum install reiserfs-utils

I didn’t test any other filesystems such as ZFS, because I’ve read some other filesystems are unstable on Linux and I’ll be running production on Linux so the extra time for the tests seemed unnecessary. I am interested in other alternatives that could increase performance if you have any to share I’d love to hear about them.

You can quickly get a volume setup with the following:

mkfs -t ext3 /dev/sdf
mkdir /vol1
mount /dev/sdf /vol1

Next time you mount the volume, you won’t need to use “mkfs” because the drive is already formatted.

RAID

The default AMI already includes support for RAID, but if you needed to add them to your yum enabled system, it’s “yum install mdadm”. On the Fedora Core 9 test rig I was using, RAID 0, 1, 5, 6 were supported, YMMV.

To create a 4 disk RAID 0 volume, it’s simply:

mdadm --create --verbose /dev/md0 --level=0 --raid-devices=4 /dev/sdf /dev/sdg /dev/sdh /dev/sdi
mkfs -t ext3 /dev/md0
mkdir /raid
mount /dev/md0 /raid

To create a 4 disk RAID 5 volume instead, it’s simply:

mdadm --create --verbose /dev/md0 --level=5 --raid-devices=4 /dev/sdf /dev/sdg /dev/sdh /dev/sdi
mkfs -t ext3 /dev/md0
mkdir /raid
mount /dev/md0 /raid

This example assumes you have 4 EBS volumes attached to the system. AWS shows 7 possible mount points /dev/sdf – /dev/sdl in the web console, however, the documentation states you can use through /dev/sdp, which is 11 EBS volumes in addition to the non-persistent storage. This would be a theoretical maximum of 10TB of RAID 5 or 11TB of RAID 0 storage!

Checking in on things…

  • cat /proc/mdstat
    is a great way to check in on the RAID volume. If you run it directly after creating a mirroring or striping array, you’ll also be able to see the scrubbing process and how far along it is.
  • mount -l
    shows the currently mounted devices and any options specified.
  • df
    disk free provides a nice list of device mounts and their total, available and used space.

Conclusion

It’s clear from the numbers that software RAID offer a clear performance advantage over a ESB volume. Since with EBS you pay per Gb not per disk, it’s certainly cost effective to create a robust RAID volume. The question that remains is how careful do you need to be with your data? RAID 0 offered blistering fast performance but like a traditional array, without redundancy. You can always set it up as RAID 5, RAID 6 or RAID 10 but this of course requires more unusable space to handle the redundancy.

Since the volumes on EBS are theoretically invincible, it may be okay to run unprotected by a mirror or parity drive, however, I haven’t found anyone who would recommend this in production. If anyone knows of a good reason to ignore the saftey of RAID 10 or RAID 6 or RAID 5, I’d love to hear the reasoning.

I am also curious if these drives maintain a consistent throughput over the full capacity of the disk or will they slow down as the drive fills like a traditional drive? I did not test this. It remains open for another test (and subsequent blog post). Should anyone run ZCAV against a 100Gb+ drive and figure that out, please let me know.

Fine Print – The Costs

Storage starts at a reasonable $0.10/GB-Month which is reasonable and is prorated for only the time you use it. A 1Tb RAID 0 volume made of 10x100Gb volumes would only cost $1,200 per year. Good luck getting performance/dollar costs for 1Tb like that from any SAN solution at a typical ISP. There are however some hidden costs in the I/O that you’ll need to pay attention to. Each time you read or write a block to disk, there’s an incremental cost. The pricing is $0.10 per million I/O requests – which seems cheap, but just running the simple tests I ran with Bonnie++ I consumed almost 2 million requests in less than 3 hours of instance time. If you have a high number of reads or writes, which you likely do if you’re considering reading this, you’ll need to factor these costs in.

The total AWS cost for running these tests was $0.71 of which $0.19 were storage related. The balance was the machine instances and bandwidth.

Resources

Share PHP MySQL vs MySQLi Database Access Metrics

Friday, January 30th, 2009

Last week I had a MySQL DBA/Developer tell me, “MySQLi sucks” when I asked why some code I was reviewing used the mysql_* extensions provided in PHP instead of mysqli_*. This really didn’t sit well with me. I turned to a trusted source for PHP information, and found they recommend mysqli_* for connecting to MySQL servers versions 4.1.3 and above. Although, I read many anecdotal stories about how mysqli_* was faster, I was unable to find any quantified proof of just how much faster. So in keeping with my performance measurement track I’ve been on this week, I decided to conduct my own test. I was surprised by the result. MySQL is faster in some cases.

The database I used is a pet project of mine that tracks some vehicle information. It uses a varchar primary key based on an automotive VIN number and the sample table I queried against has about 30K rows. The average row length is about 150bytes. I conducted 2 types of tests. The first was a completely unrealistic test measuring how fast a single row could be returned using each of the extension types. The other also returned a random sampling of rows. Since this isn’t a performance test of MySQL, I allowed each test to run one time completely to ensure the query cache was fully populated in MySQL. Then each test was run 3 subsequent times. Each test executed 1,000,000 queries to the server.

Single Hard Coded Query:

The first listing is the mysql_* calls. You’ll notice the

mysql_free_result()

function is commented out. I did test with this enabled and disabled and found that with this function enabled the queries returned at the same rate as with it enabled. So in the interest of fairness, I disabled it to more closely mimic the code executing with the mysqli_* calls.

<?php
	$tq = 1000000;
	$mt = microtime(true);
	$conn = mysql_connect("127.0.0.1","user","");
	mysql_select_db("test_schema", $conn);
	for($i=0; $i<$tq; $i++){
		$data = mysql_query("SELECT * FROM vehicles LIMIT 1", $conn);
		$data_object = mysql_fetch_assoc($data);
		// mysql_free_result($data);
	}
	mysql_close($conn);
	$et = microtime(true) - $mt;
	print "Elapsed time: " . number_format($et) . "s\n";
	print "QPS: " . ($tq / $et) . "\n";
?>

The next listing is the same code converted to leverage the mysqli_* calls. You’ll see that the result is not freed each time through the loop.

<?php
	$tq = 1000000;
	$mt = microtime(true);
	$conn = mysqli_connect("127.0.0.1","user","","test_schema");
	for($i=0; $i<$tq; $i++){
		$data = $conn->query("SELECT * FROM vehicles LIMIT 1");
		$data_obj = $data->fetch_assoc();
	}
	$conn->close();
	$et = microtime(true) - $mt;
	print "Elapsed time: " . number_format($et) . "s\n";
	print "QPS: " . ($tq / $et) . "\n";
?>

The results were surprising to me. The mysql_* extension actually ran considerably faster than the mysqli_* extenstion. In fact, the mysql_* extension ran 3.6% faster. Perhaps this DBA was on to something? I also tested this same methodology using prepared statements with MySQLi and found the results actually degraded. Discouraged, I devised a new test that would hopefully be more realistic. The values provided are queries per second.

Method Run #1 Run #2 Run #3 Avg
mysql_* 7,227.7 7,222.2 7,220.7 7,223.5
mysqli_* 6,963.0 6,974.0 6,974.4 6,970.5
mysqli prepared statements 4,849.6 5,005.3 4,978.7 4,944.5

Parameterized Queries:

More often than not, we do not run the same select query 1 million times, and we are actually more likely to run a query over a series of data sets. While this in itself is a terrible idea in production code because you should re-write the query to do it all as a single database call, it provides a better picture of how the database might be interacted with in PHP; especially if persistent connections are being utilized. However, if you find that you have some code that fits the patterns used here, it might be to your advantage to review the prepared statement code below.

The basic flow of the scripts is to first fetch 10 thousand (~1/3) of the table’s primary keys and populate an array. Then the code will select 1 million rows using a randomly selected primary key (vin). The data is still relatively small, and fits into the query cache after a warm up period. This shows how efficiently the query passing is instead of how efficient MySQL is at answering different questions, making it a better comparison of the two technologies. The three test scripts I ran are provided here for your reference.

Leveraging the mysql_* calls

 
<?php
	print "Fetching VIN array...\n";
	$vin_array = array();
	$conn = mysql_connect("127.0.0.1","user","");
	mysql_select_db("test_schema",$conn);
	$vins = mysql_query("SELECT vin FROM vehicles LIMIT 10000", $conn);
	while($vin = mysql_fetch_assoc($vins)){
		$vin_array[] = $vin['vin'];
	}
	mysql_close($conn);
	unset($conn);
 
	print "Starting test...\n";
	$tq = 1000000;
	$mt = microtime(true);
	$conn = mysql_connect("127.0.0.1","user","");
	mysql_select_db("test_schema", $conn);
	for($i=0; $i<$tq; $i++){
		$data = mysql_query("SELECT * FROM vehicles WHERE vin = '" . $vin_array[rand(0,count($vin_array)-1)] . "'", $conn);
		$data_object = mysql_fetch_assoc($data);
		// mysql_free_result($data);
	}
	mysql_close($conn);
	$et = microtime(true) - $mt;
	print "Elapsed time: " . number_format($et) . "s\n";
	print "QPS: " . ($tq / $et) . "\n";
?>

Leveraging the mysqli_* calls

<?php
	print "Fetching VIN Array...\n";
	$vin_array = array();
	$conn = mysqli_connect("127.0.0.1","user","","test_schema");
	$vins = $conn->query("SELECT vin FROM vehicles LIMIT 10000");
	while($vin = $vins->fetch_assoc()){
		$vin_array[] = $vin['vin'];
	}
	$conn->close();
	unset($conn);
 
	print "Starting test...\n";
	$tq = 1000000;
	$mt = microtime(true);
	$conn = mysqli_connect("127.0.0.1","user","","test_schema");
	for($i=0; $i<$tq; $i++){
		$data = $conn->query("SELECT * FROM vehicles WHERE vin = '" . $vin_array[rand(0,count($vin_array)-1)] . "'");
		$data_obj = $data->fetch_assoc();
	}
	$conn->close();
	$et = microtime(true) - $mt;
	print "Elapsed time: " . number_format($et) . "s\n";
	print "QPS: " . ($tq / $et) . "\n";
?>

Leveraging mysqli_ calls and prepared statements.

<?php
	print "Fetching VIN array...\n";
	$vin_array = array(); 
	$conn = mysqli_connect("127.0.0.1","user","","test_schema");
	$vins = $conn->query("SELECT vin FROM vehicles LIMIT 10000");
	while($vin = $vins->fetch_assoc()){
		$vin_array[] = $vin['vin'];
	}
	$conn->close();
	unset($conn);
 
	print "Running tests...\n";
	$tq = 1000000;
	$mt = microtime(true);
	$conn = mysqli_connect("127.0.0.1","user","","test_schema");
	$stmt = $conn->stmt_init();
	$stmt->prepare("SELECT * FROM vehicles WHERE VIN = ?");
	for($i=0; $i<$tq; $i++){
		$stmt->bind_param("s", $vin_array[rand(0,count($vin_array) - 1)]);
		$stmt->execute();
		$stmt->bind_result($a,$b,$c,$d,$e,$f,$g,$h,$z,$j,$k,$l,$m,$n);
		$stmt->fetch();
	}
	$conn->close();
	$et = microtime(true) - $mt;
	print "Elapsed time: " . number_format($et) . "s\n";
	print "QPS: " . ($tq / $et) . "\n";
?>

The performance of this “harder” task degraded for all technologies. Again, the mysql_* extension was about 3.6% faster, but the prepared statements were 9.7% faster than the old mysql_* calls and 13.7% faster than the mysqli_ calls. This improvement makes a good case for transitioning repetitive calls in your code to prepared statements leveraging the MySQLi extenstions. Again, the values are measured in queries per second.

Method Run #1 Run #2 Run #3 Avg
mysql_* 4,063.6 4,016.7 4,054.0 4,044.8
mysqli_* 3,902.3 3,905.8 3,896.0 3,901.4
mysqli prepared statements 4,443.8 4,421.8 4,445.7 4,437.1

Without testing stored procedures, it’s difficult to say if they would have continued to improve the performance of this test, however, it is clear that the speed advantages of mysqli_* extensions are only present if you are utilizing the advanced features of the more recent versions of the database.

Conclusion:

If your code is purely hard coded statements, the old school MySQL syntax may be your best bet for a performance lift. However, most likely your code would benefit more from prepared statements and some of the other performance enhancements provided by the MySQLi extension set. I wouldn’t say, “MySQLi Sucks” but it does appear there is a valid argument for continuing to use it if you are not leveraging any of the performance enhancements of the newer versions of MySQL.

System Configuration:

  • Mac OS 10.5.6
  • 2.16Ghz Intel Core Duo
  • 2 Gb RAM
  • PHP 5.2.6
  • MySQL Server 5.0.51b

Share PHP Memory Caching Performance

Wednesday, January 28th, 2009

Today I had the great pleasure to work with both APC and Memcached in a production environment. I’ve read that the performance of APC is roughly 3-5x faster than Memcache. So I decided to do my own test to see which performed better on my rig. If you want more numbers that may better mimic a production server environment, be sure to read Peter from the MySQL Performance Blog’s initial article which found APC to be roughly 3x faster. Although, I found that his results are spot on a little over 2 years later.

Being ultimately more concerned with the number of requests Apache serves back to my clients, I opted to use metrics similar to Jay Pipes method, leveraging ApacheBench to see determine the best way to push more content to the users faster. I created two test files which are nearly identical, accessing the cache layer using the simplest calls possible.

The tests were run against localhost after a pre-warmup to ensure the correct number of Apache workers were initiated and the cache had the correct content. This is important, because these numbers are a good comparison of read performance, but do not provide write throughput. The test was performed against a desktop system, so background processes may have varied slightly from test to test despite efforts to disable everything.

Apache bench was run over 10,000 requests with concurrency of 50. The values are the average of 3 consecutive runs. Standard deviation for connection time was consistent over each of the runs. APC proved significantly faster (30%) even with Memcached on the localhost. However, memcached scales out where as APC is tied to the local machine. That alone may be sufficient reason to use it over APC in your environment despite the performance benefit.

Some large scale applications benefit from multiple layers of caching. According to Matt Raible’s notes from OSCON 2008 Facebook uses $GLOBALS, APC and Memcached as their first lines of caching defense. This seems to further validate Peter’s findings.

Results:

APC Memcached
Requests/Second 2,088.43 1,611,.59 APC ~30% More
Time/Request (mean) 0.48ms 0.62ms APC ~23% Faster
99% of Requests Finished in 63ms 102ms APC ~39% Faster

Test Code:

The scripts are fairly straightforward. I didn’t want this to be a comparison of MySQL database accesses and so I manually created a simple object to cache.

<?php
 
	// APC Cache 
 
	$data = apc_fetch("test_object", &$success);
	if($success){
		print_r($data);
	} else {
		// store an object in the cache for the next call
		$test_object = array();
		$test_object['key1'] = 1;
		$test_object['key2'] = "hello world";
		$test_object['key3'] = array(1,2,3);
		$result = apc_store("test_object", $test_object, 3600);
		print_r($data);
	}
?>
 
<?php
 
	// Memcached Cache 
 
	$memcache = new Memcache();
	$memcache->addServer("127.0.0.1","11211");
	$data = $memcache->get("test_object");
	if($data){
		print_r($data);
	} else {
		// store an object in the cache for the next call
		$test_object = array();
		$test_object['key1'] = 1;
		$test_object['key2'] = "hello world";
		$test_object['key3'] = array(1,2,3);
		$memcache->set("test_object", $test_object, 3600);
		print_r($data);
	}
?>

System Configuration:

  • Mac OS 10.5.6
  • 2.16Ghz Intel Core Duo
  • 2 Gb RAM
  • PHP 5.2.6
  • APC 3.1.2
  • Memcached 1.1.12
  • Apache 2.2.9
  • ApacheBench 2.3

Share Before You Buy that new DB

Monday, October 27th, 2008

Database Server Icon - Visio Database servers are tricky. Before jumping to the conclusion that more hardware is needed, make sure you’re getting the most out of the hardware you already have. Hiring a DBA is definitely the right way to go. They’ll quickly be able to point out where the pain points are in your configuration. If you can’t afford a DBA, you can use this quick guide to get a starting point on what might need adjustment.

At the system level, start by checking over the following items.

  • Does the machine have sufficient RAM?
  • How much idle CPU time is available?
  • Is the hard disk okay?
  • Is the networking speed sufficient?
  • Are there services that should run somewhere else?

Next investigate the application itself.

  • Is the RDBMS allowed to use all of the RAM available?
  • Are the tables indexed; properly?
  • Are the indexes up to date?
  • Are the statistics up to date?
  • Are there queries that can be offloaded; refactored; eliminated?

This, at minimum, will save time when your thinking about sticking more power in-front of the database. If you still decide to make the move, make sure you check over these values as soon as you are done or you may find your performance decreases when it should have increased.

Share Not So Fast! MySQL Stored Procedures Are Not Panacea

Wednesday, May 21st, 2008

MySQL Logo MySQL 5.0 introduced stored procedures, but are you taking advantage of them? MySQL claims a 50% lift in performance just having an execution plan on hand, but what real world performance boost can you expect to see? I set out to discover just that and found the results unsatisfying.

Column Type Indexed
id int(11) unsigned Primary Key
content varchar(4096) No

I then conducted a test which would closely mimic a real world example to determine just that. I started with a table with a few million rows of variable length data in it. I then created a stored procedure that would as closely as possible match my hand coded SQL query. Then I created a simple PHP script to test it all out.

CREATE PROCEDURE sp_getdata (i INT(11))
BEGIN
	SELECT * FROM test_table WHERE id = t;
END;
$iterations = 100000;
$conn = mysqli_database($server, $user, $pass, $schema);
 
// find the extents of our random range and make sure the connection actually opens
$max = $conn->query("select max(id) as maxid from test_table");
$max = $max->fetch_assoc();
$max = $max['maxid'];
 
$sql = "SELECT * FROM test_table WHERE id = %d";
$procedure = "CALL sp_getmessage(%d)";
 
// run the stored procedure query
for($i=0;$i<$iterations; $i++){
     $query = sprintf($procedure, rand(1,$max));
     $conn->multi_query($query);
     $result = $conn->store_result();
}
print number_format(microtime(true)-$t2,2) . "s\n";
 
// run the straight sql query
print "Test SQL Query: ";
$t1 = microtime(true);
for($i=0; $i<$iterations; $i++){
     $query = sprintf($sql, rand(1,$max));
     $result = $conn->query($sql);
}
print number_format(microtime(true)-$t1,2) . "s\n";

I then ran the test 5 times to eliminate the so called warm-up period which MySQL would inevitably use to more accurately answer future questions. I then ran the test a number of times swapping the position of the two “tests” within the code so that neither received an unfair advantage. The results are below:

Run SQL Stored Procedure Order
1. 0.28s 0.39s Proc/SQL
2. 0.28s 0.37s Proc/SQL
3. 0.28s 0.41s Proc/SQL
4. 0.28s 0.38s Proc/SQL
5. 0.28s 0.39s Proc/SQL
6. 0.28s 0.80s SQL/Proc
7. 0.28s 0.82s SQL/Proc
8. 0.28s 0.81s SQL/Proc
9. 0.28s 0.82s SQL/Proc
10. 0.27s 0.80s SQL/Proc

I was taken by surprise by two obvious trends. First, the SQL statement was amazingly consistent regardless of when it was run. Second, not only was the stored procedure slower, it was significantly slower if a batch of plain old SQL statement executions were run first.

Next I decided to track the response time of each individual query that I ran to get a better look at the execution time costs for each. I expected to see a bell curve showing some queries running very fast and others running very slow with the majority falling in the middle somewhere. I had to increase the sensitivity of my timer millionths of a second to get a good picture of just how fast things were or were not running. Again I was surprised by the results. The basic query was still much faster in total elapsed time, but suddenly the performance of the stored procedure began showing itself. I ran the tests the same way as before and swapped the order in which they ran to eliminate preferential execution order. The following table gives results from one run of 100,000 so you can see the results first hand. The method was to insert the time value into an array and increment the value of that index by one each time a query executed in that time. So the first value you see is also the first query that gets sent to the server. This is important because for the stored procedure, you’ll notice it’s MUCH longer than all subsequent runs. Oddly, the query to get the maxium value had already been run so I’m not sure what causes the increase delay in runtime.

Stored Procedure Basic SQL Statement
    [0.00144] => 1
    [0.00005] => 2
    [0.00002] => 52
    [0.00001] => 99938
    [0.00003] => 6
    [0.00004] => 1
    [0.00063] => 1
    [0.00040] => 1
    [0.00047] => 16
    [0.00049] => 42172
    [0.00048] => 26
    [0.00299] => 21
    [0.00050] => 1735
    [0.00124] => 88
    [0.00074] => 517
    [0.00249] => 32
    [0.00224] => 38
    [0.00174] => 171
    [0.00149] => 138
    [0.00274] => 22
    [0.00099] => 139
    [0.00324] => 19
    [0.00399] => 12
    [0.00374] => 12
    [0.00349] => 12
    [0.00199] => 102
    [0.00054] => 1
    [0.00043] => 1
    [0.00075] => 35
    [0.00046] => 5
    [0.00250] => 2
    [0.00024] => 49970
    [0.00100] => 4
    [0.00025] => 4657
    [0.00125] => 6
    [0.00023] => 11
    [0.00175] => 2
    [0.00022] => 6
    [0.00449] => 4
    [0.00028] => 2
    [0.00052] => 1
    [0.00044] => 1
    [0.00200] => 1
    [0.00030] => 1
    [0.00041] => 1
    [0.00026] => 3
    [0.00027] => 4
    [0.00070] => 1
    [0.00499] => 1
    [0.00150] => 2
    [0.29807] => 1
    [0.00275] => 1
    [0.00424] => 2

Clearly the fluctuation in times we see in the first table are largely related to the first query we execute. After that initial procedure call, stored procedures run consistently fast, faster in fact than the plain old SQL query. But back to the “real world test”. How often do we create a single page that executes 100,000 queries at a time? More likely its 2-20 queries which are all very different than the others in the group. Given that variable, which is really the best solution? Ultimately it’s going to depend on your setup. This was a mostly idle app server communicating over a 100Mb connection to a moderately loaded DB server with ample RAM and disk performance.

Future exploration areas will be to research the most efficient way to execute stored procedures for maximum performance and deciding if a stored procedure is faster for more complex queries involving joins etc. If anyone has some good directions I can head in here, I’m all ears.

© 1998-2008 AF-Design, All rights reserved.