Archive for January, 2009

Share PHP MySQL vs MySQLi Database Access Metrics

Friday, January 30th, 2009

Last week I had a MySQL DBA/Developer tell me, “MySQLi sucks” when I asked why some code I was reviewing used the mysql_* extensions provided in PHP instead of mysqli_*. This really didn’t sit well with me. I turned to a trusted source for PHP information, and found they recommend mysqli_* for connecting to MySQL servers versions 4.1.3 and above. Although, I read many anecdotal stories about how mysqli_* was faster, I was unable to find any quantified proof of just how much faster. So in keeping with my performance measurement track I’ve been on this week, I decided to conduct my own test. I was surprised by the result. MySQL is faster in some cases.

The database I used is a pet project of mine that tracks some vehicle information. It uses a varchar primary key based on an automotive VIN number and the sample table I queried against has about 30K rows. The average row length is about 150bytes. I conducted 2 types of tests. The first was a completely unrealistic test measuring how fast a single row could be returned using each of the extension types. The other also returned a random sampling of rows. Since this isn’t a performance test of MySQL, I allowed each test to run one time completely to ensure the query cache was fully populated in MySQL. Then each test was run 3 subsequent times. Each test executed 1,000,000 queries to the server.

Single Hard Coded Query:

The first listing is the mysql_* calls. You’ll notice the

mysql_free_result()

function is commented out. I did test with this enabled and disabled and found that with this function enabled the queries returned at the same rate as with it enabled. So in the interest of fairness, I disabled it to more closely mimic the code executing with the mysqli_* calls.

<?php
	$tq = 1000000;
	$mt = microtime(true);
	$conn = mysql_connect("127.0.0.1","user","");
	mysql_select_db("test_schema", $conn);
	for($i=0; $i<$tq; $i++){
		$data = mysql_query("SELECT * FROM vehicles LIMIT 1", $conn);
		$data_object = mysql_fetch_assoc($data);
		// mysql_free_result($data);
	}
	mysql_close($conn);
	$et = microtime(true) - $mt;
	print "Elapsed time: " . number_format($et) . "s\n";
	print "QPS: " . ($tq / $et) . "\n";
?>

The next listing is the same code converted to leverage the mysqli_* calls. You’ll see that the result is not freed each time through the loop.

<?php
	$tq = 1000000;
	$mt = microtime(true);
	$conn = mysqli_connect("127.0.0.1","user","","test_schema");
	for($i=0; $i<$tq; $i++){
		$data = $conn->query("SELECT * FROM vehicles LIMIT 1");
		$data_obj = $data->fetch_assoc();
	}
	$conn->close();
	$et = microtime(true) - $mt;
	print "Elapsed time: " . number_format($et) . "s\n";
	print "QPS: " . ($tq / $et) . "\n";
?>

The results were surprising to me. The mysql_* extension actually ran considerably faster than the mysqli_* extenstion. In fact, the mysql_* extension ran 3.6% faster. Perhaps this DBA was on to something? I also tested this same methodology using prepared statements with MySQLi and found the results actually degraded. Discouraged, I devised a new test that would hopefully be more realistic. The values provided are queries per second.

Method Run #1 Run #2 Run #3 Avg
mysql_* 7,227.7 7,222.2 7,220.7 7,223.5
mysqli_* 6,963.0 6,974.0 6,974.4 6,970.5
mysqli prepared statements 4,849.6 5,005.3 4,978.7 4,944.5

Parameterized Queries:

More often than not, we do not run the same select query 1 million times, and we are actually more likely to run a query over a series of data sets. While this in itself is a terrible idea in production code because you should re-write the query to do it all as a single database call, it provides a better picture of how the database might be interacted with in PHP; especially if persistent connections are being utilized. However, if you find that you have some code that fits the patterns used here, it might be to your advantage to review the prepared statement code below.

The basic flow of the scripts is to first fetch 10 thousand (~1/3) of the table’s primary keys and populate an array. Then the code will select 1 million rows using a randomly selected primary key (vin). The data is still relatively small, and fits into the query cache after a warm up period. This shows how efficiently the query passing is instead of how efficient MySQL is at answering different questions, making it a better comparison of the two technologies. The three test scripts I ran are provided here for your reference.

Leveraging the mysql_* calls

 
<?php
	print "Fetching VIN array...\n";
	$vin_array = array();
	$conn = mysql_connect("127.0.0.1","user","");
	mysql_select_db("test_schema",$conn);
	$vins = mysql_query("SELECT vin FROM vehicles LIMIT 10000", $conn);
	while($vin = mysql_fetch_assoc($vins)){
		$vin_array[] = $vin['vin'];
	}
	mysql_close($conn);
	unset($conn);
 
	print "Starting test...\n";
	$tq = 1000000;
	$mt = microtime(true);
	$conn = mysql_connect("127.0.0.1","user","");
	mysql_select_db("test_schema", $conn);
	for($i=0; $i<$tq; $i++){
		$data = mysql_query("SELECT * FROM vehicles WHERE vin = '" . $vin_array[rand(0,count($vin_array)-1)] . "'", $conn);
		$data_object = mysql_fetch_assoc($data);
		// mysql_free_result($data);
	}
	mysql_close($conn);
	$et = microtime(true) - $mt;
	print "Elapsed time: " . number_format($et) . "s\n";
	print "QPS: " . ($tq / $et) . "\n";
?>

Leveraging the mysqli_* calls

<?php
	print "Fetching VIN Array...\n";
	$vin_array = array();
	$conn = mysqli_connect("127.0.0.1","user","","test_schema");
	$vins = $conn->query("SELECT vin FROM vehicles LIMIT 10000");
	while($vin = $vins->fetch_assoc()){
		$vin_array[] = $vin['vin'];
	}
	$conn->close();
	unset($conn);
 
	print "Starting test...\n";
	$tq = 1000000;
	$mt = microtime(true);
	$conn = mysqli_connect("127.0.0.1","user","","test_schema");
	for($i=0; $i<$tq; $i++){
		$data = $conn->query("SELECT * FROM vehicles WHERE vin = '" . $vin_array[rand(0,count($vin_array)-1)] . "'");
		$data_obj = $data->fetch_assoc();
	}
	$conn->close();
	$et = microtime(true) - $mt;
	print "Elapsed time: " . number_format($et) . "s\n";
	print "QPS: " . ($tq / $et) . "\n";
?>

Leveraging mysqli_ calls and prepared statements.

<?php
	print "Fetching VIN array...\n";
	$vin_array = array(); 
	$conn = mysqli_connect("127.0.0.1","user","","test_schema");
	$vins = $conn->query("SELECT vin FROM vehicles LIMIT 10000");
	while($vin = $vins->fetch_assoc()){
		$vin_array[] = $vin['vin'];
	}
	$conn->close();
	unset($conn);
 
	print "Running tests...\n";
	$tq = 1000000;
	$mt = microtime(true);
	$conn = mysqli_connect("127.0.0.1","user","","test_schema");
	$stmt = $conn->stmt_init();
	$stmt->prepare("SELECT * FROM vehicles WHERE VIN = ?");
	for($i=0; $i<$tq; $i++){
		$stmt->bind_param("s", $vin_array[rand(0,count($vin_array) - 1)]);
		$stmt->execute();
		$stmt->bind_result($a,$b,$c,$d,$e,$f,$g,$h,$z,$j,$k,$l,$m,$n);
		$stmt->fetch();
	}
	$conn->close();
	$et = microtime(true) - $mt;
	print "Elapsed time: " . number_format($et) . "s\n";
	print "QPS: " . ($tq / $et) . "\n";
?>

The performance of this “harder” task degraded for all technologies. Again, the mysql_* extension was about 3.6% faster, but the prepared statements were 9.7% faster than the old mysql_* calls and 13.7% faster than the mysqli_ calls. This improvement makes a good case for transitioning repetitive calls in your code to prepared statements leveraging the MySQLi extenstions. Again, the values are measured in queries per second.

Method Run #1 Run #2 Run #3 Avg
mysql_* 4,063.6 4,016.7 4,054.0 4,044.8
mysqli_* 3,902.3 3,905.8 3,896.0 3,901.4
mysqli prepared statements 4,443.8 4,421.8 4,445.7 4,437.1

Without testing stored procedures, it’s difficult to say if they would have continued to improve the performance of this test, however, it is clear that the speed advantages of mysqli_* extensions are only present if you are utilizing the advanced features of the more recent versions of the database.

Conclusion:

If your code is purely hard coded statements, the old school MySQL syntax may be your best bet for a performance lift. However, most likely your code would benefit more from prepared statements and some of the other performance enhancements provided by the MySQLi extension set. I wouldn’t say, “MySQLi Sucks” but it does appear there is a valid argument for continuing to use it if you are not leveraging any of the performance enhancements of the newer versions of MySQL.

System Configuration:

  • Mac OS 10.5.6
  • 2.16Ghz Intel Core Duo
  • 2 Gb RAM
  • PHP 5.2.6
  • MySQL Server 5.0.51b

Share PHP Memory Caching Performance

Wednesday, January 28th, 2009

Today I had the great pleasure to work with both APC and Memcached in a production environment. I’ve read that the performance of APC is roughly 3-5x faster than Memcache. So I decided to do my own test to see which performed better on my rig. If you want more numbers that may better mimic a production server environment, be sure to read Peter from the MySQL Performance Blog’s initial article which found APC to be roughly 3x faster. Although, I found that his results are spot on a little over 2 years later.

Being ultimately more concerned with the number of requests Apache serves back to my clients, I opted to use metrics similar to Jay Pipes method, leveraging ApacheBench to see determine the best way to push more content to the users faster. I created two test files which are nearly identical, accessing the cache layer using the simplest calls possible.

The tests were run against localhost after a pre-warmup to ensure the correct number of Apache workers were initiated and the cache had the correct content. This is important, because these numbers are a good comparison of read performance, but do not provide write throughput. The test was performed against a desktop system, so background processes may have varied slightly from test to test despite efforts to disable everything.

Apache bench was run over 10,000 requests with concurrency of 50. The values are the average of 3 consecutive runs. Standard deviation for connection time was consistent over each of the runs. APC proved significantly faster (30%) even with Memcached on the localhost. However, memcached scales out where as APC is tied to the local machine. That alone may be sufficient reason to use it over APC in your environment despite the performance benefit.

Some large scale applications benefit from multiple layers of caching. According to Matt Raible’s notes from OSCON 2008 Facebook uses $GLOBALS, APC and Memcached as their first lines of caching defense. This seems to further validate Peter’s findings.

Results:

APC Memcached
Requests/Second 2,088.43 1,611,.59 APC ~30% More
Time/Request (mean) 0.48ms 0.62ms APC ~23% Faster
99% of Requests Finished in 63ms 102ms APC ~39% Faster

Test Code:

The scripts are fairly straightforward. I didn’t want this to be a comparison of MySQL database accesses and so I manually created a simple object to cache.

<?php
 
	// APC Cache 
 
	$data = apc_fetch("test_object", &$success);
	if($success){
		print_r($data);
	} else {
		// store an object in the cache for the next call
		$test_object = array();
		$test_object['key1'] = 1;
		$test_object['key2'] = "hello world";
		$test_object['key3'] = array(1,2,3);
		$result = apc_store("test_object", $test_object, 3600);
		print_r($data);
	}
?>
 
<?php
 
	// Memcached Cache 
 
	$memcache = new Memcache();
	$memcache->addServer("127.0.0.1","11211");
	$data = $memcache->get("test_object");
	if($data){
		print_r($data);
	} else {
		// store an object in the cache for the next call
		$test_object = array();
		$test_object['key1'] = 1;
		$test_object['key2'] = "hello world";
		$test_object['key3'] = array(1,2,3);
		$memcache->set("test_object", $test_object, 3600);
		print_r($data);
	}
?>

System Configuration:

  • Mac OS 10.5.6
  • 2.16Ghz Intel Core Duo
  • 2 Gb RAM
  • PHP 5.2.6
  • APC 3.1.2
  • Memcached 1.1.12
  • Apache 2.2.9
  • ApacheBench 2.3

Share iPhone and BlackBerry App Inequality

Saturday, January 17th, 2009

weather.com logo Weather.com really frustrated me tonight while checking the weather on my BlackBerry. I was using the WAP site and saw an ad to download the weather.com application for BlackBerry. “Sweet!” I’m thinking to myself, I use it all the time on my iPod Touch how handy to have it when I’m away from WiFi. Then I realized it would be a big fat fee to use it, $12.99 per quarter! Whaaaat??? Don’t get me wrong, I’m not opposed to paying for applications and software. I usually upgrade versions for pretty much everything within the first 1-2 months of availability. But when I get the same service for free on the iPod (and by extension iPhone) it really rubs me the wrong way. Either charge me for the iPod Touch app or find another way to subsidize your income.

Weather.com App Download Screenshots

Weather.com App Download Screenshots

© 1998-2008 AF-Design, All rights reserved.