Archive for April, 2009

Share Message Queue Solutions

Tuesday, April 28th, 2009

While I’m a fan of Gearman so far, I thought it prudent to look at alternative solutions. This is a survey of alternative solutions I’ve located so far. Most of my clients are LAMP(hp) and so I’ll probably be ignoring the language specific packages that don’t support PHP. After a cursory overview from the list below I know I’ll be checking in on Amazon SQS and Beanstalkd before I make my final selection.

Linden Labs (publishers of Second Life) posted their evaluation of Messaging Queue Software. Of course they’re an edge case in terms of scale so some of these may work just fine for your uses despite being eliminated by them.

Share Getting up and Running with Gearman

Monday, April 27th, 2009

Gearman Gearman is a job scheduling service and I’m very excited about it. I’m using it in a development capacity so your mileage may vary in production but I wanted to share my experience thus far. As I said, I’m very bullish on this project and I see it as hugely helpful in eliminating latency in applications that often get bogged down during unnecessary synchronous communications.

Compiling gearman required installing a package that wasn’t part of my default Fedora Core install and for me wasn’t intuitive to locate. The UUID header file was located in the package e2fsprogs-devel which I found using yum provides "*/uuid.h". After that it was rather smooth to get it up and running. gearmand -d -u nobody got it up and running as a damon and I was able to connect to it using telnet over port 4730. Next I compiled the source for the PHP client and got that hooked into PHP by adding an extension file include in /etc/php.d to load the module and restarted Apache so it would be loaded there too.

Process to install and get running:

// First the server
tar -xzvf gearmand-0.5.tar.gz.tar
cd gearmand-0.5
yum install e2fsprogs-devel
./configure; make && make install
gearmand -d -u nobody
 
// Next the PHP client
tar -xzvf gearman-php-ext-0.2.tar.gz.tar
cd gearman-php-ext-0.2
phpize
./configure; make && make install
echo "extension=gearman.so" > /etc/php.d/gearman.ini
service httpd restart

So now to do some work, even if it’s useless, that takes a long time. It just so happens that creating a file with 1,000,000 sequential numbers takes a few seconds on a small EC2 instance, perfect for my test. I realize this is a highly insecure process, NEVER pass filenames as parameters in production code. Here’s the worker that creates a file (passed as the parameter) on the current system’s /tmp directory.

$worker = new gearman_worker();
$worker->add_server('127.0.0.1', 4730);
$worker->add_function('fill_file', 'fill_file_fn');
 
while(1) $worker->work();
 
function fill_file_fn($job){
	$data = $job->workload();
	$fh = fopen("/tmp/" . $data, "w");
	for($i=1;$i<1000000;$i++){
		fwrite($fh, $i . "\n");
	}
	fclose($fh);
	return;
}

The calling client just invokes this 20 times in the background.

$client = new gearman_client();
$client->add_server('127.0.0.1', 4730);
for($i=0; $i<20; $i++){
	$client->do_background('fill_file', 'file' . $i . '.txt');
}

Workers are started from the command line with something like this, “php worker.php &” and if you want more, just run more of them. You can also kill off some if they’re no longer needed.

The client completes it’s run in about 5 seconds while 5 worker threads toil away in the background until they get their work done about 3 minutes later. The use cases from the gearman team show the utility of this as a spider and for image manipulation. I see uses for sending mass emails to distribution lists using a template and substitute parameters to create a unique email for each person on the worker instead of the client – thus reducing the processing time to get the mail ready and speeding the delivery using multiple worker threads for sending (that can even be on remote machines). This product is definitely worth checking out.

Hopefully this helps you get up and running with Gearman!

Share Subversion Hosting Part 2 of 2

Thursday, April 2nd, 2009

This is the second part of of an article looking at how to effectively host a small subversion based project that is no longer going through rapid development. The first part looked at using EC2 to run Subversion and S3 for persistent storage. While an intruiging solution, it raised some concerns.

The alternative solution is to look at outsourcing the hosting of Subversion and ticket management to another provider. The size of our repository is less than 1GB and so I’m using that as the price point. Additionally, there are 2-3 developers who’ll require access to the repository. There are many great “free” services including Google, but this is not an open sourced project so it’s out. In the hosted subversion realm, there are a number of providers with basic accounts to handle this size repository. The following table is a price comparison at the 1GB storage level. Many providers offer a free service for smaller projects with different limitations for bandwidth, tickets and so on so YMMV.

ProjectLocker $2.50
Wush $6.67
SVNRepository.com $6.95
CVSDude (2GB) $6.99
Hosted-Projects $7.00
Assembla1 $8.00
Code Spaces $9.99
Beanstalk (3GB)2 $15.00
Versionshelf (3GB)2

$19.00
Unfuddle (2GB)2 $24.00
DevGuard (2GB)2 $29.95
  • 1 Pricing dependent on storage and developers
  • 2 Offers a cheaper or free plan with less than 1GB of storage.

The real benefit of a hosted solution is the addition of services such as Trac, user management, automated backups and more. If you are looking at building a project with multiple developers who are not in the same physical location, hosting your project with a service is definitely the way to go. It’s cheaper and the overhead of configuring and maintaining your own EC2 instance (or even a dedicated server) increases the costs significantly.

Share Subversion Hosting Part 1 of 2

Thursday, April 2nd, 2009

Over the last few weeks I’ve been considering some options for cutting development costs for myself and a few clients. One of the continuing questions is how to manage the code base. Keeping a development server on hand is great during periods of active development and work, but when the site reaches maturity and only bug fixed are required, development servers sit idle for weeks on end without use. This got me thinking about how to best manage the source in a persistent way. This first post looks at how this might be accomplished using EC2.

I’ve been thinking about moving a development environment to Amazon EC2 from a dedicated server. The problem is, at least for this project, development only occurs a few hours per day and may go entire weeks without anyone working on it. Obviously a small instance at $0.10/hour is sufficient for load. That would cost roughly $72/month. But, even if I’m working on the application 40 hours per week, I should be able to reduce that charge to $16/month to cover the time the server is actually on. Additionally, after being burned with an instance failure last weekend, I want to be sure the data is securely backed up as well. I thought about using EBS but, as readers have pointed out, even they can fail. Furthermore, I don’t want to create a drive sized in GB if I only need a few MB of storage. Lastly, but not least, if I need to scale the drive – I don’t want to re-create the AMI each time to reflect new drive ids.

My initial thought was to start with a public Fedora Core instance and install PersistentFS, automating all of the startup and shutdown process to ensure data integrity. Next, configure subversion to use that mount point for file storage. Last but not least, I’ll create a script I can run from my local machine (or a remote server) that starts and stops an instance and binds a known elastic IP to that instance at boot time. I think my overall costs will be greatly reduced.

Estimate of costs on EC2:

EC2 Small Instance Run Time (40hr week) $16.00
S3 Storage Cost (~10GB AMI) $1.50
S3 Storage Cost – Filesystem $0.15
S3 Bandwidth Cost (guess) $2.00
EC2 Bandwidth Cost (guess) $2.00
Total Cost (Monthly) $21.90

The bare minimum – if no development work was done at all would be the storage costs of $1.65 – certainly cheap enough! However, the time to build the initial environment, create the scripts and the time lost during the startup and shutdown of the server each time made me think there may be a better alternative. Read more on subversion hosting in the second part.

© 1998-2008 AF-Design, All rights reserved.