Jack and the Beanstalkd

Vagmi Mudumbai

Every webapp needs to do some background tasks like sending emails, processing stats, creating PDF documents and so on. The rule of the thumb is that if a requests takes more than a couple of seconds to process, turn it into a background process. There are several alternatives in the ruby world. From delayed_job to RabbitMQ. However, we at Artha42 love simplicity. One such really simple solution for processing background jobs is beanstalkd.

Beanstalk is a simple, fast work queue.

Its interface is generic, but was originally designed for reducing the latency of page views in high-volume web applications by running time-consuming tasks asynchronously.

It has a simple text based protocol inspired by memcached to put in messages that can then be run with workers. It is persistent and most of its clients support connections to multiple beanstalkd servers. There are different choices of ruby clients. While there is the standard ruby beanstalk client, there is also an asynchronous EventMachine version called em-jack. You can choose your pick depending on the type of jobs. If the jobs are IO bound and not too much processing, EM-Jack is a good choice. If the jobs are fairly performance intensive, then you are better off using the standard loopy constructs or implement your own threaded client.

Tubes

In addition to the standard global queue, beanstalk supports the concept of tubes. Tubes are independent job queues. You can imagine an app having multiple producers each pushing jobs to a queue and multiple consumers consuming and processing the jobs from these queues. This is not as exhaustive as RabbitMQ's queues, exchanges and keys but it is simple enough for most web apps.

Usage

Connecting to beanstalkd and creating a job is is fairly simple.

email={:to=>['email@example.com'], :subject=>'Some Subject', :body=>body}
beanstalk = Beanstalk::Pool.new(['localhost:11300'])
beanstalk.use "emailtube"
beanstalk.yput(email) # yput converts the object to yaml before putting it in the queue

Processing the job is equally simple.

beanstalk = Beanstalk::Pool.new(['localhost:11300'])
beanstalk.use "emailtube"
loop do
  job = beanstalk.reserve
  email=YAML.load(job.body)
  # send the email
  job.delete
end

We have used beanstalkd in a couple of applications as a replacement to RabbitMQ. And so far, we have had a great experience using it. Try it out and let me know if it works for you.

Posted on 2010-12-20T05:09:30Z by Vagmi Mudumbai Comments
blog comments powered by Disqus