By now open source software is even in corporate enterprises. Open source is a great way to make software available. Some companies make a commercial version of it to provide special support and care. But because it is open source, people can kick the tires before they commit to using it.
After MySQL’s acquisition, many people left. Some I met after that, but others were pretty much gone from my circle of contacts. I did not talk to Brian Aker much when he was director of architecture at MySQL. The last time I saw him was at MySQL’s Tokyo office with Larry Stefonic, president of MySQL Japan. I remember that the three of us had lunch after walking far under intense sun to find seats because many restaurants were crowded at lunchtime.
In any event, I knew Brian was doing fine after MySQL but did not know where he was until recently. At a recent cloud meet-up, I was glad to find him as one of the speakers. His talk was about something called Gearman. It is not German by the way. What is it? He said in one word, it is an effective queue. Other descriptions are here.
Those who do not know what Mechanical Turk is can find the explanation here.
On its own website, it is described as:
Gearman provides a generic application framework to farm out work to other machines or processes that are better suited to do the work. It allows you to do work in parallel, to load balance processing, and to call functions between languages. It can be used in a variety of applications, from high-availability websites to the transport of database replication events. In other words, it is the nervous system for how distributed processing communicates.
He is a fellow at HP now and in charge of Gearman. I did not know what Gearman was or how well it was received by the developer community. Brian said it is being used by Wal-Mart, Disney, Yelp, Craigslist, Flickr, Instagram (which was recently acquired by Facebook) and others. He knows this because he receives bug reports from these organizations. In running an open source shop, you learn that many people download your package for the fun of it. Some may actually install and use it, while others are content that they downloaded it and do nothing with it. So you do not know who is actually using your package until you hear from the users.
My understanding is cursory because it was the first time I heard about Gearman. But the main philosophy is to separate UI and the backend (Gearman’s term is worker), and let Gearman be in the middle to control traffic. In this way, UI does not have to know anything about backends or workers. In a way, this is similar to object-oriented programming, which separates an interface from its implementation.
Using one of his slides in the following, UI requests a resize function but does not know or care how it is done, where it is done, or who does it.
The leftmost UI wants to do “resize.” The Gearman (its head is a gear) in the middle traffic-controls the resize command to an appropriate worker (far right) that knows how to do “resize.” Gearman looks at the requirements and spins on as many appropriate workers as needed to process a job.
The Gearman stack looks like the following.
The usefulness of Gearman:
As you can see from its architecture, you can spin off any number of workers, depending on the size of a load. Because Gearman can do aggregation, a large job can be chopped into smaller pieces and given to each worker. Upon job completion, Gearman aggregates the results together and returns them to UI. This is like Map Reduce or Hadoop. No wonder it is used by many SNS-based applications.
To conclude, here’s how to obtain more information about Gearman. By the way, I forgot to ask why it is named Gearman.