As is well known, there are three types of cloud computing: software as a service (SaaS), such as Salesforce.com; platform as a service (PaaS), such as Google App Engine; and infrastructure as a service (IaaS), such as Amazon’s AWS.
Google has been in the PaaS space but recently dived into IaaS. A great thing about living in Silicon Valley is that local meetings with the movers and shakers of technology market segments are within an easy 30-minute drive. The meeting that prompted this blog was on cloud computing. Cloud computing is changing the way IT is delivered and thereby changing how and where we conduct computing. Consequently, it has a vast impact on the way we use energy. Some conclude cloud computing saves energy, while others are skeptical about it. In any event, Google, a mammoth of computing power, has entered another category of cloud computing, so I attended SVforum’s cloud computing and virtualization special interest group meeting on Google’s new service. (By the way, Google Compute Engine was announced about a month ago at the Google IO conference.) The SIG meeting was well attended; more than 100 people showed up.
These were the speakers:
- Marc Cohen and Kathryn Hurley of Google, developer relation engineers
- M. C. Srivas of MapR, cofounder and CTO
Dave Nielsen, one of the cochairs of this SIG, and others did a good job organizing this meeting. Normally, when a new product or service is introduced, a vendor or provider gives a dry presentation about how great their product or service is. Google wanted to show how solid their high speed and security offering is, so they included a couple of good demos.
Well, that was not enough. On top of that, they invited MapR, one of their partners, to share their experience in deploying MapR’s version of Hadoop on the Google Compute Engine platform. Hadoop is open source from the Apache foundation and is used to crunch Big Data. MapR took Hadoop and added features to make it compatible with enterprise requirements.
Big Data and Hadoop have opened up a tremendous number of market opportunities, so MapR is not the only one to exploit them; Cloudera and Hortonnetworks also provide Hadoop on steroid versions for enterprises. By the way, I found a research report by Dave Ohara, The big machine: creating value out of machine-driven bigdata, to be very good tutorial material on Big Data.
MapR also has deployed its engine on Amazon AWS, and a performance comparison would be interesting. During the discussion, MapR revealed their performance data but said they had not conducted such a performance benchmark on AWS.
Back to Google Compute Engine. Google is no stranger to cloud computing. Here’s a simplified list of Google’s cloud offerings:
Marc Cohen speaks about Google’s history of cloud offerings.
What is Google Compute Engine? Marc summarized it on one slide:
Marc Cohen presented an overview of Google Compute Engine.
- Infrastructure as a service (IaaS)
- Supports Ubuntu and CentOS (more operating systems, such as Windows, will come later)
- KVM as hypervisor
- Deployable in two territories (one eastern and two central time zone data centers, only in the same data center rather than inter–data centers)
- In private beta (need to apply to be deployed on the platform; see here for more detail)
- Free for now (only qualified users) but will be charged for later
- No SLA guarantee now (under consideration when released officially)
Marc did not elaborate on SLA, but judging from what he said, you need to specify which territory you want to deploy your load in. All the computing and data associated with it stays in the same data center (i.e., cloud). No matter how we improve technologies, we will still be bounded by the laws of physics, and we cannot send packets any faster than the speed of light. If they want to guarantee SLA, they need to make a lot of assumptions and impose restrictions on their customers. I have not heard any cloud service provider discuss SLA, and I wonder how they can guarantee it, even with conditions.
More details can be found in their data sheet here, and here are two useful links:
Some architectural information follows.
The yellow disklike box behind Kathryn Hurley is cloud storage.
One thing unique about this is the use of their command language interface with gcutil library.
Now to benchmarks and usability shared by MapR. It is more convincing if your actual user, rather than yourself, says good things about your offering. Even though this is not a blog on Hadoop or MapR, here’s some basic information shared by Jack Norris, VP marketing, of MapR.
He also summarized MapR’s deployment on Google Compute Engine, as follows:
Srivas added the following, more-detailed benchmark:
The one on the Google platform outperformed that hosted on a physical platform in every benchmark except for processing time:
There was no comparison in cost. Srivas said jokingly that we had better think twice before purchasing and owning a bunch of servers. This is because a server’s life is probably only two to three years, and the minute you buy a new server, it becomes obsolete because new servers with new technologies are invented constantly. Spending so much money is one thing, but after your business goal changes and you no longer need that many servers, what do you do with them? They do not disappear magically.
In the area of IaaS, AWS was way ahead of the market curve, followed by RackSpace and others. As Dave Nielson said at the beginning of the meeting, those who were working on IaaS (e.g., Amazon) are adding a PaaS solution (e.g., Elastic Beanstalk) and those who were in the PaaS market (e.g., Google) are adding an IaaS solution. The cloud market is still expanding, and in spite of some problems, such as lock-in (due to no standards), security worries, and lack of control, because of its wide and broad market, it is still growing rapidly. As long as mobile computing and sensor networks (such as smart grid) are growing, it does not seem that the end of growth is even near. When very few interoperability cloud platforms exist, it is always good to have competition.