Living in Silicon Valley is good. Within a 30-minute drive, I can go to many interesting meetings free or for a nominal charge. Speakers are topnotch, and many of them are movers and shakers in the technical fields that are shaping the market.
Some regard social networking systems (SNS) as timewasters, but many others, like me, take them seriously. I am not sure if the current shape and form of SNS will continue to exist for many more years, but certainly mass participation and sharing of a vast amount of information in an elastic fashion will continue in one shape or another for many years to come. This means we need infrastructures to handle such requirements. I will try not to be too analytical, but these are two of those requirements:
- Instant (by the second) increase/decrease of traffic of massive proportions with dynamic fluctuation
- A vast amount of structured and unstructured data processing
I am sure there are more, but let’s just make things simple by considering only those two. The first states the necessity of having a powerful infrastructure that can process an unseen amount of information. Because of dynamic load shifting, a static infrastructure is not a good solution. In other words, we need an infrastructure to satisfy somewhat conflicting requirements. One solution is to oversubscribe each and every component in the infrastructure. However, most times you cannot accurately predict the level of demand, especially for SNS applications. Because you cannot prepare an infinite amount of resources, this solution would be very hard to implement. Also, as stated in the second point, a vast amount of multiple formatted and unformatted data must be processed. On top of that, those data are coming in mass in real time. How do you handle that?
In a recent cloud computing meet-up held at Microsoft’s Silicon Valley office, there were two talks on accommodating these requirements.
The first talk was by Jim Zimmerman, CTO of Thuzi, which is based in Tampa, Florida. According to their website,
Thuzi is focused on making social media count.
To that end, they provide infrastructures and tools to assist clients’ SNS-based marketing campaigns, which might be hosted on Facebook or other sites. Their tools provide a solution to the above two requirements.
Jim said that the techie population in Tampa is much smaller than that of Silicon Valley, and when they hold a meet-up, they can do it at a local Panera Bread Restaurant. In contrast, a good-size auditorium at the Microsoft Research facility was full of techies. I must confess that I could follow most of his talk, but here and there I was lost because I have not touched code for some time. But the problems he articulated were clear. They have a solution for both business and consumer applications.
A set of problems in dealing with business applications.
The problem statement is an expanded version of what I described. For this, their solution is here.
Command query responsibility segregation (CQRS) is not a very familiar term. Jim elaborated it further in the following architecture slide. Basically, it is a complete separation of UI and backend. The same philosophy was discussed in the subsequent talk by Brian Aker.
The consumer application has its own problems, as follows
And their solution for that is as follows.
They also have an application that manipulates photos. Their earlier version sent everything to the server for processing, but it took too much time. In the current version, they process the photos on the client side and send the processed photos to the server. This is summarized in the following slide.
The computing and infrastructure requirements need very different thinking to accommodate SNS applications. Public cloud is an ideal platform to accommodate unknown bandwidth because you can increase your computing and other resources on-the-fly. If traffic subsides, you can decrease the resources accordingly. So for you, it is good to save energy, but how about the provider? In theory, a public cloud provider has many customers running various loads simultaneously. Some may increase and others may decrease traffic at a given moment. So in the end, everything balances out and energy is not wasted. Intuitively, that seems to be true. But it really depends upon many parameters, including each cloud site’s configuration and customer mixes. I think the jury is still out on this.
I cannot generalize their architecture but it seems that for scalability support, UI and backend processing should be separated to increase processing power, regardless of how light or heavy traffic may be. In the cloud environment, UI and backends can be anywhere. They may not be stationary at one physical location, and that makes the whole design very complex. Brain Aker addressed that problem in his talk.
The last slide contains the relevant reference information.