Real-Time Streaming Data Analysis with SaaS

In a recent Technology Convergence Conference, Dean Nelson, Vice President, Global Foundation Services (GFS), eBay, said something like the following. People embrace cloud computing, the digital world, and e-commerce, but they seldom pay much attention to the infrastructure that makes them possible. Yes, many people think applications and clouds are available out of nowhere. It is important to recognize that it is the availability of solid infrastructures consisting of computing, storage, and networking that makes them possible.

When we consider the components of these infrastructures, networks stand out. After all, without connectivity, none of the other parts of the infrastructure matter much.  Moreover, along with data transmission requirements, there are requirements for networking. One of those is network analysis, which has become more complex and demanding in this new age and which is not easily accomplished.

The requirements for network analysis include:

  1. Capturing data that reach your networks at high speed, in a variety of data formats (streaming data) through various network protocols (NetFlow, PCAP, IPFIX, sFlow, GeoIP, BGP, SNMP and others)
  2. Storing such captured data in data storage in real time
  3. Analyzing such data to take appropriate action in real or near real time

The very rudimentary diagram below depicts these.

 

 

 

 

 

Figure 1: Oversimplified Kentik Architecture (diagram by the author)

I totally agree with Nelson’s comment, and I wondered whether a company existed that contributes to satisfying such demanding requirements for network analysis. I had an opportunity to talk to Alex Henthorn-Iwane late last year; I had interviewed him when he was with a previous company. His new company, Kentik, sounded interesting, and I recently got to talk to him and Jim Frey, VP, Product, at Kentik Technologies, Inc.

Jim Frey

As usual, I will summarize our discussion at a level deeper than a press release but appropriate for laymen.

Target Markets

According to Frey, their products and services are useful to cloud companies like Amazon and Microsoft, ISPs/telecom companies like AT&T and Verizon, and e-commerce companies like eBay. I think data center service providers, such as Equinix, could benefit from them. Actually, Kentik colocates at an Equinix data center and is in talks with them for adoption of their services.

Differentiations

I added my own to Frey’s, below.

1.    SaaS

Frey said providing services with SaaS is one of their differentiations. If you can set up their services easily, and you do not need to worry about placing any other appliances on your premises but can receive results, it would be best.

2.    Multitenancy

Another one is multitenancy. Multitenancy security could mean two different things: logical and physical data separations. Logical separation means each user’s data segments are placed on the same physical storage, and separation is done via software. Physical separation means each user’s data segments are placed on separate physical storage. Kentik implements logical separation now but plans to implement physical later. For those who are really worried about data security, their services can be hosted on-premise at each user’s site.

3.    Investments and management team

Their war chest is stuffed with about $15M from several investors. In addition, the management team consists of people in the relevant fields, such as CDN.

4.    Technologies

In addition to Frey’s points, the following, in my opinion, are noteworthy for this demanding digital age and differentiators:

1.      It can capture streaming data of various kinds and speed.

2.      It can store such data in a database in a real time.

3.      It can apply analysis quickly.

The current packet capture market requires 10 to 40 gigabits speed but will soon require 100 gigabits. Kentik provides adequate support for such high speeds. Points 2 and 3 are requirements for streaming data store and analysis, which is a trend in data analysis these days. I asked Frey for some details of the three components.

His answer was that all three were developed by Kentik from the ground up. The streaming data management and analysis include such technologies as the Lambda architecture by Nathan Marz and Apache Spark. There are a few more, like Storm, Flink, and Samza. Kentik does not use any of those, said Frey. In the past, Avi Freedman, CEO, Kentik, contributed to a Quora question, “What kinds of applications is Apache Spark not suitable for?” His answer was “Superfast semi-random access to superlarge persistent data stores.” I wonder what “superfast” and “superlarge persistent data stores” are, though. Kentik filed patents on these technologies but did not elaborate on them, as they are still in the patent process.

Data Center

Kentik colocates at an Equinix data center in Ashburn, Virginia. Incidentally, it is well known that if you start a data center, you want to start with one in Virginia. I thought this was quite a good move on their part. These days, it is becoming common knowledge that unless your power requirements are larger than several tens of MW, it does not make sense to own your data centers. In addition, I think that Equinix could use their services, but Frey declined to comment on that, other than to say that they are in discussions with them.

From another point of view, teaming up with Equinix is a really good idea. I think Equinix is disrupting the Internet market because users and their ISPs do not have to worry about peering too much, because Equinix now has direct connection services to major clouds, including Amazon, Microsoft, Google, and Softlayer (IBM). More and more people and companies are dealing with clouds. On top of that, Equinix has its data centers in strategic locations all around the world. By colocating with them, Kentik deploys their services worldwide without major efforts.

Analytics

One such analysis is the detection of DDOS. Frey said their analysis is along the lines of a well-known detection method. I could spend multiple blogs on this alone, but I won’t go into detail here. He said their customers want to know how to mitigate DDOS attacks, and they are working on it as well.

One interesting analysis is to optimize peering. As indicated above, the Internet market is changing. Bill Norton’s book is a good reference for understanding how the current Internet is implemented, with abundant peering information. More analytics can be developed by applying machine-learning technologies.

Summary

Because of the lack of time, I could not get into the details of the implementation, which attracts most of my interest. I want to find out more about their streaming data technologies, which are said (by its CEO) to be better than Spark. After all, IBM created Spark Technology Center because it invests in that technology so much.

Finally, Figure 1 is expanded into the following Figure 2.

Zen Kishimoto

About Zen Kishimoto

Seasoned research and technology executive with various functional expertise, including roles in analyst, writer, CTO, VP Engineering, general management, sales, and marketing in diverse high-tech and cleantech industry segments, including software, mobile embedded systems, Web technologies, and networking. Current focus and expertise are in the area of the IT application to energy, such as smart grid, green IT, building/data center energy efficiency, and cloud computing.

, , , ,

No comments yet.

Leave a Reply


*