Amazon has added some new features for EC2, including the ability to auto scale. Clients have been asking for the auto scale capability for a while. Until now, Amazon’s EC2 service had its best fit in doing computational tasks that leveraged its heavily distributed functionality. With the new service in place, Amazon better serves clients that want automatic, in the cloud, load balancing. Judging from the activity on Twitter, enthusiasm is high for Amazons new features. Amazon responds fast to customer feedback and develop new features accordingly. Still, it is easy to get caught up in the enthusiasm and lose sight a bit about the risks that go with auto scaling in the cloud. But first, let’s define what we are discussing. What is Amazon’s Auto Scaling service? Amazon’s Auto Scaling feature pairs with their new CloudWatch service. CloudWatch: "….tracks and stores a number of per-instance performance metrics including CPU load, Disk I/O rates, and Network I/O rates. The metrics are rolled-up at one minute intervals and are retained for two weeks. Once stored, you can retrieve metrics across a number of dimensions including Availability Zone, Instance Type, AMI ID, or Auto Scaling Group." The Auto Scaling service: "….allows you to automatically scale your Amazon EC2 capacity up or down according to conditions you define. With Auto Scaling, you can ensure that the number of Amazon EC2 instances you’re using scales up seamlessly during demand spikes to maintain performance, and scales down automatically during demand lulls to minimize costs. Auto Scaling is particularly well suited for applications that experience hourly, daily, or weekly variability in usage." A Different View: Dynamic Scaling George Reese makes a distinction between dynamic scaling and auto-scaling. Reese is CTO for EnStratus, a cloud service provider. He is the author of Cloud Application Architectures published by O’Reilly. He also writes a blog for O’Reilly about cloud computing. Here’s his definition:
Auto-scaling takes advantage of a critical feature of the cloud called dynamic scaling. Dynamic scaling is the ability to add and remove capacity into your cloud infrastructure on a whim—ideally because you know your traffic patterns are about to change and you are adjusting accordingly.
His point? He explained it today for me in a Twitter exchange: Reese wrote last December that Amazon and other cloud providers can not respond fast enough to increases in capacity needs. It can take 10 minutes before Amazon responds. That may be too late. He says there are security risks that can lead to a whole set of problems without "governors" in place. Governors may prove to be ineffective, too, without any guidelines in place that deal with capacity.
* Capacity demands you should have planned for, and thus don’t need auto-scaling for. * Capacity demands you could not have planned for, and thus you have no idea whether the governor level you have set is even appropriate to the traffic.
And what about getting Slashdotted? You want the site to scale to the traffic of unexpected attention but auto scaling can be a bit blind in its approach:
But you don’t want it to auto-scale. Auto-scaling cannot differentiate between valid traffic and non-sense. You can. If your environment is experiencing a sudden, unexpected spike in activity, the appropriate approach is to have minimal auto-scaling with governors in place, receive a notification from your cloud infrastructure management tools, then determinate what the best way to respond is going forward.
At what point should there be human intervention? Reuven Cohen of ElasticVapor chimed in after Reese wrote his post last December:
No scaling operation should be fully or completed automated, it should be a series of controls / rules, policies, quotas and monitors that are tailored to reduce the need for human operators involvement or for the purposes of achieving a set of requirements such as the quality of my users experience. I’m personally all for a Dynamic Automated Infrastructure.