Tuesday, April 28, 2009

Azure Best Practice #1: Always Run at Least 2 Instances of Any Role


Some best practices for the Azure cloud computing platform are starting to emerge, and I'll be blogging them as I discover them. I expect the sources for these will be a combination of Microsoft guidance, personal experience, and experiences shared by others in the community.

Best Practice #1 is to always run at least 2 instances of any role (this applies to both web roles and worker roles). There's a very important reason for this: your application may be highly unavailable if you fail to do so. Why? Because Azure is constantly upgrading your instances with the latest OS patches. From what I understand talking to people on the product teams it's doing this very, very frequently. When you have 2 or more instances of a role running, Azure is careful to keep some of them running through the concept of upgrade domains; but when you're running a single instance you're going to have periods where you can't access your application.

I first encountered this issue when I wrote the Lifetracks demo application last year around Thanksgiving, though I didn't recognize the problem for what it was at first. Lifetracks would work after deploying it, but I noticed after a few days it would be "down" and I'd have to re-deploy it. In February I was at a Microsoft event and mentioned this behavior, which I had chalked up to instability in the platform. When I received the advice to run more instances I was skeptical this would make any difference, but I'm happy to report Lifetracks has stayed up from the moment I did so, over 10 weeks ago. So I can attest this is a best practice from direct personal experience.

Some additional evidence is the 22-hour outage the Azure platform experienced in March 2009. If you read Microsoft's analysis of the problem, you'll note that single-instance applications were primarily affected while multiple-instance applications continued to be available. The guidance to run at least 2 instances is stated there.

2 comments:

Unknown said...

My reason for running more than 1 instance is the concurrency access issue we may not encountered when running only a single instance.

Unknown said...
This comment has been removed by the author.