Managed Availability in Exchange 2013

Managed Availability (MA) is a new component introduced in Exchange 2013 that helps to keep your Exchange server up and running and within acceptable service levels. The basic premise is that if Exchange knows that certain counters indicate that the client experience is degraded on a server it can take action to rectify the issue or if all else fails reboot the server.

Microsoft developed MA to help run the Office 365 service. The engineers responsible for fixing the O365 issues were the Exchange Team, and since they were so close to the product itself they were able to build in a ton of probes/monitors along with actions. Basically they took all the automation that they needed to run O365 (which runs an enormous number of servers!) and pushed it down into the on-Prem Exchange.
This sounds like a great thing! Right?

Well yes and no. The problem comes in that the Exchange Team has a mountain of data and access to all the "inside" knowledge. IT is very easy for them to say "Hmmm this counter seems outside expected norms, let me go talk to the guy who wrote the code to see what if this indicates a problem and how to respond to it.

For those of us running Exchange on-Prem when we have a question about the MA probes and counters we have to try pulling info from TechNet, Blogs, etc. We don't have the intellectual and technical resources that are available at MS. This causes us to have some really horrible experiences trying to figure out what MA is telling us and why it is acting the way it is when there's a problem.

In the long run MA will be an amazing feature in Exchange that allows servers to Self-Heal, anything that reduces the frantic calls at 2am is a good thing, right?! The problem for MS will be in how they distill the information down to the rank and file. How will they enable us IT folks to utilize the data that is being provided by MA in a clear and concise manner.

Many options were discussed here at MEC during the Ask The Experts Session on Managed Availability, and the panel was one of the best I went to here at MEC 2014, so MS is definitely aware of the issues and some really novel ideas were discussed, such as allowing MS to collect MA data from clients to turn around and provide a better "view" of that data (right now the data is only straight forward to an Exchange Team Dev!).

Again I want to stress that the idea and impetus for Managed Availability is great, it's just the interaction with it needs a lot of work. The Exchange Team should be commended for their efforts but they need to realize that this is just the first couple miles of a marathon.

Check out some more information on Managed Availability at TechNet:

Abram Jackson wrote a great article on MA and what it does on the EHLO blog. You really need to read this!