I was recently asked an interesting question, so I thought I would share the answer here. The question was “Why did Microsoft decide to use RPC over HTTP for internal communication between Outlook and Exchange 2013?” As with most questions of design for a product as complex as Exchange, there are a lot of reasons for this design but I think the best answer comes down to simplicity.
With a product as complex as Exchange, any design elements that can reduce that complexity will make a much better, reliable, and less expensive to maintain product. Any idiot can make a solution complex, it takes real genius to make a solution simple.
Why is using RCP over HTTP simpler than using MAPI communication? You might guess that the reason would be that Exchange uses RPC to insert messages into the database, so making all communication RPC based makes it easier to update the Exchange database. While this would be a sensible guess, it’s exactly wrong. MAPI is still the language that the database engine understands, so using RPC for all client server communications means that one of the last steps that has to be performed by Transport is to “translate” the RPC into MAPI. No added simplicity there.
To understand why RPC over HTTP makes Exchange 2013 simpler, you need to understand site resilient designs for Exchange 2010. The introduction of Database Availability Groups in Exchange 2010 was a great set forward for Exchange design. DAGs make high availability design much simpler, much easier to deploy and maintain, and much easier to support. DAGs make it easy to deploy up to 16 copies of a database on different servers in one site, or in multiple sites. The problem with the introduction of DAGs in Exchange 2010 is that they do nothing to change the high availability design of the Client Access servers. I’m not going to go into all the specifics of HA design for Client Access servers in Exchange 2010 in this blog post, but I will say to design two site redundancy for the CA server in Exchange 2010 requires 9 names spaces and a manual failover process. I’ve seen a lot of Exchange 2010 deployments where the designers assumed that building a cross site DAG meant that if they lost site one, then all Exchange services would automatically failover to site two within a few seconds and everything would be A-OKAY. In reality, this is not the case with Exchange 2010.
So back to the question at hand, the answer is that the elimination of direct MAPI communication between Outlook clients and Exchange servers means than Exchange 2013 can be deployed with highly available site redundant configurations with no additional namespaces and automatic site failovers. This is all possible because of the way the internet, and pretty much all LANs, work at the transport layer. MAPI is not internet routable, so we need to do all sorts of complicated magic to get Outlook clients from one site to connect to Exchange CA servers in another site after a failure. If those Outlook clients are using RPC over HTTP, with HTTP using TCP at the transport layer, then those clients can seamlessly connect to a new CA server at a new entirely different site. With the proper design and deployment, an Exchange 2013 site failure will result in all Outlook clients reconnecting to servers in a second site in about 15 seconds.
Simple.
Of course RPC over HTTP has it’s own issues. Exchange 2013 SP1 introduces MapiHttp, which is a new communications method. You can read my brief summary of MapiHttp in my blog post about Exchange 2013 SP1.