Friday, 18 February 2011

Exchange 2010 SP1 DAC in a single AD site

With the release of SP1 for Exchange 2010 it is now possible to have Datacenter Activation Coordination mode (DAC) enabled in a single AD site. This is perfect for smaller environments as it is now possible to have Exchange 2010 disaster recovery possible where two DAG members are separated across two system rooms in the same building or maybe separate building but still in the same AD site.

The configuration would usually consist of the following:
  • Primary Datacenter with Exchange Data Availability Group (DAG) member and a file share witness which is usually located on Exchange CAS or HUB transport server
  • Secondary Datacenter with second Exchange DAG member
This underlying cluster mechanism which is Failover Cluster now has three votes - two DAG members and a file share witness. In this setup we can loose any single machine but still have the Exchange databases online.

However if the disaster strikes and we loose our primary data center it means two out of three votes are lost and the Failover Cluster mechanism will bring the entire cluster down. This situation leaves the Exchange databases unavailable to users.


If you had databases mounted on EXMBX01 they would have status of "Disconnected and Healthy" on EXMBX02 and if there were databases mounted on EXMBX02 they would have status of "Dismounted".


Note: For the following procedure to work you should have already enabled the Datacenter Activation Coordination mode using the following command:

Set-DatabaseAvailabilityGroup -Identity dagname -DatacenterActivationMode DagOnly


To mount databases on the Exchange DAG member in the secondary data center you will need to type the following powershell commands on the second DAG member:

Stop-Service clussvc

Stops the Failover Cluster service

Stop-DatabaseAvailabilityGroup –Identity EXDAG01 –MailboxServer EXMBX01 -ConfigurationOnly

Stops the DAG on the failed Exchange DAG member. Since the DAG member is down and unavailable we are using ConfigurationOnly switch

Restore-DatabaseAvailabilityGroup –Identity EXDAG01 –AlternateWitnessServer EXHUBCAS02 –AlternateWitnessDirectory C:\EXDAG01_W02\

This command sets up the new Failover Cluster which now consists of only one cluster member (EXMBX02) and file share witness which we will be placed on EXHUBCAS02 which is located in the secondary data center. After the command finishes we should have our cluster online and all databases mounted on EXMBX02. We can ommit AlternateWitnessServer and AlternateWitnessDirectory switches if we have previously set this up on the properties of DAG. We can set this using Set-DatabaseAvailabilityGroup cmdlet.

If we now open the Failover Cluster management console we should see only EXMBX02 as a member of the cluster and file share witness should point to EXHUBCAS02.

But what when the primary data center comes back online? This is when DAC magic comes into play. If EXHUBCAS01 and EXMBX01 are brought online but the WAN link between the sites is still down, DAC mode will not allow for the quorum to be formed even though two out of three votes are available. This is because in DAC mode each DAG member must successfully contact all other DAG members or at least DAG member which has the Active Manager bit of 1 stored in memory. Since the EXMBX01 cannot contact EXMBX02 it will not form a cluster and database will not be mounted thus preventing the split brain scenario.

I recommend you read Scott Feltmann's blog for more information on how this Active Manager bit works.

When everything is back online again you should follow this steps to put everything as it was before the disaster:

Start-DatabaseAvailabilityGroup –Identity EXDAG01 –ActiveDirectorySite Default-First-Site-Name

This command essentially ads EXMBX01 back into cluster. Databases are now still mounted on EXMBX02 but the replication should be resumed and all passive database copies on EXMBX01 should be healthy. If not you will use Update-MailboxDatabaseCopy cmdlet to remedy this.

At this point file share witness is still set on EXHUBCA02. We can confirm this by opening Failover Cluster management console but if we look at the DAG properties using Get-DatabaseAvailabilityGroup cmdlet we will see that it shows EXHUBCAS01 as file share witness. All we need to do is run this cmdlet:

Set-DatabaseAvailabilityGroup –Identity EXDAG01

Failover Cluster management console now shows that EXHUBCAS01 is file share witness.

There is one more thing to do and that is to move Active databases to EXMBX01 and everything is now the way it was before the disaster.






7 comments:

  1. Thanks for the info Dinko. I'm setting this up for a customer this week with a similar single AD site, 3 member DAG with a remote data center scenario.

    ReplyDelete
  2. Hi great article, just so I haven't missed anything is this suggesting that the mbx servers are set up in an active/active manner ?

    Thanks

    ReplyDelete
  3. Yes, both mbx servers have active mailbox databases on them.

    ReplyDelete
  4. hi, great procedure


    i shutdown mailbox servers, and run the command to activate db . i encountered error error getting db copy status from mailbox server. and the replication service is not running. i checked repl is running. any advise?

    ReplyDelete
  5. it will work if i have 1-1 HUB& CAS Server on both datacenter & 3 mailbox server in DAG, 2 mailbox member in primary datacenter & 1 mailbox server in second datacenter

    Thanks,

    ReplyDelete
  6. Check this out about Exchange data center HA Switch over and stretching DAG between sites.
    It is notes from engineers in the field doing data center switch overs for Exchange 2010 with tips and real word scenarios. It explains everything in step by step and examples.
    http://wp.me/p1eUZH-8U

    ReplyDelete
  7. DAC mode prevents split brain at the database level. It has nothing whatsoever to do with failovers, and therefore leaving DAC mode disabled will not enable automatic datacenter failovers.
    data room online

    ReplyDelete