Features, discussions, tips, tricks, questions, problems and feedback

Configuring Windows Server Failover Clustering with SQL AlwaysON Availability Groups

If Adroit/MAPS Redundant system is placed on Windows Server machines, additional safety feature for smooth and reliable failover would be to also configure OS-based cluster together with setting up replication on SQL instances, if they are installed on the same machines.

In following example, we will setup simple Windows Server Failover Cluster with SQL Replication (using AlwaysON AG mechanism) using Windows Server 2016 version (not connected to domain controller, both servers will simply be in same workgroup) and Microsoft SQL Server 2017 Enterprise Edition (however, this tutorial is also applicable for Standard Edition as well).

It is strongly recommended to first understand mechanisms behind this example by reading documentation available at Microsoft webpages:

Because servers in this example are not in domain, it is very important they are part of the same workgroup and local users configuring whole process will have the same name and password on both machines. Make sure these 2 prerequisites are met before going onwards or have these servers in the same domain.

NOTE: Points 2,3,4 are not required if the servers are part of domain, because inside domain DNS resolution would be solved by the DNS controller. Additionally, if in domain, point 6 can be done from a Failover Cluster Manager through clicking, not PowerShell.

  1. Install Failover Cluster role on both servers. Go to Server Manager Dashboard and click Add roles and features. On features section select Failover Clustering and proceed until the installation is finished.
    obraz
  2. At Control Panel->System->Name of computer, domain and workgroup click Change settings->More… and set the same Primary DNS suffix on both servers so they can communicate between each other in the same network.
    obraz
  3. In Advanced TCP/IP Settings of active network card (under TCP/IPv4 protocol) untick Register this connection in addresses in DNS option. Also, make sure Your server has a static IP address in the network.
    obraz
  4. Configure hosts file of Windows Server so nodes can resolve the names of each other and the cluster itself. The hosts file in Windows Server 2016 is located under C:\Windows\System32\drivers\etc. Example of configuration:

192.168.40.2 MAPSNode1.maps.net
192.168.40.2 MAPSNode1
192.168.40.3 MAPSNode2.maps.net
192.168.40.3 MAPSNode2
192.168.40.20 MAPSCluster.maps.net
192.168.40.20 MAPSCluster

  1. Now in Server Manager Dashboard tools pick Failover Cluster Manager. To create a cluster, it needs to validate first, so click the Validate Configuration… option.
    obraz
  2. Create Cluster. Because we are not in domain we cannot simply select after validation option to create cluster based on validation results. Instead, start PowerShell with administrator rights and type in following command (of course modifying the names of nodes, cluster and IP Address of cluster accordingly to former settings):

New-Cluster -Name MAPSCluster -Node MAPSNode1.maps.net, MAPSNode2.maps.net -AdministrativeAccessPoint DNS -StaticAddress 192.168.40.20

  1. If everything went ok, now it should be possible to connect to cluster status using Failover Cluster Manager tool. Simply right click on it and select “Connect to cluster…”. As cluster name, type in same name which was picked during creation. In this example, it was MAPSCluster:
    obraz
  2. Now install SQL Server 2017 Standard (or higher) edition. Make sure that SQL Server Service after installation is started using the same privileges that Windows Cluster itself was created with. DO NOT USE BUILT-IN SERVICE ACCOUNT.
    obraz
  3. In SQL Server Configuration Manager right click on SQL Server service and enable AlwaysOn Availability Groups option. Restart service after doing so. It is required to do that on both servers.
    obraz
  4. Make sure to have firewall exceptions for SQL connections (by default it is inbound TCP port 1433 and for endpoints it is TCP 5033)
  5. Start SQL Management Tools and connect to SQL server on first node (make sure connection to second node is also working). Create at least one database to use in the cluster and make a full backup of it first.
  6. Create Availability Groups. Right click on Availability Groups and go through wizard.
    obraz
  7. While reading through steps, in “Specify Replicas” add second node as second replica. Set it up accordingly to needs (recommended automatic failover option)
    obraz
  8. In the same step it is possible to specify a listener (single IP address to access the clustered SQL services). If it will be configured, make sure before doing that to add that entry as well in hosts file (from step 4) if this is a non-domain cluster.
    obraz
  9. If everything was configured correctly (especially firewall and privileges) cluster should be created.

From now on in MAPS/Adroit itself (DBLog, Auditing, Alarm Management etc.) when using SQL string it should point to SQL cluster IP address/DNS Name instead of specific node for high availability. Also, in MAPS/Adroit Redundant system remember to select in SQL mechanisms “disable on standby” option to prevent from inserting doubled entries to database.

If MAPS/Adroit is installed also on such clustered solution, then Operator should connect to IP Address/name of the cluster instead of specific node. This way it doesn’t even need to be aware of clustering happening.

NOTE: This tutorial showed very crude basics of Windows Failover Clustering. It is very important to understand more when deciding to use such solution (such as, what is Shared Storage, Failover Cluster Quorum and how to configure it correctly etc.). Remember to read the official documentation beforehand!

Please note that in several recent projects where Windows Server Clustering has been implemented, it resulted in unstable behavior. Reasons for this included incorrect network adapter teaming that resulted in intermittent connectivity issues (for example when VMQ capable high speed adapters are used), but in some cases no verifiable cause could be found other than to disable the use of Windows Clustering. This is often due to the intricate nature of real-time Operational Technology hardware and networking software which is not well supported through Windows Clustering.

It is recommended that only Adroit/MAPS Clustering is used in these projects and NOT Windows Server Failover Clustering .

Similar failure scenarios have been experienced where Virtual Machine technology (such as HyperV) is used, especially in the case of specific PLC hardware and high speed network communication requirements. Physical Hardware instead of using Virtual Machines are recommended in these cases instead. Also discussed further here: Virtualization (VMWare, HyperV, VirtualBox)