This is part deux from the post Configuring Sql Server AlwaysOn in AWS. I’m calling it “Testing Sql Server AlwaysOn in AWS 2; The Reckoning”.
Keep in mind, this is a multi-subnet configuration. This plays into the testing below, but it actually works out to our advantage. I’ve not tested AlwaysOn in a single subnet configuration yet.
Once again, a shout out is deserved to Jeremiah Peschka (blog | twitter), as I’d never have gotten this far without this post.
So, now we have our listener set up, ip’s configured, and we’re ready to start seeing how well the failovers work. Let’s start by pinging the listener from the primary server:
So far so good. Let’s ping it from the Secondary:
Awesome. Let’s ping it from the DNS server:
So, let’s check connections with the native client. On the desktop of the Primary, right-click and create a file called test.udl and save it. Next, double-click the file. It should bring up a dialog like this:
Click on the Provider tab, and change the provider to SQL Server Native Client 11.0, click Next and fill out the connection information, entering in your listener name in the “Select or enter a server name” drop-down. Next, select or type in a database name and click Test Connection.
If you typed everything in correctly, you should see this:
You should get the same results on the secondary as well.
Now, let’s manually fail the database over and re-test. Go on the primary server, right-click on the Availability Group and click Failover…
Go through the various steps required to failover the Availability group to the secondary server. I’ll skip images and descriptions for this process, as this is not the main focus of this post.
So, the Availability Group has been failed over. Let’s test the primary, secondary & dns server connectivity to the listener via ping again:
|Secondary (was primary):
No Love (host unreachable)
|Primary (was secondary):
So, at this point only 1 out of 3 servers can connect. Let’s check our UDL files again on the Secondary (used to be primary):
No love. But wait! Click on the All tab and find the property named Extended Properties, click Edit Value… and enter in MultiSubnetFailover=Yes. Now, select the Connection tab and click Test Connection again.
The reason this works, is that the provider will look for this property and will aggressively try both IP addresses listed in DNS. Refer to the section Connecting with MultiSubnetFailover in this article for more details.
What I’ve found is the process AWS uses for DNS updating itself to point to the correct host is inconsistent at best. Sometimes it will update itself after 5 minutes, sometimes 15, sometimes never. It’s no big deal for clients that use the SQL Server Native Client connection with the MultiSubnetFailover=Yes setting, but for other clients it will definitely present a problem.
Let’s try one more thing. Let’s cause an automatic failover. Let’s go into the AWS console and stop the instance that the current primary is running on. In my case, the Primary is running on RHDB1.
Go into the AWS EC2 dashboard and stop the instance.
Okay, automatic failover. RHDB2 is now the primary.
Ping from the DNS server resolves, the UDL resolves, pinging the listener resolves. Just for grins, fire the other server back up and lets shut down the primary via the AWS console again. Currently, the Primary is RHDB2 and RHDB1 is the secondary.
Let’s kill RHDB2.
Automatic failover again:
Pinging the listener from RHDB1 works just fine. The UDL connection also works fine. Ping from the DNS server, not immediately, but it seemed to resolve after about a minute.
By no means is this a 100% thorough test, but my results are promising. Automatic failover seems to work just fine in this case. What doesn’t work reliably is the DNS resolving. But, if you are using the MultiSubnetFailover when connecting with the sql server native client, it more or less negates the dns not resolving issues.