Testing Sql Server AlwaysOn in AWS

This is part deux from the post Configuring Sql Server AlwaysOn in AWS.  I’m calling it “Testing Sql Server AlwaysOn in AWS 2; The Reckoning”.

Keep in mind, this is a multi-subnet configuration.  This plays into the testing below, but it actually works out to our advantage.  I’ve not tested AlwaysOn in a single subnet configuration yet.

Once again, a shout out is deserved to Jeremiah Peschka (blog | twitter), as I’d never have gotten this far without this post.

So, now we have our listener set up, ip’s configured, and we’re ready to start seeing how well the failovers work.  Let’s start by pinging the listener from the primary server:

image

So far so good.  Let’s ping it from the Secondary:

image

Awesome.  Let’s ping it from the DNS server:

image

Looking good.

So, let’s check connections with the native client.  On the desktop of the Primary, right-click and create a file called test.udl and save it.  Next, double-click the file.  It should bring up a dialog like this:

image

Click on the Provider tab, and change the provider to SQL Server Native Client 11.0, click Next and fill out the connection information, entering in your listener name in the “Select or enter a server name” drop-down.  Next, select or type in a database name and click Test Connection.

image image

If you typed everything in correctly, you should see this:

image

You should get the same results on the secondary as well.

Now, let’s manually fail the database over and re-test.  Go on the primary server, right-click on the Availability Group and click Failover…

image

Go through the various steps required to failover the Availability group to the secondary server.  I’ll skip images and descriptions for this process, as this is not the main focus of this post.

So, the Availability Group has been failed over.  Let’s test the primary, secondary & dns server connectivity to the listener via ping again:

Secondary (was primary):
image
No Love (host unreachable)
Primary (was secondary):
image
Love
DNS Server:

image
No Love (timeout)

 

So, at this point only 1 out of 3 servers can connect.  Let’s check our UDL files again on the Secondary (used to be primary):

image

No love.  But wait!  Click on the All tab and find the property named Extended Properties, click Edit Value… and enter in MultiSubnetFailover=Yes.  Now, select the Connection tab and click Test Connection again.

image

 

Yay!

image

The reason this works, is that the provider will look for this property and will aggressively try both IP addresses listed in DNS.  Refer to the section Connecting with MultiSubnetFailover in this article for more details.

image

What I’ve found is the process AWS uses for DNS updating itself to point to the correct host is inconsistent at best.  Sometimes it will update itself after 5 minutes, sometimes 15, sometimes never.  It’s no big deal for clients that use the SQL Server Native Client connection with the MultiSubnetFailover=Yes setting, but for other clients it will definitely present a problem. 

Let’s try one more thing.  Let’s cause an automatic failover.  Let’s go into the AWS console and stop the instance that the current primary is running on.  In my case, the Primary is running on RHDB1. 

image

Go into the AWS EC2 dashboard and stop the instance.

image

Okay, automatic failover.  RHDB2 is now the primary. 

image

Ping from the DNS server resolves, the UDL resolves, pinging the listener resolves.  Just for grins, fire the other server back up and lets shut down the primary via the AWS console again.  Currently, the Primary is RHDB2 and RHDB1 is the secondary. 

image

Let’s kill RHDB2.

image

Automatic failover again:

image

Pinging the listener from RHDB1 works just fine.  The UDL connection also works fine.  Ping from the DNS server, not immediately, but it seemed to resolve after about a minute.

By no means is this a 100% thorough test, but my results are promising.  Automatic failover seems to work just fine in this case.  What doesn’t work reliably is the DNS resolving.  But, if you are using the MultiSubnetFailover when connecting with the sql server native client, it more or less negates the dns not resolving issues. 

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.