← Back to overview
December 13, 2013 · Failover Read-Access Geo-Redundant Storage Traffic Manager

Adding failover to your application with Read-Access Geo-Redundant Storage and the Windows Azure Traffic Manager

Today Scott Guthrie announced the public preview of Read-Access Geo-Redundant Storage. Before we start talking about failover, let's take a quick look at how you can create a Storage Account:

The data stored in the secondary location was only accessible after contacting support, but with the preview of Read-Access Geo-Redundant Storage or simply RA-GRS (simply?) you'll now get read-only access to this replicated storage account.

"Real" Failover

Traffic Manager has been available for a while now. But if you ever really used it, you've probably been thinking, "what about my data?". For the Windows Azure SQL Database we have the Data Sync functionality (does this thing really work?), but up until now there was no out-of-the-box solution to make your data available in a different datacenter.

Let's make a simple application that takes advantage of RA-GRS and the Traffic Manager.

The Wall

So I've built this little social application (this must be social-app #298439292) which allows you to post messages on someone's wall:

This is a simple application which runs in the West-Europe datacenter (http://wallapp.cloudapp.net/) and which uses the Table Storage for storing the messages on the wall. These are stored in the wallprod Storage Account, also located in the West-Europe datacenter. Now in case the West-Europe datacenter starts having issues with Compute or Storage, my application will go down.

Traffic Manager

The first thing we'll want to do is create a new Cloud Service in a secondary location. Since my application is deployed in the West-Europe datacenter I'll deploy the "spare" application to the North-Europe datacenter.

Now that my spare Cloud Service is online I'll configure Traffic Manager. I created a new profile called "wallapp" in which I added my Cloud Services as endpoints.

As you can see I've set my profile to Failover mode and the wallapp.cloudapp.net Cloud Service is the first endpoint in the priority list. As soon as this Cloud Service goes down Traffic Manager will kick in and the wallapp.trafficmanager.net endpoint will point to wallapp-failover.cloudapp.net. Note that in a real scenario I'll have something like www.thewall.com pointing to wallapp.trafficmanager.net

In order to test the failover I'll simply stop the wallapp.cloudapp.net Cloud Service. As soon as the Traffic Manager notices that the application is offline it can take up to 30 seconds (DNS TTL) for me to be forwarded to the failover environment. I can test this by navigating to:http://wallapp.trafficmanager.net/or by pinging wallapp.trafficmanager.net (even if PING isn't enabled, you'll see that wallapp.trafficmanager.net points to wallapp-failover.cloudapp.net):

Now if I visit my wall you'll see the following message:

Let's see why we're doing this…

What about my data?

Ok so what are your options when we're talking about failover:

Using the Traffic Manager our application is able to failover to a secondary location. But if there's an issue with Storage our application will still break. That's why it's useful to also enable RA-GRS on our storage account. Since this is a preview feature we'll need to activate it first: http://www.windowsazure.com/en-us/services/preview/

After the preview feature is active you can enable RA-GRS on your account:

As you can see the Secondary Region for my Storage Account is North-Europe. Connecting to the Read-Access Storage Account in the Secondary Region works by convention. Just add-secondary to the name of your Storage Account and use the same keys to connect to the Storage Account. In my case I'll be connecting to wallprod-secondary.table.core.windows.net

Configuration

Since I'm working with a Cloud Service I can take advantage of different Service Configurations to configure how the Failover version of the application should work. Start by right clicking your Cloud Service, chooseManage Configurations and take a copy of the Cloud configuration (I called itCloudFailover). And what I've done there is the following:

  1. I made sure the storage account points to the secondary Read-Access version.
  2. I added a property called IsFailover and set its value to 1. This allows me to disable certain features or show warning messages.
  3. I changed the Diagnostics Storage Account for my failover configuration to use a Storage Account in North-Europe (you wouldn't want diagnostics to write to the primary location, which might be broken)

This means that the failover deployment will connect to the read-only Storage Account. It could be possible that some data is missing (maybe the replication wasn't complete before Storage in West-Europe went down), but at least my users will still have access to the application (even though some features might not completely work).

Now since I have a setting which defines if an application is deployed in a failover environment or not I can access this setting in my web application and use it to show notifications or to limit access to specific features. This is how I'm showing the notification that posting a message is not possible:

@if (ViewBag.IsFailover)
{
    <div class="alert alert-danger">We're having some technical issues. Until we solve the issue, you won't be able to write new posts.</div>
}

<div class="jumbotron">  
    <h1>@Model.Username's Wall</h1>
    @if (!Model.Messages.Any())
    {
        <p>There are no messages. Hurry up and post a message!</p>
    }
</div>  

Or how we're restricting access to the "Post a message" feature:

public class MessagesController : Controller  
{
    protected override void OnActionExecuting(ActionExecutingContext filterContext)
    {
        ViewBag.IsFailover = RoleEnvironment.GetConfigurationSettingValue("IsFailover") == "1";
        base.OnActionExecuting(filterContext);
    }

    public ActionResult Index(string username)
    {
        return View(new MessagesModel()
        {
            Username = username,
            Messages = MessageService.List(username)
        });
    }

    [HttpPost]
    public ActionResult Post(PostMessageModel model)
    {
        if (ViewBag.IsFailover)
            return View("ReadOnly");
        MessageService.Add(model.Username, model.Subject, model.Body);
        return RedirectToAction("Index", new {username = model.Username});
    }
}

And that's it. Now I can deploy my application to the failover environment with a specific Service Configuration where the IsFailover option is set to 1. This will cause the application to run in degraded mode (showing messages and limiting certain features).

Considerations

When the Traffic Manager monitors your application it will connect to your homepage by default. But if your homepage doesn't use storage (or Service Bus or whatever…) Traffic Manager might think everything is OK while it's not. That's why it's important to have a custom health probe (like /HealthCheck.aspx) which also checks if the services your depend on are working correctly.

When you build an application keep in mind that systems will fail eventually, so you better come prepared.Read-Access Geo-Redundant Storage and the Traffic Manager make this a lot easier to do.

More information:

  • LinkedIn
  • Tumblr
  • Reddit
  • Google+
  • Pinterest
  • Pocket
Comments powered by Disqus