Warming up your WCF Service on an Azure Cloud Service

You might remember me writing on how to warm up your App Service instances when moving between slots. The use of the applicationInitialization-element is implemented on nearly every IIS webserver nowadays and works great, until it doesn’t.

I’ve been working on a project which has been designed, as I’d like to call it, a distributed monolith. To give you an oversimplified overview, here’s what we have.

image

First off we have a single page web application which communicates directly to an ASP.NET Web API, which in turn communicates to a backend WCF service, which in turn also communicates with a bunch of other services. You can probably imagine I’m not very happy with this kind of a design, but I can’t do much about it currently.

One of the problems with this design is having cold-starts whenever a service is being deployed.

Since we’re deploying continuously to test & production there are a lot of cold starts. Using the applicationInitialization-element really helped spinning up our App Services, but we were still facing some slowness whenever the WCF service was being deployed to any of our environments. This service is being deployed to an ‘old-fashioned’ Cloud Service so we figured the applicationInitialization-element should just work as it’s still running on IIS.

After having added some logging to our warmup endpoints in the WCF service we quickly discovered these methods were never hit whenever the service was being deployed or swapped. Strange!

I started looking inside the WCF configuration and discovered HTTP GET requests should just work as expected.

<behaviors>
 <serviceBehaviors>
  <behavior name="MyBehavior">
    <serviceMetadata httpsGetEnabled="true" httpGetEnabled="true" />
  </behavior>
 </serviceBehaviors>
</behaviors>

The thing is, we apparently also have some XML transformation going on for all of our cloud environments, so these attributes were set to false for both the Test and Production services. This means we can’t do a warmup request via the applicationInitialization-element as it’s doing a normal GET request to the specified endpoints.

Since we still need this WCF service to be hot right after a deployment and swap VIP I had to come up with something else. There are multiple things you can do at this point, like creating a PowerShell script which runs right after a deployment, do a smoke-test in your release pipeline which touches all instances, etc. All of them didn’t feel quite right.

What I came up with is to extend the WebRole.OnStart method! This way I can know for sure that every time the WCF Service starts my code is being executed. Plus, all of the warmup source code is located in the same project as the service implementation which makes it easier to track. In order to do a warmup request you need to do a couple of things. First off, we need to figure out what the local IP-number is of the currently running instance. Once we’ve retrieved this IP-number we can use it to do a HTTP request to our (local) warmup endpoint. As I mentioned earlier, HTTP GET is disabled via our configuration, so we need to use a different method, like POST.

There’s an excellent blog post by Kevin Williamson on how to implement such a feature. In this post he states the following:

If your website takes several minutes to warmup (either standard IIS/ASP.NET warmup of precompilation and module loading, or warming up a cache or other app specific tasks) then your clients may experience an outage or random timeouts. After a role instance restarts and your OnStart code completes then your role instance will be put back in the load balancer rotation and will begin receiving incoming requests. If your website is still warming up then all of those incoming requests will queue up and time out. If you only have 2 instances of your web role then IN_0, which is still warming up, will be taking 100% of the incoming requests while IN_1 is being restarted for the Guest OS update. This can lead to a complete outage of your service until your website is finished warming up on both instances. It is recommended to keep your instance in OnStart, which will keep it in the Busy state where it won’t receive incoming requests from the load balancer, until your warmup is complete.

He also posts a small code snippet which can be used as inspiration for implementation, so I went and used this piece of code.

One of the problems you will face when implementing this, is realizing your IoC-container isn’t spun up yet. This doesn’t has to be a problem per se, but I wanted to add some logging to this code in order to check if everything was working correctly (or not). Well, this isn’t possible!

In order to add some kind of logging, I had to resort to writing entries to the Windows Event Log. This isn’t ideal of course, but it’s the cleanest way I could come up with. By adding entries to the event log you ‘only’ have to enable RDP on your cloud service in order to check what has happened. Needless to say, the RDP settings are reset every time you deploy a new update to the cloud service, so enabling it is quite safe in our scenario.

Adding this logging to my solution really saved the day, because after having added the HTTP request to the OnStart method I still couldn’t see the warmup events being triggered. By checking out the Event Log I quickly discovered this had to do with the installed certificate on the endpoint. The error I was facing told me the following

Could not establish a trust relationship for the SSL/TLS secure channel

This makes sense of course as the certificate is registered on the hostname and we’re now making a request directly to the internal IP-number which obviously is different (service.customer.com !== 12.34.56.78). Removing the certificate check is rather easy, but you should only do this when you’re 100% sure to do a thing like this. If you remove the certificate check on a global scope you’re opening up yourself for a massive amount of problems!

For future reference, here’s a piece of code which resembles what I came up with.

private void Warmup()
{
    string loggingPrefix = $"{nameof(WebRole)} - {nameof(Warmup)} - ";

    using (var eventLog = new EventLog("Application"))
    {
        eventLog.Source = "Application";
        eventLog.WriteEntry($"{loggingPrefix}Starting {nameof(Warmup)}.", EventLogEntryType.Information);

        IPHostEntry ipEntry = Dns.GetHostEntry(Dns.GetHostName());
        string ip = null;
        foreach (IPAddress ipaddress in ipEntry.AddressList)
        {
            if (ipaddress.AddressFamily.ToString() == "InterNetwork")
            {
                ip = ipaddress.ToString();
                eventLog.WriteEntry($"{loggingPrefix}Found IPv4 address is `{ip}`.", EventLogEntryType.Information);
            }
        }
        string urlToPing = $"https://{ip}/V1/MyService.svc/WarmUp";
        HttpWebRequest req = HttpWebRequest.Create(urlToPing) as HttpWebRequest;
        req.Method = "POST";
        req.ContentLength = 0;
        req.ContentType = "application/my-webrole-startup";

        RemoveCertificateValidationToMakeRequestOnInstanceInsteadOfPublicHostname(req);

        try
        {
            eventLog.WriteEntry($"{loggingPrefix}Posting to `{urlToPing}`.", EventLogEntryType.Information);
            var response = req.GetResponse();
        }
        catch (WebException webException)
        {
            // Warmup failed for some reason.
            eventLog.WriteEntry($"{loggingPrefix}Posting to endpoint `{urlToPing}` failed for reason: {webException.Message}.", EventLogEntryType.Error);
        }
        eventLog.WriteEntry($"{loggingPrefix}Finished {nameof(Warmup)}.", EventLogEntryType.Information);
    }
}

private static void RemoveCertificateValidationToMakeRequestOnInstanceInsteadOfPublicHostname(HttpWebRequest req)
{
    req.ServerCertificateValidationCallback = delegate { return true; };
}

This piece of code is based on the sample provided by Kevin Williamson with some Event Log logging added to it, a POST request and removed the certificate check.

Hope it helps you when facing a similar issue!


Share

comments powered by Disqus