Tactics to Safely Rewrite a Major API

Here at HubSpot, we ship new code quite often. Development at such a rapid pace only works if you can ensure as little disruption as possible (if any) for your customers by using strategic patterns and techniques. We have a system in place that allows us to toggle features on a per-customer basis and we recently applied it to successfully rewriting an existing, high-trafficked API. We implemented this with our own homegrown gating system, and our nginx loadbalancing tier, but the general pattern could be applied to other architectures, too. Here’s a rundown of how we did this and the steps you can take to address issues with a new system before they impact your overall customer base.

Step 1: Create internal proxy in nginx Load Balancer

The first thing we want to do is allow the load balancer to transparently proxy a request to the existing API on a conditional basis, as determined by our new application (in this case, our condition is the evaluation of a feature gate). Nginx has built-in support for internal proxying, via a feature known as X-Accel (or x-sendfile).  If during the processing of a reverse-proxied request, nginx receives a special response header from the upstream server, "X-Accel-Redirect", with a URI, nginx will internally proxy the request to the named location and return the response straight out to the originating client.

 

Here's our internal proxy setup:
The critical bit here is that this looks for an additional response header from the upstream server, X-Downstream-Url, and uses that value as an internal proxy source. By setting this into a variable, it forces nginx to resolve the host via DNS, which is absolutely necessary in an EC2-like environment where your hosts can get reassigned IP's on a whim, especially if you use ELB's. Note: the downstream url must be an absolute URL, complete with scheme (http://) or else it won't work, since we're using an external DNS resolver.

Step 2: Check gate early in request filter in new API

Next, we need to create a request filter in our new API service which can evaulate our condition (gate check) and trigger the redirect to the existing API if necessary. Since we are a dropwizard shop, naturally our filter is a ContainerRequestFilter:

Step 3: Reroute all existing API traffic to new API load balancer

This is the potentially risky step as we're actually affecting all existing traffic, even though our gate will be initially set up to not allow any traffic past our new filter. Once the traffic has been rerouted, however, we are free to ungate and/or re-gate customers whenever we need to!

Step 4: Validate, Verify, Vanquish (the old API)

Our final step is to validate that the new API functions correctly. We've created a number of specialized tools for comparing responses side-by-side, automated smoke tests and regression tests, and tools for visual diff checking, which would make for a good follow-up post. Stay tuned! 

 

As you can see, using an internal proxy at the load balancer can be a powerful technique for rolling out new code in a controlled fashion. An added benefit is that you aren't tying up your new API application server with the I/O of proxying requests to the existing API during the rollout period.


 

Jared Stehler

Written by Jared Stehler

Software Engineer @ HubSpot

Comments

Subscribe for updates

New Call-to-action