Blog - HubSpot Product Team

Our Journey to Multi-Region: An Introduction

Written by Jonathan Haber (He/Him) | Mar 23, 2022

This is the first installment of a five-part series about HubSpot’s year-long project to rework our platform for multi-region support.

In July 2021, HubSpot launched an EU data center. Prior to that, HubSpot systems were deployed to a single AWS region (spread across 3 availability zones in us-east-1). The EU launch was the culmination of a year-long project to rework our platform for multi-region support. Requiring more than 10,000 pull requests, this project utilized contributions from nearly all of our 1,000+ engineers. A project of this scale required collaboration and partnership across the organization, making various design decisions and creating solutions to challenges we encountered along the way.

As with any project, we first needed to understand the motivations and goals for the project to help inform all of the design decisions and trade-offs. When we started the multi-region project, we laid out four primary goals:

  1. Our approach to multi-region should give HubSpot customers control over where their data is processed and stored. Businesses are becoming increasingly interested in the details of where their data is processed and stored, and HubSpot customers are no exception. This is true across the globe, but especially within the EU. 
  2. Our approach to multi-region should improve performance for customers located outside of the United States.When using HubSpot, our frontend apps are constantly making API calls in order to read and write data. Currently, each of these API calls needs to make a round-trip to AWS us-east-1 (where HubSpot systems are hosted). For customers on other continents, these round trips have high latency and make the HubSpot product feel sluggish. 
  3. Our approach to multi-region should isolate failures and prevent global outages. When there's an AWS outage in us-east-1 or when HubSpot has an infrastructure issue, it currently affects 100% of our customers. In the long-term, we want these global outages to be the exception, rather than the norm. 
  4. Our approach to multi-region should allow us to more easily scale the whole HubSpot platform. Over the past few years, HubSpot has grown by leaps and bounds in terms of customers and data. This is great for the business, but it means that all of our systems need to scale to support that growth. This has created plenty of exciting engineering challenges, but solving these scaling issues can take engineering focus away from other product improvements.

Based on these goals, we decided that our multi-region design should move HubSpot towards a "pod" architecture. This would allow us to have multiple, independent copies of the entire HubSpot platform, each serving a subset of our customers. Normally, each of these copies would be called a pod. However, to avoid confusion with Kubernetes Pods, we invented a HubSpot-specific term: Hublet. Here are some of the basic guardrails we established:

  • Each Hublet will run a full copy of the HubSpot platform, including all infrastructure, databases, and backend systems
  • Each Hublet will be hosted in a single AWS region, with backups replicated to a secondary region
  • Each Hublet will be named with a geographic identifier followed by an incrementing number, such as na1 or eu1 (to be followed by na2 and eu2, etc.)
  • External traffic will primarily be routed via Hublet-specific DNS records (for example, EU customers will access the product via app-eu1.hubspot.com, which makes API calls to api-eu1.hubspot.com)
  • Each Hublet will get its own AWS account and VPC
  • Databases will be locked down at the network level to prevent accidental cross-Hublet traffic
  • To the extent possible, each Hublet will not communicate with or depend on any other Hublets
  • Secrets such as encryption/decryption keys and 3rd party API keys will be unique to each Hublet, and only stored within that Hublet
  • Each HubSpot account will be hosted in a single Hublet, chosen at creation time (ie, a single HubSpot account can't have half of its data in the US and half in the EU)
  • All of our existing infrastructure in us-east-1 will become the "na1" Hublet, and therefore all existing HubSpot customers are hosted on na1

In the long term, you can imagine dozens of Hublets spread around the world. This gives our customers relatively granular control over where their data is processed and stored. It also improves performance of the HubSpot product, by bringing the data closer to the customer. And because each Hublet is well-isolated from the others, outages will only affect a small fraction of our customers. Finally, we can define self-imposed scaling limits. As we near these limits in a particular Hublet, we can stop adding customers to that Hublet and launch a new one.

However, this design isn't without its own set of trade-offs. Because each Hublet is isolated and hosted in a single AWS region, region-level outages will still cause downtime (for the subset of HubSpot customers hosted in that region). Also, each HubSpot customer needs to choose a single location to host their account. For larger companies with employees around the world, there's no single location that will provide low latency to everyone. We decided that these limitations are acceptable for now, but we may try to mitigate them in the future.

While the long term plan is to have dozens of Hublets spread around the world, we decided to take a more incremental approach to the project. The first milestone would be to launch a new Hublet in the EU, named eu1. To do so, we had to "Hublet-ize" all of our systems, which was a one-time cost that will make future launches much more straight-forward. 

In future posts, we'll talk about the details of this project and some of the technical challenges we had to overcome.

Other multi-region blog posts:

These are the types of challenges we solve for on a daily basis at HubSpot. If projects like this sound exciting to you, we’re hiring! Check out our open positions and apply.