Written by Ray Mattingly, Engineering Lead, HBase @ HubSpot
At HubSpot, managing resource usage across our HBase clusters is a critical and difficult problem. Our HBase clusters serve over 25 million requests per second at our daily peak traffic. Thousands of distinct background processes and user-facing web services generate traffic against our most critical HBase clusters, and this traffic is often routed through a small number of internal micro-services that act as a dedicated proxy layer for these HBase clusters. With this architecture, a single bad job or worker could cause pathological load on the underlying HBase cluster, and that load could cause reliability issues across all features that depend on said cluster.
A concrete example of this applications-to-proxy-to-HBase pipeline is our CRM. The CRM is the centerpiece of the HubSpot platform, and its basic data type is a “CRM object.” HubSpot CRM has your standard CRM objects: contacts, companies, deals, etc., but it also has support for “custom objects” that can be flexibly designed to meet each customer’s unique requirements. This framework is powerful, customizable, and is used to some degree across virtually every HubSpot feature. Here is a very simplified example of how many different systems at HubSpot may interact with the CRM objects that are stored in HBase:
In this post I will explain how to achieve scalable resource sharing both at a cluster level and per-RegionServer.
Introductory Glossary
HBase
HBase is a distributed, scalable, big data store modeled after Google's Bigtable. It's part of the Apache Hadoop ecosystem and is designed to provide random, low-latency read/write access to huge datasets.
HBase Tables
Data in HBase is organized and accessed through tables, which consist of lexicographically sorted rows. The rows in each table are partitioned into regions, which are the basic building blocks for scalability and distribution.
HBase Regions
A region is a subset of a table containing all the rows between a start row and an end row. As data grows, regions can be split and merged automatically to maintain system performance. Regions are distributed across the RegionServers in your cluster to distribute the read/write workload for your table.
HBase RegionServers
An HBase RegionServer is the process responsible for handling read and write requests. Each RegionServer is responsible for some number of regions. This post will often refer to an HBase deployment as an “HBase cluster”, meaning a cluster of HBase RegionServers.
Hotspots
An HBase hotspot occurs when a disproportionate amount of traffic is directed to a single RegionServer. This can lead to performance bottlenecks as that server becomes overloaded, resulting in slower response times and reduced reliability. Hotspots often arise from poor data organization, but can also be caused by application layer bugs or general misunderstandings regarding “how much traffic is too much?” When one system at HubSpot generates a hotspot that causes an outage for other users of a given cluster, we refer to it as a “noisy neighbor hotspot.”
HBase Quotas
HBase Quotas are the out-of-the-box solution for managing resource sharing. Quotas come in several forms. For example, there are space quotas — these dictate how large the data of tables, namespaces, etc may become. This post ignores this type entirely, and we don’t use space quotas at HubSpot. Separately there are throttle quotas, which restrict throughput. For example, one can specify that only 10 writes may be done to a given table per minute. Or one could specify that only 1000 requests may be executed by a given user per second. Throughput limits can be defined based on request counts, write size, and read IO. Throttles can also be optionally configured on a per-machine basis, meaning that a 1000 request/second/machine limit would restrict its subjects to only executing 1000 requests/second against any given RegionServer (rather than 1k/sec against the whole cluster). This is a nice choice in our opinion, because this allows throttles to scale horizontally with the cluster, and it allows them to protect against hotspots. This post will primarily focus on per-RegionServer, per-second, user-specific throttles.
Why is Resource Sharing a Problem?
Users don’t always behave. With thousands of distinct systems generating workloads against such critical clusters, we need to assume that some systems will run awry. In this real example you can see the load across every RegionServer drastically increase by approximately 10x in a matter of moments:
Digging deeper into our metrics, we can prove that this increased workload originated from a single system:
We see here that the rate of reads per second per user looks normal for everything except this severe outlier, the brown line.
A 10x, 200k RPS increase in read traffic is a drastic example, but it is both a real example, and it demonstrates that even a well-distributed schema can be at risk of these “noisy neighbor” abuses — where a single user may saturate the available resources. Incidents like this are surprisingly common when thousands of microservices, workers, and batch analytics are at play. Further, in a strongly consistent database like HBase — even a single overloaded commodity server can be significantly problematic for the application layer. To put it simply, we need to find a way to eliminate the possibility of runaway users, hot clusters, and even hotspots; this example demonstrates that thoughtful schema design alone is not an adequate guardrail.
In case you’re wondering “why not autoscale?” I would like to proactively debate this idea. First of all, autoscale cannot be fast enough to guarantee our reliability standards across all systems when unthrottled traffic is in the mix. Traffic can multiply in an instant, but machines take time to bootstrap, acquire balanced traffic, and warm their caches. Further, autoscale is not a cost effective solution for database abuses like this. In our experience, the traffic most likely to be problematic is not latency sensitive traffic. Our end-users do not generate 200k RPS of latency sensitive traffic in a moment’s notice, but rather our internal systems, batch analytics, and async processes sometimes do. By throttling these internal and insensitive workloads you can maintain high reliability standards and efficient bottomline costs.
Quotas as a Solution
Our solution can be broken down into a few steps:
Each step is important and useful in its own right, but it is the culmination of these new features that multiplies the utility of HBase Quotas.
Enabling Quotas
HBase Quotas are the out-of-the-box solution for resource sharing. After several contributions to Apache HBase, we have found them to be a very effective tool at HubSpot. This system is disabled by default, but can be enabled by setting hbase.quota.enabled to true in your HBase configuration. With no other custom configurations, and no quotas defined, enabling quotas should not have any effect on requests.
HBase quotas will periodically “refresh” via a scheduled chore. A quota refresh updates the RegionServers’ understanding of the defined limitations — for example, if you update a quota, then it will take effect in the next refresh. This chore runs by default on five minute intervals (at the time of writing this blogpost). We decided that five minutes is not quick enough, because we would like to modify system throughputs much more seamlessly in an emergency. We have configured hbase.quota.refresh.period to 30000 (this value is in milliseconds) and have not observed increased quota refreshes to be a burden on performance, but have appreciated the improved throughput flexibility.
Supplying Quota User Overrides
What is a Quota User Override?
To put it simply, a quota user override is a new feature in HBase that allows you to specify a custom username (for use only by Quotas) on any given request. This will make more sense as you read through this section. This is also where we needed to make some contributions to Quotas in order to make it work at HubSpot, and where our custom configurations begin.
As a refresher from our glossary definitions, at HubSpot we have focused on per-RegionServer, per-second, user-specific throttles. To us, it seemed to make sense that, if we wanted to ensure that no users monopolize resources, then user throttles are the way to go.
But we had a problem — because we use web services as proxies between our traffic generators and the database itself, our original user was obfuscated. It wouldn’t make sense to throttle the direct caller, our proxy micro-service, because that would limit traffic without any regard for its true source. For example, if the EmailSendingKafkaWorker called the CrmObjectsWebService which called the CrmObjectsHBaseCluster, then the CrmObjectsWebService would erroneously be considered the “user” from the quota’s perspective.
To solve this, we added a couple of new features:
Connection and Request Attributes
Each request to HBase contains at least one Operation (a Get, Put, Delete, …), and often at HubSpot we will send batches of hundreds or thousands of Operations in a single request. Operations already supported the notion of “attributes” — a map of arbitrary key/value metadata. Expanding the payload of each operation feels quite wasteful when we send millions of operations per second at our peak traffic, so we added support for connection and request attributes which allow for metadata communication at much lower frequency than a per-Operation basis.
Quota User Overrides
This feature allows you to specify a key that, when passed in as a request attribute, will override the quota’s “user” value. In other words, this allows you to tell HBase quotas who this request is coming from via a request attribute. Please note that this system is not “secure” in any sense — any HBase client could trivially change their username to bypass your quotas.
Specifying a Quota User Override Key
In order to use quota user overrides, you must specify the request attribute key to be used for this feature. You can do this by configuring hbase.quota.user.override.key to your desired key. For this example, we will assume that you have configured hbase.quota.user.override.key to qu (short for quota user, shortened because this will be going over the network for each request).
Sending the Request Attribute
Let’s now pretend that you’re designing an endpoint which will interact with HBase:
By using the TableBuilder class like we have in the example above, you can use TableBuilder#setRequestAttribute to customize the quota user.
Circling back to our original example:
the EmailSendingKafkaWorker called the CrmObjectsWebService which called the CrmObjectsHBaseCluster
Let’s assume that the endpoint defined above lives in CrmObjectsWebService, and it is appropriately passing in the request attribute qu=EmailSendingKafkaWorker for requests from said worker. You would now be empowered to create user throttles (via the shell or the Admin interface) for the EmailSendingKafkaWorker in isolation — you could restrict it to, say, 100MB of IO per RegionServer. If you strike a balance between a reasonable throughput for the given system, and a throughput which cannot monopolize any servers, then you’ve established a reliable client/service relationship!
Configuring Default User Quotas
What are Default User Quotas?
If you’ve gotten to this step, then you’re already in a much better position. In an emergency, you could manually implement a strict quota and throttle a single user in isolation, despite any obfuscation introduced by a microservice architecture. This is a great start, but it is retroactive and manual operations don’t scale.
To become proactive and scalable, we introduced default user quotas. Default user quotas allow you to specify per-RegionServer, per-second user throttles that will be applied to each user out-of-the-box.
Default User Configurations
For example, one can configure hbase.quota.default.user.machine.read.size to 524288000. This would ensure that, without any other configuration, workers like EmailSendingKafkaWorker may only read 500MB/second from single RegionServers. To be clear, this will limit each distinct user at 500MB/second, not the aggregate of all users, so one user’s exhaustion of the quota does not affect other users’ quota availability.
At the time of writing this blogpost, there are several different request types that support default user quotas (see HBase’s QuotaUtil class):
All of these defaults are applied on a per-second and per-RegionServer basis. The size quotas are all in bytes, and the read/write/request number limits are self-explanatory.
Customizing Back Off Strategies
At this point, you now have clusters with great guardrails against runaway traffic. Perhaps you have modern SSDs in production that are easily capable of 1GB/sec of IO, so you have restricted each original caller to utilize no more than 100MB/sec out-of-the-box. This ensures that, without some sort of coordination across original callers, hotspots and cluster wide resource saturation become extremely unlikely.
At HubSpot, we found this approach to have one glaring gap: we hadn’t thought much about the relationship between throttling servers and throttled clients.
AverageIntervalRateLimiter’s Insufficiencies
Under the hood, HBase Quotas are powered by RateLimiters. The rate limiters monitor resource usage against the defined limit, and handle refreshing of resource availability as time passes. By default, HBase uses the AverageIntervalRateLimiter and we found this to be inadequate.
The AverageIntervalRateLimiter is designed to refill quota availability in chunks of flexible size. It does this by proactively refilling the proportion of the TimeUnit (at HubSpot, always per-second) that has passed since the last check of the quota. So if you have defined a 1000 request per second limit, and one millisecond has passed since the quota was exhausted, then it would allow a single request to be executed (because 1ms = 1/1000 of a second, and 1/1000 of the 1000 RPS quota = 1 request). The AverageIntervalRateLimiter sounds like a good way to balance a desire for low latency and high quota utilization, while still safely backing off as necessary, but in practice we found it to be far too optimistic.
The AverageIntervalRateLimiter’s problematic optimism is showcased if your average request size is far smaller than your overall quota. For example, let’s say that a RegionServer is serving 10,000 reads per second, and each read is fetching one 64kb block from disk; this workload would require 625MB/sec of IO. If you put a 500MB/sec throttle in place, then the throttle would quickly be exhausted and the server would begin throwing RpcThrottlingExceptions. RpcThrottlingExceptions are the client’s clue to back off before retrying — they contain a recommended back off time (or “wait interval”) that is the server’s estimation of when appropriate resources will be available to serve the given request, and the HBase client implicitly sleeps for the RpcThrottlingException’s recommended back off.
The RpcThrottlingException back off time is calculated by determining how much time would need to pass, in a vacuum, for the quota to be adequately refilled for the given request. So, circling back to our original example, let’s say that our 500MB/sec quota has been exhausted, but single block requests keep coming in to read 64KB. The quota would calculate that a 64KB request is only 0.01% of the entire 500MB/sec (512000KB/sec) limit, so it would estimate a back off time at 0.01% * 1sec, or 0.1ms (rounded down to 0ms!).
A zero millisecond back off is problematic for several reasons:
Here’s an example of a real throttle-induced-DOS in a HubSpot cluster:
FixedIntervalRateLimiter: A Suitable Alternative
AverageIntervalRateLimiter may not be the best solution for your HBase setup, but what’s our alternative? HBase offers a FixedIntervalRateLimiter which is a much simpler design: it simply refills your quota to its limit on the given TimeUnit (again, at HubSpot, always per-second). So if you implement a 1000 request/second quota and you exhaust it within 1 millisecond, then you will need to wait 999ms to execute your next request (or 1000 requests, if you’d like).
The FixedIntervalRateLimiter has its own drawbacks: in a latency sensitive production environment, you don’t want to needlessly wait around for seconds at a time. Further, long wait intervals cause a poor developer experience because it is difficult to fully utilize your quotas with all of the wasted cycles, and if it is difficult to fully utilize your quotas, then it’s difficult to reason about how they will work. If you are unable to consistently utilize meaningful proportions of your quota, then it becomes difficult to reason about ideal limits; you’re often left wondering why you’re being throttled despite throughput metrics suggesting you’re doing an acceptable amount of work.
Making FixedIntervalRateLimiter Better
To combat the wait interval pessimism of the FixedIntervalRateLimiter, we added a new configuration to HBase (HBASE-28453). The new configuration, hbase.quota.rate.limiter.refill.interval.ms, defines the interval in milliseconds on which a FixedIntervalRateLimiter should refill proportions of its quota (rather than just unchangeably using the quota’s TimeUnit). We refer to this interval as the quota’s “refill interval.” The refill interval allows you to avoid pessimistic back offs, while also guarding against frivolous retries.
After a few different trials, we landed on hbase.quota.rate.limiter.refill.interval.ms set to 50ms across all clusters at HubSpot, and it has worked well enough that we have not been motivated to continue experimentation. This means that, if we define a 1000 request/second throttle, then it will refill 50 requests every 50ms. Here is how a 100ms refill interval changed our ability to consistently utilize a 25MB/sec quota’s full throughput:
And here is how the 100ms refill interval affected RpcThrottlingExceptions’ wait intervals as we rolled out the new configuration across our QA environment.
With default configurations, we see a pattern of presumably quick quota exhaustion, followed by long back offs. As RegionServers picked up a 100ms refill interval, these back offs became a much more normal distribution approaching instant retries on one end and one second back offs on the other.
Improving Workload Estimation
Throughout our work here we also realized that our Quotas setup was only as good as HBase’s workload estimation. Without accurate workload estimation, one may allow expensive requests through a too-saturated throttle. Further, workload estimates are used to determine wait interval recommendations. So poor estimation could result in tiny back offs of expensive requests, and we have already discussed why that can be so problematic.
To improve quota estimation, we would recommend upgrading to HBase 2.6 or backporting the following to your fork of HBase:
With these changes in place your workload estimation, and consequently your Quotas performance, should be greatly improved.
Quotas at HubSpot Today
To provide a concrete example, we can look back at a MapReduce job in January 2024 — before we implemented any default throttling. This job, without any server-side guardrails, would generate approximately 150MB/s of IO from the average RegionServer, and as much as 400MB/s from the maximum.
This particular job generates as much as 400MB/s from a single RegionServer, meanwhile its average workload is less significant.
This is a very precarious situation for the cluster and its other users because this client, and any others like it, have the freedom to request resource-saturating workloads from our RegionServers. It’s especially disappointing because MapReduce jobs exist to do batch analytics, making them inherently latency insensitive — customers are not waiting on this work to be done as quickly as possible. But the hotspots that this job produces can degrade performance for customer-facing traffic and too easily cause meaningful outages for our product.
In times past, we would wait until a job like this caused an outage, and then we would require that the relevant product developers put more thought into their schema, traffic distribution, and client-side rate limiting. In other words, we were solving for outages retroactively and forcing our product developers to spend time focused on details that are irrelevant to their feature set — neither aspect being aligned with HubSpot’s customer focused philosophy.
Fast forward to today and look at the same job’s workload, bearing in mind that we now have a 100MB/s default user throttle in place on this cluster:
This job’s traffic remains “hotspotty” in nature — its maximum and average IO workloads differing by about 2x. But by limiting the max workload at a predetermined, safe, ceiling, we have mitigated the risk posed by this application.
On typical runs we will now see this job generate 50MB/s of IO from the average RegionServer in this cluster, and 100MB/s in the maximum case. This default throttling setup mitigates would-be hotspots in real-time and without any manual intervention required.
By leaning into default user throttles for all applications, it has never been easier to both manage and use HBase at HubSpot. We are virtually always throwing some volume of RpcThrottlingExceptions across our clusters, and we have found this to be a healthy, well-utilized, and predictable state for both operators and users.
Conclusion
Whether you are new to the HBase ecosystem or you have been working in it for a decade, you have likely run into resource sharing issues. It’s also likely that you considered HBase Quotas as a potential solution, but found it to be insufficient.
At HubSpot, we have successfully isolated bad users to a precise degree via the Quotas setup described in this post. Even beyond bad user isolation, it is a huge win to allow the application layer to pass the responsibility of rate limiting up to the RegionServers — by proactively solving the complex choices required to run a reliable application at scale, we’ve been able to drive up developer productivity across product teams at HubSpot.
With an upgrade to 2.6 and a little bit of customization, Quotas can now be a very powerful and scalable guardrail, even for clusters with thousands of distinct tenants.
Are you ready to make an impact? Check out our careers page for your next opportunity! And to learn more about our culture, follow us on Instagram @HubSpotLife.