Rate Limiting - Tag - Lorenzo's Blog

HPA vs Rate-limit

Lorenzo Girardi — Tue, 14 Feb 2023 00:00:00 +0000

INTRO

Strange… we are using HPA to increase availability and introducing rate limiting to reduce it?

Well, let’s create the context.

This analysis is based on specific assumptions:

Cloud environment
Dynamic infrastructure
Minimum resources available

HPA

In Kubernetes, a HorizontalPodAutoscaler automatically updates a workload resource (Deployment, StatefulSet) to match demand.

Patterns

Type	Behaviour
Slow and temporary	Daily fluctuations, peaking during the day and troughing at night
Rapid and temporary	Short bursts from poorly-behaved downstream services
Slow and persistent	Request volume slowly increases as the product sees adoption
Rapid and persistent	Abrupt shift from low to high volumes — e.g. called by batch jobs

Ideal Practice

Type	Ideal Practice
Slow and temporary	HPA should add and remove pods as necessary
Rapid and temporary	HPA should NOT modify pod count — leave headroom for brief spikes
Slow and persistent	HPA should add and remove pods as necessary
Rapid and persistent	Leave headroom; HPA adds pods quickly to restore target utilization

Rate Limit

A rate limit is the number of API calls an app or user can make within a given time period. If this limit is exceeded — or if CPU or time limits are exceeded — the app may be throttled. Throttled requests fail.

Application Rate Limit

Lorenzo Girardi — Sat, 11 Feb 2023 00:00:00 +0000

I needed to implement rate limiting within an application for reasons I’ll get into in a follow-up post. When you start thinking about this, you basically have two paths:

Logic embedded directly in the application code
A sidecar container that handles the rate limiting role

Both work. Both have trade-offs. Let me go through each one.

The Code Way

This is the simpler approach on the surface, but it comes with some annoying limitations. It can only be reused for applications in the same programming language. And adding rate limiting logic inside the application creates a secondary role — meaning the request interceptor will consume CPU and may produce false metrics if you’re not tracking it carefully.

Kubernetes API Gateway

Lorenzo Girardi — Sun, 08 Nov 2020 00:00:00 +0000

It’s time to talk about the API gateway.

In a modern infrastructure — especially in a microservices environment — you probably know what I’m referring to. But it’s worth being explicit about it:

“An API gateway takes all API calls from clients, then routes them to the appropriate microservice with request routing, composition, and protocol translation. Typically it handles a request by invoking multiple microservices and aggregating the results, to determine the best path.”