Application Rate Limit
I needed to implement rate limiting within an application for reasons I’ll get into in a follow-up post. When you start thinking about this, you basically have two paths:
- Logic embedded directly in the application code
- A sidecar container that handles the rate limiting role
Both work. Both have trade-offs. Let me go through each one.
The Code Way
This is the simpler approach on the surface, but it comes with some annoying limitations. It can only be reused for applications in the same programming language. And adding rate limiting logic inside the application creates a secondary role — meaning the request interceptor will consume CPU and may produce false metrics if you’re not tracking it carefully.











