application rate limit

For some reasons that i will share later on in a dedicated discussion, i had the need to create a rate limit inside the application.
For this scenario we have 2 greddy options:
- have the logic inside the app , in a code way
- have a sidecar container that has this role
the code way
This is the simplest one but has some side effects,
first of all can be reaused just for the same language apps.
Same as before , having this rate inside the app , we are creating a role inside this app
that can be good or bad depending of the strategy applied,
imagine the request interceptor for example will consume cpu and may false metrics if we are not aware of it.
But anyway , it can be a solution , so in a simple python code you have to add the following for example
(i will use the usual app i'm using for labs stuff)
I'ts Flask so you just have to add the following:
in requirements.txt --> Flask-Limiter
Inside your code , import the libraries
from flask_limiter import Limiter
from flask_limiter.util import get_remote_address
Enable the limiter
limiter = Limiter(
get_remote_address,
app=app,
storage_uri="memory://",
)
And add the limiting option in the route u'd like to rate
@limiter.limit("28 per second")
Handson
Just moved from @limiter.limit("28 per second")
to @limiter.limit("2 per second")
$ docker build -t lgirardi/pytbakrated .
[+] Building 11.1s (9/9) FINISHED
=> [internal] load build definition from Dockerfile 0.0s
=> => transferring dockerfile: 34B 0.0s
=> [internal] load .dockerignore 0.0s
=> => transferring context: 2B 0.0s
=> [internal] load metadata for docker.io/library/python:3.9-alpine 0.6s
=> [internal] load build context 0.0s
=> => transferring context: 4.47kB 0.0s
=> CACHED [1/4] FROM docker.io/library/python:3.9-alpine@sha256:8bda1e9a98fa4e87ff6e3a7682f496532b06fcbae10326a59c8656126051d4df 0.0s
=> [2/4] COPY . /app 0.0s
=> [3/4] WORKDIR /app 0.0s
=> [4/4] RUN pip install -r requirements.txt 9.9s
=> exporting to image 0.4s
=> => exporting layers 0.4s
=> => writing image sha256:583ec19905662414ad6bdb3b0da1042ec799ec46f8a676fac6553364cd02bf1e 0.0s
=> => naming to docker.io/lgirardi/pytbakrated
$ docker run -p 5000:5000 lgirardi/pytbakrated
* Serving Flask app 'app'
* Debug mode: off
2023-02-11 10:36:06,264 INFO werkzeug MainThread : WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead.
* Running on all addresses (0.0.0.0)
* Running on http://127.0.0.1:5000
* Running on http://172.17.0.2:5000
2023-02-11 10:36:06,264 INFO werkzeug MainThread : Press CTRL+C to quit
With a simple curl you can check if it's working , so the rate will be triggered if u go over 2 req/sec
$ while true; do curl -I localhost:5000/api/fib/1 && sleep 0.4;done
HTTP/1.1 200 OK
Server: Werkzeug/2.2.2 Python/3.9.16
Date: Sat, 11 Feb 2023 10:37:54 GMT
Content-Type: text/html; charset=utf-8
Content-Length: 1
Vary: Accept-Encoding
Connection: close
HTTP/1.1 200 OK
Server: Werkzeug/2.2.2 Python/3.9.16
Date: Sat, 11 Feb 2023 10:37:54 GMT
Content-Type: text/html; charset=utf-8
Content-Length: 1
Vary: Accept-Encoding
Connection: close
HTTP/1.1 429 TOO MANY REQUESTS
Server: Werkzeug/2.2.2 Python/3.9.16
Date: Sat, 11 Feb 2023 10:37:55 GMT
Content-Type: text/html; charset=utf-8
Content-Length: 117
Vary: Accept-Encoding
Connection: close
HTTP/1.1 200 OK
Server: Werkzeug/2.2.2 Python/3.9.16
Date: Sat, 11 Feb 2023 10:37:55 GMT
Content-Type: text/html; charset=utf-8
Content-Length: 1
Vary: Accept-Encoding
Connection: close
HTTP/1.1 200 OK
Server: Werkzeug/2.2.2 Python/3.9.16
Date: Sat, 11 Feb 2023 10:37:56 GMT
Content-Type: text/html; charset=utf-8
Content-Length: 1
Vary: Accept-Encoding
Connection: close
HTTP/1.1 429 TOO MANY REQUESTS
Server: Werkzeug/2.2.2 Python/3.9.16
Date: Sat, 11 Feb 2023 10:37:56 GMT
Content-Type: text/html; charset=utf-8
Content-Length: 117
in the same way the contrainer logs are showing the same
2023-02-11 10:37:54,359 INFO werkzeug Thread-34 : 172.17.0.1 - - [11/Feb/2023 10:37:54] "HEAD /api/fib/1 HTTP/1.1" 200 -
2023-02-11 10:37:54,778 INFO werkzeug Thread-36 : 172.17.0.1 - - [11/Feb/2023 10:37:54] "HEAD /api/fib/1 HTTP/1.1" 200 -
2023-02-11 10:37:55,196 INFO flask-limiter Thread-38 : ratelimit 2 per 1 second (172.17.0.1) exceeded at endpoint: fib
2023-02-11 10:37:55,197 INFO werkzeug Thread-38 : 172.17.0.1 - - [11/Feb/2023 10:37:55] "HEAD /api/fib/1 HTTP/1.1" 429 -
2023-02-11 10:37:55,616 INFO werkzeug Thread-40 : 172.17.0.1 - - [11/Feb/2023 10:37:55] "HEAD /api/fib/1 HTTP/1.1" 200 -
2023-02-11 10:37:56,035 INFO werkzeug Thread-42 : 172.17.0.1 - - [11/Feb/2023 10:37:56] "HEAD /api/fib/1 HTTP/1.1" 200 -
2023-02-11 10:37:56,455 INFO flask-limiter Thread-44 : ratelimit 2 per 1 second (172.17.0.1) exceeded at endpoint: fib
2023-02-11 10:37:56,456 INFO werkzeug Thread-44 : 172.17.0.1 - - [11/Feb/2023 10:37:56] "HEAD /api/fib/1 HTTP/1.1" 429 -
Honestly there are few pro and cons ... but i'd like to enter in those details with a more practical exmaple ...
just few concepts anyway are:
- is using the same pool of connections
- it's impacting the same metric that i'm using for autoscaling
- is stealing resources from app
- is really simple rate limit that can fight with others stuff
the sidecar way
Yes i know , this is a stretch, since we have service mesh , gateway etc etc that embrace this concept.
However, for a specific reason, i created just a sidecar container that can be reused without the needs to implement the infrastructure mentioned before
First of all the choice, tons of possibility, from nginx reverse proxy , to haproxy to whatever proxy ...
in this scenario i took the opportunity to use envoy , i've already used it with istio , and is really really flexible
(sometimes too much, in a way to be confused :-)
Anyway ... it's just a sidecar
In an existing app you have to add the following to add envoy
- name: sidecar
image: envoyproxy/envoy:v1.22-latest
resources:
limits:
cpu: 100m
memory: 150Mi
requests:
cpu: 30m
memory: 55Mi
ports:
- name: http
containerPort: 5002
protocol: TCP
volumeMounts:
- name: sidecar-config
mountPath: "/etc/envoy"
readOnly: true
volumes:
- name: sidecar-config
configMap:
name: pytbakt-configmap
(great fantasy for the name :-)
and ... well the most important configuration is the pytbakt-configmap
that is the envoy configuration
apiVersion: v1
kind: ConfigMap
metadata:
name: pytbakt-configmap
labels:
app: pytbak
namespace: pytbak
data:
envoy.yaml: |
static_resources:
listeners:
- name: listener_0
address:
socket_address:
address: 0.0.0.0
port_value: 5002
filter_chains:
- filters:
- name: envoy.filters.network.http_connection_manager
typed_config:
"@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
stat_prefix: ingress_http
route_config:
name: local_route
virtual_hosts:
- name: local_service
domains: ["*"]
routes:
- match:
prefix: "/"
route:
cluster: pytbak
http_filters:
- name: envoy.filters.http.local_ratelimit
typed_config:
"@type": type.googleapis.com/envoy.extensions.filters.http.local_ratelimit.v3.LocalRateLimit
stat_prefix: http_local_rate_limiter
token_bucket:
max_tokens: 29
tokens_per_fill: 29
fill_interval: 1s
filter_enabled:
runtime_key: local_rate_limit_enabled
default_value:
numerator: 100
denominator: HUNDRED
filter_enforced:
runtime_key: local_rate_limit_enforced
default_value:
numerator: 100
denominator: HUNDRED
response_headers_to_add:
- append_action: OVERWRITE_IF_EXISTS_OR_ADD
header:
key: x-local-rate-limit
value: 'true'
local_rate_limit_per_downstream_connection: false
- name: envoy.filters.http.router
typed_config:
"@type": type.googleapis.com/envoy.extensions.filters.http.router.v3.Router
clusters:
- name: pytbak
connect_timeout: 0.25s
type: LOGICAL_DNS
dns_lookup_family: V4_ONLY
lb_policy: ROUND_ROBIN
load_assignment:
cluster_name: service_a
endpoints:
- lb_endpoints:
- endpoint:
address:
socket_address:
address: localhost
port_value: 5000
in a Human readable way ...
- name: listener_0
address:
socket_address:
address: 0.0.0.0
port_value: 5002
envoy proxy is listening on port 5002
- name: local_service
domains: ["*"]
routes:
- match:
prefix: "/"
route:
cluster: pytbak
no path or domain configuration , just handle all after /
clusters:
- name: pytbak
connect_timeout: 0.25s
type: LOGICAL_DNS
dns_lookup_family: V4_ONLY
lb_policy: ROUND_ROBIN
load_assignment:
cluster_name: service_a
endpoints:
- lb_endpoints:
- endpoint:
address:
socket_address:
address: localhost
port_value: 5000
something like reverse proxy configuration since the app is listening on port 5000
And now the Rate limiting section
http_filters:
- name: envoy.filters.http.local_ratelimit
typed_config:
"@type": type.googleapis.com/envoy.extensions.filters.http.local_ratelimit.v3.LocalRateLimit
stat_prefix: http_local_rate_limiter
token_bucket:
max_tokens: 29
tokens_per_fill: 29
fill_interval: 1s
filter_enabled:
runtime_key: local_rate_limit_enabled
default_value:
numerator: 100
denominator: HUNDRED
filter_enforced:
runtime_key: local_rate_limit_enforced
default_value:
numerator: 100
denominator: HUNDRED
response_headers_to_add:
- append_action: OVERWRITE_IF_EXISTS_OR_ADD
header:
key: x-local-rate-limit
value: 'true'
local_rate_limit_per_downstream_connection: false
we have a bucket of 29 token , and every second it re-enable the 29 token in the bucket ... i mean ... 29req/second almost :-)
Handson
Again i changed the rate limit to 2req/second
token_bucket:
max_tokens: 2
tokens_per_fill: 2
fill_interval: 1s
in kubernetes pytbak pytbak-stable-bd648fd46-nj95m 2/2 Running 0 13s
kubectl describe pod pytbak-stable-bd648fd46-nj95m -n pytbak
Name: pytbak-stable-bd648fd46-nj95m
Namespace: pytbak
Priority: 0
Node: instance-20220215-1853/10.0.254.135
Start Time: Sat, 11 Feb 2023 11:19:45 +0000
Labels: app=pytbak
pod-template-hash=bd648fd46
track=pytbak-stable
Annotations: cni.projectcalico.org/podIP: 10.1.156.173/32
cni.projectcalico.org/podIPs: 10.1.156.173/32
prometheus.io/path: /metrics
prometheus.io/port: 5000
prometheus.io/scrape: true
Status: Running
IP: 10.1.156.173
IPs:
IP: 10.1.156.173
Controlled By: ReplicaSet/pytbak-stable-bd648fd46
Containers:
pytbak:
Container ID: containerd://9a54504e73a14746299366e2941b51f7069b6649208329a3f87708b053ef1eaf
Image: lgirardi/rest-test-multip:0.6
Image ID: docker.io/lgirardi/rest-test-multip@sha256:c94695a04fb3b862bfb576ead460aba3528bd098327f88122d087f48380506dd
Port: 5000/TCP
Host Port: 0/TCP
State: Running
Started: Sat, 11 Feb 2023 11:19:46 +0000
Ready: True
Restart Count: 0
Limits:
cpu: 300m
memory: 250Mi
Requests:
cpu: 30m
memory: 125Mi
Liveness: http-get http://:5000/api/ delay=40s timeout=10s period=10s #success=1 #failure=3
Readiness: http-get http://:5000/api/ delay=5s timeout=15s period=10s #success=1 #failure=3
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-xrr7c (ro)
sidecar:
Container ID: containerd://a0d45982aa56704983a89bacd664e5cbb8b38b1df943df92ad879e7c66986c22
Image: envoyproxy/envoy:v1.22-latest
Image ID: docker.io/envoyproxy/envoy@sha256:3b1e0114dbead3fbd9f561994f3894f5d113a815e023065cabdf0c48d55396ce
Port: 5002/TCP
Host Port: 0/TCP
State: Running
Started: Sat, 11 Feb 2023 11:19:47 +0000
Ready: True
Restart Count: 0
Limits:
cpu: 100m
memory: 150Mi
Requests:
cpu: 30m
memory: 55Mi
Environment: <none>
Mounts:
/etc/envoy from sidecar-config (ro)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-xrr7c (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
sidecar-config:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: pytbakt-configmap
Optional: false
kube-api-access-xrr7c:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 3m53s default-scheduler Successfully assigned pytbak/pytbak-stable-bd648fd46-nj95m to instance-20220215-1853
Normal Pulled 3m53s kubelet Container image "lgirardi/rest-test-multip:0.6" already present on machine
Normal Created 3m53s kubelet Created container pytbak
Normal Started 3m53s kubelet Started container pytbak
Normal Pulled 3m53s kubelet Container image "envoyproxy/envoy:v1.22-latest" already present on machine
Normal Created 3m53s kubelet Created container sidecar
Normal Started 3m52s kubelet Started container sidecar
With curl
$ while true;do curl -I http://oracolo.k8s.it/api/fib/1 && sleep 0.4;done
HTTP/1.1 200 OK
Date: Sat, 11 Feb 2023 11:29:48 GMT
Content-Type: text/html; charset=utf-8
Content-Length: 1
Connection: keep-alive
vary: Accept-Encoding
x-envoy-upstream-service-time: 2
HTTP/1.1 200 OK
Date: Sat, 11 Feb 2023 11:29:49 GMT
Content-Type: text/html; charset=utf-8
Content-Length: 1
Connection: keep-alive
vary: Accept-Encoding
x-envoy-upstream-service-time: 2
HTTP/1.1 429 Too Many Requests
Date: Sat, 11 Feb 2023 11:29:49 GMT
Content-Type: text/plain
Content-Length: 18
Connection: keep-alive
x-local-rate-limit: true
HTTP/1.1 200 OK
Date: Sat, 11 Feb 2023 11:29:50 GMT
Content-Type: text/html; charset=utf-8
Content-Length: 1
Connection: keep-alive
vary: Accept-Encoding
x-envoy-upstream-service-time: 2
HTTP/1.1 200 OK
Date: Sat, 11 Feb 2023 11:29:50 GMT
Content-Type: text/html; charset=utf-8
Content-Length: 1
Connection: keep-alive
vary: Accept-Encoding
x-envoy-upstream-service-time: 2
HTTP/1.1 429 Too Many Requests
Date: Sat, 11 Feb 2023 11:29:50 GMT
Content-Type: text/plain
Content-Length: 18
Connection: keep-alive
x-local-rate-limit: true
HTTP/1.1 200 OK
Date: Sat, 11 Feb 2023 11:29:51 GMT
Content-Type: text/html; charset=utf-8
Content-Length: 1
Connection: keep-alive
vary: Accept-Encoding
x-envoy-upstream-service-time: 2
HTTP/1.1 200 OK
Date: Sat, 11 Feb 2023 11:29:51 GMT
Content-Type: text/html; charset=utf-8
Content-Length: 1
Connection: keep-alive
vary: Accept-Encoding
x-envoy-upstream-service-time: 2
HTTP/1.1 429 Too Many Requests
Date: Sat, 11 Feb 2023 11:29:51 GMT
Content-Type: text/plain
Content-Length: 18
Connection: keep-alive
x-local-rate-limit: true
Both solution are working, different ways and different impacts on the app. In the next episode i will explain why i did this job , however, now in case u need a simple way to rate limit you app you can have a sidecar container out of the box.