Forum Discussion
Configuring semantic caching on F5 AI Gateway
Can't say about the semantic caching feature but about token based rate limit this can be done with authorization header with XC as I have shown in F5 XC Session tracking with User Identification Policy | DevCentral
On nginx you can use njs javascript module to make not source ip based rate limit and on BIG-IP an irule should do the trick:
GitHub - nginx/njs-examples: NGINX JavaScript examples
3.1.2. Lab 2 - HTTP Throttling
The idea is to place the XC , nginx or BIG-IP before the AI Gateway as the AI gateway is for exact AI protections while pure API protection is done on the normal systems like XC, Nginx or BIG-IP.
- devopssongJun 04, 2025
Nimbostratus
Hi. Thanks for the reply.
Actually, I was wondering about input/output text token used in LLM API pricing, not JWT tokens.
For AI Gateway use cases, I believe token-based rate limiting would be more effective than traditional request-based limits.
- Nikoolayy1Jun 04, 2025
MVP
If the token is not in a header but the request body then extracting it and rate limiting on it will be a little harder. Big-IP with irules or nginx with njs javascript module could do it but it will be complex.
https://clouddocs.f5.com/training/community/nginx/html/class3/module1/module12.html
https://github.com/nginx/njs-examples
Recent Discussions
Related Content
* Getting Started on DevCentral
* Community Guidelines
* Community Terms of Use / EULA
* Community Ranking Explained
* Community Resources
* Contact the DevCentral Team
* Update MFA on account.f5.com