Forum Discussion
Configuring semantic caching on F5 AI Gateway
Hi. Thanks for the reply.
Actually, I was wondering about input/output text token used in LLM API pricing, not JWT tokens.
For AI Gateway use cases, I believe token-based rate limiting would be more effective than traditional request-based limits.
If the token is not in a header but the request body then extracting it and rate limiting on it will be a little harder. Big-IP with irules or nginx with njs javascript module could do it but it will be complex.
https://clouddocs.f5.com/training/community/nginx/html/class3/module1/module12.html
https://github.com/nginx/njs-examples
Recent Discussions
Related Content
* Getting Started on DevCentral
* Community Guidelines
* Community Terms of Use / EULA
* Community Ranking Explained
* Community Resources
* Contact the DevCentral Team
* Update MFA on account.f5.com