Forum Discussion

Nimbostratus

May 26, 2025

Configuring semantic caching on F5 AI Gateway

The semantic caching feature is mentioned on the F5 AI Gateway introduction page, but I couldn't find any documentation on how to use it. Is there a guide available for this? Also, I'm curious wheth...

AI Gateway

aigw

devopssong

Nimbostratus

Jun 04, 2025

Hi. Thanks for the reply.

Actually, I was wondering about input/output text token used in LLM API pricing, not JWT tokens.

For AI Gateway use cases, I believe token-based rate limiting would be more effective than traditional request-based limits.

Nikoolayy1

MVP

Jun 04, 2025

If the token is not in a header but the request body then extracting it and rate limiting on it will be a little harder. Big-IP with irules or nginx with njs javascript module could do it but it will be complex.

https://clouddocs.f5.com/training/community/nginx/html/class3/module1/module12.html

https://github.com/nginx/njs-examples