Forum Discussion
CPU load when Prometheus is scraping metrics from F5 BIG-IP LTM
We are experiencing an issue where Prometheus is scraping metrics from F5 BIG-IP LTM, causing high CPU and memory utilization on the F5 device.
Initial step, we have adjusted the scraping interval to 1 minute, but the issue still.
Are there any recommended tuning options or best practices?
4 Replies
Which version of TMOS are you running on the BIG-IP? Are you just using Prometheus as a 'pull consumer', or do you also have a 'push consumer' configured to send telemetry to something like Elasticsearch?
Is the icrd_child on the BIG-IP showing high CPU usage? You might want to consider configuring custom endpoints as per:
CPU utilization increase after enabling Telemetry Streaming- panuwong
Nimbostratus
Hi,
We are currently running TMOS version 17.5.1 and using the pull consumer.
Regarding icrd_child, we are not certain at the moment, but we will try your recommendation first.
Thanks,
Panu
- panuwong
Nimbostratus
I'm still seeing the same issue. Currently, the icrd_child CPU usage is around 40%. The metric collection interval in Prometheus is set to 60 seconds, and there are approximately 300 virtual servers configured. From the restjavad/restnoded logs, it appears that the F5 system is receiving a high number of requests and may not be able to respond in time. Could you please advise if there are any recommendations for using Prometheus to collect metrics at this scale?
- carlbidwell268
Nimbostratus
Yeah, just increasing the scrape interval usually isn’t enough if each scrape itself is heavy. On BIG-IP, the main load often comes from how many endpoints and stats are being queried per request. A good next step is to reduce the scope; only collect the metrics you actually need instead of the full set. If you’re using an exporter or iControl REST, try limiting objects (like specific pools/virtual servers) rather than pulling everything.
Also check if multiple Prometheus jobs are hitting the device at the same time; staggering them can smooth out CPU spikes. Another practical trick is enabling some form of caching on the exporter side (if supported), so BIG-IP isn’t recalculating stats every single scrape. In short, focus on making each scrape lighter, not just less frequent; that’s usually where the real improvement comes from.
Recent Discussions
* Getting Started on DevCentral
* Community Guidelines
* Community Terms of Use / EULA
* Community Ranking Explained
* Community Resources
* Contact the DevCentral Team
* Update MFA on account.f5.com