Monitoring Flapping
I need some help investigating a problem with our GTMs where by all pools at one of our two sites consistently changes state between available and down.
Our set up is two GTMs in one sync group, at two different data centres. We use the GTMs to load balance a simple HTTP web service between the two sites.
Our monitoring is configured to use HTTP monitors to make a simple HTTP request and check for a response string. The HTTP monitor timeout is set to 121 seconds, the interval at 30 seconds and the probe timeout of 5 seconds.
The monitors report that the pool members are timing out but I know this is not the case. We monitor all of our services from three different locations directly (Not via the GTMs) and all of the services respond generally in sub second response times but occasionally very small blips of up to 3 seconds.
Any ideas how I can get more information(debugging information) out of the GTMs to help find the solution would be great? Any ideas what might be the cause would be even better!
Thanks,
David.