BIGIP LTM Automated Pool Monitor Flap Troubleshooting Script in Bash
Problem this snippet solves:
A bash script is mainly for collecting data when F5 BIG-IP LTM pool member monitor flaps in a period of time and help determine the Root Cause of BIGIP monitor health check failure; Script will monitor the LTM logs, if new pool member down message event occurs, script will perform following functions: 1. Turn on LTM bigd debug ; 2. Start to tcpdump capture to capture relevant traffics; 3. Turn off bigd debug and terminate tcpdump process when timer elapse (timer is configurable) 4. Generate qkview (optinal) 5. Tar ball full logs files under /var/log/ directory (optinal)
Script has been tested on v11.x
Code :
#!/usr/bin/bash ##########identify the log file that script is monitoring filename="/var/log/ltm" ##########identify the period of time that debug and tcpdump are running, please change it according to the needs; timer=60 ##########IP address of pool member flaps poolMemberIP="10.10.10.229" ##########self IP address of LTM is usd to send LTM Health Monitor traffics ltmSelfip="10.10.10.248" ##########pool member service port number poolMemberPort="443" ##########TMOS command to turn on bigd debug turnonBigdDebug="tmsh modify sys db bigd.debug value enable" ##########TMOS command to turn off bigd debug turnoffBigdDebug="tmsh modify sys db bigd.debug value disable" ##########BASH command to tar BIGIP log files tarLogs="tar -czpf /var/tmp/logfiles.tar.gz /var/log/*" ####### function file check: following code will check if /var/log/ltm exist on the system, ####### if it exists, script will be running and perform subsequent functions if [ -f $filename ] then echo "/var/log/ltm exists and program is running to collect data when BG-IP pool member flaps" else ####### if it does not exist, programe will be terminated and log following message echo "no /var/log/ltm file found and program is terminated" exit 0 fi ####### function file check ends ###### write timestap to /var/log/ltm for tracking purpose echo "$(date) monitoring the log" >> $filename ###### start to monitor the /var/log/ltm for new events tail -f -n 0 $filename | while read -r line do ###### counter for pool down message appears hit=$(echo "$line" | grep -c "$poolMemberIP:$poolMemberPort monitor status down") #echo $hit ###### if [ "$hit" == "1" ]; then ###### diplay the pool down log event in file /var/log/ltm echo $line ###### show timestamp of debug is on echo "$(date) Turning on system bigddebug" ###### turn on bigd debug echo $($turnonBigdDebug) ###### turn on tcpdump capture echo $(tcpdump -ni 0.0:nnn -s0 -w /var/tmp/Monitor.pcap port $poolMemberPort and \(host $poolMemberIP and host $ltmSelfip\)) & ###### running timer sleep $timer ###### show timestamp of debug is off echo "$(date) Truning off system bigddebug" ###### turn off bigd debug echo $($turnoffBigdDebug) ###### terminate tcpdump process echo $(killall tcpdump) ###### generate qkview, it's an optional function, enable it by remove "#" sign #echo $(qkview) ###### tar log files, it's an optional function, enable it by remove "#" sign #echo $($tarLogs) break #else #echo "Monitor in progress" fi done ###### show message that programe is end echo "$(date) exiting from programe" ###### exit from the program exit 0
Tested this on version:
11.6- JRahmAdmin
davidfisher the bash sleep command value is in seconds.
Vasim the script would need to be called from an iCall action tied to an event trigger. There are examples of this, just search icall. I also have an article that walks through an example, though you won't need iRules for this.
- the full logs will be in /var/tmp, should include the window under duress for analysis
- it will run whenever triggered, but in your iCall script that executes it, you can select windows to avoid
- anytime you run tcpdump it impacts the system, could be small could be large, depending on your filters and how much data you are writing to disk in the window it's running.
- linjingEmployeeThanks for sharing, it's useful.
- Kiozs_131042AltocumulusThere's culprit on this script, By default, LTM will rotate the LTM logs and following command: "tail -f -n 0 $filename | while read -r line" which track the file according to the file descriptor (inode); so once file get rotated, script will still track the old file; change the command to "tail -F -n 0 $filename | while read -r line" should be able to overcome this issue, as "-F" will track the file according to the name only. Cheers Best Regards Kiozs
- davidfisherCirrus
Is the timer in mins or secs ?
- VasimAltocumulus
how to run this script in Production ?
where we will collect report ? whether there is any impact in working enivorment ? is it allowed to run during business hours ?
Vasim & davidfisher - this codeshare has been around a while and it looks like Kiozs_131042 may not be registered on DevCentral anymore. I'll see if I can find someone to address your questions. Thanks!