Moz_58995
Sep 20, 2011Nimbostratus
External Monitor marking pool member down. Why?!
Hi All,
I’m fairly new to F5 and new to External Monitors. A customer has asked us to implement an External Monitor (as sometimes the template monitor we configured doesn’t detect a failure on the customer’s application), but is marking the pool member as down. I’ve gone through the implementation and troubleshooting guides here on DevCentral, but still haven’t been able to figure out why our setup isn’t working.
The Perl script (diffusionPing.pl) looks like this and was supplied to us by the customer:
!/usr/bin/perl -w
use strict;
use Diffusion::Client;
use Diffusion::ClientDetails;
use Diffusion::Credentials;
use Getopt::Std;
require 5;
sub usage() {
print STDERR "usage: $0 [-v] [-u username] [-p password] host port\n";
exit 1;
}
Handle options
my %options = ();
usage() unless( getopts( "vu:p:", \%options ) );
Check the number of arguments, should be host port
usage() unless( 2 == @ARGV );
my $verbose = $options{v};
my( $hostName, $portNumber ) = @ARGV;
Convert an IPv4 over IPv6 address into pure IPv4 format
$hostName =~ s/::ffff://g;
eval {
my $cnxDetails = new Diffusion::ClientDetails( $hostName, $portNumber );
Handle optional credentials
if( $options{u} and $options{p} ) {
my ($user, $password) = ( $options{u}, $options{p} );
$cnxDetails->setCredentials( new Diffusion::Credentials( $user, $password ) );
print "Connecting to $hostName:$portNumber as $user\n" if( $verbose );
} else {
print "Connecting to $hostName:$portNumber\n" if( $verbose ); }
my $diffClient = new Diffusion::Client( $cnxDetails );
$diffClient->connect();
my $result = $diffClient->pingServer();
if( $verbose )
{
print "Ping response received\n";
} else { print "UP\n"; }
$diffClient->close(); 1; return 1, or the exception handler is called
} or do {
print nothing
print "Cannot ping $hostName:$portNumber: $@" if( $verbose ); }
Verify this on CLI:
./diffusionPing.pl -v 172.27.11.27 18080
Connecting to 172.27.11.27:18080
Ping response received
On LTM:
monitor diffusionPing {
defaults from external
interval 45
timeout 136
run "/usr/bin/monitors/diffusionPing.pl"
tcpdump -ni 1.4 host 172.27.11.27 and port 18080 tcpdump: listening on 1.4
16:18:15.950940 802.1Q vlan4093 P0 172.27.11.231.36535 > 172.27.11.27.18080: S 1770442755:1770442755(0) win 5840 (DF)
16:18:15.950940 802.1Q vlan4093 P0 172.27.11.27.18080 > 172.27.11.231.36535: S 1105197470:1105197470(0) ack 1770442756 win 5792 (DF)
16:18:15.950940 802.1Q vlan4093 P0 172.27.11.231.36535 > 172.27.11.27.18080: . ack 1 win 5840 (DF)
16:18:15.950940 802.1Q vlan4093 P0 172.27.11.231.36535 > 172.27.11.27.18080: F 1:1(0) ack 1 win 5840 (DF)
16:18:15.950940 802.1Q vlan4093 P0 172.27.11.27.18080 > 172.27.11.231.36535: . ack 2 win 46 (DF)
From the tcpdump I see the F5 connecting to 172.27.11.27 on port 18080 and what looks like responses back. However, recently I removed the monitor from the pool and I see the same from tcpdump?! So I’m not even sure the F5 is calling the script.
Any tips would be awesome. As I understand it, the F5 will mark a pool member as up when it receives “UP” (or anything) from the script, and mark it as down if nothing? The F5 is running 10.0.0 (I know. We’ve been pressing the customer to give us downtime to upgrade for ages).
Many thanks,
Matt