Forum Discussion

scaissie_82066's avatar
scaissie_82066
Icon for Nimbostratus rankNimbostratus
Jun 05, 2007

LB::reselect doesn't select another node

I have an iRule that sets persistence based on a usrid string in the TCP payload. That works, but when the node goes down, I can't get the same usrid's to move to another server. They remain stuck to the node that's down. I've tried many ideas that I've seen posted, but none of them work.

 

 

when CLIENT_ACCEPTED {

 

TCP::collect 34

 

}

 

 

when CLIENT_DATA {

 

set usrid [string range [TCP::payload] 23 33]

 

if {[regexp {\d{10}} $usrid]} {

 

persist uie $usrid

 

log local0. "Persisting $usrid"

 

}

 

}

 

 

when LB_FAILED {

 

set p [LB::server pool]

 

set n [LB::server addr]

 

log local0. "Set node down: $n"

 

LB::down node $n

 

log local0. "Deleting $usrid from Pool $p"

 

persist delete uie $usrid

 

LB::detach

 

LB::reselect pool $p

 

set x [LB::server addr]

 

log local0. "After reselect node is $x"

 

}

 

 

Here's the ltm log file:

 

Jun 5 10:48:19 tmm tmm[13956]: Rule phoneIDoffset : Persisting 0000000001

 

Jun 5 10:48:30 tmm tmm[13956]: Rule phoneIDoffset : Persisting 0000000001

 

Jun 5 10:48:30 tmm tmm[13956]: Rule phoneIDoffset : Set node down: 10.3.0.254

 

Jun 5 10:48:30 tmm tmm[13956]: Rule phoneIDoffset : Deleting 0000000001 from Pool pool.asr

 

Jun 5 10:48:30 tmm tmm[13956]: Rule phoneIDoffset : After reselect node is

 

Jun 5 10:48:30 tmm tmm[13956]: Rule phoneIDoffset : Persisting 0000000001

 

Jun 5 10:48:30 tmm tmm[13956]: Rule phoneIDoffset : Set node down: 10.3.0.254

 

Jun 5 10:48:30 tmm tmm[13956]: Rule phoneIDoffset : Deleting 0000000001 from Pool pool.asr

 

Jun 5 10:48:30 tmm tmm[13956]: Rule phoneIDoffset : After reselect node is

 

Jun 5 10:48:30 tmm tmm[13956]: Rule phoneIDoffset : Persisting 0000000001

 

 

It just loops forever.

 

  • I found a bug in my code. The offset in the "string range" should be 24 instead of 23. It was putting an extra space at the beginning of $usrid. I'm guessing that the "persist delete uie $usrid" was removing the space, meaning that it was deleting an entry that didn't exist.

     

  • I also had a problem with LB::reselect selecting always the same node, but in different circumstances. My iRule uses "hand-made" persistence, i.e. it extracts the server address from the packet contents and selects it using the node command. The simplified iRule is below (the actual one is too complex to put here):

     

    
    when HTTP_REQUEST {
      HTTP::collect [HTTP::header "Content-Length"]
    }
    when HTTP_REQUEST_DATA {
       Assume that payload contains host:port
      set host [substr [HTTP::payload] 0 ":"]
      set port [findstr [HTTP::payload] ":" 1]
      
      Route there only if node status is "up"
      set status [LB::status pool $::pool member $host $port]
      log local0.debug "Status: $status"
      if { $status eq "up" } {
        node $host $port
      }
    }
    when LB_FAILED {
      If the selected server doesn't work, route it to any server
      LB::reselect
    }

     

     

    This didn't work correctly - LB::reselect always selected the same node as previously selected by "node" command. Adding LB::detach before reselect didn't change anything. However, it turned out that using LB::reselect with pool name works (even without detach)! So, the code for LB_FAILED event now looks as follows:

     

     

    
    when LB_FAILED {
      If the selected server doesn't work, route it to any server
      LB::reselect pool mypool
    }

     

    It seems that when I was using LB::reselect without argument, the node command was still in force.