Forum Discussion

Steve_Scott_873's avatar
Steve_Scott_873
Historic F5 Account
Nov 23, 2010

Table Replace command with CMP - Connections reset

Hi All

 

Might be better for support, but let me try here first as it tends to be more code orientated

 

Running an F5 1600 with BIGIP 10.2 HF2, provisioned with LTM

 

I've got a new bit of code, which is responsible for doing some DNS cache for outbound connections (Mainly so we can ride out any flakeyness with DNS)

 

Code makes use of tables (CMP Compatible according to the docs) NAME::lookup @Virtual -a $Host to fire off an async DNS request, then sets the node to the address from cache.

 

 

When NAME_RESOLVED fires, it should update the cache ready for the next connection, so far so good.

 

After some validate, we get to this bit of code:

 

log local0. "Got good DNS result [lindex $response 0]"

 

table set command seems to delete and then add, use replace instead - this updates correctly it would seem

 

table replace $Host [lindex $response 0] 3600

 

log local0. "Updated"

 

Tested on the F5 virtual appliance (Non CMP) again, so far, so good.

 

 

Finally got to implementation tonight. When half the time I get connected to the backend service, half the time the connection just terminates

 

 

curl -H "host: portal.something.uk" http://2620:....

 

curl: (52) Empty reply from server

 

 

Log files show

 

Nov 23 20:32:08 local/tmm info tmm[5142]: Rule Rulename : Got good DNS result 123.123.123.123

 

curl -H "host: portal.something.uk" http://2620:....

 

... LB Failed text, as the firewalls not sorted yet

 

 

Nov 23 20:31:53 local/tmm1 info tmm1[5143]: Rule Rulename : Got good DNS result 123.123.123.123

 

Nov 23 20:31:53 local/tmm1 info tmm1[5143]: Rule Rulename : Updated

 

Basically every time this executes on TMM1, it works fine. Every time it executes on TMM it fails. No other error messages in the logs, just connections reset. (This is part of a HTTP rule, so some normal packet exchange for the F5 to get a request)

 

20:50:03.818500 IP Removed.39928 > Removed.http: P 1:175(174) ack 1 win 45

 

20:50:03.818562 IP Removed0.39928 > Removed.http: P 1:175(174) ack 1 win 45

 

20:50:03.835191 IP Removed.http > Removed.39928: R 1:1(0) ack 175 win 4494

 

20:50:03.835272 IP Removed.http > Removed.39928: R 1:1(0) ack 175 win 4494

 

 

 

Any thoughts? I've tried a catch block, I've tried using add instead of replace (Add seems to delete and recreate, replace seems to update. Add has a half second window when there doesn't seem to be any entry)

 

  • hoolio's avatar
    hoolio
    Icon for Cirrostratus rankCirrostratus
    Hi Steve,

     

     

    Can you post the full iRule? If you disable CMP on the virtual server does the issue clear?

     

     

    Also, you could try using RESOLV::lookup instead of NAME::lookup. You eliminate the need for adding code to NAME_RESOLVED and it should be more efficient than NAME::lookup.

     

     

    http://devcentral.f5.com/wiki/default.aspx/iRules/name__lookup

     

     

    Aaron
  • Steve_Scott_873's avatar
    Steve_Scott_873
    Historic F5 Account
    Full iRule will need some agreement from management and a bit of hacking up before I can post it. Am i right in saying your working for our supplier anyway? I can email it across

     

     

    I want to use RESOLV::lookup because I don't want the transaction blocked for the duration of the DNS request. Normally this will be short, but if one of the dns servers goes down we won't want a 15 second delay before we can process the messages

     

     

    Changing over to CMP Disabled mode locks it to TMM0, but the persistence engine entry for this one has been distributed to TMM1, hence it always fails for that particular entry. The other half of the entires are on TMM0 and these work 100% of the time. So I'd need to globally disable CMP
  • Steve_Scott_873's avatar
    Steve_Scott_873
    Historic F5 Account
    Hi Aaron,

     

     

    The company decided at short notice that actually it wasn't such a good idea having all of the engineers with any F5 experience (or even logons) out of the office for the day, so I got told I were staying back as I had the most experience with iRules having done some development and had the 3 day iRules course at chertsey (Very good btw).

     

    I'll double check I'm ok to pop in a sanatised version tomorrow. My experience with support with this sort of thing involves a lot of hoop jumping.

     

     

    The NAME:lookup is intentional - basically the idea is that the existing connect can be sent to the cache entry, and if the cache is a bit stale then I can use NAME:lookup to update the cache for the next request - this way I don't have to hold up the connection until there is a DNS response that comes back (There doesn't seem to be any control with RESOLV::lookup, which stalls the iRule for up to 15 seconds as far as i could see, if the dns servers are down / iffy anyway)

     

     

    The session table entires are indeed accessible. Reading works perfectly if the rule executes on either TMM Process, but it only seems to write correctly when its on its "Home" process, hence disabling CMP on the virtual (Forcing execution on TMM0) caused about half of the entires to write perfectly 100% of the time, and the other half to fail 100% of the time - depending if the "home" process is TMM0. Thats my theory anyway!

     

     

    On a non CMP system (F5 LTM Virtual edition) it works perfectly. I could disable CMP globally on the F5, but I suspect i should do a lot of paperwork before I can do that.
  • hoolio's avatar
    hoolio
    Icon for Cirrostratus rankCirrostratus
    Regarding the RESOLV::lookup timeouts, I just got some info from PD:

     

     

     

    http://devcentral.f5.com/wiki/default.aspx/iRules/resolv__lookup.html

     

     

    By default, TMM will make up to 4 consecutive query attempts (1 original with 3 retries) with an individual query timeout of 5 seconds. These parameters are globally configurable using bigpipe database keys:

     

     

    b db tmm.resolv.retry

     

    b db tmm.resolv.timeout

     

     

     

    Aaron
  • Steve_Scott_873's avatar
    Steve_Scott_873
    Historic F5 Account
    Ooh, hadn't spotted that.

     

     

    Might well strip some parts of the rule out to get it going, and raise a case with support re the tables command
  • spark_86682's avatar
    spark_86682
    Historic F5 Account
    Are you trying to use a table command in NAME_RESOLVED? Yeah, that's going to give you the behavior you describe: it works on one tmm, and fails on the others. This is bug 247742 ("iRule NAME_RESOLVED event does not handle suspension"). Your best bet is to switch to RESOLV::lookup instead, so you don't have to deal with the NAME_RESOLVED event.