Forum Discussion

JBlogs_314812's avatar
JBlogs_314812
Icon for Nimbostratus rankNimbostratus
Sep 14, 2017

DNS listeners, DNS Express & BIND

I'm a little confused over what is/ isn't deemed best practice. Is there's anything wrong with the following points?

 

  • Listener configured; queries are both wip's and non-wip records. Bind is enabled to be able to create non-wip records - is this correct?

     

  • Recursion has been enabled in the named config and restricted to an acl of rfc1918 addresses.

     

  • DNS express is configured to import the local zone from bind for performance purposes.

     

  • Unhandled Query Actions set to drop in the profile. My understanding being requests would not be passed to bind with this set thus making it more secure? With this enabled the wip times out 3 times before resolving on the 4th try. Coincidentally I have 4 VS in the gslb pool.

     

I did try disabling bind completely and found my wip's again timed out several times before eventually resolving?

 

Any pointers/ help, much appreciated.

 

  • What version are you running? Try enabling gslb decision logging as outlined HERE to get an idea of what gtm is "thinking". Also in the wideip or gslb pool stats is the wip resolving with the preferred, alternate, or fallback lb method?

     

  • Kevin_K_51432's avatar
    Kevin_K_51432
    Historic F5 Account

    Greetings,

    This is a fairly short article and offers some good information and recommendations (at the bottom) regarding DNS packet processing:
    K14510: Overview of DNS query processing on BIG-IP systems
    
    https://support.f5.com/csp/article/K14510
    

    Hope this is helpful!

    Kevin
  • Thanks both for the help. Agreed, that's a good article, unfortunately it's also what lead to my further questions.

     

    I don't believe this problem is a load balancing one; the decision logging looks correct. I only encounter these timeouts when BIND is disabled, unhandled query actions is set to drop, or both.

     

    With BIND enabled I get an immediate resolution.

     

    Regardless of whether BIND is enabled or not statistics suggest the record is served out of DNS Express.

     

    BIG-IP is version 13.0

     

  • A wip should always be resolved with gslb never dns express or bind. Do the wip pool stats show that the fallback lb method is being used? If the fallback method is set to "Return to DNS" then gtm has given up on intelligent gslb resolution and will attempt standard resolution using express/dns pool/bind. If your wip is not being resolved with the preferred or alt lb methods then something is wrong with the gslb config. Resolving a wip does not require bind or express to be enabled in the listener dns profile. Can you post the gslb config from /config/bigip_gtm.conf?

     

  • Thanks, I must have been half asleep when I wrote that last comment. I see the DNS express stats increment when the wip is resolved, as well as the wide ip stats. I'm not sure if the latter is anything to go by.

    The odd thing is that regardless of the timeouts, the wip always resolves to the correct address, even when I change the priority of the pools, so gslb is working. This is what makes me think it's not lb related.

    I basically have 2 listeners, one internal, one external. Internally dns express disabled and bind enabled. Externally dns express enabled and bind disabled.

    Internally resolves immediately and externally times out but resolves on the 3rd/4th attempt.

    Config is below:

    gtm datacenter /Common/DC_1 { }
    gtm datacenter /Common/DC_2 { }
    gtm prober-pool /Common/DC_1_prober {
        load-balancing-mode round-robin
        members {
            /Common/f51.sample.com {
                order 0
            }
        }
    }
    gtm region /Common/rfc1918 {
        region-members {
            subnet 10.0.0.0/8 { }
            subnet 172.16.0.0/12 { }
            subnet 192.168.0.0/16 { }
        }
    }
    gtm server /Common/f51.sample.com {
        datacenter /Common/DC_1
        devices {
            f51.sample.com {
                addresses {
                    192.168.157.254 { }
                }
            }
        }
        monitor /Common/bigip_5s
        product bigip
        virtual-servers {
            APP_1 {
                destination 1.1.1.10:443
                translation-address 192.168.157.10
                translation-port 443
            }
            APP_1_int {
                destination 192.168.157.10:443
            }
            APP_2 {
                destination 1.1.1.11:443
                translation-address 192.168.157.11
                translation-port 443
            }
            APP_2_int {
                destination 192.168.157.11:443
            }
        }
    }
    gtm topology ldns: region /Common/rfc1918 server: subnet 192.168.157.0/24 {
        order 1
    }
    gtm topology ldns: not region /Common/rfc1918 server: not subnet 192.168.157.0/24 {
        order 2
    }
    gtm global-settings metrics {
        metrics-collection-protocols { icmp }
    }
    gtm global-settings metrics-exclusions {
        addresses none
    }
    gtm monitor bigip /Common/bigip_5s {
        aggregate-dynamic-ratios none
        defaults-from /Common/bigip
        destination *:*
        interval 5
        timeout 15
    }
    gtm pool a /Common/APP_1_gslbpool {
        alternate-mode none
        fallback-mode none
        load-balancing-mode topology
        members {
            /Common/f51.sample.com:APP_1_int {
                member-order 1
            }
        }
    }
    
  • DNS express stats should not increment when resolving a wip.

     

    Have you done a pkt capture to see what the client is actually sending in the dns queries?

     

    What type of client are you using to test resolution? Nslookup from Windows sends queries for AAAA IPv6 RRs first before it gets to A/IPv4

     

    A couple of questions: what is the lb method on the wideip? does wideip have a single pool or multiple pools assigned?

     

  • It looks as though Windows is appending an invalid dns suffix, which accounted for one of the timeouts and explains the dns express stats. The others being AAAA.

     

    With bind enabled, the request with the suffix was rejected as opposed to with bind disabled where it's just dropped causing the timeout in the client.

     

    Appreciate the help!