tech tip
409 TopicsLTM External Monitors: The Basics
LTM's external monitors are incredibly flexible, fairly easy to implement, and especially useful for monitoring applications for which there is no built-in monitor template. They give you the ability to effectively monitor the health of just about any application by writing custom scripts to interact with your servers in the same way users would. In this article, I will attempt to explain the basic LTM external monitoring paradigm, then dissect and explain one of the sample monitors from the Advanced Design & Config codeshare. (Thanks to poster pgroven for inspiring me to finally write this up.) An "External Monitor" is a script that is "external" to the configuration file which contains specific logic designed to interact with your servers to verify the health of load balanced services. LTM runs a unique instance of the custom-crafted script against each pool member to which it is applied, passing command line arguments and environment variables as specified in the monitor definition calling the script. The script logic formulates and submits a request (or requests) to the target pool member, evaluates the response(s), and manages the pool member's availability based on the results of the response evaluation. The Tools The sample monitor scripts The external script itself should be a shell script (if at all possible) to minimize overhead. If absolutely necessary, a perl script may be used instead, but keep in mind that the overhead of invoking the intepreter and required modules for multiple instances may negatively impact performance overall. However, LTM was not intended to be a development platform or a dedicated monitoring device, and thus has a limited set of development tools and modules included in the software build, so you may not find the perl modules you need. You can add them, but it is not recommended or supported to do so, and those customizations will likely not survive an upgrade. (You can also use an external monitor to invoke a compiled program, but that discussion is beyond the scope of this article.) cURL cURL is a very flexible command line tool you can use in shell and perl scripts for complex interactions with HTTP and FTP servers. netcat netcat is another useful command line tool that facilitates interaction with TCP and UDP services. The LTM external monitor template The LTM external monitor template allows you to specify the name of the script to run, the interval & timeout, command line arguments and variables the script requires, and alternate destination for the monitor traffic. The Tips ("good to know" stuff and best practices recommendations) There are a few special considerations you need to make when writing the script and configuring the LTM monitor definition that calls it. Do you really need an external monitor? Never use an external monitor when a built-in one will work as well. Forking a shell and running even the simplest shell script takes a significant amount of system resources, so external monitors should be avoided whenever possible. If possible, have the server administrator script execution of the required transaction on the server itself (or locate/author an alternative script on the server) that reliably reflects its availability. Then, instead of an external monitor, you can define a built-in monitor that requests that dynamic script from the server, and let the server run the script locally and report results. For example, the simple request/response HTTP transaction in the sample script below would be much better implemented using the built in basic HTTP monitor. Optimization Use the lowest overhead tools, make the simplest possible request, & minimize the amount of response parsing required to determine the pool member's status. The logic of the script can contain just about any logic you want to determine if that server is healthy. You can use commandline tools like netcat and cURL to replicate server transactions, from a basic request and response parsing for an expected string, to more complicated exchanges where cookies or persistence tokens are used, login is required or some other dynamic transaction must take place in order to establish that usability of a server by its intended users. Redundant pairs Both units in a redundant pair will independently run the configured monitor, even when running as Standby. Monitor status is not shared between redundant pairs. Variables Variables may be passed to indicate service or hostname, the URI you need to request, or just about any piece of information that would be needed to construct a valid query and receive a valid response from the server. Variables can contain static values, basic regex expressions, or even expressions that contain other variables. As long as your script receives the expected variables from the monitor definition and the logic handles them appropriately, the possibilities are fairly limitless. Authentication If your script must pass authentication tokens to the pool members to sufficiently transact with them, make sure the authentication method will allow multiple concurrent logins. Each pool-member-specific instance on each member of a redundant pair may attempt to log in simultaneously. If only a single login is allowed per credential, authentication collisions will most likely result in rolling multiple concurrent false downs as only one monitor request can succeed at a time. Script against one pool member The script should be written to determine the health of one specific pool member. An LTM monitor script is really a template for monitoring a single pool member. Whether you apply an LTM monitor to an individual pool member or to the entire pool in the GUI, a separate copy of the monitor runs for each pool member, passing only that specific IP & port to be tested and maintaining only that single tested pool member's availability. (Discrete monitoring of a single pool member by an external monitor isespecially important if other monitors will also be applied to the pool members.) Minimize the work Keep the amount of work your monitor script must perform as small as possible. Both the script that runs on LTM and the request against the server itself should represent the minimum interaction required to adequately determine the server's health. If you consider how often the monitor will make that request against each pool members, you can get an idea of the scale of the work that you're asking both big IP and your servers to do. The Ins & Outs A script intended for use as an external monitor must conform to some specific input and output requirements. Command line arguments IP and port of the pool member are passed automatically as the first 2 command line arguments for all external monitors. The IP address is always passed in the IPv6 format (TMOS' internal address format). IPv4 addresses are passed using IPv6's special "IPv4 mapped address" transition notation: The IPv4 address prefixed with "::ffff:". In that notation, the IP address for pool member 10.0.0.1:80 would be "::ffff:10.0.0.1". The proper address type is critical to proper operation of your monitor script. More on that later. Additional command line arguments may be defined in the monitor configuration. When defined, they are passed to the script by the monitoring daemon as the 3rd, 4th, and subsequent arguments. Variables Variables in the form of Name/Value pairs may also be defined in the monitor configuration. When defined, they are created as environment variables in the shell forked for each instance of the script. Script Output IF ANY VALUE AT ALL GOES TO STANDARD OUTPUT, THE POOL MEMBER WILL BE MARKED UP. If the pool member is determined to be healthy enough to receive load balance traffic by successfully satisfying the script logic, the script should output any value but null to standard output, and the monitoring daemon will mark the pool member up. If the pool member does not respond as expected, the script will output nothing to stdout, and the lack of output will cause the monitoring daemon to mark the pool member down at the expiration of the timeout. All other outputs from the script are ignored by the monitoring daemon. The Timing The interval is the amount of time that will elapse between the start of each monitor attempt. In order to avoid creating a Denial of Service situation by sending your servers excessive monitor traffic, you should increase the interval as much as possible. The interval MUST be longer that the longest possible healthy response should take, since each successive instance of the script run against a pool member will kill off any already-running previous instances, assuming they are hung and will never complete. F5 recommends a timeout value 3 times greater than the interval value plus 1 second, but you can use a different ratio if necessary. Setting the timeout shorter than the interval is not recommended. If you consider the monitor will make that request every <interval> against each pool member, you can get an idea of the scale of the work that you're asking both LTM and your servers to do, so some careful testing is in order with the goal of minimizing the timeout value and maximizing the interval. (If you notice that your healthy pool members are being marked down and then back up again on the next interval, your timeout may be to short, and some further experimentation may be in order.) There are also ways that you can control and tighten up the tolerance for timing in some monitors. In another article, we will take a closer look at a different external monitor that marks pool members down a little bit more aggressively than waiting for the monitor timeout. The Gory Details Here's a sample monitor from the codeshare: HTTPMonitor_cURL_BasicGET Let's go through the script a section at a time and take a closer look at what's going on. First of all, notice that the script documentation tells us it is expecting 2 variable definitions: # This example expects the following Name/Value pairs: # URI = the URI to request from the server # RECV = the expected response (not case sensitive) For this example, we are going to request the URI "/testpath/testfile.html" over HTTP for each server, and expect a string that says "Server is UP!!!". (As noted earlier, the simple request / response HTTP transaction demonstrated here would be much better implemented using the built in basic HTTP monitor with static request/receive strings, but is still helpful in demonstrating the basic requirements for external monitor implementation.) Now that we know what variables we need to define, the monitor configuration will look like this: monitor ExternalHTTP { defaults from external RECV "Server is UP!!!" run "" URI "/testpath/testfile.html" } Once the monitor is defined, it can be applied to the pool members. (The monitor can be applied to individual pool members or the entire pool. Either way, a unique instance of the script is run for each pool member at each interval to monitor each pool member independently.) When the monitoring daemon (bigd) runs the script according to the monitor definition, it forks a new shell and and creates the required environment variables, then invokes the script with the 2 default command line arguments (the target pool member's IP address and port). At the start of the script, the command line arguments are processed. First it checks if the IPv6 address passed is in the IPv4 mapped format, and if so, converts it to a standard IPv4 address instead, and assigns both arguments to named environment variables: # remove IPv6/IPv4 compatibility prefix (LTM passes addresses in IPv6 format) IP=`echo ${1} | sed 's/::ffff://'` PORT=${2} Once the IP and PORT variables are defined, they are used to set up a process management scheme intended to prevent multiple copies of the monitor from running against the same pool member at the same time. It works like this: Each instance of the script first looks for a unique file named "monitorname.IP_port.pid" in /var/run containing the process ID of the last instance of the script run against that pool member. If it exists, it means the last instance of the script has not completed. Since multiple copies of the same script funning against the same pool member may interfere with proper monitor operations, the script kills that process, then re-writes the PID file containing the process ID of the current instance for reference by the next instance. PIDFILE="/var/run/`basename ${0}`.${IP}_${PORT}.pid" # kill of the last instance of this monitor if hung and log current pid if [ -f $PIDFILE ] then kill -9 `cat $PIDFILE` > /dev/null 2>&1 fi echo "$$" > $PIDFILE Now the heavy lifting begins. In this example, we're simply sending a URI, and examining the response to see if it contains the RECV string: # send request & check for expected response curl -fNs http://${IP}:${PORT}${URI} | grep -i "${RECV}" 2>&1 > /dev/null (Remember this is a simplified example. In a real world example, the logic inserted here would replicate whatever transactions you identified earlier as the minimum required interaction to determine the pool member's health. cURL has a wide range of options you can use to mimic almost any browser operation, including sending and receiving cookies, to replicate multi-step transactions or validate complex responses.) If the expected response contained the value of the RECV variable, the "grep" command will return 0, causing the script to send the string "UP" to stdout, and the pool member will be marked up immediately. If the expected response did NOT contain the value of the RECV variable, the "grep" command will return a non-zero value, and the script will output nothing to stdout, and the pool member will be marked down when the timeout expires. # mark node UP if expected response was received if [ $? -eq 0 ] then echo "UP" fi And finally the script will delete the PID file written earlier (since it has finished cleanly and won't need to be killed of by the next instance) and then exit: rm -f $PIDFILE exit It doesn't work... what now? Troubleshooting external monitors can be challenging. In my next article, I'll cover the basic process you can follow to track down and resolve any issues that may interfere with proper monitor operation. (LTM External Monitors: Troubleshooting)11KViews0likes5CommentsLTM External Monitors: Troubleshooting
My last article explained the basics of implementing an LTM external monitor script. This article covers some helpful tools and techniques you can use to validate/troubleshoot/debug your external monitor implementation. There are 2 basic parts to an external monitor: 1) The monitor definition in the LTM configuration; and 2) the external program it calls. The external program should be validated before configuring an external monitor to use it. I'll continue to use the shell script from the previous article as an example, but the basic technique is the same regardless of the nature of the program. Validating the external program Validating the external program is fairly straightforward. Simply run the program at the command line with the expected commandline arguments and the expected environment variables defined. The program must either send something to standard output (stdout) if the pool member is to be marked up, or output nothing to stdout if it is to be marked down. (Output to standard error (stderr) is ignored by the monitoring daemon, but should be resolved before deploying the monitor.) Start by installing the program in the /usr/bin/monitors directory and make it executable. Instructions are here. Debugging: Validating program I/O Identify the environment variables the program requires and export them to the shell at the command line. For our example script, the variables URI and RECV are required, so export them like this: export RECV="Server is UP!!!" export URI="/testpath/testfile.html" Verify they are set as expected by running the "set" command and grepping for the expected variable names: set | grep RECV set | grep URI Execute the program, giving the expected commandline arguments, which at a minimum will be the IP address and port of the pool member to be tested. For our example script, the commandline arguments are simply that -- IP and port of one pool member: /usr/bin/monitors/HTTPMonitor_cURL_BasicGET 10.0.0.1 80 The pool member against which the test is run should be in a known state. I recommend starting with a pool member that's known to be up. If the pool member is up and the script operates as expected, the script should return the expected character string to stdout. This code snip from our example specifies the character string that will be sent to stdout only if the expected response is seen. The string that will be conditionally output to stdout is "UP": if [ $? -eq 0 ] then echo "UP" fi The expected result for a healthy node would be: monitors # /usr/bin/monitors/HTTPMonitor_cURL_BasicGET 10.0.0.1 80 UP monitors # Once you have verified that a healthy pool member is marked up as expected by the program, test against a known down pool member. For a known down pool member, the script should return nothing, and the monitoring daemon will mark it down at the expiration of the timeout. The expected result for an unhealthy node would be no output: monitors # /usr/bin/monitors/HTTPMonitor_cURL_BasicGET 10.0.0.1 80 monitors # If you've tested against both a healthy pool member and an unhealthy one, and the script doesn't return the expected output, or if any unexpected output is seen on either stdout or stderr, then some debugging is in order. The most common not-so-obvious mistake we see in external monitors is uncontrolled output to stdout -- commands that send data to stdout every time the sript runs. For example, pgroven's initial version included the following command to display a variable name & value on screen for debugging purposes: echo "exstatus = $exstatus" Given that code, the result for a healthy node would be: monitors # /usr/bin/monitors/example_script 10.0.0.1 80 exstatus = 0 UP monitors # which is no problem -- the pool member will be marked up because text was seen on stdout. However, the result for an unhealthy node would be: monitors # /usr/bin/monitors/example_script 10.0.0.1 80 exstatus = 1 monitors # and in this case, the pool member would still be marked up even if the server didn't respond at all. Because there is always something sent to stdout, the script would always cause the pool member to be marked up regardless of the server response (or lack thereof). Debugging: Validating network I/O Looking at the server logs can be helpful, but the most definitive information is on the ethernet. A packet trace can be used to look directly at the conversation that is happening on the wire (or, as the case may be, not happening on the wire). You'll want to capture on the server-facing interface, and filter for the non-floating self IP on the server-facing VLAN and the port of the pool member: tcpdump -nni <server_vlan> -Xs 0 host <internal_non-floating_selfIP> and port <pool_member_port> If the expected request traffic is not seen being sent from LTM to the server, it is typically for one of 2 reasons: 1) because the command issuing the request is flawed. We'll address that in the next section; or 2) the network address doesn't appear to be valid. If the network address doesn't appear to be valid, all the usual suspects come into play here: missing/bad IP address, no route to host, missing/bad L2 address (pool member is hard down or not visible for some other reason), missing/bad port, service not listening, etc. Verify connectivity for L1-4 as you usually would. Poster pgroven's case highlighted a common issue that manifests in the request never being sent: The monitoring daemon passes the IP address of the pool member to the external program using IPv6's IPv4 mapped format, as explained in the previous article. The IPv6 address passed to the cURL command is not a valid address for the corresponding IPv4 pool member -- it doesn't map to any L2 address. Since there is no valid IP address to which to send the cURL request, no packets will be seen on the wire. Adding the code to translate the IPv6 address format to IPv4 resolves that issue. This situation raises an important troubleshooting detail: The most authoritative test for all of the script execution test examples shown here would actually be to use the same format the monitoring daemon would when passing the commandline argument for the IP address: /usr/bin/monitors/HTTPMonitor_cURL_BasicGET ::ffff:10.0.0.1 80 Otherwise you're not actually testing the entire script. Debugging: Validating individual commands If you've examined a packet trace and you see the request being sent to the server, but no response or an error response are seen, take a closer look at the request itself and make sure that it is valid and well formed. Using our shell script as an example, you can debug the request itself by issuing the same command at the command line with the appropriate values inserted, adjusting as necessary until the expected response is received. You may need to remove output suppression flags if they were included. For example, the cURL command in the example shell script is embedded in this line of code: curl -fNs http://${IP}:${PORT}${URI} | grep -i "${RECV}" 2>&1 > /dev/null so you would want to remove the output supression flags (-f & -s) and substitute the appropriate variable values to submit this request from the command line: curl -N http://10.0.0.1:80/testpath/testfile.html Once any existing problems have been identified and resolved, substitute the corrected command(s) into your script and continue testing. Debugging: Validating logic If your request and response look correct, but the script is still not producing the expected output, you'll have to dig into the the script logic. For a shell script such as the one we're using in our example, the simplest way is to invoke the shell command explicitly with the xtrace flag, passing it the script name and its commandline arguments. The xtrace flag (-x) causes the shell to write each command (preceded by '+') to stderr before it is executed, displaying all variables fully expanded and showing the results of any logical comparisons. Running this command to test the script sh -x /usr/bin/monitors/HTTPMonitor_cURL_BasicGET ::ffff:10.0.0.1 80 gives this output (on stderr) for pool member NOT returning the expected response: ++ echo ::ffff:10.0.0.1 ++ sed s/::ffff:// + IP=10.0.0.1 + PORT=80 ++ basename /usr/bin/monitors/HTTPMonitor_cURL_BasicGET + PIDFILE=/var/run/HTTPMonitor_cURL_BasicGET.10.0.0.1_80.pid + '[' -f /var/run/HTTPMonitor_cURL_BasicGET.10.0.0.1_80.pid ']' + echo 19955 + curl -fNs http://10.0.0.1:80/testpath/testfile.html + grep -i 'Server is UP!!!' + '[' 1 -eq 0 ']' + rm -f /var/run/HTTPMonitor_cURL_BasicGET.10.0.0.1_80.pid + exit A healthy pool member returning the expected response will produce output similar to the following: ++ echo ::ffff:10.0.0.1 ++ sed s/::ffff:// + IP=10.0.0.1 + PORT=80 ++ basename /usr/bin/monitors/HTTPMonitor_cURL_BasicGET + PIDFILE=/var/run/HTTPMonitor_cURL_BasicGET.10.0.0.1_80.pid + '[' -f /var/run/HTTPMonitor_cURL_BasicGET.10.0.0.1_80.pid ']' + echo 20064 + curl -fNs http://10.0.0.1:80/testpath/testfile.html + grep -i 'Server is UP!!!' + '[' 0 -eq 0 ']' + echo UP UP + rm -f /var/run/HTTPMonitor_cURL_BasicGET.10.0.0.1_80.pid + exit All lines prefixed with "+" are output on stderr from xtrace. Notice that in the first example, all output was from xtrace, and nothing went to stdout, so the monitoring daemon would have marked the pool member down after the timeout expired unless another instance marked it up again. In the second, the one line without the + prefix is the expected output to stdout, the string "UP", and the monitoring daemon would have marked the pool member up. Those are the primary tools and techniques I use for troubleshooting external monitor scripts. You can follow this sequence, or you could do it all backwards, or start in the middle. It is mildly complex, so where you dig in really depends on what you observe and/or your gut instinct about what might be wrong. Validating the external monitor template configuration Validation of the external monitor template configuration is simply a matter of comparing its settings to the commandline arguments and variables you used successfully during validation of the external program itself, then applying the monitor to a pool member to watch it in action. You can list the monitor template from the LTM configuration with the bigpipe command: bigpipe monitor <monitor_name> list bigpipe monitor cURL_BasicGET list For our example, the monitor definition looks like this, which reflects the command line arguments and variable values with which we tested above: monitor ExternalHTTP { defaults from external RECV "Server is UP!!!" run "HTTPMonitor_cURL_BasicGET" URI "/testpath/testfile.html" } Once you've configured the external monitor template with the name of the script, the commandline arguments, and the variables it requires to function, apply it to a single pool member and monitor the results. If the results are as expected, then Congratulations! you are the proud parent of a functional external monitor. If the pool member is not marked up or down as expected, then start again by examining a packet trace, and make sure command line testing gives you the exact results you expect. In the next article, I'll show you a neat trick used by another codeshare sample to more aggressively mark pool members down before the timeout expires.3.7KViews1like3CommentsiRule Editor - System Config Editing
In the latest release of the iRule Editor v 0.10.1, I added several new features. This tutorial will walk through System Level Configuration editing allowing you to work with your bigip.conf and bigip_base.conf files without having to open a terminal session to the BIG-IP. Usage:397Views0likes7CommentsiRules 101 - #12 - The Session Command
One of the things that makes iRules so incredibly powerful is the fact that it is a true scripting language, or at least based on one. The fact that they give you the tools that TCL brings to the table - regular expressions, string functions, even things as simple as storing, manipulating and recalling variable data - sets iRules apart from the rest of the crowd. It also makes it possible to do some pretty impressive things with connection data and massaging/directing it the way you want it. Other articles in the series: Getting Started with iRules: Intro to Programming with Tcl | DevCentral Getting Started with iRules: Control Structures & Operators | DevCentral Getting Started with iRules: Variables | DevCentral Getting Started with iRules: Directing Traffic | DevCentral Getting Started with iRules: Events & Priorities | DevCentral Intermediate iRules: catch | DevCentral Intermediate iRules: Data-Groups | DevCentral Getting Started with iRules: Logging & Comments | DevCentral Advanced iRules: Regular Expressions | DevCentral Getting Started with iRules: Events & Priorities | DevCentral iRules 101 - #12 - The Session Command | DevCentral Intermediate iRules: Nested Conditionals | DevCentral Intermediate iRules: Handling Strings | DevCentral Intermediate iRules: Handling Lists | DevCentral Advanced iRules: Scan | DevCentral Advanced iRules: Binary Scan | DevCentral Sometimes, though, a simple variable won't do. You've likely heard of global variables in one of the earlier 101 series and read the warning there, and are looking for another option. So here you are, you have some data you need to store, which needs to persist across multiple connections. You need it to be efficient and fast, and you don't want to have to do a whole lot of complex management of a data structure. One of the many ways that you can store and access information in your iRule fits all of these things perfectly, little known as it may be. For this scenario I'd recommend the usage of the session command. There are three main permutations of the session command that you'll be using when storing and referencing data within the session table. These are: session add: Stores user's data under the specified key for the specified persistence mode session lookup: Returns user data previously stored using session add session delete: Removes user data previously stored using session add A simple example of adding some information to the session table would look like: when CLIENTSSL_CLIENTCERT { set ssl_cert [SSL::cert 0] session add ssl $ssl_cert 90 } By using the session add command, you can manually place a specific piece of data into the LTM's session table. You can then look it up later, by unique key, with the session lookup command and use the data in a different section of your iRule, or in another connection all together. This can be helpful in different situations where data needs to be passed between iRules or events that it might not normally be when using a simple variable. Such as mining SSL data from the connection events, as below: when CLIENTSSL_CLIENTCERT { # Set results in the session so they are available to other events session add ssl [SSL::sessionid] [list [X509::issuer] [X509::subject] [X509::version]] 180 } when HTTP_REQUEST { # Retrieve certificate information from the session set sslList [session lookup ssl [SSL::sessionid]] set issuer [lindex sslList 0] set subject [lindex sslList 1] set version [lindex sslList 2] } Because the session table is optimized and designed to handle every connection that comes into the LTM, it's very efficient and can handle quite a large number of items. Also note that, as above, you can pass structured information such as TCL Lists into the session table and they will remain intact. Keep in mind, though, that there is currently no way to count the number of entries in the table with a certain key, so you'll have to build all of your own processing logic for now, where necessary. It's also important to note that there is more than one session table. If you look at the above example, you'll see that before we listed any key or data to be stored, we used the command session add ssl. Note the "ssl" portion of this command. This is a reference to which session table the data will be stored in. For our purposes here there are effectively two session tables: ssl, and uie. Be sure you're accessing the same one in your session lookup section as you are in your session add section, or you'll never find the data you're after. This is pretty easy to keep straight, once you see it. It looks like: session add uie ... session lookup uie Or: session add ssl ... session lookup ssl You can find complete documentation on the session command here, in the iRules, as well as some great examplesthat depict some more advanced iRules making use of the session command to great success. Check out Codeshare for more examples.3.5KViews0likes8CommentsIntermediate iRules: Nested Conditionals
Conditionals are a pretty standard tool in every programmer's toolbox. They are the functions that allow us to decided when we want certain actions to happen, based on, well, conditions that can be determined within our code. This concept is as old as compilers. Chances are, if you're writing code, you're going to be using a slew of these things, even in an Event based language like iRules. iRules is no different than any other programming/scripting language when it comes to conditionals; we have them. Sure how they're implemented and what they look like change from language to language, but most of the same basic tools are there: if, else, switch, elseif, etc. Just about any example that you might run across on DevCentral is going to contain some example of these being put to use. Learning which conditional to use in each situation is an integral part to learning how to code effectively. Once you have that under control, however, there's still plenty more to learn. Now that you're comfortable using a single conditional, what about starting to combine them? There are many times when it makes more sense to use a pair or more of conditionals in place of a single conditional along with logical operators. For example: if { [HTTP::host] eq "bob.com" and [HTTP::uri] starts_with "/uri1" } { pool pool1 } elseif { [HTTP::host] eq "bob.com" and [HTTP::uri] starts_with "/uri2" } { pool pool2 } elseif { [HTTP::host] eq "bob.com" and [HTTP::uri] starts_with "/uri3" } { pool pool3 } Can be re-written to use a pair of conditionals instead, making it far more efficient. To do this, you take the common case shared among the example strings and only perform that comparison once, and only perform the other comparisons if that result returns as desired. This is more easily described as nested conditionals, and it looks like this: if { [HTTP::host] eq "bob.com" } { if {[HTTP::uri] starts_with "/uri1" } { pool pool1 } elseif {[HTTP::uri] starts_with "/uri2" } { pool pool2 } elseif {[HTTP::uri] starts_with "/uri3" } { pool pool3 } } These two examples are logically equivalent, but the latter example is far more efficient. This is because in all the cases where the host is not equal to "bob.com", no other inspection needs to be done, whereas in the first example, you must perform the host check three times, as well as the uri check every single time, regardless of the fact that you could have stopped the process earlier. While basic, this concept is important in general when coding. It becomes exponentially more important, as do almost all optimizations, when talking about programming in iRules. A script being executed on a server firing perhaps once per minute benefits from small optimizations. An iRule being executed somewhere in the order of 100,000 times per second benefits that much more. A slightly more interesting example, perhaps, is performing the same logical nesting while using different operators. In this example we'll look at a series of if/elseif statements that are already using nesting, and take a look at how we might use the switch command to even further optimize things. I've seen multiple examples of people shying away from switch when nesting their logic because it looks odd to them or they're not quite sure how it should be structured. Hopefully this will help clear things up. First, the example using if statements: when HTTP_REQUEST { if { [HTTP::host] eq "secure.domain.com" } { HTTP::header insert "Client-IP:[IP::client_addr]" pool sslServers } elseif { [HTTP::host] eq "www.domain.com" } { HTTP::header insert "Client-IP:[IP::client_addr]" pool httpServers } elseif { [HTTP::host] ends_with "domain.com" and [HTTP::uri] starts_with "/secure"} { HTTP::header insert "Client-IP:[IP::client_addr]" pool sslServers } elseif {[HTTP::host] ends_with "domain.com" and [HTTP::uri] starts_with "/login"} { HTTP::header insert "Client-IP:[IP::client_addr]" pool httpServers } elseif { [HTTP::host] eq "intranet.myhost.com" } { HTTP::header insert "Client-IP:[IP::client_addr]" pool internal } } As you can see, this is completely functional and would do the job just fine. There are definitely some improvements that can be made, though. Let's try using a switch statement instead of several if comparisons for improved performance. To do that, we're going to have to use an if nested inside a switch comparison. While this might be new to some or look a bit odd if you're not used to it, it's completely valid and often times the most efficient you’re going to get. This is what the above code would look like cleaned up and put into a switch: when HTTP_REQUEST { HTTP::header insert "Client-IP:[IP::client_addr]" switch -glob [HTTP::host] { "secure.domain.com" { pool sslServers } "www.domain.com" { pool httpServers } "*.domain.com" { if { [HTTP::uri] starts_with "/secure" } { pool sslServers } else { pool httpServers } } "intranet.myhost.com" { pool internal } } } As you can see this is not only easier to read and maintain, but it will also prove to be more efficient. We've moved to the more efficient switch structure, we've gotten rid of the repeat host comparisons that were happening above with the /secure vs /login uris, and while I was at it I got rid of all those examples of inserting a header, since that was happening in every case anyway. Hopefully the benefit this technique can offer is clear, and these examples did the topic some justice. With any luck, you'll nest those conditionals with confidence now.5.8KViews0likes0CommentsSSL Profiles Part 8: Client Authentication
This is the eighth article in a series of Tech Tips that highlight SSL Profiles on the BIG-IP LTM. SSL Overview and Handshake SSL Certificates Certificate Chain Implementation Cipher Suites SSL Options SSL Renegotiation Server Name Indication Client Authentication Server Authentication All the "Little" Options This article will discuss the concept of Client Authentication, how it works, and how the BIG-IP system allows you to configure it for your environment. Client Authentication In a TLS handshake, the client and the server exchange several messages that ultimately result in an encrypted channel for secure communication. During this handshake, the client authenticates the server's identity by verifying the server certificate (for more on the TLS handshake, see SSL Overview and Handshake - Article 1in this series). Although the client always authenticates the server's identity, the server is not required to authenticate the client's identity. However, there are some situations that call for the server to authenticate the client. Client authentication is a feature that lets you authenticate users that are accessing a server. In client authentication, a certificate is passed from the client to the server and is verified by the server. Client authentication allow you to rest assured that the person represented by the certificate is the person you expect. Many companies want to ensure that only authorized users can gain access to the services and content they provide. As more personal and access-controlled information moves online, client authentication becomes more of a reality and a necessity. How Does Client Authentication Work? Before we jump into client authentication, let's make sure we understand server authentication. During the TLS handshake, the client authenticates the identity of the server by verifying the server's certificate and using the server's public key to encrypt data that will be used to compute the shared symmetric key. The server can only generate the symmetric key used in the TLS session if it can decrypt that data with its private key. The following diagram shows an abbreviated version of the TLS handshake that highlights some of these concepts. Ultimately, the client and server need to use a symmetric key to encrypt all communication during their TLS session. In order to calculate that key, the server shares its certificate with the client (the certificate includes the server's public key), and the client sends a random string of data to the server (encrypted with the server's public key). Now that the client and server each have the random string of data, they can each calculate (independently) the symmetric key that will be used to encrypt all remaining communication for the duration of that specific TLS session. In fact, the client and server both send a "Finished' message at the end of the handshake...and that message is encrypted with the symmetric key that they have both calculated on their own. So, if all that stuff works and they can both read each other's "Finished" message, then the server has been authenticated by the client and they proceed along with smiles on their collective faces (encrypted smiles, of course). You'll notice in the diagram above that the server sent its certificate to the client, but the client never sent its certificate to the server. When client authentication is used, the server still sends its certificate to the client, but it also sends a "Certificate Request" message to the client. This lets the client know that it needs to get its certificate ready because the next message from the client to the server (during the handshake) will need to include the client certificate. The following diagram shows the added steps needed during the TLS handshake for client authentication. So, you can see that when client authentication is enabled, the public and private keys are still used to encrypt and decrypt critical information that leads to the shared symmetric key. In addition to the public and private keys being used for authentication, the client and server both send certificates and each verifies the certificate of the other. This certificate verification is also part of the authentication process for both the client and the server. The certificate verification process includes four important checks. If any of these checks do not return a valid response, the certificate verification fails (which makes the TLS handshake fail) and the session will terminate. These checks are as follows: Check digital signature Check certificate chain Check expiration date and validity period Check certificate revocation status Here's how the client and server accomplish each of the checks for client authentication: Digital Signature: The client sends a "Certificate Verify" message that contains a digitally signed copy of the previous handshake message. This message is signed using the client certificate's private key. The server can validate the message digest of the digital signature by using the client's public key (which is found in the client certificate). Once the digital signature is validated, the server knows that public key belonging to the client matches the private key used to create the signature. Certificate Chain: The server maintains a list of trusted CAs, and this list determines which certificates the server will accept. The server will use the public key from the CA certificate (which it has in its list of trusted CAs) to validate the CA's digital signature on the certificate being presented. If the message digest has changed or if the public key doesn't correspond to the CA's private key used to sign the certificate, the verification fails and the handshake terminates. Expiration Date and Validity Period: The server compares the current date to the validity period listed in the certificate. If the expiration date has not passed and the current date is within the period, everything is good. If it's not, then the verification fails and the handshake terminates. Certificate Revocation Status: The server compares the client certificate to the list of revoked certificates on the system. If the client certificate is on the list, the verification fails and the handshake terminates. As you can see, a bunch of stuff has to happen in just the right way for the Client-Authenticated TLS handshake to finalize correctly. But, all this is in place for your own protection. After all, you want to make sure that no one else can steal your identity and impersonate you on a critically important website! BIG-IP Configuration Now that we've established the foundation for client authentication in a TLS handshake, let's figure out how the BIG-IP is set up to handle this feature. The following screenshot shows the user interface for configuring Client Authentication. To get here, navigate to Local Traffic > Profiles > SSL > Client. The Client Certificate drop down menu has three settings: Ignore (default), Require, and Request. The "Ignore" setting specifies that the system will ignore any certificate presented and will not authenticate the client before establishing the SSL session. This effectively turns off client authentication. The "Require" setting enforces client authentication. When this setting is enabled, the BIG-IP will request a client certificate and attempt to verify it. An SSL session is established only if a valid client certificate from a trusted CA is presented. Finally, the "Request" setting enables optional client authentication. When this setting is enabled, the BIG-IP will request a client certificate and attempt to verify it. However, an SSL session will be established regardless of whether or not a valid client certificate from a trusted CA is presented. The Request option is often used in conjunction with iRules in order to provide selective access depending on the certificate that is presented. For example: let's say you would like to allow clients who present a certificate from a trusted CA to gain access to the application while clients who do not provide the required certificate be redirected to a page detailing the access requirements. If you are not using iRules to enforce a different outcome based on the certificate details, there is no significant benefit to using the "Request" setting versus the default "Ignore" setting. In both cases, an SSL session will be established regardless of the certificate presented. Frequency specifies the frequency of client authentication for an SSL session. This menu offers two options: Once (default) and Always. The "Once" setting specifies that the system will authenticate the client only once for an SSL session. The "Always"setting specifies that the system will authenticate the client once when the SSL session is established as well as each time that session is reused. The Retain Certificate box is checked by default. When checked, the client certificate is retained for the SSL session. Certificate Chain Traversal Depth specifies the maximum number of certificates that can be traversed in a client certificate chain. The default for this setting is 9. Remember that "Certificate Chain" part of the verification checks? This setting is where you configure the depth that you allow the server to dig for a trusted CA. For more on certificate chains, see article 2 of this SSL series. Trusted Certificate Authorities setting is used to specify the BIG-IP's Trusted Certificate Authorities store. These are the CAs that the BIG-IP trusts when it verifies a client certificate that is presented during client authentication. The default value for the Trusted Certificate Authorities setting is None, indicating that no CAs are trusted. Don't forget...if the BIG-IP Client Certificate menu is set to Require but the Trusted Certificate Authorities is set to None, clients will not be able to establish SSL sessions with the virtual server. The drop down list in this setting includes the name of all the SSL certificates installed in the BIG-IP's /config/ssl/ssl.crt directory. A newly-installed BIG-IP system will include the following certificates: default certificate and ca-bundle certificate. The default certificate is a self-signed server certificate used when testing SSL profiles. This certificate is not appropriate for use as a Trusted Certificate Authorities certificate bundle. The ca-bundle certificate is a bundle of CA certificates from most of the well-known PKIs around the world. This certificate may be appropriate for use as a Trusted Certificate Authorities certificate bundle. However, if this bundle is specified as the Trusted Certificate Authorities certificate store, any valid client certificate that is signed by one of the popular Root CAs included in the default ca-bundle.crt will be authenticated. This provides some level of identification, but it provides very little access control since almost any valid client certificate could be authenticated. If you want to trust only certificates signed by a specific CA or set of CAs, you should create and install a bundle containing the certificates of the CAs whose certificates you trust. The bundle must also include the entire chain of CA certificates necessary to establish a chain of trust. Once you create this new certificate bundle, you can select it in the Trusted Certificate Authorities drop down menu. The Advertised Certificate Authorities setting is used to specify the CAs that the BIG-IP advertises as trusted when soliciting a client certificate for client authentication. The default value for the Advertised Certificate Authorities setting is None, indicating that no CAs are advertised. When set to None, no list of trusted CAs is sent to a client with the certificate request. If the Client Certificate menu is set to Require or Request, you can configure the Advertised Certificate Authorities setting to send clients a list of CAs that the server is likely to trust. Like the Trusted Certificate Authorities list, the Advertised Certificate Authorities drop down list includes the name of all the SSL certificates installed in the BIG-IP /config/ssl/ssl.crt directory. A newly-installed BIG-IP system includes the following certificates: default certificate and ca-bundle certificate. The default certificate is a self-signed server certificate used for testing SSL profiles. This certificate is not appropriate for use as an Advertised Certificate Authorities certificate bundle. The ca-bundle certificate is a bundle of CA certificates from most of the well-known PKIs around the world. This certificate may be appropriate for use as an Advertised Certificate Authorities certificate bundle. If you want to advertise only a specific CA or set of CAs, you should create and install a bundle containing the certificates of the CA to advertise. Once you create this new certificate bundle, you can select it in the Advertised Certificate Authorities setting drop down menu. You are allowed to configure the Advertised Certificate Authorities setting to send a different list of CAs than that specified for the Trusted Certificate Authorities. This allows greater control over the configuration information shared with unknown clients. You might not want to reveal the entire list of trusted CAs to a client that does not automatically present a valid client certificate from a trusted CA. Finally, you should avoid specifying a bundle that contains a large number of certificates when you configure the Advertised Certificate Authorities setting. This will cut down on the number of certificates exchanged during a client SSL handshake. The maximum size allowed by the BIG-IP for native SSL handshake messages is 14,304 bytes. Most handshakes don't result in large message lengths, but if the SSL handshake is negotiating a native cipher and the total length of all messages in the handshake exceeds the 14,304 byte threshold, the handshake will fail. The Certificate Revocation List (CRL) setting allows you to specify a CRL that the BIG-IP will use to check revocation status of a certificate prior to authenticating a client. If you want to use a CRL, you must upload it to the /config/ssl/ssl.crl directory on the BIG-IP. The name of the CRL file may then be entered in the CRL setting dialog box. Note that this box will offer no drop down menu options until you upload a CRL file to the BIG-IP. Since CRLs can quickly become outdated, you should use either OCSP or CRLDP profiles for more robust and current verification functionality. Conclusion Well, that wraps up our discussion on Client Authentication. I hope the information helped, and I hope you can use this to configure your BIG-IP to meet the needs of your specific network environment. Be sure to come back for our next article in the SSL series. As always, if you have any other questions, feel free to post a question here or Contact Us directly. See you next time!27KViews1like21CommentsControlling a Pool Members Ratio and Priority Group with iControl
A Little Background A question came in through the iControl forums about controlling a pool members ratio and priority programmatically. The issue really involves how the API’s use multi-dimensional arrays but I thought it would be a good opportunity to talk about ratio and priority groups for those that don’t understand how they work. In the first part of this article, I’ll talk a little about what pool members are and how their ratio and priorities apply to how traffic is assigned to them in a load balancing setup. The details in this article were based on BIG-IP version 11.1, but the concepts can apply to other previous versions as well. Load Balancing In it’s very basic form, a load balancing setup involves a virtual ip address (referred to as a VIP) that virtualized a set of backend servers. The idea is that if your application gets very popular, you don’t want to have to rely on a single server to handle the traffic. A VIP contains an object called a “pool” which is essentially a collection of servers that it can distribute traffic to. The method of distributing traffic is referred to as a “Load Balancing Method”. You may have heard the term “Round Robin” before. In this method, connections are passed one at a time from server to server. In most cases though, this is not the best method due to characteristics of the application you are serving. Here are a list of the available load balancing methods in BIG-IP version 11.1. Load Balancing Methods in BIG-IP version 11.1 Round Robin: Specifies that the system passes each new connection request to the next server in line, eventually distributing connections evenly across the array of machines being load balanced. This method works well in most configurations, especially if the equipment that you are load balancing is roughly equal in processing speed and memory. Ratio (member): Specifies that the number of connections that each machine receives over time is proportionate to a ratio weight you define for each machine within the pool. Least Connections (member): Specifies that the system passes a new connection to the node that has the least number of current connections in the pool. This method works best in environments where the servers or other equipment you are load balancing have similar capabilities. This is a dynamic load balancing method, distributing connections based on various aspects of real-time server performance analysis, such as the current number of connections per node or the fastest node response time. Observed (member): Specifies that the system ranks nodes based on the number of connections. Nodes that have a better balance of fewest connections receive a greater proportion of the connections. This method differs from Least Connections (member), in that the Least Connections method measures connections only at the moment of load balancing, while the Observed method tracks the number of Layer 4 connections to each node over time and creates a ratio for load balancing. This dynamic load balancing method works well in any environment, but may be particularly useful in environments where node performance varies significantly. Predictive (member): Uses the ranking method used by the Observed (member) methods, except that the system analyzes the trend of the ranking over time, determining whether a node's performance is improving or declining. The nodes in the pool with better performance rankings that are currently improving, rather than declining, receive a higher proportion of the connections. This dynamic load balancing method works well in any environment. Ratio (node): Specifies that the number of connections that each machine receives over time is proportionate to a ratio weight you define for each machine across all pools of which the server is a member. Least Connections (node): Specifies that the system passes a new connection to the node that has the least number of current connections out of all pools of which a node is a member. This method works best in environments where the servers or other equipment you are load balancing have similar capabilities. This is a dynamic load balancing method, distributing connections based on various aspects of real-time server performance analysis, such as the number of current connections per node, or the fastest node response time. Fastest (node): Specifies that the system passes a new connection based on the fastest response of all pools of which a server is a member. This method might be particularly useful in environments where nodes are distributed across different logical networks. Observed (node): Specifies that the system ranks nodes based on the number of connections. Nodes that have a better balance of fewest connections receive a greater proportion of the connections. This method differs from Least Connections (node), in that the Least Connections method measures connections only at the moment of load balancing, while the Observed method tracks the number of Layer 4 connections to each node over time and creates a ratio for load balancing. This dynamic load balancing method works well in any environment, but may be particularly useful in environments where node performance varies significantly. Predictive (node): Uses the ranking method used by the Observed (member) methods, except that the system analyzes the trend of the ranking over time, determining whether a node's performance is improving or declining. The nodes in the pool with better performance rankings that are currently improving, rather than declining, receive a higher proportion of the connections. This dynamic load balancing method works well in any environment. Dynamic Ratio (node) : This method is similar to Ratio (node) mode, except that weights are based on continuous monitoring of the servers and are therefore continually changing. This is a dynamic load balancing method, distributing connections based on various aspects of real-time server performance analysis, such as the number of current connections per node or the fastest node response time. Fastest (application): Passes a new connection based on the fastest response of all currently active nodes in a pool. This method might be particularly useful in environments where nodes are distributed across different logical networks. Least Sessions: Specifies that the system passes a new connection to the node that has the least number of current sessions. This method works best in environments where the servers or other equipment you are load balancing have similar capabilities. This is a dynamic load balancing method, distributing connections based on various aspects of real-time server performance analysis, such as the number of current sessions. Dynamic Ratio (member): This method is similar to Ratio (node) mode, except that weights are based on continuous monitoring of the servers and are therefore continually changing. This is a dynamic load balancing method, distributing connections based on various aspects of real-time server performance analysis, such as the number of current connections per node or the fastest node response time. L3 Address: This method functions in the same way as the Least Connections methods. We are deprecating it, so you should not use it. Weighted Least Connections (member): Specifies that the system uses the value you specify in Connection Limit to establish a proportional algorithm for each pool member. The system bases the load balancing decision on that proportion and the number of current connections to that pool member. For example,member_a has 20 connections and its connection limit is 100, so it is at 20% of capacity. Similarly, member_b has 20 connections and its connection limit is 200, so it is at 10% of capacity. In this case, the system select selects member_b. This algorithm requires all pool members to have a non-zero connection limit specified. Weighted Least Connections (node): Specifies that the system uses the value you specify in the node's Connection Limitand the number of current connections to a node to establish a proportional algorithm. This algorithm requires all nodes used by pool members to have a non-zero connection limit specified. Ratios The ratio is used by the ratio-related load balancing methods to load balance connections. The ratio specifies the ratio weight to assign to the pool member. Valid values range from 1 through 100. The default is 1, which means that each pool member has an equal ratio proportion. So, if you have server1 a with a ratio value of “10” and server2 with a ratio value of “1”, server1 will get served 10 connections for every one that server2 receives. This can be useful when you have different classes of servers with different performance capabilities. Priority Group The priority group is a number that groups pool members together. The default is 0, meaning that the member has no priority. To specify a priority, you must activate priority group usage when you create a new pool or when adding or removing pool members. When activated, the system load balances traffic according to the priority group number assigned to the pool member. The higher the number, the higher the priority, so a member with a priority of 3 has higher priority than a member with a priority of 1. The easiest way to think of priority groups is as if you are creating mini-pools of servers within a single pool. You put members A, B, and C in to priority group 5 and members D, E, and F in priority group 1. Members A, B, and C will be served traffic according to their ratios (assuming you have ratio loadbalancing configured). If all those servers have reached their thresholds, then traffic will be distributed to servers D, E, and F in priority group 1. he default setting for priority group activation is Disabled. Once you enable this setting, you can specify pool member priority when you create a new pool or on a pool member's properties screen. The system treats same-priority pool members as a group. To enable priority group activation in the admin GUI, select Less than from the list, and in the Available Member(s) box, type a number from 0 to 65535 that represents the minimum number of members that must be available in one priority group before the system directs traffic to members in a lower priority group. When a sufficient number of members become available in the higher priority group, the system again directs traffic to the higher priority group. Implementing in Code The two methods to retrieve the priority and ratio values are very similar. They both take two parameters: a list of pools to query, and a 2-D array of members (a list for each pool member passed in). long [] [] get_member_priority( in String [] pool_names, in Common__AddressPort [] [] members ); long [] [] get_member_ratio( in String [] pool_names, in Common__AddressPort [] [] members ); The following PowerShell function (utilizing the iControl PowerShell Library), takes as input a pool and a single member. It then make a call to query the ratio and priority for the specific member and writes it to the console. function Get-PoolMemberDetails() { param( $Pool = $null, $Member = $null ); $AddrPort = Parse-AddressPort $Member; $RatioAofA = (Get-F5.iControl).LocalLBPool.get_member_ratio( @($Pool), @( @($AddrPort) ) ); $PriorityAofA = (Get-F5.iControl).LocalLBPool.get_member_priority( @($Pool), @( @($AddrPort) ) ); $ratio = $RatioAofA[0][0]; $priority = $PriorityAofA[0][0]; "Pool '$Pool' member '$Member' ratio '$ratio' priority '$priority'"; } Setting the values with the set_member_priority and set_member_ratio methods take the same first two parameters as their associated get_* methods, but add a third parameter for the priorities and ratios for the pool members. set_member_priority( in String [] pool_names, in Common::AddressPort [] [] members, in long [] [] priorities ); set_member_ratio( in String [] pool_names, in Common::AddressPort [] [] members, in long [] [] ratios ); The following Powershell function takes as input the Pool and Member with optional values for the Ratio and Priority. If either of those are set, the function will call the appropriate iControl methods to set their values. function Set-PoolMemberDetails() { param( $Pool = $null, $Member = $null, $Ratio = $null, $Priority = $null ); $AddrPort = Parse-AddressPort $Member; if ( $null -ne $Ratio ) { (Get-F5.iControl).LocalLBPool.set_member_ratio( @($Pool), @( @($AddrPort) ), @($Ratio) ); } if ( $null -ne $Priority ) { (Get-F5.iControl).LocalLBPool.set_member_priority( @($Pool), @( @($AddrPort) ), @($Priority) ); } } In case you were wondering how to create the Common::AddressPort structure for the $AddrPort variables in the above examples, here’s a helper function I wrote to allocate the object and fill in it’s properties. function Parse-AddressPort() { param($Value); $tokens = $Value.Split(":"); $r = New-Object iControl.CommonAddressPort; $r.address = $tokens[0]; $r.port = $tokens[1]; $r; } Download The Source The full source for this example can be found in the iControl CodeShare under PowerShell PoolMember Ratio and Priority.29KViews0likes3CommentsA Brief Introduction To External Application Verification Monitors
Background EAVs (External Application Verification) monitors are one of most useful and extensible features of the BIG-IP product line. They give the end user the ability to utilize the underlying Linux operating system to perform complex and thorough service checks. Given a service that does not have a monitor provided, a lot of users will assign the closest related monitor and consider the solution complete. There are more than a few cases where a TCP or UDP monitor will mark a service “up” even while the service is unresponsive. EAVs give us the ability to dive much deeper than merely performing a 3-way handshake and neglecting the other layers of the application or service. How EAVs Work An EAV monitor is an executable script located on the BIG-IP’s file system (usually under /usr/bin/monitors) that is executed at regular intervals by the bigd daemon and reports its status. One of the most common misconceptions (especially amongst those with *nix backgrounds) is that the exit status of the script dictates the fate of the pool member. The exit status has nothing to do with how bigd interprets the pool member’s health. Any output to stdout (standard output) from the script will mark the pool member “up”. This is a nuance that should receive special attention when architecting your next EAV. Analyze each line of your script and make sure nothing will inadvertently get directed to stdout during monitor execution. The most common example is when someone writes a script that echoes “up” when the checks execute correctly and “down” when they fail. The pool member will be enabled by the BIG-IP under both circumstances rendering a useless monitor. Bigd automatically provides two arguments to the EAV’s script upon execution: node IP address and node port number. The node IP address is provided with an IPv6 prefix that may need to be removed in order for the script to function correctly. You’ll notice we remove the “::ffff://” prefix with a sed substitution in the example below. Other arguments can be provided to the script when configured in the UI (or command line). The user-provided arguments will have offsets of $3, $4, etc. Without further ado, let’s take a look at a service-specific monitor that gives us a more complete view of the application’s health. An Example I have seen on more than one occasion where a DNS pool member has successfully passed the TCP monitor, but the DNS service was unresponsive. As a result, a more invasive inspection is required to make sure that the DNS service is in fact serving valid responses. Let’s take a look at an example: #!/bin/bash # $1 = node IP # $2 = node port # $3 = hostname to resolve [[ $# != 3 ]] && logger -p local0.error -t ${0##*/} -- "usage: ${0##*/} <node IP> <node port> <hostname to resolve>" && exit 1 node_ip=$(echo $1 | sed 's/::ffff://') dig +short @$node_ip $3 IN A &> /dev/null [[ $? == 0 ]] && echo “UP” We are using the dig (Domain Information Groper) command to query our DNS server for an A record. We use the exit status from dig to determine if the monitor will pass. Notice how the script will never output anything to stdout other than “UP” in the case of success. If there aren’t enough arguments for the script to proceed, we output the usage to /var/log/ltm and exit. This is a very simple 13 line script, but effective example. The Takeaways The command should be as lightweight and efficient as possible If the same result can be accomplished with a built-in monitor, use it EAV monitors don’t rely on the command’s exit status, only standard output Send all error and informational messages to logger instead of stdout or stderr (standard error) “UP” has no significance, it is just a series of character sent to stdout, the monitor would still pass if the script echoed “DOWN” Conclusion When I first discovered EAV monitors, it opened up a whole realm of possibilities that I could not accomplish with built in monitors. It gives you the ability to do more thorough checking as well as place logic in your monitors. While my example was a simple bash script, BIG-IP also ships with Perl and Python along with their standard libraries, which offer endless possibilities. In addition to using the built-in commands and libraries, it would be just as easy to write a monitor in a compiled language (C, C++, or whatever your flavor may be) and statically compile it before uploading it to the BIG-IP. If you are new to EAVs, I hope this gives you the tools to make your environments more robust and resilient. If you’re more of a seasoned veteran, we’ll have more fun examples in the near future.2.2KViews0likes7CommentsInvestigating the LTM TCP Profile: The Finish Line
Introduction The LTM TCP profile has over thirty settings that can be manipulated to enhance the experience between client and server. Because the TCP profile is applied to the virtual server, the flexibility exists to customize the stack (in both client & server directions) for every application delivered by the LTM. In this series, we will dive into several of the configurable options and discuss the pros and cons of their inclusion in delivering applications. Nagle's Algorithm Max Syn Retransmissions & Idle Timeout Windows & Buffers Timers QoS Slow Start Congestion Control Algorithms Acknowledgements Extended Congestion Notification & Limited Transmit Recovery The Finish Line Quick aside for those unfamiliar with TCP: the transmission controlprotocol (layer4) rides on top of the internetprotocol (layer3) and is responsible for establishing connections between clients and servers so data can be exchanged reliably between them. Normal TCP communication consists of a client and a server, a 3-way handshake, reliable data exchange, and a four-way close. With the LTM as an intermediary in the client/server architecture, the session setup/teardown is duplicated, with the LTM playing the role of server to the client and client to the server. These sessions are completely independent, even though the LTM can duplicate the tcp source port over to the server-side connection in most cases, and depending on your underlying network architecture, can also duplicate the source IP. Deferred Accept Disabled by default, this option defers the allocation of resources to the connection until payload is received from the client. It is useful in dealing with three-way handshake DoS attacks, and delays the allocation of server-side resources until necessary, but delaying the accept could impact the latency of the server responses, especially if OneConnect is disabled. Bandwidth Delay This setting, enabled by default, specifies that the tcp stack tries to calculate the optimal bandwidth based on round-trip time and historical throughput. This product would then help determine the optimal congestion window without first exceeding the available bandwidth. Proxy MSS & Options These settings signal the LTM to only use the MSS and options negotiated with the client on the server-side of the connection. Disabled by default, enabling them doesn't allow the LTM to properly isolate poor TCP performance on one side of the connection nor does it enable the LTM to offload the client or server. The scenarios for these options are rare and should be utilized sparingly. Examples:troubleshooting performance problems isolated to the server, or if there is a special case for negotiating TCP options end to end. Appropriate Byte Counting Defined in RFC 3465, this option calculates the increase ot the congestion window on the number of previously unacknowledged bytes that each ACK covers. This option is enabled by default, and it is recommended for it to remain enabled. Advantages: more appropriately increases the congestion window, mitigates the impact of delayed and lost acknowledgements, and prevents attacks from misbehaving receivers. Disadvantages include an increase in burstiness and a small increase in the overall loss rate (directly related to the increased aggressiveness) Congestion Metrics Cache This option is enabled by default and signals the LTM to use route metrics to the peer for initializing the congestion window. This improves the initial slow-start ramp for previously encountered peers as the congestion information is already known and cached. If the majority of the client base is sourced from rapidly changing and unstable routing infrastructures, disabling this option ensures that the LTM will not use bad information leading to wrong behavior upon the initial connection. Conclusion This concludes our trip through the TCP profile, I hope you've enjoyed the ride. I'd like to thank the developers, UnRuleY in particular, for their help along the way. Update: This series is a decade+ old. Still relevant, but Martin Duke wrote a series of articles on the TCP profile as well with updates and considerations you should read up on as well.450Views0likes2CommentsInvestigating the LTM TCP Profile: ECN & LTR
Introduction The LTM TCP profile has over thirty settings that can be manipulated to enhance the experience between client and server. Because the TCP profile is applied to the virtual server, the flexibility exists to customize the stack (in both client & server directions) for every application delivered by the LTM. In this series, we will dive into several of the configurable options and discuss the pros and cons of their inclusion in delivering applications. Nagle's Algorithm Max Syn Retransmissions & Idle Timeout Windows & Buffers Timers QoS Slow Start Congestion Control Algorithms Acknowledgements Extended Congestion Notification & Limited Transmit Recovery The Finish Line Quick aside for those unfamiliar with TCP: the transmission controlprotocol (layer4) rides on top of the internetprotocol (layer3) and is responsible for establishing connections between clients and servers so data can be exchanged reliably between them. Normal TCP communication consists of a client and a server, a 3-way handshake, reliable data exchange, and a four-way close. With the LTM as an intermediary in the client/server architecture, the session setup/teardown is duplicated, with the LTM playing the role of server to the client and client to the server. These sessions are completely independent, even though the LTM can duplicate the tcp source port over to theserver-sideconnection in most cases, and depending on your underlying network architecture, can also duplicate the source IP. Extended Congestion Notification The extended congestion notification option available in the TCP profile by default is disabled. ECN is another option in TCP that must be negotiated at start time between peers. Support is not widely adopted yet and the effective use of this feature relies heavily on the underlying infrastructures handling of the ECN bits as routers must participate in the process. If you recall from the QoS tech tip, the IP TOS field has 8 bits, the first six for DSCP, and the final two for ECN. DSCP ECN Codepoints DSCP ECN Comments X X X X X X 0 0 Not-ECT X X X X X X 0 1 ECT(1) ECN-capable X X X X X X 1 0 ECT(0) ECN-capable X X X X X X 1 1 CE Congestion Experienced Routers implementing ECN RED (random early detection) will mark ECN-capable packets and drop Not-ECT packets (only under congestion and only by the policies configured on the router). If ECN is enabled, the presence of the ECE (ECN-Echo) bit will trigger the TCP stack to halve its congestion window and reduce the slow start threshold (cwnd and ssthresh, respectively...remember these?) just as if the packet had been dropped. The benefits of enabling ECN are reducing/avoiding drops where they normally would occur and reducing packet delay due to shorter queues. Another benefit is that the TCP peers can distinguish between transmission loss and congestion signals. However, due to the nature of this tightly integrated relationship between routers and tcp peers, unless you control the infrastructure or have agreements in place to its expected behavior, I wouldn't recommend enabling this feature as there are several ways to subvert ECN (you can read up on it in RFC 3168). Limited Transmit Recovery Defined in RFC 3042, Limited Transmit Recovery allows the sender to transmit new data after the receipt of the second duplicate acknowledge ifthe peer's receive windowallows for it and outstandingdata is less than the congestion window plus two segments. Remember that with fast retransmit,a retransmit occurs after the third duplicate acknolwedgement or after a timeout. The congestion window is not updated when LTR triggers a retransmission. Note also that if utilized with selective acknowledgements, LTR must not transmit unless the ack contains new SACK information. In the event of acongestion windowof three segments and one is lost, fast retransmit would never trigger since three duplicate acks couldn't be received. This would result in a timeout, which could be a penalty ofat least one second. Utilizing LTR can significantly reduce the number oftimeout basedretransmissions. This option is enabled by default in the profile.555Views0likes0Comments