performance
2002 TopicsOne Connect not keeping connection open on HTTP 204 No Content
We have an application that returns a 'HTTP 204 No Content' response on 99% of all requests. These connections are being kept open and reused on the client side of the F5. The problem is the Load Balancer closes these connections on the server side right after the HTTP 204 RESPONSE is received from the server. When we send a HTTP 200 the connection is kept open and reused(normal One Connect operation). Is there an iRule that we can apply to the VIP to keep the connection open even when the Server returns a 'HTTP 204 No Content'? Thanks552Views0likes9CommentsImprove BIG-IP APM VPN speed with TLS dynamic record size
After successfully setting up BIG-IP APM network access, and running it for sometime, you may be looking for ways to optimize VPN speed for your users. This article discusses one way you can do that. Feature Description Beginning in BIG-IP 12.1.0, the Client SSL profile includes a feature that enables dynamic record size in TLS. When applied to a F5 BIG-IP Access Policy Manager (APM) network access VPN TLS virtual server, this can improve VPN speeds for your users. It has been found that certain protocols, notably HTTP, show better client response times using this method. For more information on the Allow Dynamic Record Sizing setting down to the packet level, refer to the following resources TheAbout dynamic record sizing section of the BIG-IP System: SSL Administration manual. Boosting TLS Performance with Dynamic Record Sizing on BIG-IP on DevCentral. SSL Profiles Part 11: TLS Optimizationon DevCentral. Important: Dynamic record size is a TLS enhancement and does not apply to BIG-IP APM network access DTLS virtual servers. Do not enable dynamic record size on DTLS. When you want to optimize network performance, you must allocate time to tune each configuration to match the requirements specific to your environment. Additionally, note that configuration changes that improve performance may increase BIG-IP system resource (CPU, memory) usage. Testing dynamic record size on VPN speeds Having discussed the theory behind the feature, we will now perform tests to see how it affects VPN speeds. Network bandwidth can vary depending on many factors, for instance, peak vs non-peak hours. When more users are connected to a VPN, download speeds can decrease significantly. It is therefore important to establish a baseline network bandwidth and download speed at the beginning: Baseline AWS environment Windows Client (Seattle) --VPN --> BIG-IP APM (Oregon) --local LAN-->Apache and iperf servers AWS environment: BIG-IP APM 17.1.0 VE on AWS (F5 BIG-IP VE - ALL modules, m5.xlarge, 1 Gbps, AWS) located in us-west-2 Oregon. Note: Ensure you use at least the recommended size (m5.xlarge) and at least 1Gbps on AWS to make sure there are no bandwidth and resource limits. Windows client in located in Seattle Using iperf3 to measure network bandwidth Using curl to download a 377MB apmclient.iso Optional: You can optionally test using the developer tools on your browser. I used firefox; as the results did not differ significantly from curl. They are not included in this article. Baseline test results These are measured with all default settings on BIG-IP APM and dynamic record sizing not enabled: curl download results Average download speed: 4950k C:\Windows\system32>curl -k -o null https://10.0.128.23/apmclient.iso % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 377M 100 377M 0 0 4950k 0 0:01:18 0:01:18 --:--:-- 4988k iperf3 results Network bandwidth: 4873 KB/sec c:\Users\klau\Desktop\iperf-3.1.3-win64>iperf3.exe -c 10.0.128.24 --get-server-output -i 1 -f K -R Connecting to host 10.0.128.24, port 5201 Reverse mode, remote host 10.0.128.24 is sending [ 4] local 10.0.128.31 port 61284 connected to 10.0.128.24 port 5201 [ ID] Interval Transfer Bandwidth [ 4] 0.00-1.00 sec 4.33 MBytes 4434 KBytes/sec [ 4] 1.00-2.00 sec 4.67 MBytes 4785 KBytes/sec [ 4] 2.00-3.00 sec 4.86 MBytes 4977 KBytes/sec [ 4] 3.00-4.00 sec 4.77 MBytes 4878 KBytes/sec [ 4] 4.00-5.00 sec 4.72 MBytes 4834 KBytes/sec [ 4] 5.00-6.00 sec 4.78 MBytes 4898 KBytes/sec [ 4] 6.00-7.00 sec 4.87 MBytes 4989 KBytes/sec [ 4] 7.00-8.00 sec 4.81 MBytes 4925 KBytes/sec [ 4] 8.00-9.00 sec 4.71 MBytes 4823 KBytes/sec [ 4] 9.00-10.00 sec 4.82 MBytes 4934 KBytes/sec - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bandwidth Retr [ 4] 0.00-10.00 sec 48.0 MBytes 4919 KBytes/sec 9 sender [ 4] 0.00-10.00 sec 47.6 MBytes 4873 KBytes/sec receiver Server output: [...] [ 5] 0.00-10.04 sec 48.0 MBytes 4900 KBytes/sec 9 sender Test 1: Enabling dynamic record size from baseline Comparing with baseline results after enabling dynamic record size Baseline: Dynamic record size disabled dynamic record size enabled Percentage improvement curl average download, k 4950 5272 6.51% iperf3 network bandwidth, KBytes/sec 4873 5138 5.44% While this may not appear to be too high on the AWS cloud, there is also received feedback from customers that they see greater improvements in environments, especially in cases where the end-to-end latencies increase. Implementation strategy and recommendations As you plan to introduce this in your environment, take note of the following recommendations: Every environment is unique Many factors can affect network performance. This can range from VLAN settings (For example. MTU), TCP settings, intermediate network device throttling, and so on. You must perform testing in your own environment before enabling the feature. Implement the feature incrementally for a selected group of users. There are different ways to do this. For example, use an iRule to redirect users based on a URL, to a separate virtual server using a different Client SSL profile that has the feature enabled. Refer to SSL::allow_dynamic_record_sizing on Clouddocs. Monitor BIG-IP system logs and resource usage After you enable dynamic record size, make sure that your BIG-IP system continues to function as expected by monitoring the following monitor /var/log/ltm and /var/log/apm log files monitor BIG-IP CPU and memory usage. For example, you can select Dashboard on the Configuration utility, generating a QKview and analyze it in iHealth and so on. For more information, refer to K71764661: Understanding BIG-IP CPU usage and K16419: Overview of BIG-IP memory usage Verify and analyze SSL statistics Use the tmsh command inK41057430: Enhanced SSL profile statistics and check for failures. The SSL Dynamic Record Sizes section should also indicate use of large record sizes. Boosting TLS Performance with Dynamic Record Sizing on BIG-IP Conclusion There are a variety of different ways to improve VPN speeds, and this article describes just one. For other options and considerations, refer to K31143831: VPN for business continuity | Chapter 5: Optimizing Network Access VPN.253Views2likes0CommentsTuning the TCP Profile, Part One
A few months ago I pointed out some problems with the existing F5-provided TCP profiles, especially the default one. Today I'll begin a pass through the (long) TCP profile to point out the latest thinking on how to get the most performance for your applications. We'll go in the order you see these profile options in the GUI. But first, a note about programmability: in many cases below, I'm going to ask you to generalize about the clients or servers you interact with, and the nature of the paths to those hosts. In a perfect world, we'd detect that stuff automatically and set it for you, and in fact we're rolling that out setting by setting. In the meantime, you can customize your TCP parameters on a per-connection basis using iRules for many of the settings described below, something I'll explain further where applicable. In general, when I refer to "performance" below, I'm referring to the speed at which your customer gets her data. Performance can also refer to the scalability of your application delivery due to CPU and memory limitations, and when that's what I mean, I'll say so. Timer Management The one here with a big performance impact isMinimum RTO. When TCP computes its Retransmission Timeout (RTO), it takes the average measured Round Trip Time (RTT) and adds a few standard deviations to make sure it doesn't falsely detect loss. (False detections have very negative performance implications.) But if RTT is low and stable that RTO may betoolow, and the minimum is designed to catch known fluctuations in RTT that the connection may not have observed. Set Minimum RTO too low, and TCP may improperly enter congestion response and reduce the sending rate all the way down to one packet per round trip. Set it too high, and TCP sits idle when it ought to retransmit lost data. So what's the right value? Obviously, if you have a sense of the maximum RTT to your clients (which you can get with the ping command), that's a floor for your value. Furthermore, many clients and servers will implement some sort of Delayed ACK, which reduces ACK volume by sometimes holding them back for up to 200ms to see if it can aggregate more data in the ACK. RFC 5681 actually allows delays of up to 500ms, but this is less common. So take the maximum RTT and add 200 to 500 ms. Another group of settings aren't really about throughput,but to help clients and servers to close gracefully, at the cost of consuming some system resources. Long Close Wait, Fin Wait 1, Fin Wait 2, and Time Wait timers will keep connection state alive to make sure the remote host got all the connection close messages. Enabling Reset On Timeout sends a message that tells the peer to tear down the connection. Similarly, disabling Time Wait Recycle will prevent new connections from using the same address/port combination, making sure that the old connection with that combination gets a full close. The last group of settingskeeps possibly dead connections alive,using system resources to maintain state in case they come back to life. Idle Timeout and Zero Window Timeout commit resources until the timer expires. If you set Keep Alive Interval to a valuelessthan the Idle Timeout, then on the clientside BIG-IP will keep the connection alive as long as the client keeps responding to keepalive and the server doesn't terminate the connection itself. In theory, this could be forever! Memory Management In terms of high throughput performance, you want all of these settings to be as large as possible up to a point. The tradeoff is that setting them too high may waste memory and reduce the number of supportable concurrent connections. I say "may" waste because these are limitson memory use, and BIG-IP doesn't allocate the memory until it needs it for buffered data.Even so, the trick is to set the limits large enough that there are no performance penalties, but no larger. Send Buffer and Receive Window are easy to set in principle, but can be tricky in practice. For both, answer these questions: What is the maximum bandwidth (Bytes/second) that BIG-IP might experience sending or receiving? Out of all paths data might travel, what minimum delay among those paths is the highest? (What is the "maximum of the minimums"?) Then you simply multiply Bytes/second by seconds of delay to get a number of bytes. This is the maximum amount of data that TCP ought to have in flight at any one time, which should be enough to prevent TCP connections from idling for lack of memory. If your application doesn't involve sending or receiving much data on that side of the proxy, you can probably get away with lowering the corresponding buffer size to save on memory. For example, a traditional HTTP proxy's clientside probably can afford to have a smaller receive buffer if memory-constrained. There are three principles to follow in setting Proxy Buffer Limits: Proxy Buffer High should be at least as big as the Send Buffer. Otherwise, if a large ACK clears the send buffer all at once there may be less data available than TCP can send. Proxy Buffer Low should be at least as big as the Receive Window on the peer TCP profile(i.e. for the clientside profile, use the receive window on the serverside profile). If not, when the peer connection exits the zero-window state, new data may not arrive before BIG-IP sends all the data it has. Proxy Buffer High should be significantly larger than Proxy Buffer Low (we like to use a 64 KB gap) to avoid constant flapping to and from the zero-window state on the receive side. Obviously, figuring out bandwidth and delay before a deployment can be tricky. This is a place where some iRule mojo can really come in handy. The TCP::rtt and TCP::bandwidth* commands can give you estimates of both quantities you need, even though the RTT isn't a minimum RTT. Alternatively, if you've enabled cmetrics-cache in the profile, you can also obtain historical data for a destination using the ROUTE::cwnd* command, which is a good (possibly low) guess at the value you should plug into the send and receive buffers. You can then set buffer limits directly usingTCP::sendbuf**,TCP::recvwnd**, and TCP::proxybuffer**. Getting this to work very well will be difficult, and I don't have any examples where someone worked it through and proved a benefit. But if your application travels highly varied paths and you have the inclination to tinker, you could end up with an optimized configuration. If not, set the buffer sizes using conservatively high inputs and carry on. *These iRule commands only supported in TMOS® version 12.0.0 and later. **These iRule commands only supported inTMOS® version 11.6.0and later.3.4KViews0likes6CommentsIs TCP's Nagle Algorithm Right for Me?
Of all the settings in the TCP profile, the Nagle algorithm may get the most questions. Designed to avoid sending small packets wherever possible, the question of whether it's right for your application rarely has an easy, standard answer. What does Nagle do? Without the Nagle algorithm, in some circumstances TCP might send tiny packets. In the case of BIG-IP®, this would usually happen because the server delivers packets that are small relative to the clientside Maximum Transmission Unit (MTU). If Nagle is disabled, BIG-IP will simply send them, even though waiting for a few milliseconds would allow TCP to aggregate data into larger packets. The result can be pernicious. Every TCP/IP packet has at least 40 bytes of header overhead, and in most cases 52 bytes. If payloads are small enough, most of the your network traffic will be overhead and reduce the effective throughput of your connection. Second, clients with battery limitations really don't appreciate turning on their radios to send and receive packets more frequently than necessary. Lastly, some routers in the field give preferential treatment to smaller packets. If your data has a series of differently-sized packets, and the misfortune to encounter one of these routers, it will experience severe packet reordering, which can trigger unnecessary retransmissions and severely degrade performance. Specified in RFC 896 all the way back in 1984, the Nagle algorithm gets around this problem by holding sub-MTU-sized data until the receiver has acked all outstanding data. In most cases, the next chunk of data is coming up right behind, and the delay is minimal. What are the Drawbacks? The benefits of aggregating data in fewer packets are pretty intuitive. But under certain circumstances, Nagle can cause problems: In a proxy like BIG-IP, rewriting arriving packets in memory into a different, larger, spot in memory taxes the CPU more than simply passing payloads through without modification. If an application is "chatty," with message traffic passing back and forth, the added delay could add up to a lot of time. For example, imagine a network has a 1500 Byte MTU and the application needs a reply from the client after each 2000 Byte message. In the figure at right, the left diagram shows the exchange without Nagle. BIG-IP sends all the data in one shot, and the reply comes in one round trip, allowing it to deliver four messages in four round trips. On the right is the same exchange with Nagle enabled. Nagle withholds the 500 byte packet until the client acks the 1500 byte packet, meaning it takes two round trips to get the reply that allows the application to proceed. Thus sending four messages takes eight round trips. This scenario is a somewhat contrived worst case, but if your application is more like this than not, then Nagle is poor choice. If the client is using delayed acks (RFC 1122), it might not send an acknowledgment until up to 500ms after receipt of the packet. That's time BIG-IP is holding your data, waiting for acknowledgment. This multiplies the effect on chatty applications described above. F5 Has Improved on Nagle The drawbacks described above sound really scary, but I don't want to talk you out of using Nagle at all. The benefits are real, particularly if your application servers deliver data in small pieces and the application isn't very chatty. More importantly, F5® has made a number of enhancements that remove a lot of the pain while keeping the gain: Nagle-aware HTTP Profiles: all TMOS HTTP profiles send a special control message to TCP when they have no more data to send. This tells TCP to send what it has without waiting for more data to fill out a packet. Autonagle:in TMOS v12.0, users can configure Nagle as "autotuned" instead of simply enabling or disabling it in their TCP profile. This mechanism starts out not executing the Nagle algorithm, but uses heuristics to test if the receiver is using delayed acknowledgments on a connection; if not, it applies Nagle for the remainder of the connection. If delayed acks are in use, TCP will not wait to send packets but will still try to concatenate small packets into MSS-size packets when all are available. [UPDATE:v13.0 substantially improves this feature.] One small packet allowed per RTT: beginning with TMOS® v12.0, when in 'auto' mode that has enabled Nagle, TCP will allow one unacknowledged undersize packet at a time, rather than zero. This speeds up sending the sub-MTU tail of any message while not allowing a continuous stream of undersized packets. This averts the nightmare scenario above completely. Given these improvements, the Nagle algorithm is suitable for a wide variety of applications and environments. It's worth looking at both your applications and the behavior of your servers to see if Nagle is right for you.1.3KViews2likes5CommentsImplementing ECC+PFS on LineRate (Part 1/3): Choosing ECC Curves and Preparing SSL Certificates
(Editors note: the LineRate product has been discontinued for several years. 09/2023) --- Overview In case you missed it,Why ECC and PFS Matter: SSL offloading with LineRatedetails some of the reasons why ECC-based SSL has advantages over RSA cryptography for both performance and security. This article will generate all the necessary ECC certificates with the secp384r1 curve so that they may be used to configure an LineRate System for SSL Offload. Getting Started with LineRate In order to appreciate the advantages of SSL/TLS Offload available via LineRate as discussed in this article, let's take a closer look at how to configure SSL/TLS Offloading on a LineRate system. This example will implement Elliptical Curve Cryptography and Perfect Forward Secrecy. SSL Offloading will be added to an existing LineRate System that has one public-facing Virtual IP (10.10.11.11) that proxies web requests to a Real Server on an internal network (10.10.10.1). The following diagram demonstrates this configuration: Figure 1: A high-level implementation of SSL Offload Overall, these steps will be completed in order to enable SSL Offloading on the LineRate System: Generate a private key specifying the secp384r1 elliptic curve Obtain a certificate from a CA Configure an SSL profile and attach it to the Virtual IP Note that this implementation will enable only ECDHE cipher suites. ECDH cipher suites are available, but these do not implement the PFS feature. Further, in production deployments, considerations to implement additional types of SSL cryptography might be needed in order to allow backward compatibility for older clients. Generating a private key for Elliptical Curve Cryptography When considering the ECC curve to use for your environment, you may choose one from the currently available curves list in the LineRate documentation. It is important to be cognizant of the curve support for the browsers or applications your application targets using. Generally, the NIST P-256, P-384, and P-521 curves have the widest support. This example will use the secp384r1 (NIST P-384) curve, which provides an RSA equivalent key of 7680-bits. Supported curves with OpenSSL can be found by running the openssl ecparam -list_curves command, which may be important depending on which curve is chosen for your SSL/TLS deployment. Using OpenSSL, a private key is generated for use with ssloffload.lineratesystems.com. The ECC SECP curve over a 384-bit prime field (secp384r1) is specified: openssl ecparam -genkey -name secp384r1 -out ssloffload.lineratesystems.com.key.pem This command results in the following private key: -----BEGIN EC PARAMETERS----- BgUrgQQAIg== -----END EC PARAMETERS----- -----BEGIN EC PRIVATE KEY----- MIGkAgEBBDD1Kx9hghSGCTujAaqlnU2hs/spEOhfpKY9EO3mYTtDmKqkuJLKtv1P 1/QINzAU7JigBwYFK4EEACKhZANiAASLp1bvf/VJBJn4kgUFundwvBv03Q7c3tlX kh6Jfdo3lpP2Mf/K09bpt+4RlDKQynajq6qAJ1tJ6Wz79EepLB2U40fC/3OBDFQx 5gSjRp8Y6aq8c+H8gs0RKAL+I0c8xDo= -----END EC PRIVATE KEY----- Generating a Certificate Request (CSR) to provide the Certificate Authority (CA) After the primary key is obtained, a certificate request (CSR) can be created. Using OpenSSL again, the following command is issued filling out all relevant information in the successive prompts: openssl req -new -key ssloffload.lineratesystems.com.key.pem -out ssloffload.lineratesystems.com.csr.pem This results in the following CSR: -----BEGIN CERTIFICATE REQUEST----- MIIB3jCCAWQCAQAwga8xCzAJBgNVBAYTAlVTMREwDwYDVQQIEwhDb2xvcmFkbzET MBEGA1UEBxMKTG91aXN2aWxsZTEUMBIGA1UEChMLRjUgTmV0d29ya3MxGTAXBgNV BAsTEExpbmVSYXRlIFN5c3RlbXMxJzAlBgNVBAMTHnNzbG9mZmxvYWQubGluZXJh dGVzeXN0ZW1zLmNvbTEeMBwGCSqGSIb3DQEJARYPYS5yYWdvbmVAZjUuY29tMHYw EAYHKoZIzj0CAQYFK4EEACIDYgAEi6dW73/1SQSZ+JIFBbp3cLwb9N0O3N7ZV5Ie iX3aN5aT9jH/ytPW6bfuEZQykMp2o6uqgCdbSels+/RHqSwdlONHwv9zgQxUMeYE o0afGOmqvHPh/ILNESgC/iNHPMQ6oDUwFwYJKoZIhvcNAQkHMQoTCGNpc2NvMTIz MBoGCSqGSIb3DQEJAjENEwtGNSBOZXR3b3JrczAJBgcqhkjOPQQBA2kAMGYCMQCn h1NHGzigooYsohQBzf5P5KO3Z0/H24Z7w8nFZ/iGTEHa0+tmtGK/gNGFaSH1ULcC MQCcFea3plRPm45l2hjsB/CusdNo0DJUPMubLRZ5mgeThS/N6Eb0AHJSjBJlE1fI a4s= -----END CERTIFICATE REQUEST----- Obtaining a Certificate from a Certificate Authority (CA) Rather than using a self-signed certificate, a test certificate is obtained from Entrust. Upon completing the certificate request and receiving it from Entrust, a simple conversion needs to be done to PEM format. This can be done with the following OpenSSL command: openssl x509 -inform der -in ssloffload.lineratesystems.com.cer -out ssloffload.lineratesystems.com.cer.pem This results in the following certificate: -----BEGIN CERTIFICATE----- MIIC5jCCAm2gAwIBAgIETUKHWzAKBggqhkjOPQQDAzBtMQswCQYDVQQGEwJVUzEW MBQGA1UEChMNRW50cnVzdCwgSW5jLjEfMB0GA1UECxMWRm9yIFRlc3QgUHVycG9z ZXMgT25seTElMCMGA1UEAxMcRW50cnVzdCBFQ0MgRGVtb25zdHJhdGlvbiBDQTAe Fw0xNDA4MTExODQ3MTZaFw0xNDEwMTAxOTE3MTZaMGkxHzAdBgNVBAsTFkZvciBU ZXN0IFB1cnBvc2VzIE9ubHkxHTAbBgNVBAsTFFBlcnNvbmEgTm90IFZlcmlmaWVk MScwJQYDVQQDEx5zc2xvZmZsb2FkLmxpbmVyYXRlc3lzdGVtcy5jb20wdjAQBgcq hkjOPQIBBgUrgQQAIgNiAASLp1bvf/VJBJn4kgUFundwvBv03Q7c3tlXkh6Jfdo3 lpP2Mf/K09bpt+4RlDKQynajq6qAJ1tJ6Wz79EepLB2U40fC/3OBDFQx5gSjRp8Y 6aq8c+H8gs0RKAL+I0c8xDqjgeEwgd4wDgYDVR0PAQH/BAQDAgeAMB0GA1UdJQQW MBQGCCsGAQUFBwMBBggrBgEFBQcDAjA3BgNVHR8EMDAuMCygKqAohiZodHRwOi8v Y3JsLmVudHJ1c3QuY29tL0NSTC9lY2NkZW1vLmNybDApBgNVHREEIjAggh5zc2xv ZmZsb2FkLmxpbmVyYXRlc3lzdGVtcy5jb20wHwYDVR0jBBgwFoAUJAVL4WSCGvgJ zPt4eSH6cOaTMuowHQYDVR0OBBYEFESqK6HoSFIYkItcfekqqozX+z++MAkGA1Ud EwQCMAAwCgYIKoZIzj0EAwMDZwAwZAIwXWvK2++3500EVaPbwvJ39zp2IIQ98f66 /7fgroRGZ2WoKLBzKHRljVd1Gyrl2E3BAjBG9yPQqTNuhPKk8mBSUYEi/CS7Z5xt dXY/e7ivGEwi65z6iFCWuliHI55iLnXq7OU= -----END CERTIFICATE----- Note that the certificate generation process is very familiar with Elliptical Curve Cryptography versus traditional cryptographic algorithms like RSA. Only a few differences are found in the generation of the primary key where an ECC curve is specified. Continue the Configuration Now that the certificates needed to configure Elliptical Curve Cryptography have been created, it is now time to configure SSL Offloading on LineRate. Part 2: Configuring SSL Offload on LineRate continues the demonstration of SSL Offloading by importing the certificate information generated in this article and getting the system up and running. In case you missed it,Why ECC and PFS Matter: SSL offloading with LineRatedetails some of the reasons why ECC-based SSL has advantages over RSA cryptography for both performance and security. (Editors note: the LineRate product has been discontinued for several years. 09/2023) Stay Tuned! Next week a demonstration on how to verify a correct implementation of SSL with ECC+PFS on LineRate will make a debut on DevCentral. The article will detail how to check for ECC SSL on the wire via WireShark and in the browser. In the meantime, take some time to download LineRate and test out its SSL Offloading capabilities. In case you missed any content, or would like to reference it again, here are the articles related to implementing SSL Offload with ECC and PFS on LineRate: Why ECC and PFS Matter: SSL offloading with LineRate Implementing ECC+PFS on LineRate (Part 1/3): Choosing ECC Curves and Preparing SSL Certificates Implementing ECC+PFS on LineRate (Part 2/3): Configuring SSL Offload on LineRate Implementing ECC+PFS on LineRate (Part 3/3): Confirming the Operation of SSL Offloading410Views0likes0CommentsiRule Execution Tracing and Performance Profiling, Part 2
In the last article we discussed some intriguing questions around iRules tracing and profiling. There are a lot we can do to facilitate iRules debugging and performance tuning, and we took the first step in BIG-IP v13.1. We are pleased to announce that iRules tracing and profiling feature is added to TMOS tool chain in this release. This feature enables both users and iRules infrastructure development team to broaden iRules functionality and application prospects. In this article we will describe the focal use cases and design principles of the feature; we will use examples to demonstrate how to use the feature. Use case and consideration Following are the major use model we target and the design principles we employ for the feature in this release: This feature is mostly used in debug environment. Execution tracing and performance tuning are part of development process. We envision that the most typical use case is users take to the tracing functionality when scripts are still in development. It is possible that the iRules are running on production systems, yet when users like to inspect performance issues, they take the scripts to lab environment to collect performance data, analyze bottleneck and experiment algorithm tuning. The solution needs to be non-intrusive. Tcl language has tracing constructs in its core infrastructure, for example trace command is a Tcl core command. Historically trace command is not supported by iRules. Although it is a viable option to reuse Tcl's trace command as the basic facility for the tracing feature, we make the choice that the feature does not require users to modify the script; instead, it is a passive observation solution: users configure the conditions and leave the script intact. This provides a smooth debugging experience: users can enable the feature, observe the result and adjust the feature configuration continuously. Graphical interface is not critical in the first release. Ideally the feature is presented in a GUI environment. We believe a command line interaction in the first release properly addresses the most critical use cases. Separate data collecting and data scoping. Tracing comes with high volume data, the deeper and wider the tracing goes, the more data are generated. In this release we focus on picking the right data (the "occurrences" presented in the last article) and delivering the data timely and securely. Providing tool chain to help users to mine the data is a logical next step. The Feature OK, now we are ready to present the feature. The audience of this article are familiar with BIG-IP and iRules, so let's have fun and jump to some examples. Example Here is a simple iRules script: ltm rule dc_1 { when CLIENT_ACCEPTED { if {[IP::remote_addr] eq "10.10.10.2"} { set the_one true } else { set the_one false } } when HTTP_REQUEST { if {$the_one} { HTTP::uri [string map {myPortal UserPortal} [HTTP::uri]] } } } Now issue the following TMSH commands to insert the tracer: tmsh create ltm rule-profiler dc_1_tracer event-filter add { CLIENT_ACCEPTED HTTP_REQUEST } vs-filter add { /Common/vs1 } publisher tracer_pub1 tmsh modify ltm rule-profiler dc_1_tracer occ-mask { cmd cmd-vm event rule rule-vm var-mod } tmsh modify ltm rule-profiler dc_1_tracer state enabled The feature introduces a new TMSH configuration object, rule-profiler. There can be multiple rule-profiler objects, this facilitates difference tracing scenarios. Following attributes of rule-profiler are configured in the above TMSH commands: event-filter, the iRule events to trace; if not defined, all events are traced. vs-filter, the virtual servers to trace; if not defined, all virtual servers are traced. occ-mask, the occurrences to trace. The viable values are as explained in the last article. publisher, the syslog publisher to receive the tracing log. The user needs to enable the rule-profiler object after the configuration. Now the tracing facility is ready. We start the tracing on a running traffic by the following command: tmsh start ltm rule-profiler dc_1_tracer The lab setup for this example has publisher point to the local syslog and here is a partial capture of the output in /var/log/ltm (the color coding is for this article to describe the data): 1511494952932589,RP_EVENT_ENTRY,/Common/vs1,CLIENT_ACCEPTED,18291,0x94558076344064,10.10.10.2,46052,0,10.10.10.160,80, 01511494952932622,RP_RULE_ENTRY,/Common/vs1,/Common/dc_1,18291,0x94558076344064,10.10.10.2,46052,0,10.10.10.160,80,0 1511494952932625,RP_RULE_VM_ENTRY,/Common/vs1,/Common/dc_1,18291,0x94558076344064,10.10.10.2,46052,0,10.10.10.160,80,0 1511494952932628,RP_CMD_VM_ENTRY,/Common/vs1,IP::remote_addr,18291,0x94558076344064,10.10.10.2,46052,0,10.10.10.160,80,0 1511494952932630,RP_CMD_ENTRY,/Common/vs1,IP::remote_addr,18291,0x94558076344064,10.10.10.2,46052,0,10.10.10.160,80,0 1511494952932633,RP_CMD_EXIT,/Common/vs1,IP::remote_addr,18291,0x94558076344064,10.10.10.2,46052,0,10.10.10.160,80,0 1511494952932636,RP_CMD_VM_EXIT,/Common/vs1,IP::remote_addr,18291,0x94558076344064,10.10.10.2,46052,0,10.10.10.160,80, 01511494952932638,RP_VAR_MOD,/Common/vs1,the_one=true,18291,0x94558076344064,10.10.10.2,46052,0,10.10.10.160,80 Now let us take a close look at the logs. Occurrence fields Each tracing occurrence, as defined by "occ-mask" attribute, dumps one line to the log file; there are 7 fields within each line: Timestamp - This is the TMM time stamp at the occurrence. The unit is micro second. Occurrence type - This is the type of the occurrence. It is corresponding to "occ-mask" attribute definition. The meaning of each occurrence is defined in the last article. "RP" stands for "Rule Profiler". Virtual server - This is the name of the virtual server on which the iRules are running. Occurrence value - This is the value of the corresponding occurrence. For example, the first log in the above snippet is at the entry occurrence of iRule event CLIENT_ACCEPTED. Process ID - This is the TMM process ID. Flow ID - This is the flow ID. Remote tuple - There are 3 fields, IP address, port and routing domain. Local tuple - There are 3 fields, IP address, port and routing domain. Stop Tracing User can stop the tracing by issuing this TMSH command: tmsh stop ltm rule-profiler <rule-profiler name> Tracer has a built-in timer, it stops tracing after 10 milliseconds. User can adjust it through the following command: tmsh modify ltm rule-profiler dc_1_tracer period <new value in ms> Bytecode As mentioned in the first article, the tracing feature supports bytecode tracing. Using the example above, add "bytecode" to "occ-mask": tmsh modify ltm rule-profiler dc_1_tracer occ-mask { bytecode cmd cmd-vm event rule rule-vm var-mod } With this addition, you will see the bytecode execution. But keep one thing in mind, you will see a lot more logging. 1511673715911247,RP_EVENT_ENTRY,/Common/vs1,CLIENT_ACCEPTED,18291,0x94558076344064,10.10.10.2,41772,0,10.10.10.160,80,0, 1511673715911273,RP_RULE_ENTRY,/Common/vs1,/Common/dc_1,18291,0x94558076344064,10.10.10.2,41772,0,10.10.10.160,80,0,2 1511673715911277,RP_RULE_VM_ENTRY,/Common/vs1,/Common/dc_1,18291,0x94558076344064,10.10.10.2,41772,0,10.10.10.160,80,0,2 1511673715911280,RP_CMD_BYTECODE,/Common/vs1,push1,18291,0x94558076344064,10.10.10.2,41772,0,10.10.10.160,80,0,2 1511673715911282,RP_CMD_BYTECODE,/Common/vs1,invokeStk1,18291,0x94558076344064,10.10.10.2,41772,0,10.10.10.160,80,0,2 1511673715911284,RP_CMD_VM_ENTRY,/Common/vs1,IP::remote_addr,18291,0x94558076344064,10.10.10.2,41772,0,10.10.10.160,80,0,2 1511673715911286,RP_CMD_ENTRY,/Common/vs1,IP::remote_addr,18291,0x94558076344064,10.10.10.2,41772,0,10.10.10.160,80,0,2 1511673715911288,RP_CMD_EXIT,/Common/vs1,IP::remote_addr,18291,0x94558076344064,10.10.10.2,41772,0,10.10.10.160,80,0,0 1511673715911291,RP_CMD_VM_EXIT,/Common/vs1,IP::remote_addr,18291,0x94558076344064,10.10.10.2,41772,0,10.10.10.160,80,0,0 1511673715911293,RP_CMD_BYTECODE,/Common/vs1,push1,18291,0x94558076344064,10.10.10.2,41772,0,10.10.10.160,80,0,2 1511673715911295,RP_CMD_BYTECODE,/Common/vs1,streq,18291,0x94558076344064,10.10.10.2,41772,0,10.10.10.160,80,0,2 1511673715911296,RP_CMD_BYTECODE,/Common/vs1,push1,18291,0x94558076344064,10.10.10.2,41772,0,10.10.10.160,80,0,2 1511673715911298,RP_CMD_BYTECODE,/Common/vs1,push1,18291,0x94558076344064,10.10.10.2,41772,0,10.10.10.160,80,0,2 1511673715911300,RP_CMD_BYTECODE,/Common/vs1,storeScalarStk,18291,0x94558076344064,10.10.10.2,41772,0,10.10.10.160,80,0,2 1511673715911302,RP_VAR_MOD,/Common/vs1,the_one=true,18291,0x94558076344064,10.10.10.2,41772,0,10.10.10.160,80,0 What's to come This article describes the TMSH commands to configure rule-profiler and how to interpret the tracing logs. Next article will talk about tips and tricks of using the tracing feature. Authors:Jibin Han, Bonny Rais1.2KViews1like1CommentURL rewrite through iRule
Hi Guys, i have one "Performance (HTTP)" virtual server on F5-1600 series, and i want to change the URL "http://www.abc.com" to "http://partner.abc.com/xyz". i have tried all below scripts : 1- when HTTP_REQUEST { if {([string tolower [HTTP::host]] equals "http://www.abc.com")}{ HTTP::header replace Host "http://partner.abc.com/xyz" } } 2- when HTTP_REQUEST { if { not ([HTTP::uri] starts_with "/xyz") } { HTTP::uri /xyz[HTTP::uri] } } 3- when HTTP_REQUEST { if {[HTTP::uri] equals {http://www.abc.com}} {HTTP::uri {http://partner.abc.com/xyz} } } but i wasn't successful! can anyone help me how can i do this through iRule ?Solved9.2KViews0likes27Commentshigh cpu usage independent from Traffic
Hello, we've recognised since a few weeks every day for about 4 hours from 9 to 13 very high cpu-usage on Control-Plane and Analysis-Plane. Overall concurrent Client-side connections between 1200 and 1800 That's also on the standby-Machine, so it's independent from Traffic (this F5 is for Traffic from Web and terminates ssl) the hardware is i4800, but it's the same on our virtual Test-Machine Version: 16.1.3.3, on Test: 16.1.3.4 Any hint, where to look for the cause? Thank you KarlSolved4.2KViews0likes13CommentsLTM - IP Fowarder Performance issues (Stateless Router config)
Hi All, Wondering if anyone else has issues with using an IP Forwarder in the manner described in this article (Specifically - Emulating stateless IP routing with BIG-IP LTM forwarding virtual servers): https://support.f5.com/kb/en-us/solutions/public/7000/500/sol7595.html. Here's the scenario.... VLAN attached behind the BIG-IP, which has the web servers on. MSSQL servers sat on a VLAN reachable through the BIG-IP. The connections all work, just if SQL traffic isn't routed through the BIG-IP, it works fine. Otherwise, behind the BIG-IP, there is severe delays. I'd suggest it be a good idea not to route this through the BIG-IP, but I wondered what the F5 communities' take on this would be. In short....Simple IP Forwarder (Stateless) for mssql traffic... Good or bad idea? Thanks, JD423Views1like4CommentsBack to Basics: The Many Modes of Proxies
The simplicity of the term "proxy" belies the complex topological options available. Understanding the different deployment options will enable your proxy deployment to fit your environment and, more importantly, your applications. It seems so simple in theory. A proxy is a well-understood concept that is not peculiar to networking. Indeed, some folks vote by proxy, they speak by proxy (translators), and even on occasion, marry by proxy. A proxy, regardless of its purpose, sits between two entities and performs a service. In network architectures the most common use of a proxy is to provide load balancing services to enable scale, reliability and even performance for applications. Proxies can log data exchanges, act as a gatekeeper (authentication and authorization), scan inbound and outbound traffic for malicious content and more. Proxies are a key strategic point of control in the data center because they are typically deployed as the go-between for end-users and applications. These go-between services are often referred to as virtual services, and for purposes of this blog that's what we'll call them. It's an important distinction because a single proxy can actually act in multiple modes on a per-virtual service basis. That's all pretty standard stuff. What's not simple is when you start considering how you want your proxy to act. Should it be a full proxy? A half proxy? Should it route or forward? There are multiple options for these components and each has its pros and cons. Understanding each proxy "mode" is an important step toward architecting a suitable solution for your environment as the mode determines the behavior of traffic as it traverses the proxy. Standard Virtual Service (Full Application Proxy) The standard virtual service provided by a full proxy fully terminates the transport layer connections (typically TCP) and establishes completely separate transport layer connections to the applications. This enables the proxy to intercept, inspect and ultimate interact with the data (traffic) as its flowing through the system. Any time you need to inspect payloads (JSON, HTML, XML, etc...) or steer requests based on HTTP headers (URI, cookies, custom variables) on an ongoing basis you'll need a virtual service in full proxy mode. A full proxy is able to perform application layer services. That is, it can act on protocol and data transported via an application protocol, such as HTTP. Performance Layer 4 Service (Packet by Packet Proxy) Before application layer proxy capabilities came into being, the primary model for proxies (and load balancers) was layer 4 virtual services. In this mode, a proxy can make decisions and interact with packets up to layer 4 - the transport layer. For web traffic this almost always equates to TCP. This is the highest layer of the network stack at which SDN architectures based on OpenFlow are able to operate. Today this is often referred to as flow-based processing, as TCP connections are generally considered flows for purposes of configuring network-based services. In this mode, a proxy processes each packet and maps it to a connection (flow) context. This type of virtual service is used for traffic that requires simple load balancing, policy network routing or high-availability at the transport layer. Many proxies deployed on purpose-built hardware take advantage of FPGAs that make this type of virtual service execute at wire speed. A packet-by-packet proxy is able to make decisions based on information related to layer 4 and below. It cannot interact with application-layer data. The connection between the client and the server is actually "stitched" together in this mode, with the proxy primarily acting as a forwarding component after the initial handshake is completed rather than as an endpoint or originating source as is the case with a full proxy. IP Forwarding Virtual Service (Router) For simple packet forwarding where the destination is based not on a pooled resource but simply on a routing table, an IP forwarding virtual service turns your proxy into a packet layer forwarder. A IP forwarding virtual server can be provisioned to rewrite the source IP address as the traffic traverses the service. This is done to force data to return through the proxy and is referred to as SNATing traffic. It uses transport layer (usually TCP) port multiplexing to accomplish stateful address translation. The address it chooses can be load balanced from a pool of addresses (a SNAT pool) or you can use an automatic SNAT capability. Layer 2 Forwarding Virtual Service (Bridge) For situations where a proxy should be used to bridge two different Ethernet collision domains, a layer 2 forwarding virtual service an be used. It can be provisioned to be an opaque, semi-opaque, or transparent bridge. Bridging two Ethernet domains is like an old timey water brigade. One guy fills a bucket of water (the client) and hands it to the next guy (the proxy) who hands it to the destination (the server/service) where it's thrown on the fire. The guy in the middle (the proxy) just bridges the gap (you're thinking what I'm thinking - that's where the term came from, right?) between the two Ethernet domains (networks).2.1KViews0likes3Comments