F5 Friday: Latency, Logging, and Sprawl
#v11 Logging, necessary for a variety of reasons in the data center, can consume resources and introduce undesirable latency. Avoiding that latency improves application performance and in some cases, the quality of logs.
Logging. It’s mandatory and, in some industries, critical. Logs are used not only for auditing and tracking but for debugging, for data mining and analysis, and in some tiers of the architecture, replication and synchronization of data.
Logs are a critical component across the data center, of that there is no doubt. That’s why it’s particularly frustrating to know that the cost in terms of performance is also one of the highest, lagging only slightly behind graphics in terms of performance costs. Given that there is very little graphics-related processing that goes on in data center components, disk I/O leaps to the top of the stack when it comes to performance impeding operations. The latency introduced by writing to a log often impacts the overall performance experienced by end-users because of the consumption of resources on the component by the logging operations. While generally out-of-band and thus non-blocking today, the consumption of resources can negatively impede performance by draining memory and using CPU cycles to perform its required tasks.
Increasingly, as components are deployed in pairs, triples and more – owing to both scaling out physically and virtually – these logs also introduce “log sprawl” that can increase the cost associated with administration and make it more difficult to troubleshoot. After all, if you aren’t sure through which instance of a device a specific request was sent, you can’t easily find it in the log file.
For all these reasons, centralized and generally off-box logging for data center components is becoming more critical. Consider it “logging as a service” if you will. This is not a new concept; centralized syslog servers have long been leveraged to provide a centralized, easier to manage log service that can be leveraged by just about every data center component.
For load balancing services, the need is to not only centralize web-related logs but to ensure that they are written as fast as possible, to keep up with today’s demanding application environments. BIG-IP is no stranger to the need for high-speed, off-device logging and with v11 brings an open application, high-speed logging engine to bear.
BIG-IP HIGH-SPEED LOGGING ENGINE
One of the benefits of a unified, internal architecture is the ability to share improvements in the underlying platform across all products ultimately deployed on that platform. This is the case with TMOS, F5’s core application delivery technology. By enabling TMOS with a high-speed logging engine capable of up to 200,000 UDP/TCP messages per second, all modules – LTM, GTM, APM, ASM, WA, etc… – deployed on the TMOS platform automatically gain the benefits.
Support for both local and external (off-box) logging enables you to centralize the data in third-party logging engines and meet security and compliance requirements. That means you can, ostensibly, leverage the visibility of a strategic point of control in the network to perform logging of web requests (and responses if required) rather than spread the responsibility across what may be an unknown number of web servers. Consider that in a highly-virtualized or cloud computing -based architecture, the number of servers required to meet current demand is variable and makes collection of web-server written logs more difficult unless an off-server log service is leveraged. That’s because virtualized servers often simply write logs to the local disk, which may or may not be persistent enough to meet compliance – or operational – demands.
It’s also the case that some upstream infrastructure may modify the request and/or response, leaving logs with incomplete information. This is the case when an external application delivery controller acts as a Cookie gateway, a common function for adding security and consistency to web applications. Thus, logging at a strategic point, closest to the client, provides the most accurate picture of the request.
Consider, too, the impact on writing logs in the face of an attack. DDoS counts on the consumption of resources to drain server and network component capacity, and by increasing the number of requests a server has to handle, it also gets an added consumption bonus from the need to write to the log. This is true on upstream network components, which compounds the impact and drains more resources than necessary. By enabling high-speed logging on upstream devices, offloading responsibility to a log service, and eliminating the need for web servers to also write to disk, the impact of a concerted DDoS attack can be more effectively managed.
And if you’re going to use an off-server log service, it is more efficient to do so at a point upstream from the web-servers and gain the benefits of reducing resource consumption on the servers. Eliminating the resource consumption required by logs on the web server can have a very positive impact on the performance and capacity of the web server, which when combined with improvements in logging speed and reduced consumption on the BIG-IP translate into faster web applications and simplified log management strategy.
High speed logging (HSL) is configurable using the GUI (via the Request Logging Profile) and supports the W3C extended log format.
Happy Logging!