dynamic infrastructure
282 TopicsBack to Basics: Health Monitors and Load Balancing
#webperf #ado Because every connection counts One of the truisms of architecting highly available systems is that you never, ever want to load balance a request to a system that is down. Therefore, some sort of health (status) monitoring is required. For applications, that means not just pinging the network interface or opening a TCP connection, it means querying the application and verifying that the response is valid. This, obviously, requires the application to respond. And respond often. Best practices suggest determining availability every 5 seconds or so. That means every X seconds the load balancing service is going to open up a connection to the application and make a request. Just like a user would do. That adds load to the application. It consumes network, transport, application and (possibly) database resources. Resources that cannot be used to service customers. While the impact on a single application may appear trivial, it's not. Remember, as load increases performance decreases. And no matter how trivial it may appear, health monitoring is adding load to what may be an already heavily loaded application. But Lori, you may be thinking, you expound on the importance of monitoring and visibility all the time! Are you saying we shouldn't be monitoring applications? Nope, not at all. Visibility is paramount, providing the actionable data necessary to enable highly dynamic, automated operations such as elasticity. Visibility through health-monitoring is a critical means of ensuring availability at both the local and global level. What we may need to do, however, is move from active to passive monitoring. PASSIVE MONITORING Passive monitoring, as the modifier suggests, is not an active process. The Load balancer does not open up connections nor query an application itself. Instead, it snoops on responses being returned to clients and from that infers the current status of the application. For example, if a request for content results in an HTTP error message, the load balancer can determine whether or not the application is available and capable of processing subsequent requests. If the load balancer is a BIG-IP, it can mark the service as "down" and invoke an active monitor to probe the application status as well as retrying the request to another available instance – insuring end-users do not see an error. Passive (inband) monitors are not binary. That is, they aren't simple "on" or "off" based on HTTP status codes. Such monitors can be configured to track the number of failures and evaluate failure rates against a configurable failure interval. When such thresholds are exceeded, the application can then be marked as "down". Passive monitors aren't restricted to availability status, either. They can also monitor for performance (response time). Failure to meet response time expectations results in a failure, and the application continues to be watched for subsequent failures. Passive monitors are, like most inline/inband technologies, transparent. They quietly monitor traffic and act upon that traffic without adding overhead to the process. Passive monitoring gives operations the visibility necessary to enable predictable performance and to meet or exceed user expectations with respect to uptime, without negatively impacting performance or capacity of the applications it is monitoring.2.9KViews1like2CommentsThe Limits of Cloud: Gratuitous ARP and Failover
#Cloud is great at many things. At other things, not so much. Understanding the limitations of cloud will better enable a successful migration strategy. One of the truisms of technology is that takes a few years of adoption before folks really start figuring out what it excels at – and conversely what it doesn't. That's generally because early adoption is focused on lab-style experimentation that rarely extends beyond basic needs. It's when adoption reaches critical mass and folks start trying to use the technology to implement more advanced architectures that the "gotchas" start to be discovered. Cloud is no exception. A few of the things we've learned over the past years of adoption is that cloud is always on, it's simple to manage, and it makes applications and infrastructure services easy to scale. Some of the things we're learning now is that cloud isn't so great at supporting application mobility, monitoring of deployed services and at providing advanced networking capabilities. The reason that last part is so important is that a variety of enterprise-class capabilities we've come to rely upon are ultimately enabled by some of the advanced networking techniques cloud simply does not support. Take gratuitous ARP, for example. Most cloud providers do not allow or support this feature which ultimately means an inability to take advantage of higher-level functions traditionally taken for granted in the enterprise – like failover. GRATUITOUS ARP and ITS IMPLICATIONS For those unfamiliar with gratuitous ARP let's get you familiar with it quickly. A gratuitous ARP is an unsolicited ARP request made by a network element (host, switch, device, etc… ) to resolve its own IP address. The source and destination IP address are identical to the source IP address assigned to the network element. The destination MAC is a broadcast address. Gratuitous ARP is used for a variety of reasons. For example, if there is an ARP reply to the request, it means there exists an IP conflict. When a system first boots up, it will often send a gratuitous ARP to indicate it is "up" and available. And finally, it is used as the basis for load balancing failover. To ensure availability of load balancing services, two load balancers will share an IP address (often referred to as a floating IP). Upstream devices recognize the "primary" device by means of a simple ARP entry associating the floating IP with the active device. If the active device fails, the secondary immediately notices (due to heartbeat monitoring between the two) and will send out a gratuitous ARP indicating it is now associated with the IP address and won't the rest of the network please send subsequent traffic to it rather than the failed primary. VRRP and HSRP may also use gratuitous ARP to implement router failover. Most cloud environments do not allow broadcast traffic of this nature. After all, it's practically guaranteed that you are sharing a network segment with other tenants, and thus broadcasting traffic could certainly disrupt other tenant's traffic. Additionally, as security minded folks will be eager to remind us, it is fairly well-established that the default for accepting gratuitous ARPs on the network should be "don't do it". The astute observer will realize the reason for this; there is no security, no ability to verify, no authentication, nothing. A network element configured to accept gratuitous ARPs does so at the risk of being tricked into trusting, explicitly, every gratuitous ARP – even those that may be attempting to fool the network into believing it is a device it is not supposed to be. That, in essence, is ARP poisoning, and it's one of the security risks associated with the use of gratuitous ARP. Granted, someone needs to be physically on the network to pull this off, but in a cloud environment that's not nearly as difficult as it might be on a locked down corporate network. Gratuitous ARP can further be used to execute denial of service, man in the middle and MAC flooding attacks. None of which have particularly pleasant outcomes, especially in a cloud environment where such attacks would be against shared infrastructure, potentially impacting many tenants. Thus cloud providers are understandably leery about allowing network elements to willy-nilly announce their own IP addresses. That said, most enterprise-class network elements have implemented protections against these attacks precisely because of the reliance on gratuitous ARP for various infrastructure services. Most of these protections use a technique that will tentatively accept a gratuitous ARP, but not enter it in its ARP cache unless it has a valid IP-to-MAC mapping, as defined by the device configuration. Validation can take the form of matching against DHCP-assigned addresses or existence in a trusted database. Obviously these techniques would put an undue burden on a cloud provider's network given that any IP address on a network segment might be assigned to a very large set of MAC addresses. Simply put, gratuitous ARP is not cloud-friendly, and thus it is you will be hard pressed to find a cloud provider that supports it. What does that mean? That means, ultimately, that failover mechanisms in the cloud cannot be based on traditional techniques unless a means to replicate gratuitous ARP functionality without its negative implications can be designed. Which means, unfortunately, that traditional failover architectures – even using enterprise-class load balancers in cloud environments – cannot really be implemented today. What that means for IT preparing to migrate business critical applications and services to cloud environments is a careful review of their requirements and of the cloud environment's capabilities to determine whether availability and uptime goals can – or cannot – be met using a combination of cloud and traditional load balancing services.1.1KViews1like0CommentsThe Full-Proxy Data Center Architecture
Why a full-proxy architecture is important to both infrastructure and data centers. In the early days of load balancing and application delivery there was a lot of confusion about proxy-based architectures and in particular the definition of a full-proxy architecture. Understanding what a full-proxy is will be increasingly important as we continue to re-architect the data center to support a more mobile, virtualized infrastructure in the quest to realize IT as a Service. THE FULL-PROXY PLATFORM The reason there is a distinction made between “proxy” and “full-proxy” stems from the handling of connections as they flow through the device. All proxies sit between two entities – in the Internet age almost always “client” and “server” – and mediate connections. While all full-proxies are proxies, the converse is not true. Not all proxies are full-proxies and it is this distinction that needs to be made when making decisions that will impact the data center architecture. A full-proxy maintains two separate session tables – one on the client-side, one on the server-side. There is effectively an “air gap” isolation layer between the two internal to the proxy, one that enables focused profiles to be applied specifically to address issues peculiar to each “side” of the proxy. Clients often experience higher latency because of lower bandwidth connections while the servers are generally low latency because they’re connected via a high-speed LAN. The optimizations and acceleration techniques used on the client side are far different than those on the LAN side because the issues that give rise to performance and availability challenges are vastly different. A full-proxy, with separate connection handling on either side of the “air gap”, can address these challenges. A proxy, which may be a full-proxy but more often than not simply uses a buffer-and-stitch methodology to perform connection management, cannot optimally do so. A typical proxy buffers a connection, often through the TCP handshake process and potentially into the first few packets of application data, but then “stitches” a connection to a given server on the back-end using either layer 4 or layer 7 data, perhaps both. The connection is a single flow from end-to-end and must choose which characteristics of the connection to focus on – client or server – because it cannot simultaneously optimize for both. The second advantage of a full-proxy is its ability to perform more tasks on the data being exchanged over the connection as it is flowing through the component. Because specific action must be taken to “match up” the connection as its flowing through the full-proxy, the component can inspect, manipulate, and otherwise modify the data before sending it on its way on the server-side. This is what enables termination of SSL, enforcement of security policies, and performance-related services to be applied on a per-client, per-application basis. This capability translates to broader usage in data center architecture by enabling the implementation of an application delivery tier in which operational risk can be addressed through the enforcement of various policies. In effect, we’re created a full-proxy data center architecture in which the application delivery tier as a whole serves as the “full proxy” that mediates between the clients and the applications. THE FULL-PROXY DATA CENTER ARCHITECTURE A full-proxy data center architecture installs a digital "air gap” between the client and applications by serving as the aggregation (and conversely disaggregation) point for services. Because all communication is funneled through virtualized applications and services at the application delivery tier, it serves as a strategic point of control at which delivery policies addressing operational risk (performance, availability, security) can be enforced. A full-proxy data center architecture further has the advantage of isolating end-users from the volatility inherent in highly virtualized and dynamic environments such as cloud computing . It enables solutions such as those used to overcome limitations with virtualization technology, such as those encountered with pod-architectural constraints in VMware View deployments. Traditional access management technologies, for example, are tightly coupled to host names and IP addresses. In a highly virtualized or cloud computing environment, this constraint may spell disaster for either performance or ability to function, or both. By implementing access management in the application delivery tier – on a full-proxy device – volatility is managed through virtualization of the resources, allowing the application delivery controller to worry about details such as IP address and VLAN segments, freeing the access management solution to concern itself with determining whether this user on this device from that location is allowed to access a given resource. Basically, we’re taking the concept of a full-proxy and expanded it outward to the architecture. Inserting an “application delivery tier” allows for an agile, flexible architecture more supportive of the rapid changes today’s IT organizations must deal with. Such a tier also provides an effective means to combat modern attacks. Because of its ability to isolate applications, services, and even infrastructure resources, an application delivery tier improves an organizations’ capability to withstand the onslaught of a concerted DDoS attack. The magnitude of difference between the connection capacity of an application delivery controller and most infrastructure (and all servers) gives the entire architecture a higher resiliency in the face of overwhelming connections. This ensures better availability and, when coupled with virtual infrastructure that can scale on-demand when necessary, can also maintain performance levels required by business concerns. A full-proxy data center architecture is an invaluable asset to IT organizations in meeting the challenges of volatility both inside and outside the data center. Related blogs & articles: The Concise Guide to Proxies At the Intersection of Cloud and Control… Cloud Computing and the Truth About SLAs IT Services: Creating Commodities out of Complexity What is a Strategic Point of Control Anyway? The Battle of Economy of Scale versus Control and Flexibility F5 Friday: When Firewalls Fail… F5 Friday: Platform versus Product4.4KViews1like1CommentUseful IT. Bringing Health Record Transfer into the 21st Century.
I read the Life as a Healthcare CIO blog on occasion, mostly because as a former radiographer, health care records integration and other non-diagnostic IT use in healthcare is a passing interest of mine. Within the last hospital I worked at the systems didn’t communicate – not even close, as in there was no effort to make them do so. This intrigues me, as since I’ve entered IT I have watched technology uptake in healthcare slowly ramp up at a great curve behind the rest of the business world. Oh make no mistake, technology has been in overdrive on the equipment used, but things like systems interoperability and utilizing technology to make doctors, nurses, and tech’s lives easier is just slower in the medical world. A huge chunk of the resistance is grounded in a very common sense philosophy. “When people’s lives are on the line you do not rush willy-nilly to the newest gadget.” No one in healthcare says it that way – at least not to my knowledge – but that’s the essence of what they think. I can think of a few businesses that could use that same mentality applied occasionally with a slightly different twist: “When the company’s viability is on the line…” but that’s a different blog. Even with this very common-sense resistance, there has been a steady acceleration of uptake in technology use for things like patient records and prescriptions. It has been interesting to watch, as someone on the outside with plenty of experience with the way hospitals worked and their systems were all silos. Healthcare IT is to be commended for things like electronic prescription pads and instant transfer of (now nearly all electronic) X-Rays to those who need them to care for the patient. Applying the “this can help with little impact on critical care” or even “this can help with positive impact on critical care and little risk of negative impact” viewpoint as a counter to the above-noted resistance has produced some astounding results. A friend of mine from my radiographer days is manager of a Cardiac Cath Lab, and talking with him is just fun. “Dude, ninety percent of the pups coming out of Radiology schools can’t set an exposure!” is evidence that diagnostic tools are continuing to take advantage of technology – in this case auto-detecting XRay exposure limits. He has more glowing things to say about the non-diagnostic growth of technology within any given organization. But outside the organization? Well that’s a completely different story. The healthcare organization wants to keep your records safe and intact, and rarely even want to let you touch them. That’s just a case of the “intact” bit. Some people might want their records to not contain some portion – like their blood alcohol level when brought to the ER – and some people might inadvertently lose some portion of the record. While they’re more than happy to send them on a referral, and willing to give you a copy if you’re seeking a second opinion, these records all have one archaic quality. Paper. If I want to buy a movie, I can go to netflix, sign up, and stream it (at least many of them) to watch. If I want my medical records transferred to a specialist so I can get treatment before my left eye oozes out of its socket, they have to be copied, verified, and mailed. If they’re short or my eye is on the verge of falling out right this instant, then they might be faxed. But the bulk of records are mailed. Even overnight is another day lost in the treatment cycle. Recently – the last couple of years – there has been a movement to replicate the records delivery process electronically. As time goes on, more and more of your medical records are being stored digitally. It saves room, time, and makes it easier for a doctor to “request” your record should he need it in a hurry. It also makes it easier to track accidental or even intentional changes in records. While it didn’t happen as often as fear-mongers and ambulance chasers want you to believe, of course there are deletions and misplacements in the medical records of the 300 million US citizens. An electronic system never forgets, so while something as simple as a piece of paper falling out of a record could forever change it, in electronic form that can’t happen. Even an intentional deletion can be “deleted” as in not show up, but still there, stored with your other information so that changes can be checked should the need ever arise. The inevitable off-shoot of electronic records is the ability to communicate them between hospitals. If you’re in the ER in Tulsa, and your normal doctor is in Manhattan, getting your records quickly and accurately could save your life. So it made sense that as the percentage of new records that were electronic grew, someone would start to put together a way to communicate them. No doubt you’re familiar with the debate about national health information databases, a centralized location for records is a big screaming target from many people’s perspectives, while it is a potentially life-saving technological advancement to others (they’re both right, but I think the infosec crowd has the stronger argument). But a smart group of people put together a project to facilitate doing electronically exactly what is being done today physically. The process is that the patient (or another doctor) requests the records be sent, they are pulled out, copied, mailed or faxed, and then a follow-up or “record received” communication occurs to insure that the source doctor got your records where they belong. Electronically this equates to the same thing, but instead of “selected” you get “looked up”, and instead of “mailed or faxed” you get “sent electronically”. There’s a lot more to it, but that’s the gist of The Direct Project. There are several reasons I got sucked into reading about this project. From a former healthcare worker’s perspective, it’s very cool to see non-diagnostic technology making a positive difference in healthcare, from a patient perspective, I would like the transfer of records to be as streamlined as possible, from the InfoSec perspective (I did a couple of brief stints in InfoSec), I like that it is not a massive database, but rather a “faster transit” mechanism, and from an F5 perspective, the possibilities for our gear to help make this viable were in my mind while reading. While Dr. Halamka has a lot of interesting stuff on his blog, this is one I followed the links and read the information about. It’s a pretty cool initiative, and what may seem very limiting in their scope assumptions holds true to the Direct Project’s idea of replacing the transfer mechanism and not creating a centralized database. While they’re not specifying formats to use during said transfer, they do list some recommended reading on that topic. What they do have is a registry of people who can receive records, and a system for transferring data over the wire. They worry about DNS-style health-care provider lookups, transfer protocols, and encryption, which is certainly a large enough chunk for them to bite off, and then they show how they fit into the larger nation-wide healthcare electronic records efforts going on. I hope they get it right, and the system they’re helping to build results in near-instantaneous secure records transfers, but many inventions are a product of the time and society in which they live, and even if The Direct Project fails, something like it will eventually succeed. If you’re in Healthcare IT, this is certainly a way to add value to the organization, and worth checking out. Meanwhile, I’m going to continue to delve into their work and the work of other organizations they’ve linked to and see if there isn’t a way F5 can help. After all, we can compress, dedupe, and encrypt communications on-the-wire, and the entire system is about on-the-wire communications, so it seems like a perfectly logical route to explore. Though the patient care guy in me will be reading up as much as the IT guy, because healthcare was a very rewarding field that seriously needed a bit more non-diagnostic technology when I was doing it.282Views1like0Comments