What is an Application Delivery Controller - Part I
A Little History Application Delivery got its start in the form of network-based load balancing hardware. It is the essential foundation on which Application Delivery Controllers (ADCs) operate. The second iteration of purpose-built load balancing (following application-based proprietary systems) materialized in the form of network-based appliances. These are the true founding fathers of today's ADCs. Because these devices were application-neutral and resided outside of the application servers themselves, they could load balance using straightforward network techniques. In essence, these devices would present a "virtual server" address to the outside world, and when users attempted to connect, they would forward the connection to the most appropriate real server doing bi-directional network address translation (NAT). Figure 1: Network-based load balancing appliances. With the advent of virtualization and cloud computing, the third iteration of ADCs arrived as software delivered virtual editions intended to run on hypervisors. Virtual editions of application delivery services have the same breadth of features as those that run on purpose-built hardware and remove much of the complexity from moving application services between virtual, cloud, and hybrid environments. They allow organizations to quickly and easily spin-up application services in private or public cloud environments. Basic Application Delivery Terminology It would certainly help if everyone used the same lexicon; unfortunately, every vendor of load balancing devices (and, in turn, ADCs) seems to use different terminology. With a little explanation, however, the confusion surrounding this issue can easily be alleviated. Node, Host, Member, and Server Most ADCs have the concept of a node, host, member, or server; some have all four, but they mean different things. There are two basic concepts that they all try to express. One concept—usually called a node or server—is the idea of the physical or virtual server itself that will receive traffic from the ADC. This is synonymous with the IP address of the physical server and, in the absence of a ADC, would be the IP address that the server name (for example, www.example.com) would resolve to. We will refer to this concept as the host. The second concept is a member (sometimes, unfortunately, also called a node by some manufacturers). A member is usually a little more defined than a server/node in that it includes the TCP port of the actual application that will be receiving traffic. For instance, a server named www.example.com may resolve to an address of 172.16.1.10, which represents the server/node, and may have an application (a web server) running on TCP port 80, making the member address 172.16.1.10:80. Simply put, the member includes the definition of the application port as well as the IP address of the physical server. We will refer to this as the service. Why all the complication? Because the distinction between a physical server and the application services running on it allows the ADC to individually interact with the applications rather than the underlying hardware or hypervisor. A host (172.16.1.10) may have more than one service available (HTTP, FTP, DNS, and so on). By defining each application uniquely (172.16.1.10:80, 172.16.1.10:21, and 172.16.1.10:53), the ADC can apply unique load balancing and health monitoring based on the services instead of the host. However, there are still times when being able to interact with the host (like low-level health monitoring or when taking a server offline for maintenance) is extremely convenient. Most load balancing-based technology uses some concept to represent the host, or physical server, and another to represent the services available on it— in this case, simply host and services. Pool, Cluster, and Farm Load balancing allows organizations to distribute inbound application traffic across multiple back-end destinations, including cloud deployments. It is therefore a necessity to have the concept of a collection of back-end destinations. Pools, as we will refer to them (also known as clusters or farms) are collections of similar services available on any number of hosts. For instance, all services that offer the company web page would be collected into a pool called "company web page" and all services that offer e-commerce services would be collected into a pool called "e-commerce." The key element here is that all systems have a collective object that refers to "all similar services" and makes it easier to work with them as a single unit. This collective object—a pool—is almost always made up of services, not hosts. Virtual Server Although not always the case, today the term virtual server means a server hosting virtual machines. It is important to note that like the definition of services, virtual server usually includes the application port was well as the IP address. The term "virtual service" would be more in keeping with the IP:Port convention; but because most vendors, ADC and Cloud alike use virtual server, this article uses virtual server as well. Putting It All Together Putting all of these concepts together makes up the basic steps in load balancing. The ADC presents virtual servers to the outside world. Each virtual server points to a pool of services that reside on one or more physical hosts. Figure 2: Application Delivery comprises four basic concepts—virtual servers, pool/cluster, services, and hosts. While the diagram above may not be representative of a real-world deployment, it does provide the elemental structure for continuing a discussion about application delivery basics. ps Next Steps ReadWhat is Load Balancing?if you haven't already and check out What is an Application Delivery Controller -Part II.3.4KViews0likes1CommentUsing F5 Distributed Cloud private connectivity orchestration for secure multi-cloud infrastructure
Introduction Enterprise businesses use modern apps that access services in many locations. Users running productivity apps, like Office365, must connect to services in the cloud from on-prem locations. To keep this running well, enterprises must provide connectivity that’s fast, reliable, and private. Traditionally, it has taken many steps to create private connections to a public cloud subscription and route application specific traffic to it. F5 Distributed Cloud Platform orchestrates ExpressRoutes in Azure and Direct Connect services in AWS, eliminating many of the steps needed for routing end-to-end. Distributed Cloud private connectivity orchestration makes it easier than ever to connect and configure routing over existing private and dedicated circuits from on-prem locations to cloud services running in AWS and in Azure. The illustration below outlines the basic components to an ExpressRoute service in Azure but there’s a lot more you’ll need to know about just under the cover. Without orchestration, many steps are needed to enable routing between on-prem sites and Azure. This requires expert knowledge of Azure Networking, numerous dependent resources to be built, and advanced routing protocols knowledge -- specifically the Border Gateway Protocol (BGP). Extend on-prem network to a colo provider Create and provision the ExpressRoute Circuit Create a Virtual Network Gateway Create a connection between ExpressRoute Circuit & Virtual Network Gateway (VNG) Configure a Route Server to propagate routes between VNG and on-prem Configure user-defined routes on each subnet on each VNet in Azure Using Distributed Cloud to orchestrate ExpressRoutes in Azure and Direct Connect in AWS, the total number of steps is effectively reduced to just an essential few. Additional benefits include no longer needing to be an expert in Azure Networking or in BGP routing, and you get the ability to control connectivity with intent-based policies natively built into the Distributed Cloud Platform. An example of an intent-based policy is to configure VNet tagging in Azure to use with a firewall policy that just allow access to specific apps or by select users. Additional policies that support tagging include Distributed Cloud WAAP and Distributed Cloud App Infrastructure Protection. The following details cover the key components needed to support direct connectivity and show how to create the services and deploy a privately routed app in Distributed Cloud. Building ExpressRoute to Azure Extend on-prem network to a colo provider Create and provision the ExpressRoute Circuit Enable the ExpressRoute orchestration feature on an Azure VNet Site configured in Distributed Cloud To create an ExpressRoute orchestrated configuration in Distributed Cloud, navigate to Multi-Cloud Network Connect > Site Management > Azure VNET Sites > Add Azure VNET Site or Manage Configuration for an existing Site. Enter the required parameters, and when you reach the “Ingress Gateway or Ingress/Egress Gateway”, select “Ingress/Egress Gateway (Two Interface) …””. Here you have the option to deploy on a Recommended Region or an Alternate Region. This selection depends entirely on your business’ cloud deployment model. After choosing the model that best fits your environment, configure the number of Availability Zones for the Gateway and subnets (new/existing) that it will join and Apply the settings. Now scroll down to Advanced Options (enabling Advanced Fields) and Select VNet type: Hub VNet. Click “View Configuration”, and any existing VNet’s from your Azure Subscription that should inherit orchestrated routing. Next, change the “Express Route Configuration” to Enabled to expand the dropdown to access the ExpressRoute Circuit and Virtual Network Gateway settings. Under “* Connections”, add the ExpressRoute Circuit configuration for your Azure subscription(s). The required fields are the Name and the Express Route Circuit, this is the Resource ID for the circuit in Azure. Note: When configuring more than one circuit, you may want to also configure the Routing Weight for circuit preference. When configuring an express route circuit from another subscription (not shown below), you’ll also need an Authorization Key. For ease of deployment, it’s recommended to use the default values for the remaining fields, including for the Gateway SKU, Subnet for Azure VNet Gateway, and Subnet for Azure Route Server, including ASN Configuration for BGP between Site and Azure Route Servers. After the configuration is fully saved and deployed, with site status Applied on the Cloud Sites page, all resources in Azure will now be set to use ExpressRoute Circuit(s) for all designated L3 routed traffic. Next, we’ll configured the orchestration of Direct Connect in an AWS VPC connected site. AWS TGW connected sites are also support. Building Direct Connect for AWS To create a Direct Connect orchestrated configuration in Distributed Cloud, navigate to Multi-Cloud Network Connect > Site Management > AWS VPC Sites > Add AWS VPC Site or Manage Configuration for an existing Site. Enter the required parameters, and when you reach the “Ingress Gateway or Ingress/Egress Gateway”, choose the form factor that meets your deployment requirements. Scroll down to Advanced Configuration, enable Advanced Fields, and then Enable Direct Connect. Configuring the Direct Connect connection feature, choose either Hosted VIF or Standard VIF mode. Use Hosted VIF when you’re already using the Direct Connect connection for other purposes in AWS or when the VIF is in another AWS subscription. Otherwise, choosing Standard VIF allows Distributed Cloud to automatically create the VIF, and dependent services in AWS mentioned below to access the Direct Connect connection. Standard VIF mode creates the following additional resources in AWS: Virtual Gateway (VGW): associating it to the VPC and enabling route propagation to inside route tables Direct Connect Gateway (DCG): associating it to the VGW Note: In Standard VIF mode, at the end of the deployment admins may copy the direct connect gateway ID and use it to create other VIF’s. Admins may also copy the ASN. This is the AWS side of the ASN that’s needed by network ops teams to configure BGP peering. Note: In Hosted VIF mode, independent site deployment is responsible for: Creating VGW, and associating it to the VPC and enabling route propagation to inside route tables Creating DCG and associating it to the VGW Accepting the Hosted VIF and linking to the DCGW VIF Optionally, you may configure the Custom ASN if needed to work with an existing BGP configuration or choose Auto to let Distributed Cloud figure it out. Apply the config, save changes, and exit to the general Sites page. After the configuration is fully saved and deployed and having site status Applied on the Cloud Sites page, all resources in AWS will now be set to use the Direct Connect Gateway for all designated L3 routed traffic. Adding Private Connectivity On-Prem The final part to this deployment is routing both the ExpressRoute and Direct Connect circuits to an on-prem site. Both circuits must terminate at a colo space, and then standard IT/NetOps teams handle the routing outside the realm of Distributed Cloud to the destination. Building a Distributed App w/ Private Connectivity With Distributed Cloud having orchestrated the routing to each site’s workload, and IT/NetOps configured routing on-prem, including propagating the on-prem routes on BGP, an app with components that work independently can now be accessed as one unified interface. An example of a distributed app that run perfectly in this environment is the demo app, Arcadia Finance. This app has four components: Main – Frontend Web interface API – An App module accessed by Main to support money transfers Refer-A-Friend (Not used) – An App module interface accessed by Main to invite friends Backend – A DB server that stores money transfer accounts used by the API module, stock portfolio positions used by the Main module, and email addresses saved by the Refer-A-Friend module. Functionally, the connection flow is as follows: Users access a VIP advertised by an F5 Global Network Regional Edge to the Internet User traffic is connected to the Main (frontend) app running in AWS via the F5 Global Network Main App connects to API in Azure to load the money-transfer side frame, and then to the Backend DB on-prem to load the stocks portfolio balances. These connections transit the private connectivity links created in this article. API App in Azure connects to the Backend DB on-prem to retrieve money transfer accounts. This connection transits the private connectivity links created in this article. To support this topology and configuration, the apps are divided and run as follows: AWS Frontend (nginx) Main (Web) Refer-A-Friend Azure API (App) On-Prem Backend (DB) To make the app reachable to users, use the Distributed Cloud console Sites Distributed Apps feature to create one HTTP Load Balancer with the VIP advertised to the Internet, and with the origin pool of the Frontend (nginx) app. Note: This step assumes that you have previously created a fully connected AWS CE Site with connectivity to your VPC’s and a Direct Connect circuit in the section above. Navigate to Multi-Cloud App Connect > Manage > Load Balancers > Origin Pools, create a new origin pool. In the pool creation menu, at the top, select “JSON” and change the format to YAML, then paste the following example, changing the specific values, such as the namespace, to match your environment: metadata: name: mcn-aws-workload-pool namespace: mcn-privatelinks labels: ves.io/app_type: arcadia annotations: {} disable: false spec: origin_servers: - private_ip: ip: 10.100.2.238 site_locator: site: tenant: acmecorp-tnxbsial namespace: system name: soln-eng-aws-dc kind: site inside_network: {} labels: {} no_tls: {} port: 8000 same_as_endpoint_port: {} loadbalancer_algorithm: LB_OVERRIDE endpoint_selection: LOCAL_PREFERRED With the origin pool created, navigate to Distributed Apps > Manage > Load Balancers > HTTP Load Balancers, and add a new one with the following YAML provided as an example: metadata: name: mcn-arcadia-frontend namespace: mcn-privatelinks labels: ves.io/app_type: arcadia annotations: {} disable: false spec: domains: - mcn-arcadia-frontend.demo.internal http: dns_volterra_managed: false port: 80 downstream_tls_certificate_expiration_timestamps: [] advertise_on_public_default_vip: {} default_route_pools: - pool: tenant: acmecorp-tnxbsial namespace: mcn-privatelinks name: mcn-aws-workload-pool kind: origin_pool weight: 1 priority: 1 endpoint_subsets: {} Internally verify end-to-end connectivity Opening a command line shell to the Frontend Web App, a variation of traceroute with the tool hping3 and using curl, reveals each hop identified as privately connected along with connectivity established directly to the destination working without an intermediary. The following IP addresses are used to support a TCP connection from the Frontend (Web) app running in AWS to the API app running in Azure: 10.100.2.238 (Source): Frontend (Web) 172.18.0.1: Container host node 192.168.1.6: AWS Direct Connect Gateway 192.168.1.5: On-Prem router 192.168.1.22: Azure ExpressRoute Circuit endpoint 10.101.1.5 (Destination): App (API) In the CLI output, note each hop and the value in the HTTP Response header “Server”: root@1e40062cb314:/etc/nginx# hping3 -ST -p 8080 api HPING api (eth2 10.101.1.5): S set, 40 headers + 0 data bytes hop=1 TTL 0 during transit from ip=172.18.0.1 name=UNKNOWN hop=1 hoprtt=7.7 ms hop=2 TTL 0 during transit from ip=192.168.1.6 name=UNKNOWN hop=2 hoprtt=31.4 ms hop=3 TTL 0 during transit from ip=192.168.1.5 name=UNKNOWN hop=3 hoprtt=67.3 ms hop=4 TTL 0 during transit from ip=192.168.1.22 name=UNKNOWN hop=4 hoprtt=67.2 ms ^C --- api hping statistic --- 8 packets transmitted, 4 packets received, 50% packet loss round-trip min/avg/max = 7.7/43.4/67.3 ms root@1e40062cb314:/etc/nginx# curl -v api:8080 * Rebuilt URL to: api:8080/ * Hostname was NOT found in DNS cache * Trying 10.101.1.5... * Connected to api (10.101.1.5) port 8080 (#0) > GET / HTTP/1.1 > User-Agent: curl/7.35.0 > Host: api:8080 > Accept: */* > < HTTP/1.1 200 OK * Server nginx/1.18.0 (Ubuntu) is not blacklisted < Server: nginx/1.18.0 (Ubuntu) < Date: Thu, 12 Jan 2023 18:59:32 GMT < Content-Type: text/html < Content-Length: 612 < Last-Modified: Fri, 11 Nov 2022 03:24:47 GMT < Connection: keep-alive < ETag: "636dc07f-264" < Accept-Ranges: bytes Conclusion As more services continue to be deployed to and run in the cloud, dedicated, reliable, and secure private connectivity is increasingly required by Enterprises. Establishing connectivity is not a rudimentary task and requires the assistance of many hands in different departments. Distributed Cloud private connectivity orchestration helps streamline this process by eliminating many of the steps required in each cloud provider, including no longer requiring dedicated cloud and routing protocol experts just to configure these services manually. To see all of this in action and to see how all the parts come together, watch the following video, a companion to this article. Visit the following resources for more information about this feature and other Distributed Cloud services: Multi-Cloud Network Connect Product Information Direct Connect orchestration for AWS TGW Sites Direct Connect orchestration for AWS VPC Sites ExpressRoutes orchestration for Azure VNet Sites YouTube Video2.7KViews8likes0CommentsWhat is Load Balancing?
tl;dr - Load Balancing is the process of distributing data across disparate services to provide redundancy, reliability, and improve performance. The entire intent of load balancing is to create a system that virtualizes the "service" from the physical servers that actually run that service. A more basic definition is to balance the load across a bunch of physical servers and make those servers look like one great big server to the outside world. There are many reasons to do this, but the primary drivers can be summarized as "scalability," "high availability," and "predictability." Scalability is the capability of dynamically, or easily, adapting to increased load without impacting existing performance. Service virtualization presented an interesting opportunity for scalability; if the service, or the point of user contact, was separated from the actual servers, scaling of the application would simply mean adding more servers or cloud resources which would not be visible to the end user. High Availability (HA) is the capability of a site to remain available and accessible even during the failure of one or more systems. Service virtualization also presented an opportunity for HA; if the point of user contact was separated from the actual servers, the failure of an individual server would not render the entire application unavailable. Predictability is a little less clear as it represents pieces of HA as well as some lessons learned along the way. However, predictability can best be described as the capability of having confidence and control in how the services are being delivered and when they are being delivered in regards to availability, performance, and so on. A Little Background Back in the early days of the commercial Internet, many would-be dot-com millionaires discovered a serious problem in their plans. Mainframes didn't have web server software (not until the AS/400e, anyway) and even if they did, they couldn't afford them on their start-up budgets. What they could afford was standard, off-the-shelf server hardware from one of the ubiquitous PC manufacturers. The problem for most of them? There was no way that a single PC-based server was ever going to handle the amount of traffic their idea would generate and if it went down, they were offline and out of business. Fortunately, some of those folks actually had plans to make their millions by solving that particular problem; thus was born the load balancing market. In the Beginning, There Was DNS Before there were any commercially available, purpose-built load balancing devices, there were many attempts to utilize existing technology to achieve the goals of scalability and HA. The most prevalent, and still used, technology was DNS round-robin. Domain name system (DNS) is the service that translates human-readable names (www.example.com) into machine recognized IP addresses. DNS also provided a way in which each request for name resolution could be answered with multiple IP addresses in different order. Figure 1: Basic DNS response for redundancy The first time a user requested resolution for www.example.com, the DNS server would hand back multiple addresses (one for each server that hosted the application) in order, say 1, 2, and 3. The next time, the DNS server would give back the same addresses, but this time as 2, 3, and 1. This solution was simple and provided the basic characteristics of what customer were looking for by distributing users sequentially across multiple physical machines using the name as the virtualization point. From a scalability standpoint, this solution worked remarkable well; probably the reason why derivatives of this method are still in use today particularly in regards to global load balancing or the distribution of load to different service points around the world. As the service needed to grow, all the business owner needed to do was add a new server, include its IP address in the DNS records, and voila, increased capacity. One note, however, is that DNS responses do have a maximum length that is typically allowed, so there is a potential to outgrow or scale beyond this solution. This solution did little to improve HA. First off, DNS has no capability of knowing if the servers listed are actually working or not, so if a server became unavailable and a user tried to access it before the DNS administrators knew of the failure and removed it from the DNS list, they might get an IP address for a server that didn't work. Proprietary Load Balancing in Software One of the first purpose-built solutions to the load balancing problem was the development of load balancing capabilities built directly into the application software or the operating system (OS) of the application server. While there were as many different implementations as there were companies who developed them, most of the solutions revolved around basic network trickery. For example, one such solution had all of the servers in a cluster listen to a "cluster IP" in addition to their own physical IP address. Figure 2: Proprietary cluster IP load balancing When the user attempted to connect to the service, they connected to the cluster IP instead of to the physical IP of the server. Whichever server in the cluster responded to the connection request first would redirect them to a physical IP address (either their own or another system in the cluster) and the service session would start. One of the key benefits of this solution is that the application developers could use a variety of information to determine which physical IP address the client should connect to. For instance, they could have each server in the cluster maintain a count of how many sessions each clustered member was already servicing and have any new requests directed to the least utilized server. Initially, the scalability of this solution was readily apparent. All you had to do was build a new server, add it to the cluster, and you grew the capacity of your application. Over time, however, the scalability of application-based load balancing came into question. Because the clustered members needed to stay in constant contact with each other concerning who the next connection should go to, the network traffic between the clustered members increased exponentially with each new server added to the cluster. The scalability was great as long as you didn't need to exceed a small number of servers. HA was dramatically increased with these solutions. However, since each iteration of intelligence-enabling HA characteristics had a corresponding server and network utilization impact, this also limited scalability. The other negative HA impact was in the realm of reliability. Network-Based Load balancing Hardware The second iteration of purpose-built load balancing came about as network-based appliances. These are the true founding fathers of today's Application Delivery Controllers. Because these boxes were application-neutral and resided outside of the application servers themselves, they could achieve their load balancing using much more straight-forward network techniques. In essence, these devices would present a virtual server address to the outside world and when users attempted to connect, it would forward the connection on the most appropriate real server doing bi-directional network address translation (NAT). Figure 3: Load balancing with network-based hardware The load balancer could control exactly which server received which connection and employed "health monitors" of increasing complexity to ensure that the application server (a real, physical server) was responding as needed; if not, it would automatically stop sending traffic to that server until it produced the desired response (indicating that the server was functioning properly). Although the health monitors were rarely as comprehensive as the ones built by the application developers themselves, the network-based hardware approach could provide at least basic load balancing services to nearly every application in a uniform, consistent manner—finally creating a truly virtualized service entry point unique to the application servers serving it. Scalability with this solution was only limited by the throughput of the load balancing equipment and the networks attached to it. It was not uncommon for organization replacing software-based load balancing with a hardware-based solution to see a dramatic drop in the utilization of their servers. HA was also dramatically reinforced with a hardware-based solution. Predictability was a core component added by the network-based load balancing hardware since it was much easier to predict where a new connection would be directed and much easier to manipulate. The advent of the network-based load balancer ushered in a whole new era in the architecture of applications. HA discussions that once revolved around "uptime" quickly became arguments about the meaning of "available" (if a user has to wait 30 seconds for a response, is it available? What about one minute?). This is the basis from which Application Delivery Controllers (ADCs) originated. The ADC Simply put, ADCs are what all good load balancers grew up to be. While most ADC conversations rarely mention load balancing, without the capabilities of the network-based hardware load balancer, they would be unable to affect application delivery at all. Today, we talk about security, availability, and performance, but the underlying load balancing technology is critical to the execution of all. Next Steps Ready to plunge into the next level of Load Balancing? Take a peek at these resources: Go Beyond POLB (Plain Old Load Balancing) The Cloud-Ready ADC BIG-IP Virtual Edition Products, The Virtual ADCs Your Application Delivery Network Has Been Missing Cloud Balancing: The Evolution of Global Server Load Balancing22KViews0likes1CommentIntro to Load Balancing for Developers – The Algorithms
If you’re new to this series, you can find the complete list of articles in the series on my personal page here If you are writing applications to sit behind a Load Balancer, it behooves you to at least have a clue what the algorithm your load balancer uses is about. We’re taking this week’s installment to just chat about the most common algorithms and give a plain- programmer description of how they work. While historically the algorithm chosen is both beyond the developers’ control, you’re the one that has to deal with performance problems, so you should know what is happening in the application’s ecosystem, not just in the application. Anything that can slow your application down or introduce errors is something worth having reviewed. For algorithms supported by the BIG-IP, the text here is paraphrased/modified versions of the help text associated with the Pool Member tab of the BIG-IP UI. If they wrote a good description and all I needed to do was programmer-ize it, then I used it. For algorithms not supported by the BIG-IP I wrote from scratch. Note that there are many, many more algorithms out there, but as you read through here you’ll see why these (or minor variants of them) are the ones you’ll see the most. Plain Programmer Description: Is not intended to say anything about the way any particular dev team at F5 or any other company writes these algorithms, they’re just an attempt to put the process into terms that are easier for someone with a programming background to understand. Hopefully a successful attempt. Interestingly enough, I’ve pared down what BIG-IP supports to a subset. That means that F5 employees and aficionados will be going “But you didn’t mention…!” and non-F5 employees will likely say “But there’s the Chi-Squared Algorithm…!” (no, chi-squared is theoretical distribution method I know of because it was presented as a proof for testing the randomness of a 20 sided die, ages ago in Dragon Magazine). The point being that I tried to stick to a group that builds on each other in some connected fashion. So send me hate mail… I’m good. Unless you can say more than 2-5% of the world’s load balancers are running the algorithm, I won’t consider that I missed something important. The point is to give developers and software architects a familiarity with core algorithms, not to build the worlds most complete lexicon of algorithms. Random: This load balancing method randomly distributes load across the servers available, picking one via random number generation and sending the current connection to it. While it is available on many load balancing products, its usefulness is questionable except where uptime is concerned – and then only if you detect down machines. Plain Programmer Description: The system builds an array of Servers being load balanced, and uses the random number generator to determine who gets the next connection… Far from an elegant solution, and most often found in large software packages that have thrown load balancing in as a feature. Round Robin: Round Robin passes each new connection request to the next server in line, eventually distributing connections evenly across the array of machines being load balanced. Round Robin works well in most configurations, but could be better if the equipment that you are load balancing is not roughly equal in processing speed, connection speed, and/or memory. Plain Programmer Description: The system builds a standard circular queue and walks through it, sending one request to each machine before getting to the start of the queue and doing it again. While I’ve never seen the code (or actual load balancer code for any of these for that matter), we’ve all written this queue with the modulus function before. In school if nowhere else. Weighted Round Robin (called Ratio on the BIG-IP): With this method, the number of connections that each machine receives over time is proportionate to a ratio weight you define for each machine. This is an improvement over Round Robin because you can say “Machine 3 can handle 2x the load of machines 1 and 2”, and the load balancer will send two requests to machine #3 for each request to the others. Plain Programmer Description: The simplest way to explain for this one is that the system makes multiple entries in the Round Robin circular queue for servers with larger ratios. So if you set ratios at 3:2:1:1 for your four servers, that’s what the queue would look like – 3 entries for the first server, two for the second, one each for the third and fourth. In this version, the weights are set when the load balancing is configured for your application and never change, so the system will just keep looping through that circular queue. Different vendors use different weighting systems – whole numbers, decimals that must total 1.0 (100%), etc. but this is an implementation detail, they all end up in a circular queue style layout with more entries for larger ratings. Dynamic Round Robin (Called Dynamic Ratio on the BIG-IP): is similar to Weighted Round Robin, however, weights are based on continuous monitoring of the servers and are therefore continually changing. This is a dynamic load balancing method, distributing connections based on various aspects of real-time server performance analysis, such as the current number of connections per node or the fastest node response time. This Application Delivery Controller method is rarely available in a simple load balancer. Plain Programmer Description: If you think of Weighted Round Robin where the circular queue is rebuilt with new (dynamic) weights whenever it has been fully traversed, you’ll be dead-on. Fastest: The Fastest method passes a new connection based on the fastest response time of all servers. This method may be particularly useful in environments where servers are distributed across different logical networks. On the BIG-IP, only servers that are active will be selected. Plain Programmer Description: The load balancer looks at the response time of each attached server and chooses the one with the best response time. This is pretty straight-forward, but can lead to congestion because response time right now won’t necessarily be response time in 1 second or two seconds. Since connections are generally going through the load balancer, this algorithm is a lot easier to implement than you might think, as long as the numbers are kept up to date whenever a response comes through. These next three I use the BIG-IP name for. They are variants of a generalized algorithm sometimes called Long Term Resource Monitoring. Least Connections: With this method, the system passes a new connection to the server that has the least number of current connections. Least Connections methods work best in environments where the servers or other equipment you are load balancing have similar capabilities. This is a dynamic load balancing method, distributing connections based on various aspects of real-time server performance analysis, such as the current number of connections per node or the fastest node response time. This Application Delivery Controller method is rarely available in a simple load balancer. Plain Programmer Description: This algorithm just keeps track of the number of connections attached to each server, and selects the one with the smallest number to receive the connection. Like fastest, this can cause congestion when the connections are all of different durations – like if one is loading a plain HTML page and another is running a JSP with a ton of database lookups. Connection counting just doesn’t account for that scenario very well. Observed: The Observed method uses a combination of the logic used in the Least Connections and Fastest algorithms to load balance connections to servers being load-balanced. With this method, servers are ranked based on a combination of the number of current connections and the response time. Servers that have a better balance of fewest connections and fastest response time receive a greater proportion of the connections. This Application Delivery Controller method is rarely available in a simple load balancer. Plain Programmer Description: This algorithm tries to merge Fastest and Least Connections, which does make it more appealing than either one of the above than alone. In this case, an array is built with the information indicated (how weighting is done will vary, and I don’t know even for F5, let alone our competitors), and the element with the highest value is chosen to receive the connection. This somewhat counters the weaknesses of both of the original algorithms, but does not account for when a server is about to be overloaded – like when three requests to that query-heavy JSP have just been submitted, but not yet hit the heavy work. Predictive: The Predictive method uses the ranking method used by the Observed method, however, with the Predictive method, the system analyzes the trend of the ranking over time, determining whether a servers performance is currently improving or declining. The servers in the specified pool with better performance rankings that are currently improving, rather than declining, receive a higher proportion of the connections. The Predictive methods work well in any environment. This Application Delivery Controller method is rarely available in a simple load balancer. Plain Programmer Description: This method attempts to fix the one problem with Observed by watching what is happening with the server. If its response time has started going down, it is less likely to receive the packet. Again, no idea what the weightings are, but an array is built and the most desirable is chosen. You can see with some of these algorithms that persistent connections would cause problems. Like Round Robin, if the connections persist to a server for as long as the user session is working, some servers will build a backlog of persistent connections that slow their response time. The Long Term Resource Monitoring algorithms are the best choice if you have a significant number of persistent connections. Fastest works okay in this scenario also if you don’t have access to any of the dynamic solutions. That’s it for this week, next week we’ll start talking specifically about Application Delivery Controllers and what they offer – which is a whole lot – that can help your application in a variety of ways. Until then! Don.21KViews1like9CommentsUse F5 Distributed Cloud to service chain WAAP and CDN
The Content Delivery Network (CDN) market has become increasingly commoditized. Many providers have augmented their CDN capabilities with WAFs/WAAPs, DNS, load balancing, edge compute, and networking. Managing all these solutions together creates a web of operational complexity, which can be confusing. F5’s synergistic bundling of CDN with Web Application and API Protection (WAAP) benefits those looking for simplicity and ease of use. It provides a way around the complications and silos that many resource-strapped organizations face with their IT systems. This bundling also signifies how CDN has become a commodity product often not purchased independently anymore. This trend is encouraging many competitors to evolve their capabilities to include edge computing – a space where F5 has gained considerable experience in recent years. F5 is rapidly catching up to other providers’ CDNs. F5’s experience and leadership building the world’s best-of-breed Application Delivery Controller (ADC), the BIG-IP load balancer, put it in a unique position to offer the best application delivery and security services directly at the edge with many of its CDN points of presence. With robust regional edge capabilities and a global network, F5 has entered the CDN space with a complementary offering to an already compelling suite of features. This includes the ability to run microservices and Kubernetes workloads anywhere, with a complete range of services to support app infrastructure deployment, scale, and lifecycle management all within a single management console. With advancements made in the application security space at F5, WAAP capabilities are directly integrated into the Distributed Cloud Platform to protect both web apps and APIs. Features include (yet not limited to): Web Application Firewall: Signature + Behavioral WAF functionality Bot Defense: Detect client signals, determining if clients are human or automated DDoS Mitigation: Fully managed by F5 API Security: Continuous inspection and detection of shadow APIs Solution Combining the Distributed Cloud WAAP with CDN as a form of service chaining is a straightforward process. This not only gives you the best security protection for web apps and APIs, but also positions apps regionally to deliver them with low latency and minimal compute per request. In the following solution, we’ve combined Distributed Cloud WAAP and CDN to globally deliver an app protected by a WAF policy from the closest regional point of presence to the user. Follow along as I demonstrate how to configure the basic elements. Configuration Log in to the Distributed Cloud Console and navigate to the DNS Management service. Decide if you want Distributed Cloud to manage the DNS zone as a Primary DNS server or if you’d rather delegate the fully qualified domain name (FQDN) for your app to Distributed Cloud with a CNAME. While using Delegation or Managed DNS is optional, doing so makes it possible for Distributed Cloud to automatically create and manage the SSL certificates needed to securely publish your app. Next, in Distributed Cloud Console, navigate to the Web App and API Protection service, then go to App Firewall, then Add App Firewall. This is where you’ll create the security policy that we’ll later connect our HTTP LB. Let’s use the following basic WAF policy in YAML format, you can paste it directly in to the Console by changing the configuration view to JSON and then changing the format to YAML. Note: This uses the namespace “waap-cdn”, change this to match your individual tenant’s configuration. metadata: name: buytime-waf namespace: waap-cdn labels: {} annotations: {} disable: false spec: blocking: {} detection_settings: signature_selection_setting: default_attack_type_settings: {} high_medium_low_accuracy_signatures: {} enable_suppression: {} enable_threat_campaigns: {} default_violation_settings: {} bot_protection_setting: malicious_bot_action: BLOCK suspicious_bot_action: REPORT good_bot_action: REPORT allow_all_response_codes: {} default_anonymization: {} use_default_blocking_page: {} With the WAF policy saved, it’s time to configure the origin server. Navigate to Load Balancers > Origin Pools, then Add Origin Pool. The following YAML uses a FQDN DNS name reach the app server. Using an IP address for the server is possible as well. metadata: name: buytime-pool namespace: waap-cdn labels: {} annotations: {} disable: false spec: origin_servers: - public_name: dns_name: webserver.f5-cloud-demo.com labels: {} no_tls: {} port: 80 same_as_endpoint_port: {} healthcheck: [] loadbalancer_algorithm: LB_OVERRIDE endpoint_selection: LOCAL_PREFERRED With the supporting WAF and Origin Pool resources configured, it’s time to create the HTTP Load Balancer. Navigate to Load Balancers > HTTP Load Balancers, then create a new one. Use the following YAML to create the LB and use both resources created above. metadata: name: buytime-online namespace: waap-cdn labels: {} annotations: {} disable: false spec: domains: - buytime.waap.f5-cloud-demo.com https_auto_cert: http_redirect: true add_hsts: true port: 443 tls_config: default_security: {} no_mtls: {} default_header: {} enable_path_normalize: {} non_default_loadbalancer: {} header_transformation_type: default_header_transformation: {} advertise_on_public_default_vip: {} default_route_pools: - pool: tenant: your-tenant-uid namespace: waap-cdn name: buytime-pool kind: origin_pool weight: 1 priority: 1 endpoint_subsets: {} routes: [] app_firewall: tenant: your-tenant-uid namespace: waap-cdn name: buytime-waf kind: app_firewall add_location: true no_challenge: {} user_id_client_ip: {} disable_rate_limit: {} waf_exclusion_rules: [] data_guard_rules: [] blocked_clients: [] trusted_clients: [] ddos_mitigation_rules: [] service_policies_from_namespace: {} round_robin: {} disable_trust_client_ip_headers: {} disable_ddos_detection: {} disable_malicious_user_detection: {} disable_api_discovery: {} disable_bot_defense: {} disable_api_definition: {} disable_ip_reputation: {} disable_client_side_defense: {} resource_version: "517528014" With the HTTP LB successfully deployed, check that its status is ready on the status page. You can verify the LB is working by sending a basic request using the command line tool, curl. Confirm that the value of the HTTP header “Server” is “volt-adc”. da.potter@lab ~ % curl -I https://buytime.waap.f5-cloud-demo.com HTTP/2 200 date: Mon, 17 Oct 2022 23:23:55 GMT content-type: text/html; charset=UTF-8 content-length: 2200 vary: Origin access-control-allow-credentials: true accept-ranges: bytes cache-control: public, max-age=0 last-modified: Wed, 24 Feb 2021 11:06:36 GMT etag: W/"898-177d3b82260" x-envoy-upstream-service-time: 136 strict-transport-security: max-age=31536000 set-cookie: 1f945=1666049035840-557942247; Path=/; Domain=f5-cloud-demo.com; Expires=Sun, 17 Oct 2032 23:23:55 GMT set-cookie: 1f9403=viJrSNaAp766P6p6EKZK7nyhofjXCVawnskkzsrMBUZIoNQOEUqXFkyymBAGlYPNQXOUBOOYKFfs0ne+fKAT/ozN5PM4S5hmAIiHQ7JAh48P4AP47wwPqdvC22MSsSejQ0upD9oEhkQEeTG1Iro1N9sLh+w+CtFS7WiXmmJFV9FAl3E2; path=/ x-volterra-location: wes-sea server: volt-adc Now it’s time to configure the CDN Distribution and service chain it to the WAAP HTTP LB. Navigate to Content Delivery Network > Distributions, then Add Distribution. The following YAML creates a basic CDN configuration that uses the WAAP HTTP LB above. metadata: name: buytime-cdn namespace: waap-cdn labels: {} annotations: {} disable: false spec: domains: - buytime.f5-cloud-demo.com https_auto_cert: http_redirect: true add_hsts: true tls_config: tls_12_plus: {} add_location: false more_option: cache_ttl_options: cache_ttl_override: 1m origin_pool: public_name: dns_name: buytime.waap.f5-cloud-demo.com use_tls: use_host_header_as_sni: {} tls_config: default_security: {} volterra_trusted_ca: {} no_mtls: {} origin_servers: - public_name: dns_name: buytime.waap.f5-cloud-demo.com follow_origin_redirect: false resource_version: "518473853" After saving the configuration, verify that the status is “Active”. You can confirm the CDN deployment status for each individual region by going to the distribution’s action button “Show Global Status”, and scrolling down to each region to see that each region’s “site_status.status” value is “DEPLOYMENT_STATUS_DEPLOYED”. Verification With the CDN Distribution successfully deployed, it’s possible to confirm with the following basic request using curl. Take note of the two HTTP headers “Server” and “x-cache-status”. The Server value will now be “volt-cdn”, and the x-cache-status will be “MISS” for the first request. da.potter@lab ~ % curl -I https://buytime.f5-cloud-demo.com HTTP/2 200 date: Mon, 17 Oct 2022 23:24:04 GMT content-type: text/html; charset=UTF-8 content-length: 2200 vary: Origin access-control-allow-credentials: true accept-ranges: bytes cache-control: public, max-age=0 last-modified: Wed, 24 Feb 2021 11:06:36 GMT etag: W/"898-177d3b82260" x-envoy-upstream-service-time: 63 strict-transport-security: max-age=31536000 set-cookie: 1f945=1666049044863-471593352; Path=/; Domain=f5-cloud-demo.com; Expires=Sun, 17 Oct 2032 23:24:04 GMT set-cookie: 1f9403=aCNN1JINHqvWPwkVT5OH3c+OIl6+Ve9Xkjx/zfWxz5AaG24IkeYqZ+y6tQqE9CiFkNk+cnU7NP0EYtgGnxV0dLzuo3yHRi3dzVLT7PEUHpYA2YSXbHY6yTijHbj/rSafchaEEnzegqngS4dBwfe56pBZt52MMWsUU9x3P4yMzeeonxcr; path=/ x-volterra-location: dal3-dal server: volt-cdn x-cache-status: MISS strict-transport-security: max-age=31536000 To see a security violation detected by the WAF in real-time, you can simulate a simple XSS exploit with the following curl: da.potter@lab ~ % curl -Gv "https://buytime.f5-cloud-demo.com?<script>('alert:XSS')</script>" * Trying x.x.x.x:443... * Connected to buytime.f5-cloud-demo.com (x.x.x.x) port 443 (#0) * ALPN, offering h2 * ALPN, offering http/1.1 * successfully set certificate verify locations: * CAfile: /etc/ssl/cert.pem * CApath: none * (304) (OUT), TLS handshake, Client hello (1): * (304) (IN), TLS handshake, Server hello (2): * TLSv1.2 (IN), TLS handshake, Certificate (11): * TLSv1.2 (IN), TLS handshake, Server key exchange (12): * TLSv1.2 (IN), TLS handshake, Server finished (14): * TLSv1.2 (OUT), TLS handshake, Client key exchange (16): * TLSv1.2 (OUT), TLS change cipher, Change cipher spec (1): * TLSv1.2 (OUT), TLS handshake, Finished (20): * TLSv1.2 (IN), TLS change cipher, Change cipher spec (1): * TLSv1.2 (IN), TLS handshake, Finished (20): * SSL connection using TLSv1.2 / ECDHE-ECDSA-AES256-GCM-SHA384 * ALPN, server accepted to use h2 * Server certificate: * subject: CN=buytime.f5-cloud-demo.com * start date: Oct 14 23:51:02 2022 GMT * expire date: Jan 12 23:51:01 2023 GMT * subjectAltName: host "buytime.f5-cloud-demo.com" matched cert's "buytime.f5-cloud-demo.com" * issuer: C=US; O=Let's Encrypt; CN=R3 * SSL certificate verify ok. * Using HTTP2, server supports multiplexing * Connection state changed (HTTP/2 confirmed) * Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0 * Using Stream ID: 1 (easy handle 0x14f010000) > GET /?<script>('alert:XSS')</script> HTTP/2 > Host: buytime.f5-cloud-demo.com > user-agent: curl/7.79.1 > accept: */* > * Connection state changed (MAX_CONCURRENT_STREAMS == 128)! < HTTP/2 200 < date: Sat, 22 Oct 2022 01:04:39 GMT < content-type: text/html; charset=UTF-8 < content-length: 269 < cache-control: no-cache < pragma: no-cache < set-cookie: 1f945=1666400679155-452898837; Path=/; Domain=f5-cloud-demo.com; Expires=Fri, 22 Oct 2032 01:04:39 GMT < set-cookie: 1f9403=/1b+W13c7xNShbbe6zE3KKUDNPCGbxRMVhI64uZny+HFXxpkJMsCKmDWaihBD4KWm82reTlVsS8MumTYQW6ktFQqXeFvrMDFMSKdNSAbVT+IqQfSuVfVRfrtgRkvgzbDEX9TUIhp3xJV3R1jdbUuAAaj9Dhgdsven8FlCaADENYuIlBE; path=/ < x-volterra-location: dal3-dal < server: volt-cdn < x-cache-status: MISS < strict-transport-security: max-age=31536000 < <html><head><title>Request Rejected</title></head> <body>The requested URL was rejected. Please consult with your administrator.<br/><br/> Your support ID is 85281693-eb72-4891-9099-928ffe00869c<br/><br/><a href='javascript:history.back();'>[Go Back]</a></body></html> * Connection #0 to host buytime.f5-cloud-demo.com left intact Notice that the above request intentionally by-passes the CDN cache and is sent to the HTTP LB for the WAF policy to inspect. With this request rejected, you can confirm the attack by navigating to the WAAP HTTP LB Security page under the WAAP Security section within Apps & APIs. After refreshing the page, you’ll see the security violation under the “Top Attacked” panel. Demo To see all of this in action, watch my video below. This uses all of the configuration details above to make a WAAP + CDN service chain in Distributed Cloud. Additional Guides Virtually deploy this solution in our product simulator, or hands-on with step-by-step comprehensive demo guide. The demo guide includes all the steps, including those that are needed prior to deployment, so that once deployed, the solution works end-to-end without any tweaks to local DNS. The demo guide steps can also be automated with Ansible, in case you'd either like to replicate it or simply want to jump to the end and work your way back. Conclusion This shows just how simple it can be to use the Distributed Cloud CDN to frontend your web app protected by a WAF, all natively within the F5 Distributed Cloud’s regional edge POPs. The advantage of this solution should now be clear – the Distributed Cloud CDN is cloud-agnostic, flexible, agile, and you can enforce security policies anywhere, regardless of whether your web app lives on-prem, in and across clouds, or even at the edge. For more information about Distributed Cloud WAAP and Distributed Cloud CDN, visit the following resources: Product website: https://www.f5.com/cloud/products/cdn Distributed Cloud CDN & WAAP Demo Guide: https://github.com/f5devcentral/xcwaapcdnguide Video: https://youtu.be/OUD8R6j5Q8o Simulator: https://simulator.f5.com/s/waap-cdn Demo Guide: https://github.com/f5devcentral/xcwaapcdnguide7.3KViews10likes0CommentsThe Concise Guide to Proxies
We often mention that the benefits derived from some application delivery controllers are due to the nature of being a full proxy. And in the same breath we might mention reverse, half, and forward proxies, which makes the technology sound more like a description of the positions on a sports team than an application delivery solution. So what does these terms really mean? Here's the lowdown on the different kinds of proxies in one concise guide. PROXIES Proxies (often called intermediaries in the SOA world) are hardware or software solutions that sit between the client and the server and do something to requests and sometimes responses. The most often heard use of the term proxy is in conjunction with anonymizing Web surfing. That's because proxies sit between your browser and your desired destination and proxy the connection; that is you talk to the proxy while the proxy talks to the web server and neither you nor the web server know about each other. Proxies are not all the same. Some are half proxies, some are full proxies; some are forward and some are reverse. Yes, that came excruciatingly close to sounding like a Dr. Seuss book. (Go ahead, you know you want to. You may even remember this from .. .well, when it was first circulated.) FORWARD PROXIES Forward proxies are probably the most well known of all proxies, primarily because most folks have dealt with them either directly or indirectly. Forward proxies are those proxies that sit between two networks, usually a private internal network and the public Internet. Forward proxies have also traditionally been employed by large service providers as a bridge between their isolated network of subscribers and the public Internet, such as CompuServe and AOL in days gone by. These are often referred to as "mega-proxies" because they managed such high volumes of traffic. Forward proxies are generally HTTP (Web) proxies that provide a number of services but primarily focus on web content filtering and caching services. These forward proxies often include authentication and authorization as a part of their product to provide more control over access to public content. If you've ever gotten a web page that says "Your request has been denied by blah blah blah. If you think this is an error please contact the help desk/your administrator" then you've probably used a forward proxy. REVERSE PROXIES A reverse proxy is less well known, generally because we don't use the term anymore to describe products used as such. Load balancers (application delivery controllers) and caches are good examples of reverse proxies. Reverse proxies sit in front of web and application servers and process requests for applications and content coming in from the public Internet to the internal, private network. This is the primary reason for the appellation "reverse" proxy - to differentiate it from a proxy that handles outbound requests. Reverse proxies are also generally focused on HTTP but in recent years have expanded to include a number of other protocols commonly used on the web such as streaming audio (RTSP), file transfers (FTP), and generally any application protocol capable of being delivered via UDP or TCP. HALF PROXIES Half-proxy is a description of the way in which a proxy, reverse or forward, handles connections. There are two uses of the term half-proxy: one describing a deployment configuration that affects the way connections are handled and one that describes simply the difference between a first and subsequent connections. The deployment focused definition of half-proxy is associated with a direct server return (DSR) configuration. Requests are proxied by the device, but the responses do not return through the device, but rather are sent directly to the client. For some types of data - particularly streaming protocols - this configuration results in improved performance. This configuration is known as a half-proxy because only half the connection (incoming) is proxied while the other half, the response, is not. The second use of the term "half-proxy" describes a solution in which the proxy performs what is known as delayed binding in order to provide additional functionality. This allows the proxy to examine the request before determining where to send it. Once the proxy determines where to route the request, the connection between the client and the server are "stitched" together. This is referred to as a half-proxy because the initial TCP handshaking and first requests are proxied by the solution, but subsequently forwarded without interception. Half proxies can look at incoming requests in order to determine where the connection should be sent and can even use techniques to perform layer 7 inspection, but they are rarely capable of examining the responses. Almost all half-proxies fall into the category of reverse proxies. FULL PROXIES Full proxy is also a description of the way in which a proxy, reverse or forward, handles connections. A full proxy maintains two separate connections - one between itself and the client and one between itself and the destination server. A full proxy completely understands the protocols, and is itself an endpoint and an originator for the protocols. Full proxies are named because they completely proxy connections - incoming and outgoing. Because the full proxy is an actual protocol endpoint, it must fully implement the protocols as both a client and a server (a packet-based design does not). This also means the full proxy can have its own TCP connection behavior, such as buffering, retransmits, and TCP options. With a full proxy, each connection is unique; each can have its own TCP connection behavior. This means that a client connecting to the full proxy device would likely have different connection behavior than the full proxy might use for communicating with servers. Full proxies can look at incoming requests and outbound responses and can manipulate both if the solution allows it. Many reverse and forward proxies use a full proxy model today. There is no guarantee that a given solution is a full proxy, so you should always ask your solution provider if it is important to you that the solution is a full proxy.4.2KViews2likes12CommentsWILS: Virtual Server versus Virtual IP Address
load balancing intermediaries have long used the terms “virtual server” and “virtual IP address”. With the widespread adoption of virtualization these terms have become even more confusing to the uninitiated. Here’s how load balancing and application delivery use the terminology. I often find it easiest to explain the difference between a “virtual server” and a “virtual IP address (VIP)” by walking through the flow of traffic as it is received from the client. When a client queries for “www.yourcompany.com” they get an IP address, of course. In many cases if the site is served by a load balancer or application delivery controller that IP address is a virtual IP address. That simply means the IP address is not tied to a specific host. It’s kind of floating out there, waiting for requests. It’s more like a taxi than a public bus in that a public bus has a predefined route from which it does not deviate. A taxi, however, can take you wherever you want within the confines of its territory. In the case of a virtual IP address that territory is the set of virtual servers and services offered by the organization. The client (the browser, probably) uses the virtual IP address to make a request to “www.yourcompany.com” for a particular resource such as a web application (HTTP) or to send an e-mail (SMTP). Using the VIP and a TCP port appropriate for the resource, the application delivery controller directs the request to a “virtual server”. The virtual server is also an abstraction. It doesn’t really “exist” anywhere but in the application delivery controller’s configuration. The virtual server determines – via myriad options – which pool of resources will best serve to meet the user’s request. That pool of resources contains “nodes”, which ultimately map to one (or more) physical or virtual web/application servers (or mail servers, or X servers). A virtual IP address can represent multiple virtual servers and the correct mapping between them is generally accomplished by further delineating virtual servers by TCP destination port. So a single virtual IP address can point to a virtual “HTTP” server, a virtual “SMTP” server, a virtual “SSH” server, etc… Each virtual “X” server is a separate instantiation, all essentially listening on the same virtual IP address. It is also true, however, that a single virtual server can be represented by multiple virtual IP addresses. So “www1” and “www2” may represent different virtual IP addresses, but they might both use the same virtual server. This allows an application delivery controller to make routing decisions based on the host name, so “images.yourcompany.com” and “content.yourcompany.com” might resolve to the same virtual IP address and the same virtual server, but the “pool” of resources to which requests for images is directed will be different than the “pool” of resources to which content is directed. This allows for greater flexibility in architecture and scalability of resources at the content-type and application level rather than at the server level. WILS: Write It Like Seth. Seth Godin always gets his point across with brevity and wit. WILS is an ATTEMPT TO BE concise about application delivery TOPICS AND just get straight to the point. NO DILLY DALLYING AROUND. Server Virtualization versus Server Virtualization Architects Need to Better Leverage Virtualization Using "X-Forwarded-For" in Apache or PHP SNAT Translation Overflow WILS: Client IP or Not Client IP, SNAT is the Question WILS: Why Does Load Balancing Improve Application Performance? WILS: The Concise Guide to *-Load Balancing WILS: Network Load Balancing versus Application Load Balancing All WILS Topics on DevCentral If Load Balancers Are Dead Why Do We Keep Talking About Them?2.8KViews1like1CommentWhat is an Application Delivery Controller - Part II
Application Delivery Basics One of the unfortunate effects of the continued evolution of the load balancer into today's application delivery controller (ADC) is that it is often too easy to forget the basic problem for which load balancers were originally created—producing highly available, scalable, and predictable application services. We get too lost in the realm of intelligent application routing, virtualized application services, and shared infrastructure deployments to remember that none of these things are possible without a firm basis in basic load balancing technology. So how important is load balancing, and how do its effects lead to streamlined application delivery? Let’s examine the basic application delivery transaction. The ADC will typically sit in-line between the client and the hosts that provide the services the client wants to use. As with most things in application delivery, this is not a rule, but more of a best practice in a typical deployment. Let's also assume that the ADC is already configured with a virtual server that points to a cluster consisting of two service points. In this deployment scenario, it is common for the hosts to have a return route that points back to the load balancer so that return traffic will be processed through it on its way back to the client. The basic application delivery transaction is as follows: The client attempts to connect with the service on the ADC. The ADC accepts the connection, and after deciding which host should receive the connection, changes the destination IP (and possibly port) to match the service of the selected host (note that the source IP of the client is not touched). The host accepts the connection and responds back to the original source, the client, via its default route, the load balancer. The ADC intercepts the return packet from the host and now changes the source IP (and possible port) to match the virtual server IP and port, and forwards the packet back to the client. The client receives the return packet, believing that it came from the virtual server or host, and continues the process. Figure 1. A basic load balancing transaction. This very simple example is relatively straightforward, but there are a couple of key elements to take note of. First, as far as the client knows, it sends packets to the virtual server and the virtual server responds—simple. Second, the NAT takes place. This is where the ADC replaces the destination IP sent by the client (of the virtual server) with the destination IP of the host to which it has chosen to load balance the request. Step three is the second half of this process (the part that makes the NAT "bi-directional"). The source IP of the return packet from the host will be the IP of the host; if this address were not changed and the packet was simply forwarded to the client, the client would be receiving a packet from someone it didn't request one from, and would simply drop it. Instead, the ADC, remembering the connection, rewrites the packet so that the source IP is that of the virtual server, thus solving this problem. The Application Delivery Decision So, how does the ADC decide which host to send the connection to? And what happens if the selected host isn't working? Let's discuss the second question first. What happens if the selected host isn't working? The simple answer is that it doesn't respond to the client request and the connection attempt eventually times out and fails. This is obviously not a preferred circumstance, as it doesn't ensure high availability. That's why most ADC technology includes some level of health monitoring that determines whether a host is actually available before attempting to send connections to it. There are multiple levels of health monitoring, each with increasing granularity and focus. A basic monitor would simply PING the host itself. If the host does not respond to PING, it is a good assumption that any services defined on the host are probably down and should be removed from the cluster of available services. Unfortunately, even if the host responds to PING, it doesn't necessarily mean the service itself is working. Therefore most devices can do "service PINGs" of some kind, ranging from simple TCP connections all the way to interacting with the application via a scripted or intelligent interaction. These higher-level health monitors not only provide greater confidence in the availability of the actual services (as opposed to the host), but they also allow the load balancer to differentiate between multiple services on a single host. The ADC understands that while one service might be unavailable, other services on the same host might be working just fine and should still be considered as valid destinations for user traffic. This brings us back to the first question: How does the ADC decide which host to send a connection request to? Each virtual server has a specific dedicated cluster of services (listing the hosts that offer that service) which makes up the list of possibilities. Additionally, the health monitoring modifies that list to make a list of "currently available" hosts that provide the indicated service. It is this modified list from which the ADC chooses the host that will receive a new connection. Deciding the exact host depends on the ADC algorithm associated with that particular cluster. The most common is simple round-robin where the ADC simply goes down the list starting at the top and allocates each new connection to the next host; when it reaches the bottom of the list, it simply starts again at the top. While this is simple and very predictable, it assumes that all connections will have a similar load and duration on the back-end host, which is not always true. More advanced algorithms use things like current-connection counts, host utilization, and even real-world response times for existing traffic to the host in order to pick the most appropriate host from the available cluster services. Sufficiently advanced application delivery systems will also be able to synthesize health monitoring information with load balancing algorithms to include an understanding of service dependency. This is the case when a single host has multiple services, all of which are necessary to complete the user's request. A common example would be in e-commerce situations where a single host will provide both standard HTTP services (port 80) as well as HTTPS (SSL/TLS at port 443) and any other potential service ports that need to be allowed. In many of these circumstances, you don't want a user going to a host that has one service operational, but not the other. In other words, if the HTTPS services should fail on a host, you also want that host's HTTP service to be taken out of the cluster list of available services. This functionality is increasingly important as HTTP-like services become more differentiated with this things like XML and scripting. To Load Balance or Not to Load Balance? Load balancing in regards to picking an available service when a client initiates a transaction request is only half of the solution. Once the connection is established, the ADC must keep track of whether the following traffic from that user should be load balanced. There are generally two specific issues with handling follow-on traffic once it has been load balanced: connection maintenance and persistence. Connection maintenance If the user is trying to utilize a long-lived TCP connection (telnet, FTP, and more) that doesn't immediately close, the ADC must ensure that multiple data packets carried across that connection do not get load balanced to other available service hosts. This is connection maintenance and requires two key capabilities: 1) the ability to keep track of open connections and the host service they belong to; and 2) the ability to continue to monitor that connection so the connection table can be updated when the connection closes. This is rather standard fare for most ADCs. Persistence Increasingly more common, however, is when the client uses multiple short-lived TCP connections (for example, HTTP) to accomplish a single task. In some cases, like standard web browsing, it doesn't matter and each new request can go to any of the back-end service hosts; however, there are many more instances (XML, JavaScript, e-commerce "shopping cart," HTTPS, and so on) where it is extremely important that multiple connections from the same user go to the same back-end service host and not be load balanced. This concept is called persistence, or server affinity. There are multiple ways to address this depending on the protocol and the desired results. For example, in modern HTTP transactions, the server can specify a "keep-alive" connection, which turns those multiple short-lived connections into a single long-lived connection that can be handled just like the other long-lived connections. However, this provides little relief. Even worse, as the use of web and mobile services increases, keeping all of these connections open longer than necessary would strain the resources of the entire system. In these cases, most ADCs provide other mechanisms for creating artificial server affinity. One of the most basic forms of persistence is source-address affinity. Source address affinity persistence directs session requests to the same server based solely on the source IP address of a packet. This involves simply recording the source IP address of incoming requests and the service host they were load balanced to, and making all future transaction go to the same host. This is also an easy way to deal with application dependency as it can be applied across all virtual servers and all services. In practice however, the wide-spread use of proxy servers on the Internet and internally in enterprise networks renders this form of persistence almost useless; in theory it works, but proxy-servers inherently hide many users behind a single IP address resulting in none of those users being load balanced after the first user's request—essentially nullifying the ADC capability. Today, the intelligence of ADCs allows organizations to actually open up the data packets and create persistence tables for virtually anything within it. This enables them to use much more unique and identifiable information, such as user name, to maintain persistence. However, organizations one must take care to ensure that this identifiable client information will be present in every request made, as any packets without it will not be persisted and will be load balanced again, most likely breaking the application. Final Thoughts It is important to understand that basic load balancing technology, while still in use, is now only considered a feature of Application Delivery Controllers. ADCs evolved from the first load balancers through the service virtualization process and today with software only virtual editions. They can not only improve availability, but also affect the security and performance of the application services being requested. Today, most organizations realize that simply being able to reach an application doesn't make it usable; and unusable applications mean wasted time and money for the enterprise deploying them. ADCs enable organizations to consolidate network-based services like SSL/TLS offload, caching, compression, rate-shaping, intrusion detection, application firewalls, and even remote access into a single strategic point that can be shared and reused across all application services and all hosts to create a virtualized Application Delivery Network. Basic load balancing is the foundation without which none of the enhanced functionality of today's ADCs would be possible. And if you missed What is an ADC Part 1, you can find it here. ps Next Steps Now that you’ve gotten this far, would you like to dig deeper or learn more about how application delivery works? Cool, then check out these resources: Go Beyond POLB (Plain Old Load Balancing) The Cloud-Ready ADC BIG-IP Virtual Edition Products, The Virtual ADCs Your Application Delivery Network Has Been Missing Cloud Balancing: The Evolution of Global Server Load Balancing994Views0likes0CommentsBIG-IQ Grows UP [End of Life]
The F5 and Cisco APIC integration based on the device package and iWorkflow is End Of Life. The latest integration is based on the Cisco AppCenter named ‘F5 ACI ServiceCenter’. Visit https://f5.com/cisco for updated information on the integration. Today F5 is announcing a new F5® BIG-IQ™ 4.5. This release includes a new BIG-IQ component – BIG-IQ ADC. Why is 4.5 a big deal? This release introduces a critical new BIG-IQ component, BIG-IQ ADC. With ADC management, BIG-IQ can finally control basic local traffic management (LTM) policies for all your BIG-IP devices from a single pane of glass. Better still, BIG-IQ’s ADC function has been designed with the concept of “roles” deeply ingrained. In practice, this means that BIG-IQ offers application teams a “self-serve portal” through which they can manage load balancing of just the objects they are “authorized” to access and update. Their changes can be staged so that they don’t go live until the network team has approved the changes. We will post follow up blogs that dive into the new functions in more detail. In truth, there are a few caveats around this release. Namely, BIG-IQ requires our customer’s to be using BIG-IP 11.4.1 or above. Many functions require 11.5 or above. Customers with older TMOS version still require F5’s legacy central management solution, Enterprise Manager. BIG-IQ still can’t do some of the functions Enterprise Manager provides, such as iHealth integration and advanced analytics. And BIG-IQ can’t yet manage some advanced LTM options. Never-the-less, this release will an essential component of many F5 deployments. And since BIG-IQ is a rapidly growing platform, the feature gaps will be filled before you know it. Better still, we have big plans for adding additional components to the BIG-IQ framework over the coming year. In short, it’s time to take a long hard look at BIG-IQ. What else is new? There are hundreds of new or modified features in this release. Let me list a few of the highlights by component: 1. BIG-IQ ADC - Role-based central Management of ADC functions across the network · Centralized basic management of LTM configurations · Monitoring of LTM objects · Provide high availability and clustering support of BIG-IP devices and application centric manageability services · Pool member management (enable/disable) · Centralized iRules Management (though not editing) · Role-based management · Staging and manual of deployments 2. BIG-IQ Cloud - Enhanced Connectivity and Partner Integration · Expand orchestration and management of cloud platforms via 3rd party developers · Connector for VMware NSX and (early access) connector for Cisco ACI · Improve customer experience via work flows and integrations · Improve tenant isolation on device and deployment 3. BIG-IQ Device - Manage physical and virtual BIG-IP devices from a single pane of glass · Support for VE volume licensing · Management of basic device configuration & templates · UCS backup scheduling · Enhanced upgrade advisor checks 4. BIG-IQ Security - Centralizes security policy deployment, administration, and management · Centralized feature support for BIG-IP AFM · Centralized policy support for BIG-IP ASM · Consolidated DDoS and logging profiles for AFM/ASM · Enhanced visibility and notifications · API documentation for ASM · UI enhancements for AFM policy management My next blog will include a video demonstrating the new BIG-IQ ADC component and showing how it enhances collaboration between the networking and application teams with fine grained RBAC.800Views0likes3CommentsBig IQ route domain ID when setting up a Virtual Server
Good day, So we are firing up Big IQ and I have hit on an issues in provision in Big IQ ADC. When creating a server in Big IP LTM we typically use something like xxx.xxx.xxx.xxx%980/32 to define the IP address and associate it the the correct RD. When attempting this in Big IQ I keep getting "Invalid CIDR Notation" I don't see anything in the Properties that would allow this entry either. Could someone point me to either the correct notation or where I need to define it? Thanks, Jeremy210Views0likes0Comments