global load balancing
9 TopicsF5 Distributed Cloud (XC) Global Applications Load Balancing in Cisco ACI
Introduction F5 Distributed Cloud (XC) simplify cloud-based DNS management with global server load balancing (GSLB) and disaster recovery (DR). F5 XC efficiently directs application traffic across environments globally, performs health checks, and automates responses to activities and events to maintain high application performance with high availability and robustness. In this article, we will discuss how we can ensure high application performance with high availability and robustness by using XC to load-balance global applications across public clouds and Cisco Application Centric Infrastructure (ACI) sites that are geographically apart. We will look at two different XC in ACI use cases. Each of them uses a different approach for global applications delivery and leverages a different XC feature to load balance the applications globally and for disaster recovery. XC DNS Load Balancer Our first XC in ACI use case is very commonly seen where we use a traditional network-centric approach for global applications delivery and disaster recovery. We use our existing network infrastructure to provide global applications connectivity and we deploy GSLB to load balance the applications across sites globally and for disaster recovery. In our example, we will show you how to use XC DNS Load Balancer to load-balance a global application across ACI sites that are geographically dispersed. One of the many advantages of using XC DNS Load Balancer is that we no longer need to manage GSLB appliances. Also, we can expect high DNS performance thanks to XC global infrastructure. In addition, we have a single pane of glass, the XC console, to manage all of our services such as multi-cloud networking, applications delivery, DNS services, WAAP etc. Example Topology Here in our example, we use Distributed Cloud (XC) DNS Load Balancer to load balance our global application hello.bd.f5.com, which is deployed in a hybrid multi-cloud environment across two ACI sites located in San Jose and New York. Here are some highlights at each ACI site from our example: New York location XC CE is deployed in ACI using layer three attached with BGP XC advertises custom VIP 10.10.215.215 to ACI via BGP XC custom VIP 10.10.215.215 has an origin server 10.131.111.88 on AWS BIG-IP is integrated into ACI BIG-IP has a public VIP 12.202.13.149 that has two pool members: on-premise origin server 10.131.111.161 XC custom VIP 10.10.215.215 San Jose location XC CE is deployed in ACI using layer three attached with BGP XC advertises custom VIP 10.10.135.135 to ACI via BGP XC custom VIP 10.10.135.135 has an origin server 10.131.111.88 on Azure BIG-IP is integrated into Cisco ACI BIG-IP has a public VIP 12.202.13.147 that has two pool members: on-premise origin server 10.131.111.55 XC custom VIP 10.10.135.135 *Note:Click here to review on how to deploy XC CE in ACI using layer three attached with BGP. DNS Load Balancing Rules A DNS Load Balancer is an ingress controller for the DNS queries made to your DNS servers. The DNS Load Balancer receives the requests and answers with an IP address from a pool of members based on the configured load balancing rules. On the XC console, go to "DNS Management"->"DNS Load Balancer Management" to create a DNS Load Balancer and then define the load balancing rules. Here in our example, we created a DNS Load Balancer and defined the load balancing rules for our global application hello.bd.f5.com (note: as a prerequisite, F5 XC must be providing primary DNS for the domain): Rule #1: If the DNS request tohello.bd.f5.com comes from United States or United Kingdom, respond with BIG-IP VIP 12.203.13.149 in the DNS response so that the application traffic will be directed to New York ACI site and forwarded to an origin server that is located in AWS or on-premise: Rule #2: If the DNS request tohello.bd.f5.com comes from United States or United Kingdom and if New York ACI site become unavailable, respond with BIG-IP VIP 12.203.13.147 in the DNS response so that the application traffic will be directed to San Jose ACI site and forwarded to an origin server that is located on-premise or in Azure: Rule #3: If the DNS request tohello.bd.f5.comcomes from somewhere outside of United States or United Kingdom, respond with BIG-IP VIP 12.203.13.147 in the DNS response so that the application traffic will be directed to San Jose ACI and forwarded to an origin server that is located on-premise or in Azure: Validation Now, let's see what happens. When a machine located in the United States tries to reach hello.bd.f5.comand if both ACI sites are up, the traffic is directed to New York ACI site and forwarded to an origin server that is located on-premise or in AWS as expected: When a machine located in the United States tries to reachhello.bd.f5.comand if the New York ACI site is down or becomes unavailable, the traffic is re-directed to San Jose ACI site and forwarded to an origin server that is located on-premise or in Azure as expected: When a machine tries to accesshello.bd.f5.com from outside of United States or United Kingdom, it is directed to San Jose ACI site and forwarded to an origin server that is located on-premise or in Azure as expected: On the XC console, go to"DNS Management"and select the appropriate DNS Zone to view theDashboardfor information such as the DNS traffic distribution across the globe, the query types etc andRequests for DNS requests info: XC HTTP Load Balancer Our second XC in ACI use case uses a different approach for global applications delivery and disaster recovery. Instead of using the existing network infrastructure for global applications connectivity and utilizing XC DNS Load Balancer for global applications load balancing, we simplify the network layer management by securely deploying XC to connect our applications globally and leveraging XC HTTP Load Balancer to load balance our global applications and for disaster recovery. Example Topology Here in our example, we use XC HTTP load balancer to load balance our global applicationglobal.f5-demo.com that is deployed across a hybrid multi-cloud environment. Here are some highlights: XC CE is deployed in each ACI site using layer three attached with BGP New York location: ACI advertises on-premise origin server 10.131.111.161 to XC CE via BGP San Jose location: ACI advertises on-premise origin server 10.131.111.55 to XC CE via BGP An origin server 10.131.111.88 is located in AWS An origin server 10.131.111.88 is located in Azure *Note:Click here to review on how to deploy XC CE in ACI using layer three attached with BGP. XC HTTP Load Balancer On the XC console, go to “Multi-Cloud App Connect” -> “Manage” -> “Load Balancers” -> “HTTP Load Balancers” to “Add HTTP Load Balancer”. In our example, we created a HTTPS load balancer named globalwith domain name global.f5-demo.com. Instead of bringing our own certificate, we took advantage of the automatic TLS certificate generation and renewal supported by XC: Go to “Origins” section to specify the origin servers for the global application. In our example, we included all origin servers across the public clouds and ACI sites for our global application global.f5-demo.com: Next, go to “Other Settings” -> “VIP Advertisement”. Here, select either “Internet” or “Internet (Specified VIP)” to advertise the HTTP Load Balancer to the Internet. In our example, we selected “Internet” to advertise global.f5-demo.com globally because we decided not to manage nor to acquire a public IP: In our first use case, we defined a set of DNS load balancing rules on the XC DNS Load Balancer to direct the application traffic based on our requirement: If the request toglobal.f5-demo.com comes from United States or United Kingdom, application traffic should be directed to an origin server that is located on-premise in New York ACI site or in AWS. If the request toglobal.f5-demo.com comes from United States or United Kingdom and if the origin servers in New York ACI site and AWS become unavailable, application traffic should be re-directed to an origin server that is located on-premise in San Jose ACI site or in Azure. If the request toglobal.f5-demo.com comes from somewhere outside of United States or United Kingdom, application traffic should be directed to an origin server that is located on-premise in San Jose ACI site or in Azure. We can accomplish the same with XC HTTP Load Balancer by configuring Origin Server Subset Rules. XC HTTP Load Balancer Origin Server Subset Rules allow users to create match conditions on incoming source traffic to the XC HTTP Load Balancer and direct the matched traffic to the desired origin server(s). The match condition can be based on country, ASN, regional edge (RE), IP address, or client label selector. As a prerequisite, we create and assign a label (key-value pair) to an origin server so that we can specify where to direct the matched traffic to in reference to the label in Origin Server Subset Rules. Go to “Shared Configuration” -> “Manage” -> “Labels” -> “Known Keys” and “Add Know Key” to create labels. In our example, we created a key named jy-key with two labels: us-uk and other : Now, go to "Origin pool" under “Multi-Cloud App Connect” and apply the labels to the origin servers: In our example, origin servers in New York ACI site and AWS are labeledus-uk while origin servers in San Jose ACI site and Azure are labeled other : Then, go to “Other Settings” to enable subset load balancing. In our example, jy-key is our origin server subsets class, and we configured to use default subset original pool labeled other as our fallback policy choice based on our requirement that is if the origin servers in New York ACI site and AWS become unavailable, traffic should be directed to an origin server in San Jose ACI site or Azure: Next, on the HTTP Load Balancer, configure the Origin Server Subset Rules by enabling “Show Advanced Fields” in the "Origins" section: In our example, we created following Origin Server Subset Rules based on our requirement: us-uk-rule: If the request toglobal.f5-demo.com comes from United States or United Kingdom, direct the application traffic to an origin server labeled us-uk that is either in New York ACI site or AWS. other-rule: If the request to global.f5-demo.comdoes not come from United States or United Kingdom, direct the application traffic to an origin server labeled other that is either in San Jose ACI site or Azure. Validation As a reminder, we use XC automatic TLS certificate generation and renewal feature for our HTTPS load balancer in our example. First, let's confirm the certificate status: We can see the certificate is valid with an auto renew date. Now, let’s run some tests and see what happens. First, let’s try to access global.f5-demo.com from United Kingdom: We can see the traffic is directed to an origin server located in New York ACI site or AWS as expected. Next, let's see what happens if the origin servers from both of these sites become unavailable: The traffic is re-directed to an origin server located in San Jose ACI site or Azure as expected. Last, let’s try to access global.f5-demo.com from somewhere outside of United States or United Kingdom: The traffic is directed to an origin server located in San Jose ACI site or Azure as expected. To check the requests on the XC Console, go to "Multi-Cloud App Connect" -> “Performance” -> "Requests" from the selected HTTP Load Balancer. Below is a screenshot from our example and we can see the request to global.f5-demo.com came from Australia was directed to the origin server 10.131.111.55 located in San Jose ACI site based on the configured Origin Server Subset Rules other-rule: Here is another example that the request came from United States was sent to the origin server 10.131.111.88 located in AWS based on the configured Origin Server Subset Rules us-uk-rule: Summary F5 XC simplify cloud-based DNS management with global server load balancing (GSLB) and disaster recovery (DR). By deploying F5 XC in Cisco ACI, we can securely deploy and load balance our global applications across ACI sites (and public clouds) efficiently while maintaining high application performance with high availability and robustness among global applications at all times. Related Resources *On-Demand Webinar*Deploying F5 Distributed Cloud Services in Cisco ACI Deploying F5 Distributed Cloud (XC) Services in Cisco ACI - Layer Three Attached Deployment Deploying F5 Distributed Cloud (XC) Services in Cisco ACI - Layer Two Attached Deployment466Views0likes0CommentsKeep the Old, In With the New
For decades, the infrastructure needed to keep your public-facing websites online has had a relatively simple design. After all, it was all contained in one or a few datacenters under your control, or under the control of a contractor who did your bidding where network architecture was concerned. It looked, in its simplest form, something like this: Requests came in, DNS resolved them, the correct server or pool of servers handled them. There was security, and sometimes a few other steps in between, but that still didn’t change that it was a pretty simplistic architecture. Even after virtualization took hold in the datacenter, the servers were still the same, just where they sat became more fluid. And then cloud computing became the rage. Business users started demanding that IT look into it, started asking hard questions about whether it would be cheaper or more cost-effective to run some applications from the cloud, where the up-front costs are less and the overall costs might be higher than internal deployment, might not, but were spread out over time. All valid questions that will have different answers depending upon the variables in question. And that means you have to worry about it. The new world looks a little different. It’s more complex, and thus is more in need of a makeover. DNS can no longer authoritatively say “oh, that server is over there” because over there might be outside the DNS’s zone. In really complex environments, it is possible that the answer DNS gives out may be vastly different depending upon a variety of factors. If the same application is running in two places, DNS has to direct users to the one most suited to them at this time. Not so simple, really, but comes with the current state of the world. In the new world, a better depiction would be this: In this scenario, a more global solution than just DNS is required. The idea that a user in one country could be sent to an app in a public cloud and users in another could be sent to the corporate datacenter kind of implies that localized DNS services are short of a full solution. That’s where GDNS – Global DNS comes into play. It directs users between local DNS copies to get the correct response required. If it is combined with something like F5 GTM’s Wide IPs, then it can direct people to shifting locations by offering an IP address that is understood at each location. Indeed, it isn’t even understood at each location, it’s a proxy for a local address. Use a Wide IP to give users (and more importantly your apps) a single place to go when GDNS responds, and then choose from a group of local DNS resolutions to direct the user where they need to go. It’s another layer on DNS, but it is an idea whose time has come. You can only go so far with zone-based LDNS, then you’re outside the zone, or directing users inefficiently. Global DNS is here, and the need is getting more imperative every day. Check it out. Call your ADC vendor to find out what they have on offer, or check out the InfoBlox and F5 information on F5.com if your ADC vendor doesn’t have a Global DNS solution. Cloud does have its uses, you just have to plan for the issues it creates. This is an easy one to overcome, and you might just get increased agility out of GDNS and Global Server Load Balancing (GSLB). Related Articles and Blogs: Carrier Grade DNS: Not your Parents DNS Audio White Paper - High-Performance DNS Services in BIG-IP ... DNS is Like Your Mom The End of DNS As We Know It Enhanced DNS Services: For Administrators, Managers and Marketers GTM Classless Classes v.10: GTM, Meet CLI DevCentral Interviews | Audio - GTM DevCentral Weekly Roundup | Audio Podcast - GTM Real IT Video Blog - Jason Reflects On GTM Topology200Views0likes0CommentsLet me tell you Where To Go.
One thing in life, whether you are using a Garmin to go to a friend’s party or planning your career, you need to know where you’re going. Failure to have a destination in mind makes it very difficult to get directions. Even when you know where you’re going, you will have a terrible time getting there if your directions are bad. Take, for example, using a GPS to navigate between when they do major road construction and when you next update your GPS device’s maps. On a road by my house, I can actually drive down the road and be told that I’m on the highway 100 feet (30 meters) distant. Because I haven’t updated my device since they built this new road, it maps to the nearest one it can find going in the same direction. It is misinformed. And, much like the accuracy of a GPS, we take DNS for granted until it goes horribly wrong. Unfortunately, with both you can be completely lost in the wild before you figure out that something is wrong. The number of ways that DNS can go wrong is limited – it is a pretty simple system – but when it does, there is no way to get where you need to go. Just like when construction dead-ends a road. Like a road not too far from my house. Notice in the attached screenshot taken from Google Maps, how the satellite data doesn’t match the road data. The roads pictured by the satellite actually intersect. The ones pictured in roadway data do not. That is because they did intersect until about eight months ago. Now the roadway data is accurate, and one road has a roundabout, while the other passes over it. As you can plainly see, a GPS is going to tell you “go up here and turn right on road X”, when in reality it is not possible to do that any more. You don’t want your DNS doing the same thing. Really don’t. There are a couple of issues that could make your DNS either fail to respond or misdirect people. I’ll probably talk about them off-n-on over the next few months, because that’s where my head is at the moment, but we’ll discuss the two obvious ones today, just to keep this blog to blog length. First is failure to respond – either because it is overloaded, or down, or whatever. This one is easy to resolve with redundancy and load balancing. Add in Global Load Balancing, and you can distribute traffic between datacenters, internal clouds, external clouds, whatever, assuming you have the right gear for the job. But if you’re a single datacenter shop, simple redundancy is pretty straight-forward, and the only problem that might compel you to greater measures is a DDoS attack. While a risk, as a single datacenter shop, you’re not likely to attract the attention of crowds that want to participate in DDoS unless you’re in a very controversial market space. So make sure you have redundancy in DNS servers, and test them. Amazing the amount of backup/disaster recovery infrastructure that doesn’t have a regular, formalized testing plan. It does you no good to have it in place if it doesn’t work when you need it. The other is misdirection. The whole point of DNS cache poisoning is to allow someone to masquerade as you. wget can copy the look-n-feel of your website, cache poisoning (or some other as-yet-unutilized DNS vector) can redirect your users to the attacker. They typed in your name, they got a page that looks like your page, but any information they enter goes to someone else. Including passwords and credit card numbers. Scary stuff. So DNS SEC is pretty much required. It protects DNS against known attacks, and against a ton of unexplored vectors, by utilizing authorization and encryption. Yeah, that’s a horrible overstatement, but it works for a blog aimed at IT staff as opposed to DNS uber-specialists. So implement DNS SEC, but understand that it takes CPU cycles on DNS servers – security is never free – so if your DNS system is anywhere near capacity, it’s time to upgrade that 80286 to something with a little more zing. It is a tribute to DNS that many BIND servers are running on ancient hardware, because they can, but it doesn’t hurt any to refresh the hardware and get some more cycles out of DNS. In the real world, you would not use a GPS system that might send you to the wrong place (I shut mine down when in downtown Cincinnati because it is inaccurate, for example), and you wouldn’t use one that a crook could intercept the signal from and send you to a location of his choosing for a mugging rather than to your chosen destination… So don’t use a DNS that both of these things are possible for. Reports indicate that there are still many, many out of date DNS systems running out there. upgrade, implement DNS SEC, and implement redundancy (if you haven’t already, most DNS servers seem to be set up in pairs pretty well) or DNS load balancing. Let your customers know that you’re doing it more reliable and secure – for them. And worry about one less thing while you’re grilling out over the weekend. After all, while all of our systems rely on DNS, you have to admit it gets very little of our attention… Unless it breaks. Make yours more resilient, so you can continue to give it very little attention.214Views0likes0CommentsF5 Friday: Infoblox and F5 Do DNS and Global Load Balancing Right.
#F5Friday #F5 Infoblox and F5 improve resilience, compliance, and security for global load balancing. If you’re a large corporation, two things that are a significant challenge for your Network Administrators’ are DNS management and Global Load Balancing (GLB) configuration/management. With systems spread across a region, country, or the globe, the amount of time investment required to keep things running smoothly ranges from “near zero” during quiet times to “why am I still here at midnight?” in times of major network change or outages. Until now. Two market leaders – Infoblox and F5 Networks have teamed up to make DNS – including DNSSEC – and GLB less time-consuming and error prone. Infoblox has extended their Trinzic DDI family of products with Infoblox Load Balancer Manager (LBM) for F5 Global Traffic Manager (GTM). The LBM turns a loose collection of load balancers into a dynamic, automated, Infoblox Grid™. What does all that rambling and all those acronyms boil down to? Here’s the bullet list, followed with more detail: Centralized Management of DNS and global load balancing services. Application of Infoblox Security Framework across F5 GTM devices. Automation of best practices. Allow administrators to delegate responsibility for small subsets of the network to responsible individuals. Enables rapid identification of network problems. Track changes to load balancing configurations for auditing and compliance. While F5 GTM brings DNS delivery services, global load balancing, workload management, disaster recovery, and application management to the enterprise, Infoblox LBM places a management layer over both global DNS and global load balancing, making them more manageable, less error prone, and more closely aligned to your organizational structure. LBM is a module available on Infoblox DDI Grid devices and VMs, and GTM is delivered either as a product module on BIG-IP or as a VM. With unified management, Infoblox LBM shows at-a-glance what is going on in the network: Since Infoblox DDI and F5 BIG-IP GTM both interface to multiple Authentication, Authorization, and Access Control (AAA) systems, Infoblox LBM allows unified security management with groups and users, and further allows control of a given set of objects (say all hardware in the San Francisco datacenter) to local administrators without having to expose the entire infrastructure to those users. For best practices, LBM implements single-click testing of connections to BIG-IP GTM devices, synchronization of settings across BIG-IP GTM instances for consistency, and auto discovery of settings, including protocol, DNS profiles, pools, virtual IPs, servers, and domains being load balanced. In short, LBM gives a solid view of what is happening inside your BIG-IP GTM devices and presents all appliances in a unified user interface. If you use BIG-IP iControl, then you will also be pleased that Infoblox LBM regularly checks the certificates used to secure iControl communications and validates that they are not rejected or expired. For more information about this solution, see the solution page. Previous F5 Fridays F5 Friday: Speed Matters F5 Friday: No DNS? No … Anything. F5 Friday: Zero-Day Apache Exploit? Zero-Problem F5 Friday: What's Inside an F5? F5 Friday: Programmability and Infrastructure as Code F5 Friday: Enhancing FlexPod with F5 F5 Friday: Microsoft and F5 Lync Up on Unified Communications619Views0likes0CommentsF5 Friday: Elastic Applications are Enabled by Dynamic Infrastructure
You really can’t have the one without the other. VMware enables the former, F5 provides the latter. The use of public cloud computing as a means to expand compute capacity on-demand, a la during a seasonal or unexpected spike in traffic, is often called cloud bursting and we’ve been talking about it (at least in the hypothetical sense) for some time now. When we first started talking about it the big question was, of course, but how do you get the application in the cloud in the first place? Everyone kind of glossed over that because there was no real way to do it on-demand. OVERCOMING the OBSTACLES BIT by BIT and BYTE by BYTE The challenges associated with dynamically moving a live, virtually deployed application from one location to another were not trivial but neither were they insurmountable. Early on these challenges have been directly associated with the difference in networking and issues with the distances over which a virtual image could be successfully transferred. As the industry began to address those challenges others came to the fore. It’s not enough, after all, to just transfer a virtual machine from one location to another – especially if you’re trying to do so on-demand, in response to some event. You want to migrate that application while it’s live and in use, and you don’t want to disrupt service to do it because no matter what optimizations and acceleration techniques are used to mitigate the transfer time between locations, it’s still going to take some time. The whole point of cloud bursting is to remain available and if the process to achieve that dynamic growth defeats the purpose, well, it seems like a silly thing to do, doesn’t it? As we’ve gotten past that problem now another one rears its head: the down side. Not the negatives, no, the other down side – the scaling down side of cloud bursting. Remember the purpose of performing this technological feat in the first place is dynamic scalability, to enable an elastic application that scales up and down on-demand. We want to be able to leverage the public cloud when we need it but not when we don’t, to keep really realize the benefits of cloud and its lower cost of compute capacity. FORGING AHEAD F5 has previously proven that a live migration of an application is not only possible, but feasible. This week at VMworld we took the next step: elastic applications. Yes, we not only proved you can burst an application into the cloud and scale up while live and maintaining availability, but that you can also scale back down when demand decreases. The ability to also include a BIG-IP LTM Virtual Edition with the cloud-deployed application instance means you can also consistently apply any application delivery policies necessary to maintain security, consistent application access policies, and performance. The complete solution relies on products from F5 and VMware to monitor application response times and expand into the cloud when they exceed predetermined thresholds. Once in the cloud, the solution can further expand capacity as needed based on application demand. The solution comprises the use of: VMware vCloud Director A manageable, scalable platform for cloud services, along with the necessary APIs to provision capacity on demand. F5 BIG-IP® Local Traffic Manager™ (LTM) One in each data center and/or cloud providing management and monitoring to ensure application availability. Application conditions are reported to the orchestration tool of choice, which then triggers actions (scale up or down) via the VMware vCloud API. Encryption and WAN optimization for SQLFabric communications between the data center and the cloud are also leveraged for security and performance. F5 BIG-IP® Global Traffic Manager™ (GTM) Determines when and how to direct requests to the application instances in different sites or cloud environments based on pre-configured policies that dynamically respond to application load patterns. Global application delivery (load balancing) is critical for enabling cloud bursting when public cloud-deployed applications are not integrated via a virtual private cloud architecture. VMware GemStone SQLFabric Provides the distributed caching and replication of database objects between sites (cloud and/or data center) necessary to keep application content localized and thereby minimize the performance impact of latency between the application and its data. I could talk and talk about this solution but if a picture is worth a thousand words then this video ought to be worth at least that much in demonstrating the capabilities of this joint solution. If you’re like me and not into video (I know, heresy, right?) then I invite you to take a gander at some more traditional content describing this and other VMware-related solutions: A Hybrid Cloud Architecture for Elastic Applications with F5 and VMware – Overview Hybrid Cloud Application Architecture for Elastic Java-Based Web Applications – Deployment Guide F5 and VMware Solution Guide If you do like video, however, enjoy this one explaining cloud bursting for elastic applications in a hybrid cloud architecture. Related blogs and articles: Bursting the Cloud vMotion Layer 2 Adjacency Requirements Cloud-bursting and the Database Cloud Balancing, Cloud Bursting, and Intercloud Cloud Balancing, Reverse Cloud Bursting, and Staying PCI-Compliant Virtual Private Cloud (VPC) Makes Internal Cloud bursting Reality How Microsoft is bursting into the cloud with BizTalk So You Put an Application in the Cloud. Now what? Migrate a live application across clouds with no downtime? Sure ... Just in Case. Bring Alternate Plans to the Cloud Party CloudFucius Asks: Will Open Source Open Doors for Cloud Computing? The Three Reasons Hybrid Clouds Will Dominate Pursuit of Intercloud is Practical not Premature263Views0likes1CommentDatabases in the Cloud Revisited
A few of us were talking on Facebook about high speed rail (HSR) and where/when it makes sense the other day, and I finally said that it almost never does. Trains lost out to automobiles precisely because they are rigid and inflexible, while population densities and travel requirements are highly flexible. That hasn’t changed since the early 1900s, and isn’t likely to in the future, so we should be looking at different technologies to answer the problems that HSR tries to address. And since everything in my universe is inspiration for either blogging or gaming, this lead me to reconsider the state of cloud and the state of cloud databases in light of synergistic technologies (did I just use “synergistic technologies in a blog? Arrrggghhh…). There are several reasons why your organization might be looking to move out of a physical datacenter, or to have a backup datacenter that is completely virtual. Think of the disaster in Japan or hurricane Katrina. In both cases, having even the mission critical portions of your datacenter replicated to the cloud would keep your organization online while you recovered from all of the other very real issues such a disaster creates. In other cases, if you are a global organization, the cost of maintaining your own global infrastructure might well be more than utilizing a global cloud provider for many services… Though I’ve not checked, if I were CIO of a global organization today, I would be looking into it pretty closely, particularly since this option should continue to get more appealing as technology continues to catch up with hype. Today though, I’m going to revisit databases, because like trains, they are in one place, and are rigid. If you’ve ever played with database Continuous Data Protection or near-real-time replication, you know this particular technology area has issues that are only now starting to see technological resolution. Over the last year, I have talked about cloud and remote databases a few times, talking about early options for cloud databases, and mentioning Oracle Goldengate – or praising Goldengate is probably more accurate. Going to the west in the US? HSR is not an option. The thing is that the options get a lot more interesting if you have Goldengate available. There are a ton of tools, both integral to database systems and third-party that allow you to encrypt data at rest these days, and while it is not the most efficient access method, it does make your data more protected. Add to this capability the functionality of Oracle Goldengate – or if you don’t need heterogeneous support, any of the various database replication technologies available from Oracle, Microsoft, and IBM, you can seamlessly move data to the cloud behind the scenes, without interfering with your existing database. Yes, initial configuration of database replication will generally require work on the database server, but once configured, most of them run without interfering with the functionality of the primary database in any way – though if it is one that runs inside the RDBMS, remember that it will use up CPU cycles at the least, and most will work inside of a transaction so that they can insure transaction integrity on the target database, so know your solution. Running inside the primary transaction is not necessary, and for many uses may not even be desirable, so if you want your commits to happen rapidly, something like Goldengate that spawns a separate transaction for the replica are a good option… Just remember that you then need to pay attention to alerts from the replication tool so that you don’t end up with successful transactions on the primary not getting replicated because something goes wrong with the transaction on the secondary. But for DBAs, this is just an extension of their daily work, as long as someone is watching the logs. With the advent of Goldengate, advanced database encryption technology, and products like our own BIG-IPWOM, you now have the ability to drive a replica of your database into the cloud. This is certainly a boon for backup purposes, but it also adds an interesting perspective to application mobility. You can turn on replication from your data center to the cloud or from cloud provider A to cloud provider B, then use VMotion to move your application VMS… And you’re off to a new location. If you think you’ll be moving frequently, this can all be configured ahead of time, so you can flick a switch and move applications at will. You will, of course, have to weigh the impact of complete or near-complete database encryption against the benefits of cloud usage. Even if you use the adaptability of the cloud to speed encryption and decryption operations by distributing them over several instances, you’ll still have to pay for that CPU time, so there is a balancing act that needs some exploration before you’ll be certain this solution is a fit for you. And at this juncture, I don’t believe putting unencrypted corporate data of any kind into the cloud is a good idea. Every time I say that, it angers some cloud providers, but frankly, cloud being new and by definition shared resources, it is up to the provider to prove it is safe, not up to us to take their word for it. Until then, encryption is your friend, both going to/from the cloud and at rest in the cloud. I say the same thing about Cloud Storage Gateways, it is just a function of the current state of cloud technology, not some kind of unreasoning bias. So the key then is to make sure your applications are ready to be moved. This is actually pretty easy in the world of portable VMs, since the entire VM will pick up and move. The only catch is that you need to make sure users can get to the application at the new location. There are a ton of Global DNS solutions like F5’s BIG-IP Global Traffic Manager that can get your users where they need to be, since your public-facing IPs will be changing when moving from organization to organization. Everything else should be set, since you can use internal IP addresses to communicate between your application VMs and database VMs. Utilizing a some form of in-flight encryption and some form of acceleration for your database replication will round out the solution architecture, and leave you with a road map that looks more like a highway map than an HSR map. More flexible, more pervasive.368Views0likes0CommentsQuick! The Data Center Just Burned Down, What Do You Do?
You get the call at 2am. The data center is on fire, and while the server room itself was protected with your high-tech fire-fighting gear, the rest of the building billowed out smoke and noxious gasses that have contaminated your servers. Unless you have a sealed server room, this is a very real possibility. Another possibility is that the fire department had to spew a ton of liquid on your building to keep the fire from spreading. No sealed room means your servers might have taken a bath. And sealed rooms are a real rarity in datacenter design for a whole host of reasons starting with cost. So you turn to your DR plan, and step one is to make certain the load was shifted to an alternate location. That will buy you time to assess the damage. Little do you know that while a good start, that’s probably not enough of a plan to get you back to normal quickly. It still makes me wonder when you talk to people about disaster recovery how different IT shops have different views of what’s necessary to recover from a disaster. The reason it makes me wonder is because few of them actually have a Disaster Recovery Plan. They have a “Pain Alleviation Plan”. This may be sufficient, depending upon the nature of your organization, but it may not be. You are going to need buildings, servers, infrastructure, and the knowledge to put everything back together – even that system that ran for ten years after the team that implemented it moved on to a new job. Because it wouldn’t still be running on Netware/Windows NT/OS2 if it wasn’t critical and expensive to replace. If you’re like most of us, you moved that system to a VM if at all possible years ago, but you’ll still have to get it plugged into a network it can work on, and your wires? They’re all suspect. The plan to restore your ADS can be painful in-and-of itself, let alone applying the different security settings to things like NAS and SAN devices, since they have different settings for different LUNS or even folders and files. The massive amount of planning required to truly restore normal function of your systems is daunting to most organizations, and there are some question marks that just can’t be answered today for a disaster that might happen in a year or even ten – hopefully never, but we do disaster planning so that we’re prepared if it does, so never isn’t a good outlook while planning for the worst. While still at Network Computing, I looked at some great DR plans ranging from “send us VMs and we’ll ship you servers ready to rock the same day your disaster happens” to “We’ll drive a truck full of servers to your location and you can load them up with whatever you need and use our satellite connection to connect to the world”. Problem is that both of these require money from you every month while providing benefit only if you actually have a disaster. Insurance is a good thing, but increasing IT overhead is risky business. When budget time comes, the temptation to stop paying each month for something not immediately forwarding business needs is palpable. And both of those solutions miss the ever-growing infrastructure part. Could you replace your BIG-IPs (or other ADC gear) tomorrow? You could get new ones from F5 pretty quickly, but do you have their configurations backed up so you can restore? How about the dozens of other network devices, NAS and SAN boxes, network architecture? Yeah, it’s going to be a lot of work. But it is manageable. There is going to be a huge time investment, but it’s disaster recovery, the time investment is in response to an emergency. Even so, adequate planning can cut down the time you have to invest to return to business-as-usual. Sometimes by huge amounts. Not having a plan is akin to setting the price for a product before you know what it costs to produce – you’ll regret it. What do you need? Well if you’re lucky, you have more than one datacenter, and all you need to do is slightly oversize them to make sure you can pick up the slack if one goes down. If you’re not one of the lucky organizations, you’ll need a plan for getting a building with sufficient power, internet capability, and space, replace everything from power connections to racks to SAN and NAS boxes, restorable backups (seriously, test your backups or replication targets. There are horror stories…), and time for your staff to turn all of these raw elements into a functional datacenter. It’s a tall order, you need backups of the configs of all appliances and information from all of your vendors about replacement timelines. But should you ever need this plan, it is far better to have done some research than to wake up in the middle of the night and then, while you are down, spend time figuring it all out. The toughest bit is keeping it up to date, because a project to implement a DR plan is a discrete project, but updating costs for space and lists of vendors and gear on a regular basis is more drudgery and outside of project timelines. But it’s worth the effort as insurance. And if your timeline is critical, look into one of those semi trailers – or the new thing (since 2005 or 2007 at least), containerized data centers - because when you need them, you need them. If you can’t afford to be down for more than a day or two, they’re a good stopgap while you rebuild. SecurityProcedure.com has an aggregated list of free DR plans online. I’ve looked at a couple of the plans they list, they’re not horrible, but make certain you customize them to your organization’s needs. No generic plan is complete for your needs, so make certain you cover all of your bases if you use one of these. The key is to have a plan that dissects all the needs post-disaster. I’ve been through a disaster (The Great NWC Lab Flood), and there are always surprises, but having a plan to minimize them is a first step to maintaining your sanity and restoring your datacenter to full function. In the future – the not-too-distant future – you will likely have the cloud as a backup, assuming that you have a product like our GTM to enable cloud-bursting, and that Global Load Balancer isn’t taken out by the fire. But even if it is, replacing one device to get your entire datacenter emulated in the cloud would not be anywhere near as painful as the rush to reassemble physical equipment. Marketing Image of an IBM/APC Container Lori and I? No, we have backups and insurance and that’s about it. But though our network is complex, we don’t have any businesses hosted on it, so this is perfectly acceptable for our needs. No containerized data centers for us. Let’s hope we, and you, never need any of this.657Views0likes0CommentsWindows Vista Performance Issue Illustrates Importance of Context
Decisions about routing at every layer require context A friend forwarded a blog post to me last week mainly because it contained a reference to F5, but upon reading it (a couple of times) I realized that this particular post contained some very interesting information that needed to be examined further. The details of the problems being experienced by the poster (which revolve around a globally load-balanced site that was for some reason not being distributed very equally) point to an interesting conundrum: just how much control over site decisions should a client have? Given the scenario described, and the conclusion that it is primarily the result of an over-eager client implementation in Windows Vista of a fairly obscure DNS-focused RFC, the answer to how much control a client should have over site decisions seems obvious: none. The problem (which you can read about in its full detail here) described is that Microsoft Vista, when presented with multiple A records from a DNS query, will select an address “which shares the most prefix bits with the source address is selected, presumably on the basis that it's in some sense "closer" in the network.” This is not a bad thing. This implementation was obviously intended to aid in the choice of a site closer to the user, which is one of the many ways in which application network architects attempt to improve end-user performance: reducing the impact of physical distance on the transfer of application data. The problem is, however, that despite the best intentions of those who designed IP, it is not guaranteed that having an IP address that is numerically close to yours means the site is physically close to you. Which kind of defeats the purpose of implementing the RFC in the first place. Now neither solution (choosing random addresses versus one potentially physically closer) is optimal primarily because neither option assures the client that the chosen site is actually (a) available and (b) physically closer. Ostensibly the way this should work is that the DNS resolution process would return a single address (the author’s solution) based on the context in which the request was made. That means the DNS resolver needs to take into consideration the potential (in)accuracy of the physical location when derived from an IP address, the speed of the link over which the client is making the request (which presumably will not change between DNS resolution and application request) and any other information it can glean from the client. The DNS resolver needs to return the IP address of the site that at the time the request is made appears best able to serve the user’s request quickly. That means the DNS resolver (usually a global load balancer) needs to be contextually aware of not only the client but the sites as well. It needs to know (a) which sites are currently available to serve the request and (b) how well each is performing and (c) where they are physically located. That requires collaboration between the global load balancer and the local application delivery mechanisms that serve as an intermediary between the data center and the clients that interact with it. Yes, I know. A DNS request doesn’t carry information regarding which service will be accessed. A DNS lookup could be querying for an IP address for Skype, or FTP, or HTTP. Therein lies part of the problem, doesn’t it? DNS is a fairly old, in technical terms, standard. It is service agnostic and unlikely to change. But providing even basic context would help – if the DNS resolver knows a site is unreachable, likely due to routing outages, then it shouldn’t return that IP address to the client if another is available. Given the ability to do so, a DNS resolution solution could infer service based on host name – as long as the site were architected in such a way as to remain consistent with such conventions. For example, ensuring that www.example.com is used only for HTTP, and ftp.example.com is only used for FTP would enable many DNS resolvers to make better decisions. Host-based service mappings, inferred or codified, would aid in adding the context necessary to make better decisions regarding which IP address is returned during a DNS lookup – without changing a core standard and potentially breaking teh Internets. The problem with giving the client control over which site it accesses when trying to use an application is that it lacks the context necessary to make an intelligent decision. It doesn’t know whether a site is up or down or whether it is performing well or whether it is near or at capacity. It doesn’t know where the site is physically located and it certainly can’t ascertain the performance of those sites because it doesn’t even know where they are yet, that’s why it’s performing a DNS lookup. A well-performing infrastructure is important to the success of any web-based initiative, whether that’s cloud-based applications or locally hosted web sites. Part of a well-performing infrastructure is having the ability to route requests intelligently, based on the context in which those requests are made. Simply returning IP addresses – and choosing which one to use – in a vacuum based on little or no information about the state of those sites is asking for poor performance and availability problems. Context is critical.220Views0likes1CommentHow Sears Could Have Used the Cloud to Stay Available Black Friday
The prediction of the death of online shopping this holiday season were, apparently, greatly exaggerated. As it's been reported, Sears, along with several other well known retailers, were victims of heavy traffic on Black Friday. One wonders if the reports of a dismal shopping season this year due to economic concerns led retailers to believe that there would be no seasonal rush to online sites and therefore preparation to deal with sudden spikes in traffic were unnecessary. Most of the 63 objects (375 KB of total data) comprising sears.com home page are served from sears.com and are either images, scripts, or stylesheets. The rest of their site is similar, with a lot of static data comprising a large portion of the objects. That's a lot of static data being served, and a lot of connections required on the servers just for one page. Not knowing Sears internal architecture, it's quite possible they are already using application delivery and acceleration solutions to ensure availability and responsiveness of their site. If they aren't, they should, because even the simple connection optimizations available in today's application delivery controllers would have likely drastically reduced the burden on servers and increased the capacity of their entire infrastructure. But let's assume they are already using application delivery to its fullest and simply expended all possible capacity on their servers despite their best efforts due to the unexpected high volume of visitors. It happens. After all, server resources are limited in the data center and when the servers are full up, they're full up. Assuming that Sears, like most IT shops, isn't willing to purchase additional hardware and incur the associated management, power, and maintenance costs over the entire year simply to handle a seasonal rush, they still could have prepared for the onslaught by taking advantage of cloud computing. Cloudbursting is an obvious solution, as visitors who pushed Sears servers over capacity would have been automatically directed via global load balancing techniques to a cloud computing hosted version of their site. Not only could they have managed to stay available, this would have also improved performance of their site for all visitors as cloudbursting can use a wide array of variables to determine when requests should be directed to the cloud, including performance-based parameters. A second option would have been a hybrid cloud model, where certain files and objects are served from the local data center while others are served from the cloud. Instead of serving up static stylesheets and images from Sears.com internal servers, they could have easily been hosted in the cloud. Doing so would translate into fewer requests to sears.com internal servers which reduces the processing power required and results in higher capacity of servers. I suppose a third option would have been to commit fully to the cloud and move their entire application infrastructure to the cloud, but even though adoption appears to be imminent for many enterprises according to attendees at Gartner Data Center Conference, 2008 is certainly not "the year of the cloud" and there are still quite a few kinks in full adoption plans that need to be ironed out before folks can commit fully, such as compliance and integration concerns. Still, there are ways that Sears, and any organization with a web presence, could take advantage of the cloud without committing fully to ensure availability under exceedingly high volume. It just takes some forethought and planning. Yeah, I'm thinking it too, but I'm not going to say it either. Related articles by Zemanta Online retailers overloaded on Black Friday Online-only outlets see Black Friday boost over 2007 Sears.com out on Black Friday [Breakdowns] The Context-Aware Cloud Top 10 Reasons for NOT Using a Cloud Half of Enterprises See Cloud Presence by 2010239Views0likes3Comments