disaster recovery

36 Topics

Multiple Certs, One VIP: TLS Server Name Indication via iRules
An age old question that we’ve seen time and time again in the iRules forums here on DevCentral is “How can I use iRules to manage multiple SSL certs on one VIP"?”. The answer has always historically been “I’m sorry, you can’t.”. The reasoning is sound. One VIP, one cert, that’s how it’s always been. You can’t do anything with the connection until the handshake is established and decryption is done on the LTM. We’d like to help, but we just really can’t. That is…until now. The TLS protocol has somewhat recently provided the ability to pass a “desired servername” as a value in the originating SSL handshake. Finally we have what we’ve been looking for, a way to add contextual server info during the handshake, thereby allowing us to say “cert x is for domain x” and “cert y is for domain y”. Known to us mortals as "Server Name Indication" or SNI (hence the title), this functionality is paramount for a device like the LTM that can regularly benefit from hosting multiple certs on a single IP. We should be able to pull out this information and choose an appropriate SSL profile now, with a cert that corresponds to the servername value that was sent. Now all we need is some logic to make this happen. Lucky for us, one of the many bright minds in the DevCentral community has whipped up an iRule to show how you can finally tackle this challenge head on. Because Joel Moses, the shrewd mind and DevCentral MVP behind this example has already done a solid write up I’ll quote liberally from his fine work and add some additional context where fitting. Now on to the geekery: First things first, you’ll need to create a mapping of which servernames correlate to which certs (client SSL profiles in LTM’s case). This could be done in any manner, really, but the most efficient both from a resource and management perspective is to use a class. Classes, also known as DataGroups, are name->value pairs that will allow you to easily retrieve the data later in the iRule. Quoting Joel: Create a string-type datagroup to be called "tls_servername". Each hostname that needs to be supported on the VIP must be input along with its matching clientssl profile. For example, for the site "testsite.site.com" with a ClientSSL profile named "clientssl_testsite", you should add the following values to the datagroup. String: testsite.site.com Value: clientssl_testsite Once you’ve finished inputting the different server->profile pairs, you’re ready to move on to pools. It’s very likely that since you’re now managing multiple domains on this VIP you'll also want to be able to handle multiple pools to match those domains. To do that you'll need a second mapping that ties each servername to the desired pool. This could again be done in any format you like, but since it's the most efficient option and we're already using it, classes make the most sense here. Quoting from Joel: If you wish to switch pool context at the time the servername is detected in TLS, then you need to create a string-type datagroup called "tls_servername_pool". You will input each hostname to be supported by the VIP and the pool to direct the traffic towards. For the site "testsite.site.com" to be directed to the pool "testsite_pool_80", add the following to the datagroup: String: testsite.site.com Value: testsite_pool_80 If you don't, that's fine, but realize all traffic from each of these hosts will be routed to the default pool, which is very likely not what you want. Now then, we have two classes set up to manage the mappings of servername->SSLprofile and servername->pool, all we need is some app logic in line to do the management and provide each inbound request with the appropriate profile & cert. This is done, of course, via iRules. Joel has written up one heck of an iRule which is available in the codeshare (here) in it's entirety along with his solid write-up, but I'll also include it here in-line, as is my habit. Effectively what's happening is the iRule is parsing through the data sent throughout the SSL handshake process and searching for the specific TLS servername extension, which are the bits that will allow us to do the profile switching magic. He's written it up to fall back to the default client SSL profile and pool, so it's very important that both of these things exist on your VIP, or you may likely find yourself with unhappy users. One last caveat before the code: Not all browsers support Server Name Indication, so be careful not to implement this unless you are very confident that most, if not all, users connecting to this VIP will support SNI. For more info on testing for SNI compatibility and a list of browsers that do and don't support it, click through to Joel's awesome CodeShare entry, I've already plagiarized enough. So finally, the code. Again, my hat is off to Joel Moses for this outstanding example of the power of iRules. Keep at it Joel, and thanks for sharing! 1: when CLIENT_ACCEPTED { 2: if { [PROFILE::exists clientssl] } { 3: 4: # We have a clientssl profile attached to this VIP but we need 5: # to find an SNI record in the client handshake. To do so, we'll 6: # disable SSL processing and collect the initial TCP payload. 7: 8: set default_tls_pool [LB::server pool] 9: set detect_handshake 1 10: SSL::disable 11: TCP::collect 12: 13: } else { 14: 15: # No clientssl profile means we're not going to work. 16: 17: log local0. "This iRule is applied to a VS that has no clientssl profile." 18: set detect_handshake 0 19: 20: } 21: 22: } 23: 24: when CLIENT_DATA { 25: 26: if { ($detect_handshake) } { 27: 28: # If we're in a handshake detection, look for an SSL/TLS header. 29: 30: binary scan [TCP::payload] cSS tls_xacttype tls_version tls_recordlen 31: 32: # TLS is the only thing we want to process because it's the only 33: # version that allows the servername extension to be present. When we 34: # find a supported TLS version, we'll check to make sure we're getting 35: # only a Client Hello transaction -- those are the only ones we can pull 36: # the servername from prior to connection establishment. 37: 38: switch $tls_version { 39: "769" - 40: "770" - 41: "771" { 42: if { ($tls_xacttype == 22) } { 43: binary scan [TCP::payload] @5c tls_action 44: if { not (($tls_action == 1) && ([TCP::payload length] > $tls_recordlen)) } { 45: set detect_handshake 0 46: } 47: } 48: } 49: default { 50: set detect_handshake 0 51: } 52: } 53: 54: if { ($detect_handshake) } { 55: 56: # If we made it this far, we're still processing a TLS client hello. 57: # 58: # Skip the TLS header (43 bytes in) and process the record body. For TLS/1.0 we 59: # expect this to contain only the session ID, cipher list, and compression 60: # list. All but the cipher list will be null since we're handling a new transaction 61: # (client hello) here. We have to determine how far out to parse the initial record 62: # so we can find the TLS extensions if they exist. 63: 64: set record_offset 43 65: binary scan [TCP::payload] @${record_offset}c tls_sessidlen 66: set record_offset [expr {$record_offset + 1 + $tls_sessidlen}] 67: binary scan [TCP::payload] @${record_offset}S tls_ciphlen 68: set record_offset [expr {$record_offset + 2 + $tls_ciphlen}] 69: binary scan [TCP::payload] @${record_offset}c tls_complen 70: set record_offset [expr {$record_offset + 1 + $tls_complen}] 71: 72: # If we're in TLS and we've not parsed all the payload in the record 73: # at this point, then we have TLS extensions to process. We will detect 74: # the TLS extension package and parse each record individually. 75: 76: if { ([TCP::payload length] >= $record_offset) } { 77: binary scan [TCP::payload] @${record_offset}S tls_extenlen 78: set record_offset [expr {$record_offset + 2}] 79: binary scan [TCP::payload] @${record_offset}a* tls_extensions 80: 81: # Loop through the TLS extension data looking for a type 00 extension 82: # record. This is the IANA code for server_name in the TLS transaction. 83: 84: for { set x 0 } { $x < $tls_extenlen } { incr x 4 } { 85: set start [expr {$x}] 86: binary scan $tls_extensions @${start}SS etype elen 87: if { ($etype == "00") } { 88: 89: # A servername record is present. Pull this value out of the packet data 90: # and save it for later use. We start 9 bytes into the record to bypass 91: # type, length, and SNI encoding header (which is itself 5 bytes long), and 92: # capture the servername text (minus the header). 93: 94: set grabstart [expr {$start + 9}] 95: set grabend [expr {$elen - 5}] 96: binary scan $tls_extensions @${grabstart}A${grabend} tls_servername 97: set start [expr {$start + $elen}] 98: } else { 99: 100: # Bypass all other TLS extensions. 101: 102: set start [expr {$start + $elen}] 103: } 104: set x $start 105: } 106: 107: # Check to see whether we got a servername indication from TLS. If so, 108: # make the appropriate changes. 109: 110: if { ([info exists tls_servername] ) } { 111: 112: # Look for a matching servername in the Data Group and pool. 113: 114: set ssl_profile [class match -value [string tolower $tls_servername] equals tls_servername] 115: set tls_pool [class match -value [string tolower $tls_servername] equals tls_servername_pool] 116: 117: if { $ssl_profile == "" } { 118: 119: # No match, so we allow this to fall through to the "default" 120: # clientssl profile. 121: 122: SSL::enable 123: } else { 124: 125: # A match was found in the Data Group, so we will change the SSL 126: # profile to the one we found. Hide this activity from the iRules 127: # parser. 128: 129: set ssl_profile_enable "SSL::profile $ssl_profile" 130: catch { eval $ssl_profile_enable } 131: if { not ($tls_pool == "") } { 132: pool $tls_pool 133: } else { 134: pool $default_tls_pool 135: } 136: SSL::enable 137: } 138: } else { 139: 140: # No match because no SNI field was present. Fall through to the 141: # "default" SSL profile. 142: 143: SSL::enable 144: } 145: 146: } else { 147: 148: # We're not in a handshake. Keep on using the currently set SSL profile 149: # for this transaction. 150: 151: SSL::enable 152: } 153: 154: # Hold down any further processing and release the TCP session further 155: # down the event loop. 156: 157: set detect_handshake 0 158: TCP::release 159: } else { 160: 161: # We've not been able to match an SNI field to an SSL profile. We will 162: # fall back to the "default" SSL profile selected (this might lead to 163: # certificate validation errors on non SNI-capable browsers. 164: 165: set detect_handshake 0 166: SSL::enable 167: TCP::release 168: 169: } 170: } 171: }
Colin_Walker_12
Apr 05, 2011 Place Technical Articles
3.8KViews
0likes
18Comments
Quick! The Data Center Just Burned Down, What Do You Do?
You get the call at 2am. The data center is on fire, and while the server room itself was protected with your high-tech fire-fighting gear, the rest of the building billowed out smoke and noxious gasses that have contaminated your servers. Unless you have a sealed server room, this is a very real possibility. Another possibility is that the fire department had to spew a ton of liquid on your building to keep the fire from spreading. No sealed room means your servers might have taken a bath. And sealed rooms are a real rarity in datacenter design for a whole host of reasons starting with cost. So you turn to your DR plan, and step one is to make certain the load was shifted to an alternate location. That will buy you time to assess the damage. Little do you know that while a good start, that’s probably not enough of a plan to get you back to normal quickly. It still makes me wonder when you talk to people about disaster recovery how different IT shops have different views of what’s necessary to recover from a disaster. The reason it makes me wonder is because few of them actually have a Disaster Recovery Plan. They have a “Pain Alleviation Plan”. This may be sufficient, depending upon the nature of your organization, but it may not be. You are going to need buildings, servers, infrastructure, and the knowledge to put everything back together – even that system that ran for ten years after the team that implemented it moved on to a new job. Because it wouldn’t still be running on Netware/Windows NT/OS2 if it wasn’t critical and expensive to replace. If you’re like most of us, you moved that system to a VM if at all possible years ago, but you’ll still have to get it plugged into a network it can work on, and your wires? They’re all suspect. The plan to restore your ADS can be painful in-and-of itself, let alone applying the different security settings to things like NAS and SAN devices, since they have different settings for different LUNS or even folders and files. The massive amount of planning required to truly restore normal function of your systems is daunting to most organizations, and there are some question marks that just can’t be answered today for a disaster that might happen in a year or even ten – hopefully never, but we do disaster planning so that we’re prepared if it does, so never isn’t a good outlook while planning for the worst. While still at Network Computing, I looked at some great DR plans ranging from “send us VMs and we’ll ship you servers ready to rock the same day your disaster happens” to “We’ll drive a truck full of servers to your location and you can load them up with whatever you need and use our satellite connection to connect to the world”. Problem is that both of these require money from you every month while providing benefit only if you actually have a disaster. Insurance is a good thing, but increasing IT overhead is risky business. When budget time comes, the temptation to stop paying each month for something not immediately forwarding business needs is palpable. And both of those solutions miss the ever-growing infrastructure part. Could you replace your BIG-IPs (or other ADC gear) tomorrow? You could get new ones from F5 pretty quickly, but do you have their configurations backed up so you can restore? How about the dozens of other network devices, NAS and SAN boxes, network architecture? Yeah, it’s going to be a lot of work. But it is manageable. There is going to be a huge time investment, but it’s disaster recovery, the time investment is in response to an emergency. Even so, adequate planning can cut down the time you have to invest to return to business-as-usual. Sometimes by huge amounts. Not having a plan is akin to setting the price for a product before you know what it costs to produce – you’ll regret it. What do you need? Well if you’re lucky, you have more than one datacenter, and all you need to do is slightly oversize them to make sure you can pick up the slack if one goes down. If you’re not one of the lucky organizations, you’ll need a plan for getting a building with sufficient power, internet capability, and space, replace everything from power connections to racks to SAN and NAS boxes, restorable backups (seriously, test your backups or replication targets. There are horror stories…), and time for your staff to turn all of these raw elements into a functional datacenter. It’s a tall order, you need backups of the configs of all appliances and information from all of your vendors about replacement timelines. But should you ever need this plan, it is far better to have done some research than to wake up in the middle of the night and then, while you are down, spend time figuring it all out. The toughest bit is keeping it up to date, because a project to implement a DR plan is a discrete project, but updating costs for space and lists of vendors and gear on a regular basis is more drudgery and outside of project timelines. But it’s worth the effort as insurance. And if your timeline is critical, look into one of those semi trailers – or the new thing (since 2005 or 2007 at least), containerized data centers - because when you need them, you need them. If you can’t afford to be down for more than a day or two, they’re a good stopgap while you rebuild. SecurityProcedure.com has an aggregated list of free DR plans online. I’ve looked at a couple of the plans they list, they’re not horrible, but make certain you customize them to your organization’s needs. No generic plan is complete for your needs, so make certain you cover all of your bases if you use one of these. The key is to have a plan that dissects all the needs post-disaster. I’ve been through a disaster (The Great NWC Lab Flood), and there are always surprises, but having a plan to minimize them is a first step to maintaining your sanity and restoring your datacenter to full function. In the future – the not-too-distant future – you will likely have the cloud as a backup, assuming that you have a product like our GTM to enable cloud-bursting, and that Global Load Balancer isn’t taken out by the fire. But even if it is, replacing one device to get your entire datacenter emulated in the cloud would not be anywhere near as painful as the rush to reassemble physical equipment. Marketing Image of an IBM/APC Container Lori and I? No, we have backups and insurance and that’s about it. But though our network is complex, we don’t have any businesses hosted on it, so this is perfectly acceptable for our needs. No containerized data centers for us. Let’s hope we, and you, never need any of this.
Don_MacVittie_1
Jan 06, 2011 Place Technical Articles
614Views
0likes
0Comments
Virtual server address space on disaster recovery F5 instance
I am working on setting up a disaster recovery instance of an existing HA BIGIP pair. I would like to ConfigSync (sync only) my local devices to the new device which is located in a different data center. The issue is that if/when I needed to "fail over" (manually) to the new disaster recovery device, the IP space of the virtual servers are different. So for example if my virtual server "A" has a destination address of 192.168.1.10 in my existing data center, the new destination address might be 10.1.1.10. (Note: I am less worried about pool member IP space, because I can use priority groups and have the disaster recovery pool member IPs pre-configured but disabled). My first thought was a "DR go live" script which would search & replace the config file and reload it, but is there a more "elegant" way to handle this dilemma, without having them have the same IP space?
patonbike
Sep 23, 2020 Place Technical Forum
546Views
1like
1Comment
Multi-Site Redundancy
We have an active site and a disaster recovery site and have recently incorporated NSX-T which stretches the L2/L3 domain to both sites. My question is in a DR test our vip can have the same ip address in our active site then it does in our DR site, how can this be accomplished? ive read up on here and disabling ARP in the WPO Virtual Server may be a solution is there another way to accomplish this without using F5 DNS?
Thomson_Thomas
Jul 01, 2020 Place Technical Forum
364Views
0likes
0Comments
Configuring a multi-server Testing Environment with VMWare Teams and BIG-IP LTM VE
Having LTM-VE is nice, but setting things up so that you can actually use it for something is a different matter. There is a lot of information floating around out there about configuration ranging from Joe's Tech Tip to Lori's Architecture Reference to deployment guides. This article focuses on setting up a VMWare team that includes your BIG-IP and some test servers. Since I wanted this to be something you can re-create, all of the servers utilized in this Tech Tip are clones of a straight-forward CENT-OS install. I toyed with a large number of OSS operating systems, and decided of all the OS's I installed, this was the one most suited to enterprise use, so cloned it a couple of times and changed the IP configuration. All of this will be documented below, the goal is to set you up so you can easily copy this environment. If for some reason you cannot, drop me an email and we can arrange for some way to get the files to you. One thing to remember throughout an install of LTM-VE is that in the end it is just a BIG-IP. Sure, it's in an odd environment that offers some challenges of its own, but from an administration perspective it is just another BIG-IP. In terms of throughput and configuration it is different because of the environment, but in terms of day-to-day administration, it has nodes, pools, virtuals, iRules, etc. that are all accessed and managed in the same manner as you would if this was a physical BIG-IP. I placed my BIG-IP and the three server instances into a single VMWare team, meaning I have single-button turn-on/turn-off for the entire test environment. For my purposes, my desktop (the VMWare host) is sufficient for representing clients, but I'll point out how and where you would add clients below. And last caveat, this entire Tech Tip is done in VMWare Workstation. If you are running one of the server versions, some dialogs/etc will be slightly different. First off, download and open the BIG-IP vmx file. A link to the vmx can be found on the DevCentral VE page. Once you have LTM-VE opened, you're ready to start your team-building exercise. Download CENT-OS from the link above, or some other OS that you are familiar with. Windows works just fine, I only excluded it from this solution because I can't be offering to give you copies of an OS that charges per-copy, and I did say if you had problems to drop me a line. You'll need to install CENT-OS into a new VM, but VMWare makes that so easy that even if you've never done it you should have no problems. Just choose New|VM and tell it where to find the ISO images you downloaded. Once the BIG-IP is loaded - and you can start it and play with it if you like, the default management address will be in the output of an ifconfig on the BIG-IP command line as eth0-mgmt. From your host OS's web browser you can navigate to that address and start licensing and configuration as explained in the release notes. Next you'll want to create the team though. From the VMWare menu, select File|New|Team like this... Give it a name and check the directory it will save to... Then add VMs to the team. Choosing "existing VM" will move a VM into the team, choosing "Clone of a VM" will copy a VM into the team. I basically configured one copy of CENT-OS and then added clones of that one to the VM. I used the same process with BIG-IP LTM VE. Finally, add network segments to the team if you are going to model an existing network with your team. In the case that we're exploring here, I did not add any network segments to the team, but rather relied upon VMWare's DHCP capability to assign me networks that I then utilized when configuring the servers to use static IPs (as they would in most environments). So the first thing you need to do after you’ve built your team is set your servers to use static IPs. They will default to DHCP, but that’s not the greatest solution for a load-balancing environment, since any change to the IP addresses of these servers will have an impact on the BIG-IP’s ability to load-balance. Note, if you are terribly lazy you probably don’t have to give them static IPs, because you control what goes in and out of the network, if you don’t add any servers to a segment, you should get the same IPs each time you boot. But boy, I just cannot recommend assuming a given server will always get the same address, that’s asking for trouble. One last thing to check, go into the Connections tab of the Team Settings dialog. We started with each of our three servers set to NAT, and eventually had to switch them to Host Only during config. If you can’t ping across the VLANs (there should be an implied route between them), this is something worth checking. And that covers creating the team. In short, the benefits of a team in our case are these: - Single button power on/off - Contained environment with everything you need to test in one VM - The ability to add network segments to emulate your production environment As I mentioned above, I put nothing but Virtuals directly on the external network because I planned on testing with just my host OS, but you may well want to add clients to the external interface. Next, you’ll have to configure the BIG-IP. You need to create an internal VLAN with the internal network (the one the servers are on) in it, and an external network that faces the outside world. The BIG-IP Management interface on first boot will be DHCP’d on your local network, you can change that setting through the browser on your client OS (we made it static on the local network). All of this is detailed in the release notes availabe on the VE page of DevCentral. After your VLANs are set, then you need to go into the client OS’s and set the route to point at the BIG-IP address on the internal VLAN. Bing! At that point you should be able to ping the world. If you can, then you can create the nodes, pools, virtuals in the BIG-IP and start routing traffic! I won’t go into a ton of detail here, since this article is running long already, but here are some screenshots of BIG-IP LTM VE with everything configured. Here’s the list of nodes as-we-added them… And the shots of the pool, showing it using these nodes… And the VS, showing it using the pool. All is set, the VS is listening on 192.168.42.110. There’s more I’d like to show you, but I’m not only out of room, this is longer than I’d wanted. So go forth and play, and I’ll circle back with some fun stuff like setting up iRules and such.
Don_MacVittie_1
Mar 12, 2010 Place Technical Articles
311Views
0likes
0Comments
DR Site Failover with GTM
Hi, We are a company and we have one data center. We want to make another data center which will act as a DR site for our current data center. Can anybody please share some information on how GTM can help us in site failover when one site goes down and other becomes active. How will user traffic from the internet will be processed ? Any help will really be appreciated. Thanks.
F5-User_203510
Sep 19, 2015 Place Technical Forum
300Views
0likes
1Comment
Deploying BIG-IP VE in VMware vCloud Director
Beginning with BIG-IP version 11.2, you may have noticed a new package in the Virtual Edition downloads folder for vCloud Director 1.5. VMware’s vCloud Director is a software solution enabling enterprises to build multi-tenant private clouds. Each virtual datacenter has its own resource set of cpu, memory, and disk that the vDC owner can allocate as necessary. F5 DevCentral is now running in these virtual datacenter configurations (as announced June 13th, 2012), with full BIG-IP VE infrastructure in place. This article will describe the deployment process to get BIG-IP VE installed and running in the vCloud Director environment. Uploading the vCloud Image The upload process is fairly simple, but it does take a while. First, after logging in to the vCloud interface, click catalogs, then select your private catalog. Once in the private catalog, click the upload button highlighted below. This will launch a pop up. Make sure the vCloud zip file has been extracted. When the .ovf is selected in this screen, it will grab that as well as the disk file after clicking upload. Now get a cup of coffee. Or a lot of them, this takes a while. Deploying the BIG-IP VE OVF Template Now that the image is in place, click on my cloud at the top navigation, select vApps, then select the plus sign, which will create a new vApp. (Or, the BIG-IP can be deployed into an existing vApp as well.) Select the BIG-IP VE template (bigip11_2 in the screenshot below) and click next. Give the vApp a name and click next. Accept the F5 EULA and click next. At this point, give the VM a full name and a computer name and click finish. I checked the network adapter box to show the network adapter type. It is not configurable at this point, and the flexible NIC is not the right one. After clicking finish, the system will create the vApp and build the VM, so maybe it’s time for another cup of coffee. Once the build is complete, click into the vapp_test vApp. Right-click on the testbigip-11-2 VM and select properties. Do NOT power on the VM yet! CPU and memory should not be altered. More CPU won’t help TMM, there is no CMP yet in the virtual edition and one extra CPU for system stuff is sufficient. TMM can’t schedule more than 4G of RAM either. Click the “Show network adapter type” and again you’ll notice the NICs are not correct. Delete all the network interfaces, then re-add one at a time as many (up to 10 in vCloud Director) NICs as is necessary for your infrastructure. To add a NIC, just click the add button and then select the network dropdown and select Add Network. At this point, you’ll need to already have a plan for your networking infrastructure. Organizational networks are usable in and between all vApps, whereas vApp networks are isolated to just that instance. I’ll show organizational network configuration in this article. Click Organization network and then click next. Select the appropriate network and click next. I’ve selected the Management network. For the management NIC I’ll leave the adapter type as E1000. The IP Mode is useful for systems where guest customization is enabled, but is still a required setting. I set it to Static-Manual and enter the self IP addresses assigned to those interfaces. This step is still required within the F5, it will not auto-configure the vlans and self IPs for you. For the remaining NICs that you add, make sure to set the adapter type to VMXNET 3. Then click OK to apply the new NIC configurations. *Note that adding more than 5 NICs in VE might cause the interfaces to re-order internally. If this happens, you’ll need to map the mac address in vCloud to the mac addresses reported in tmsh and adjust your vlans accordingly. Powering Up! After the configuration is updated, right-click on the testbigip-11-2 VM and select power on. After the VM powers on, BIG-IP VE will boot. Login with root/default credentials and type config at the prompt to set the management ip and netmask. Select No on auto-configuration Set the IP address. Then set the netmask. I selected no on the default route, but it might be necessary depending on the infrastructure you have in place. Finally, accept the settings. At this point, the system should be available on the management network. I have a linux box on that network as well so I can ssh into the BIG-IP VE to perform the licensing steps as the vCloud Director console does not support copy/paste.
JRahm
Jul 03, 2012 Place Technical Articles
299Views
0likes
2Comments
More Complexity, New Problems, Sounds Like IT!
It is a very cool world we live in, where technology is concerned. We’re looking at a near future where your excess workload, be it applications or storage, can be shunted off to a cloud. Your users have more power in their hands than ever before, and are chomping at the bit to use it on your corporate systems. IBM recently announced a memory/storage breakthrough that will make Flash disks look like 5.25 inch floppies. While we can’t know what tomorrow will bring, we can certainly know that the technology will enable us to be more adaptable, responsive, and (yes, I’ll say it) secure. Whether we actually are or not is up to us, but the tools will be available. Of course, as has been the case for the last thirty years, those changes will present new difficulties. Enabling technology creates issues… Which create opportunity for emerging technology. But we have to live through the change, and deal with making things sane. In the near future, you will be able to send backup and replication data to the cloud, reducing your on-site storage and storage administration needs by a huge volume. You can today, in fact, with products like F5’s ARX Cloud Extender. You will also be able to grant access to your applications from an increasing array of endpoint devices, again, you can do it today, with products like F5’s ASM for VPN access and APM for application security, but recent surveys and events in the security space should be spurring you to look more closely into these areas. SaaS is cool again in many areas that it had been ruled out – like email – to move the expense of relatively standardized high volume applications out of the datacenter and into the hands of trusted vendors. You can get email “in the cloud” or via traditional SaaS vendors. That’s just some of the changes coming along, and guess who is going to implement these important changes, be responsible for making them secure, fast, and available? That would be IT. To frame the conversation, I’m going to pillage some of Lori’s excellent graphics and we’ll talk about what you’ll need to cover as your environment changes. I won’t use the one showing little F5 balls on all of the strategic points of control, but if we have one. First, the points of business value and cost containment possible on the extended datacenter network. Notice that this slide is couched in terms of “how can you help the business”. Its genius is that Lori drew an architecture and then inserted business-relevant bits into it, so you can equate what you do every day to helping the business. Next up is the actual Strategic Points of Control slide, where we can see the technological equivalency of these points. So these few points are where you can hook in to the existing infrastructure offer you enhanced control of your network – storage, global, WAN, LAN, Internet clients – by putting tools into place that will act upon the data passing through them and contain policies and programmability that give you unprecedented automation. The idea here is that we are stepping beyond traditional deployments, to virtualization, remote datacenters, cloud, varied clients, ever-increasing storage (and cloud storage of course), while current service levels and security will be expected to be maintained. That’s a tall order, and stepping up the stack a bit to put strategic points of control into the network helps you manage the change without killing yourself or implementing a million specialized apps, policies, and procedures just to keep order and control costs. At the Global Strategic Point of Control, you can direct users to a working instance of your application, even if the primary application is unavailable and users must be routed to a remote instance. At this same place, you can control access to restricted applications, and send unauthorized individuals to a completely different server than the application they were trying to access. That’s the tip of the iceberg, with load balancing to local strategic points of control being one of the other uses that is beyond the scope of this blog. The Local Strategic Point of Control offers performance, reliability, and availability in the guise of load balancing, security in the form of content-based routing and application security – before the user has hit the application server – and encryption of sensitive data flowing internally and/or externally, without placing encryption burdens on your servers. The Storage Strategic Point of Control offers up tiering and storage consolidation through virtual directories, heterogeneous security administration, and abstraction of the NAS heads. By utilizing this point of control between the user and the file services, automation can act across vendors and systems to balance load and consolidate data access. It also reduces management time for endpoint staff, as the device behind a mount/map point can be changed without impacting users. Remote site VPN extension and DMZ rules consolidation can happen at the global strategic point of control at the remote site, offering a more hands-off approach to satellite offices. Note that WAN Optimization occurs across the WAN, over the Local and global strategic points of control. Web Application Optimization also happens at the global or local strategic point of control, on the way out to the end point device. What’s not shown is a large unknown in cloud usage – how to extend the control you have over the LAN out to the cloud via the WAN. Some things are easy enough to cover by sending users to a device in your datacenter and then redirecting to the cloud application, but this can be problematic if you’re not careful about redirection and bookmarks. Also, it has not been possible for symmetric tools like WAN Optimization to be utilized in this environment. Virtual appliances like BIG-IP LTM VE are resolving that particular issue, extending much of the control you have in the datacenter out to the cloud. I’ve said before, the times are still changing, you’ll have to stay on top of the new issues that confront you as IT transforms yet again, trying to stay ahead of the curve. Related Blogs: Like Load Balancing WAN Optimization is a Feature of Application ... Is it time for a new Enterprise Architect? Virtual Infrastructure in Cloud Computing Just Passes the Buck The Cloud Computing – Application Acceleration Connection F5 Friday: Secure, Scalable and Fast VMware View Deployment Smart Energy Cloud? Sounds like fun. WAN Optimization is not Application Acceleration The Three Reasons Hybrid Clouds Will Dominate F5 Friday: BIG-IP WOM With Oracle Products Oracle Fusion Middleware Deployment Guides Introducing: Long Distance VMotion with VMWare Load Balancers for Developers – ADCs Wan Optimization Functionality Cloud Control Does Not Always Mean 'Do it yourself' Best Practices Deploying IBM Web Sphere 7 Now Available
Don_MacVittie_1
Jul 01, 2011 Place Technical Articles
275Views
0likes
0Comments
Load Balancing For Developers: Improving Application Performance With ADCs
If you’ve never heard of my Load Balancing For Developers series, it’s a good idea to start here. There are quite a few installments behind us, and I’m not going to look back in this post any more than I must to make it readable without going back… Meaning there’s much more detail back there than I’ll relate here. Again after a lengthy sojourn covering other points of interest, I return to Load Balancing For Developers with a more holistic view – application performance. Lori has talked a bit about this topic, and I’ve talked about it in the form of Load Balancing benefits and algorithms, but I’d like to look more architecturally again, and talk about those difficult to uncover performance issues that web apps often face. You’re the IT manager for the company’s Zap-n-Go website, it has grown nearly exponentially since launch, and you’re the one responsible for keeping it alive. Lately it’s online, but your users are complaining of sluggishness. Following the advice of some guy on the Internet, you put a load balancer in about a year ago, and things were better, but after you put in a redundant data center and Global Load Balancing services, things started to degrade again. Time to rethink your architecture before your product gets known as Zap-N-Gone… Again. Thus far you have a complete system with multiple servers behind an ADC in your primary data center, and a complete system with multiple servers behind an ADC in your secondary data center. Failover tests work correctly when you shut down the primary web servers, and the database at the remote location is kept up to date with something like Data Guard for Oracle or Merge Replication Services for SQL Server. This meets the business requirement that the remote database is up-to-date except for those transactions in-progress at the moment of loss. This makes you highly HA, and if your ADCs are running as an HA pair and your Global DNS – Like our GTM product - is smart enough to switch when it notices your primary site is down, most users won’t even know they’ve been shoved off to the backup datacenter. The business is happy, you’re sleeping at night, all is well. Except that slowly, as usage for the site has grown, performance has suffered. What started as a slight lag has turned into a dragging sensation. You’ve put more web servers into the pool of available resources – or better yet, used your management tools (in the ADC and on your servers) to monitor all facets of web server performance – disk and network I/O, CPU and memory utilization. And still, performance lags. Then you check on your WAN connection and database, and find the problem. Either the WAN connection is overloaded, or the database is waiting long periods of time for responses from the secondary datacenter. If you have things configured so that the primary doesn’t wait for acknowledgment from the secondary database, then your problem might be even more sinister – some transactions may never get deposited in the secondary datacenter, causing your databases to be out of synch. And that’s a problem because you need the secondary database to be as up to date as possible, but buying more bandwidth is a monthly overhead expense, and sometimes it doesn’t help – because the problem isn’t always about bandwidth, sometimes it is about latency. In fact, with synchronous real-time replication, it is almost always about latency. Latency, for those who don’t know, is a combination of how far your connection must travel over the wire and the number of “bumps in the wire” that have been inserted. Not actually the number of devices, but the number and their performance. Each device that touches your data – packet inspection, load balancing, security, whatever the reason – adds time to the delivery window. So does traveling over the wires/fiber. Synchronous replication is very time sensitive. If it doesn’t hear back in time, it doesn’t commit the changes, and then the primary and secondary databases don’t match up. So you need to cut down the latency and improve the performance of your WAN link. Conveniently, your ADC can help. Out-of-the-box it should have TCP optimizations that cut down the impact of latency by reducing the number of packets going back and forth over the wire. It may have compression too – which cuts down the amount of data going over the wire, reducing the number of packets required, which improves the “apparent” performance and the amount of data on your WAN connection. They might offer more functionality than that too. And you’ve already paid for an HA pair – putting one in each datacenter – so all you have to do is check what they do “out of the box” for WAN connections, and then call your sales representative to find out what other functionality is available. F5 includes some functionality in our LTM product, and has more in our add-on WAN Optimization Module (WOM) that can be bought and activated on your BIG-IP. Other vendors have a variety of architectures to offer you similar functionality, but of course I work for and write for F5, so my view is that they aren’t as good as our products… Certainly check with your incumbent vendor before looking for other solutions to this problem. We have seen cases where replication was massively improved with WAN Optimization. More on that in the coming days under a different topic, but just the thought that you can increase the speed and reliability of transaction-based replication (and indeed, file/storage replication, but again, that’s another blog), and you as a manager or a developer do not have to do a thing to your code. That implies the other piece – that this method of improvement is applicable to applications that you have purchased and do not own the source code for. So check it out… At worst you will lose a few hours tracking down your vendor’s options, at best you will be able to go back to sleep at night. And if you’re shifting load between datacenters, as I’ve mentioned before, Long Distance vMotion is improved by these devices too. F5’s architecture for this solution is here – PDF deployment guide. This guide relies upon the WOM functionality mentioned above. And encryption is supported between devices. That means if you are not encrypting your replication, that you can start without impacting performance, and if you are encrypting, you can offload the work of encryption to a device designed to handle it. And bandwidth allocation means you can guarantee your replication has enough bandwidth to stay up to date by giving it priority. But you won’t care too much about that, you’ll be relaxing and dreaming of beaches and stock options… Until the next emergency crops up anyway.
Don_MacVittie_1
Oct 08, 2010 Place Technical Articles
255Views
0likes
0Comments
Cloud Computing: Location is important, but not the way you think
The debate this week is on location, specifically we're back arguing over whether there exist such things as "private" clouds. Data Center Knowledge has a good recap of some of the opinions out there on the subject, and of course I have my own opinion. Location is, in fact, important to cloud computing, but probably not in the way most people are thinking right now. While everyone is concentrating on defining cloud computing based on whether it's local or remote, folks have lost sight that location is important for other reasons. It is the location of data centers that is important to cloud computing. After all, a poor choice in physical location can incur additional risk for enterprises trusting their applications to a cloud computing provider. Enterprises residing physically in high risk areas - those prone to natural disasters, primarily - understand this and often try to mitigate that risk by building out a secondary data center in a less risky location, just in case. But it's not only the physical and natural risk factors that need to be considered. The location of a data center can have a significant impact on the performance of applications delivered out of a cloud computing environment. If a cloud computing provider's primary data center is in India, or Russia, for example, and most of your users are in the U.S., the performance of that application will be adversely affected by the speed of light problem - the one that says packets can only travel so fast, and no faster, due to the laws of physics. While there are certainly ways to ameliorate the affects of the speed-of-light problem - acceleration and optimization techniques, for example - they are not a cure all. The recent loss of 3 of 4 undersea cables that transport most of the Internet data between continents proves that accidents are not only naturally occurring, but man-made as well, and the effects can be devastating on applications and their users. If you're using a cloud computing provider such as Blue Lock as a secondary or tertiary data center for disaster recovery, but their primary data center is merely a few miles from your primary data center, you aren't gaining much protection against a natural disaster, are you? Location is, in fact, important in the choice of a cloud computing provider. You need to understand where their primary and secondary data centers are located in order to ensure that the business justification for using a cloud computing provider is actually valid. If your business case is built on the reduction of CapEx and OpEx maintaining a disaster recovery site, you should make certain that in the event of a local disaster that the cloud computing provider's data center is unlikely to be affected as well, or you risk wasting your investment in that disaster recovery plan. Waste, whether large or small, of budgets today is not looked upon favorably by those running your business. Given that portability across cloud computing providers today is limited, despite the claims of providers, it is difficult to simply move your applications from one cloud to another quickly. So choose your provider carefully, based not only on matching your business and technological needs to the model they support but on the physical location and distribution of their data centers. Location is important; not to the definition of cloud computing but in its usage. Related articles by Zemanta The Case for "Private Clouds" Economic and Environmental advantages of Cloud Computing Cloud Costing: Fixed Costs vs Variable Costs & CAPEX Vs OPEX Sun feeds data center pods to credit crunched Microsoft 2.0 feels data center pinch 3 steps to a fast, secure, and reliable application infrastructure Cloud API Propagation and the Race to Zero (Cloud Interoperability) Data Center Knowledge: Amazon's Cloud Computing Data Center Locations
Lori_MacVittie
Jan 21, 2009 Place Technical Articles
244Views
0likes
0Comments