hybrid cloud
17 TopicsCloud bursting, the hybrid cloud, and why cloud-agnostic load balancers matter
Cloud Bursting and the Hybrid Cloud When researching cloud bursting, there are many directions Google may take you. Perhaps you come across services for airplanes that attempt to turn cloudy wedding days into memorable events. Perhaps you'd rather opt for a service that helps your IT organization avoid rainy days. Enter cloud bursting ... yes, the one involving computers and networks instead of airplanes. Cloud bursting is a term that has been around in the tech realm for quite a few years. It, in essence, is the ability to allocate resources across various public and private clouds as an organization's needs change. These needs could be economic drivers such as Cloud 2 having lower cost than Cloud 1, or perhaps capacity drivers where additional resources are needed during business hours to handle traffic. For intelligent applications, other interesting things are possible with cloud bursting where, for example, demand in a geographical region suddenly needs capacity that is not local to the primary, private cloud. Here, one can spin up resources to locally serve the demand and provide a better user experience.Nathan Pearcesummarizes some of the aspects of cloud bursting inthis minute long video, which is a great resource to remind oneself of some of the nuances of this architecture. While Cloud Bursting is a term that is generally accepted by the industry as an "on-demand capacity burst,"Lori MacVittiepoints out that this architectural solution eventually leads to aHybrid Cloudwhere multiple compute centers are employed to serve demand among both private-based resources are and public-based resources, or clouds, all the time. The primary driver for this: practically speaking,there are limitations around how fast data that is critical to one's application (think databases, for example) can be replicated across the internet to different data centers.Thus, the promises of "on-demand" cloud bursting scenarios may be short lived, eventually leaning in favor of multiple "always-on compute capacity centers"as loads increase for a given application.In any case, it is important to understand thatthat multiple locations, across multiple clouds will ultimately be serving application content in the not-too-distant future. An example hybrid cloud architecture where services are deployed across multiple clouds. The "application stack" remains the same, using LineRate in each cloud to balance the local application, while a BIG-IP Local Traffic Manager balances application requests across all of clouds. Advantages of cloud-agnostic Load Balancing As one might conclude from the Cloud Bursting and Hybrid Cloud discussion above, having multiple clouds running an application creates a need for user requests to be distributed among the resources and for automated systems to be able to control application access and flow. In order to provide the best control over how one's application behaves, it is optimal to use a load balancer to serve requests. No DNS or network routing changes need to be made and clients continue using the application as they always did as resources come online or go offline; many times, too, these load balancers offer advanced functionality alongside the load balancing service that provide additional value to the application. Having a load balancer that operates the same way no matter where it is deployed becomes important when resources are distributed among many locations. Understanding expectations around configuration, management, reporting, and behavior of a system limits issues for application deployments and discrepancies between how one platform behaves versus another. With a load balancer like F5's LineRate product line, anyone can programmatically manage the servers providing an application to users. Leveraging this programatic control, application providers have an easy way spin up and down capacity in any arbitrary cloud, retain a familiar yet powerful feature-set for their load balancer, ultimately redistribute resources for an application, and provide a seamless experience back to the user. No matter where the load balancer deployment is, LineRate can work hand-in-hand with any web service provider, whether considered a cloud or not. Your data, and perhaps more importantly cost-centers, are no longer locked down to one vendor or one location. With the right application logic paired with LineRate Precision's scripting engine, an application can dynamically react to take advantage of market pricing or general capacity needs. Consider the following scenarios where cloud-agnostic load balancer have advantages over vendor-specific ones: Economic Drivers Time-dependent instance pricing Spot instances with much lower cost becoming available at night Example: my startup's billing system can take advantage in better pricing per unit of work in the public cloud at night versus the private datacenter Multiple vendor instance pricing Cloud 2 just dropped their high-memory instance pricing lower than Cloud 1's Example: Useful for your workload during normal business hours; My application's primary workload is migrated to Cloud 2 with a simple config change Competition Having multiple cloud deployments simultaneously increases competition, and thusyour organization's negotiated pricing contracts become more attractiveover time Computational Drivers Traffic Spikes Someone in marketing just tweeted about our new product. All of a sudden, the web servers that traditionally handled all the loads thrown at them just fine are gettingslashdottedby people all around North America placing orders. Instead of having humans react to the load and spin up new instances to handle the load - or even worse: doing nothing - your LineRate system and application worked hand-in-hand to spin up a few instances in Microsoft Azure's Texas location and a few more in Amazon's Virginia region. This helps you distribute requests from geographically diverse locations: your existing datacenter in Oregon, the central US Microsoft Cloud, and the east-coast based Amazon Cloud. Orders continue to pour in without any system downtime, or worse: lost customers. Compute Orchestration A mission-critical application in your organization's private cloud unexpectedly needs extra computer power, but needs to stay internal for compliance reasons. Fortunately, your application can spin up public cloud instances and migrate traffic out of the private datacenter without affecting any users or data integrity. Your LineRate instance reaches out to Amazon to boot instances and migrate important data. More importantly, application developers and system administrators don't even realize the application has migrated since everything behaves exactly the same in the cloud location. Once the cloud systems boot, alerts are made to F5's LTM and LineRate instances that migrate traffic to the new servers, allowing the mission-critical app to compute away. You just saved the day! The benefit to having a cloud-agnostic load balancing solution for connecting users with an organization's applications not only provides a unified user experience, but provides powerful, unified way of controlling the application for its administrators as well. If all of a sudden an application needs to be moved from, say, aprivate datacenter with a 100 Mbps connection to a public cloud with a GigE connection, this can easily be done without having to relearn a new load balancing solution. F5's LineRate product is available for bare-metal deployments on x86 hardware, virtual machine deployments, and has recently deployed anAmazon Machine Image (AMI). All of these deployment types leverage the same familiar, powerful tools that LineRate offers:lightweight and scalable load balancing, modern management through its intuitive GUI or the industry-standard CLI, and automated control via itscomprehensive REST API.LineRate Point Load Balancerprovides hardened, enterprise-grade load balancing and availability services whereasLineRate Precision Load Balanceradds powerful Node.js programmability, enabling developers and DevOps teams to leveragethousands of Node.js modulesto easily create custom controlsfor application network traffic. Learn about some of LineRate'sadvanced scripting and functionalityhere, ortry it out for freeto see if LineRate is the right cloud-agnostic load balancing solution for your organization.900Views0likes0CommentsThe Power of &: F5 Hybrid DNS solution
While some organizations prioritize the advantages of a SaaS solution like scalability, others value the benefits of an on-premises solution, such as data control and migration flexibility. This is why having the option to deploy a hybrid model can be beneficial, not just for redundancy, but also for allowing organizations to blend the best of both worlds. Understanding theArchitecture’scomponents F5 BIG-IP DNS - (formerly BIG-IP GTM) is a well-known on-premise solution for delivering high-performance DNS services such as DNSExpress and DNS Caching. It is also recognized for offering intelligent DNS responses that are based on various factors such as LDNS’ Geolocation (GSLB) and health status of applications. F5 Distributed Cloud DNS (F5 XC DNS) – It is F5’s SaaS-based DNS solution which is built on a global data plane, ensuring automatic scalability to meet high-volume demand. Additionally, it also provides GSLB and security such as DNS DoS protection. On the diagram above, BIG-IP DNS will be the hidden primary DNS, acting as the source of truth for DNS records. This setup ensures centralized control and adds an extra layer of security by reducing exposure to potential attacks. F5XC DNS will function as the secondary DNS server, receiving DNS records from BIG-IP via Zone Transfer. It will be responsible for handling public DNS queries and providing domain name resolution services to clients. In the first part of this article, we will show you how to set up and configure BIG-IP DNS as the hidden primary and F5XC DNS as the authoritative secondary DNS server. For some, this setup is sufficient for their requirements, but for others, there may be additional requirements to consider in this hybrid design. In the later part of this article, we will demonstrate how we can address these challenges by leveraging F5's platform features and capabilities! Steps on Implementing F5 Hybrid DNS Solution Step 1: Configure BIG-IP DNS First, we need to configure BIG-IP DNS to be able to perform a zone transfer to F5XC DNS. For more details on the configuration, you can check this link: https://community.f5.com/kb/technicalarticles/configuring-big-ip-for-zone-transfer-and-dnssec/330359 Step 2: Configure F5XC DNS Now after you've configured BIG-IP DNS, we need to configure F5 XC DNS to be a secondary DNS server. For more details, check the steps below: Log into XC Console, selectDNS Managementoption, clickAdd Zone. In Domain Name field, enter the domain/subdomain. In our example, it will bef5sg.com Zone Type:Secondary DNS Configuration Under the Secondary DNS Configuration field, clickConfigure On the DNS primary server IP field, enter the public IP address of the Primary DNS. In our example it will be the Public IP of BIG-IP DNS. On TSIG key, enter the name we used to generate TSIG earlier in BIG-IP. In our example, usedexample. On the TSIG Key algorithm field and select an algorithm from the drop-down. Selecthmac-sha256. Click Configure in the TSIG key value in base 64 format section, On the Secret Type field, selectClear Secretand paste the secret in the Secret field. Use the same secret we generated earlier in BIG-IP DNS. ClickApply. You should see the DNS records transferred from BIG-IP DNS to F5XC DNS Step 3: Configure Domain Registrar In this example, the domain registrar I'm using is Namecheap. I'll configure it so that the authoritative name server for the domain f5sg.com is set to F5XC (ns1.f5clouddns.com and ns2.f5clouddns.com). The steps will vary depending on which domain registrar you are using. Refer to the documentation of your registrar. See the screenshots below for how I configured it in Namecheap. F5 XC DNS should now be able to answer DNS queries since it is set to be the authoritative DNS. Now, let's do some testing! On my local machine, I will perform a dig on the f5sg.com domain. See below: You can see that on the dig result, the NS for f5sg.com is set to ns1.f5clouddns.com & ns2.f5clouddns.com! I can also resolve sales.f5sg.com! We have successfully implemented BIG-IP as Hidden Primary and F5XC as Authoritative Secondary DNS! Challenges and Considerations Now let's discuss the additional requirements or challenges that we might encounter with this hybrid setup solution: Security: We need to comply with security compliance. Nowadays, there are laws requiring the implementation of DNSSEC (DNS Security Extensions). We need to consider this in the design and implement it without adding complexity. Resiliency: Although F5XC DNS Infrastructure is built to be resilient, we still want a backup plan to failover to the BIG-IP Primary DNS in case of unforeseen events. This process will be manual, as we need to change the NS records at the registrar to promote the hidden BIG-IP Primary DNS as the authoritative NS for the domain once F5XC is unavailable. Synchronization: BIG-IP will not be able to synchronize the GSLB functionality with F5XC because Wide-IP records are non-standard and cannot be transferred as part of zone transfers. Solution to Challenges Now comes the fun part: tackling the challenges we’ve laid out! Fortunately, F5 Distributed Cloud is an API-first platform that enables us to automate configuration. At the same time, we have the power of the BIG-IP platform, where you can run custom scripts that will enable us to integrate it with F5XC through API. Solution to Challenge #1: This is easy. DNSSEC records like RRSIG, DNSKEY, DS, NSEC, and NSEC3 are standardized and can be synchronized as part of a zone transfer. Since BIG-IP DNS is our primary DNS and supports DNSSEC, we can enable it. The records will synchronize to F5XC DNS and still respond with signed records, maintaining the integrity and security of our DNS infrastructure. How do you enable it? Check the last part of the technical article below: https://community.f5.com/kb/technicalarticles/configuring-big-ip-for-zone-transfer-and-dnssec/330359 Solution to Challenge #2: We need to automate failover! But when automating tasks, you need two things: a trigger and an action. In our scenario, our trigger should be the availability of F5XC DNS to resolve DNS queries, and the action should be to change our nameserver to BIG-IP on the domain registrar. If you can create and run a script in BIG-IP, it means you can continuously monitor the health of F5XC DNS, allowing us to determine the trigger. But what about the action to change the domain name server records in the registrar? It's easy—check if it can be configured via API, then the problem is solved! Let's explore using Namecheap as our registrar for this example. We will use the BIG-IP EAV (external) monitor to run the script. If you're unfamiliar with the BIG-IP external monitor and its capabilities, check this out →https://my.f5.com/manage/s/article/K71282813 A dummy pool configured with an external monitor will run at intervals. The attached script is designed to monitor F5XC and check if it can resolve DNS queries. If it cannot, the script will trigger an API call to Namecheap (our domain registrar) to change the nameservers back to BIG-IP DNS. Simultaneously, the script will update the domain's NS records from F5XC to BIG-IP. Step 1: Create an external monitor using the custom script. Refer to article K71282813 how to create the external monitor. See the codeshare link for the sample custom script I used: Namecheap and BIG-IP Integration via API | DevCentral Step 2: Create a dummy pool and attach the custom external monitor Let's do some tests! See the results in the later part of this article. Solution to Challenge #3: We can't use Zone Transfer to synchronize GSLB configurations? No problem! Instead, we'll harness the power of APIs. We can run a custom script in BIG-IP to convert Wide-IP configurations into F5XC DNSLB records via API. Let's see below how we can do this. On BIG-IP DNS, configure the zone records for the domainf5sg.comto delegate the subdomains needed for GSLB. For example, we need to perform GSLB forwww.f5sg.com,we will configure the zone like below: www.f5sg.comCNAMEwww.gslb.f5sg.com gslb.f5sg.com NS ns1.f5clouddns.com On BIG-IP we will create Wide-IP configuration forwww.gslb.f5sg.comwhich should hold the A records. These Wide-IP configurations can be converted by a script to F5XC DNSLB configurations. Check the sample script on this codeshare link: BIG-IP Wide-IP to F5XC DNSLB converter | DevCentral Testing and Result Challenge #2:Failover Testing To simulate the scenario in which F5XC is unable to respond to DNS queries, we designed the script to execute a dig command to F5XC for a TXT record. If F5XC responds with "RESPONSE-OK," no further action is needed. However, if it fails to respond correctly or does not respond at all, the script will trigger a failover action. Scenario 1: When F5XC responds to DNS queries(TXT record value is RESPONSE-OK) Namecheap dashboard shows F5XC nameservers BIG-IP DNS zone records shows F5XC nameservers F5XC zone records shows F5XC nameservers Result when performing dig to resolve sales.f5sg.com -> it shows that F5XC nameservers are Authoritative Scenario: When F5XC doesn't respond to DNS queries (TXT record value is RESPONSE-NOT-OK) We changed the TXT record value to 'RESPONSE-NOT-OK,' which should mark the monitor as down. The dummy pool went down, which means the script inside the monitor detected that the dig result was not what it expected. You can see from the zone records below that the NS records have now changed to GTM (gtm1.f5sg.com and gtm2.f5sg.com) When we check our domain registrar, Namecheap, we can see that the nameservers are now automatically set to BIG-IP GTMs. When I issue a dig command from my workstation, I can see that the nameserver responding to my query isgtm1.f5sg.com Online DNS tools (like MXToolbox) also report that gtm1.f5sg.com is the authoritative NS that responds to the DNS queries for sales.f5sg.com, which resolves to 2.2.2.2 We have now solved one of the challenges by implementing a backup failover plan using custom monitors and automations, made possible by the power of BIG-IP and APIs! Challenge #3:Synchronization Testing Using this script, we can convert and synchronize the BIG-IP Wide-IP configuration to its F5XC equivalent configuration Note: The sample script is limited to handling a Wide-IP with a single GTM pool. Inside the pool is where you will define the IP addresses that you want to load balance. The pool load balancing method is also limited to Round Robin, Ratio, Static Persist, and Global Availability. The script is designed to run at intervals. There are several ways to execute it: you can use external monitors (as we did earlier) or utilize a cronjob, etc. For testing and simplicity, I will use a cronjob set to run every 10 minutes. Let's begin creating our GSLB configuration. If you've configured BIG-IP GTM/DNS before, one of the first objects you need to create is a GTM server. I've configured two Generic Servers representing the application in two different Data Centers. Next is we create aGTM Poolwhich we will associate theVirtual Serverinside the GTM server we created earlier. (i.e. I'm assigning 1.1.1.1 and 2.2.2.2 as the members of the pool) Lastly, we will create theWide-IP recordand attach the GTM Pool we created earlier After this, the script should get triggered and convert this BIG-IP DNS Wide-IP configuration into F5XC DNS configuration. We should see that a new Primary Zone will be created in F5XC (gslb.f5sg.com) When you view the resource records, we should see a DNSLB record which has the record name equivalent to the subdomain of the wide-IP record. (BIG-IP DNS Wide-IP record is www.glsb.f5sg.com, In F5XC DNS zonegslb.f5sg.com, the record name iswwwand pointing toDNSLB object) The load balancing rules should have the DNSLB pool(pool-www) which is the equivalent of theGTM Pool(pool_www) configured in BIG-IP DNS TheDNSLB pool memberswill include the same IP addresses we defined asGTM Pool members in BIG-IP DNS. There are four load balancing methods available in F5XC, and there is an equivalent BIG-IP DNS load balancing method. The script was created to match this methods but if you configure the BIG-IP DNS pool load balancing method to something other than these four, it will default to Round Robin. BIG-IP DNS F5XC DNS Round Robin Round-Robin Ratio Ratio-Member Static Persist Static-Persist Global Availability Priority Based on the results above, we have successfully converted and synchronize BIG-IP DNS Wide-IP configuration into F5XC DNSLB records! Conclusion We have resolved DNS challenges using the power and integration of F5 solutions! By utilizing both BIG-IP and F5XC platforms, which can sign and serve DNSSEC records, we can seamlessly implement DNSSEC in a hybrid setup without complexity. Furthermore, our scalable F5XC Cloud DNS will shield you from myriad DNS DoS attacks, which are continually evolving, especially with the rise of AI. In terms of DNS resiliency, with the power of our API-first platforms and automation, we can create a DNS hybrid solution capable of automatically failing over from Cloud DNS to on-prem DNS. Lastly, we can synchronize the configurations of both platforms using standards like Zone Transfer and APIs. This capability allows us to convert and synchronize GSLB configurations between our on-prem DNS and Cloud DNS, making administration easier, and establishing a single source of truth.699Views2likes0CommentsAll I want for Christmas is a hybrid cloud
********** Dear Santa, I hope you and the reindeers are as excited as I am about Christmas! Make sure you get plenty of rest between now and then, so you can deliver all those lovely presents all over the world on Christmas Eve. I’ve thought a lot about what I want for Christmas, and have finally made a decision. This Christmas all I want is a hybrid cloud. I’ve heard so much about them, and lots of people either already have them or are getting them soon, and I want to join the fun! I really want a hybrid cloud because of the benefits it brings: increased scalability, improved security, better resource allocation, better availability and resiliency, and it’s much more cost-effective. And who wouldn’t want that! I hope I wake up on Christmas morning with a lovely new hybrid cloud infrastructure! I just hope you can fit it on your sleigh. ********** I think Santa’s got some work to do to get me a hybrid cloud infrastructure for Christmas! But the point made above about lots of people going down the hybrid route is true: the recent Right Scale 2015 State of the Cloud Report revealed that 82% of businesses have a hybrid cloud strategy in place. That figure is up from 74% the year before. That’s because businesses are attracted by the cost savings it can offer, as well as the other benefits mentioned in my letter to Santa above. But they also don’t want to trade-off control in order to realise these benefits; they want to maintain the same visibility, security and control of a traditional infrastructure. The key to a good hybrid environment is the ability to unify all the hardware, software and managed services resources that make up both on-premises and cloud environments. This combination of physical and virtual resources makes it easier to transition workloads to the cloud as and when needed. Now, cloud is regularly considered a security risk for organisations. And that’s fair, to some extent: cloud computing and increased mobility has meant enterprise perimeters have changed; applications are now accessed from a variety of environments and from different devices (both corporate and personal) and locations. But having a hybrid environment means security processes and protocols that apply on-premises can be extended to the cloud, protecting your users - and your data - wherever they are. This approach not only secures cloud-based, web-based, and virtual applications but also ensures high availability and reliable access. Hybrid clouds are the best of both worlds, all the benefits of cloud computing without sacrificing security, flexibility, or cost savings. However, organisations must consider what it takes to ensure applications and services there are treated the same way as on-premises infrastructure, so they can embrace a hybrid cloud with the same confidence as they approach their own data centre. Thanks Santa, and have a great Christmas folks!362Views0likes0CommentsThe Three Reasons Hybrid Clouds Will Dominate
In the short term, hybrid cloud is going to be the cloud computing model of choice. Amidst all the disconnect at CloudConnect regarding standards and where “cloud” is going was an undercurrent of adoption of what most have come to refer to as a “hybrid cloud computing” model. This model essentially “extends” the data center into “the cloud” and takes advantage of less expensive compute resources on-demand. What’s interesting is that the use of this cheaper compute is the granularity of on-demand. The time interval for which resources are utilized is measured more in project timelines than in minutes or even hours. Organizations need additional compute for lab and quality assurance efforts, for certification testing, for production applications for which budget is limited. These are not snap decisions but rather methodically planned steps along the project management lifecycle. It is on-demand in the sense that it’s “when the organization needs it”, and in the sense that it’s certainly faster than the traditional compute resource acquisition process, which can take weeks or even months. Also mentioned more than once by multiple panelists and speakers was the notion of separating workload such that corporate data remains in the local data center while presentation layers and GUIs move into the cloud computing environment for optimal use of available compute resources. This model works well and addresses issues with data security and privacy, a constant top concern in surveys and polls regarding inhibitors of cloud computing. It’s not just the talk at the conference that makes such a conclusion probabilistic. An Evans Data developer survey last year indicated that more than 60 percent of developers would be focusing on hybrid cloud computing in 2010. Results of the Evans Data Cloud Development Survey, released Jan. 12, show that 61 percent of the more than 400 developers polled said some portion of their organizations' IT resources "will move to the public cloud within the next year," Evans Data said. "However, over 87 percent [of the developers] say half or less then half of their resources will move ... As a result, the hybrid cloud is set to dominate the coming IT landscape." There are three reasons why this model will become the de facto standard strategy for leveraging cloud computing, at least in the short term and probably for longer than some pundits (and providers) hope.334Views0likes2CommentsSite-to-Site Connectivity in F5 Distributed Cloud Network Connect – Reference Architecture
Purpose This guide describes the reference architecture for deploying F5 Distributed Cloud’s (XC) Multicloud Network Connect service to interconnect their workload across private connectivity or the internet. It enumerates the options available to an F5 Distributed Cloud user to configure site-to-site connectivity using F5 XC Customer Edges (CEs) and explains them in detail to help the user make informed decisions to choose the correct topology for their use case. Audience This guide is for technical readers, including network admins and architects who want to understand how the Multicloud Network Connect service works and what network topology they must use to interconnect their workloads across data centers, branches, and public clouds. This guide assumes the reader is familiar with networking concepts like routing, default gateway configurations, IPSec and SSL encryption, and private connectivity solutions provided by public clouds like AWS Direct Connect and Azure Express Route. Introduction Ensuring workload reachability across data centers, branches, and/or public cloud can be challenging and operationally complex if done in the traditional way. The network teams must design, configure, and maintain multiple networking and security equipment and need expertise across many vendor solutions that provide functionality like NAT, SD-WAN, VPN, firewalls, access control lists (ACLs), etc. This gets even more nuanced while connecting two networks that have overlapping IP address CIDRs which is often in the case of hybrid cloud deployments and during mergers and acquisitions. F5 XC Multicloud Network Connect provides a simple way to configure these interconnections and manage access and security policies across multiple heterogeneous environments, from a single console. It abstracts the complexities by taking the user intent and automating the underlying networking and security while providing the flexibility to choose to connect over a private network or the public internet. Customer Edge as a Gateway To provide site-to-site reachability and ensure enforcement of the security policies, traffic must flow through the CE site. For this, the CE’s Site Local Inside (SLI) IP address must be used as the default gateway or as the next hop to reach the networks on other sites. Figure:Using CE as the gateway Physical vs. Logical Connectivity Two or more CE sites can be physically connected in multiple ways. But this does not automatically allow networks on different sites to be L3 routable to each other by default. For this, the user must associate networks with segments. The physical connection dictates the path the packets will take while going from one site to the other, and the logical connection connects the VLANs (on-prem) and the VPCs/VNETs (in the public cloud) using network overlay and provides segmentation. Note: Multicloud Network Connect provides Layer 3 connectivity between networks. Configuring L3 connectivity is not required to have app-to-app connectivity across the sites. This is done using the distributed load balancer feature under App Connect. Physical Transit Options Over F5 Global Network Backbone A CE site is always connected to the two nearest Regional Edges (REs) for redundancy, using IPSec or SSL tunnels. The REs across different regions are connected via a global, private F5 backbone network. F5’s global network backbone provides high-speed, private transit across regions where Regional Edges are located. Users can use the CE-RE tunnels over the internet to securely connect to this backbone locally and leverage the private connectivity to connect across regions. Figure: Default CE-CE connectivity over REs and F5 network backbone Pros: No need to manage underlay networking if using CE-RE tunnels over the internet. High-speed private transit between geographically distant regions, at no extra cost. End-to-end encryption of traffic between sites. Option to have end-to-end private connectivity Cons: Throughput is limited to the bandwidth of the two tunnels per site. When To Use: The easiest way to connect when private connectivity is not available between data centers or to the cloud VPCs in the case of hybrid cloud. IPSec/SSL tunnels over the internet are acceptable, but you do not want to manage multiple VPN tunnels or SD-WAN devices. Connecting geographically distant sites and you need better end-to-end latency and reliability than going over the internet. Direct Site-to-Site Over Internet If security regulations prevent the use of F5’s private backbone network, users can connect the CE sites directly to each other using IPSec tunnels (SSL encryption is not supported in this case). This is done using the Site Mesh Group (SMG) feature. Note: Even when the CEs are a part of Site Mesh Group, they will still connect to the REs using encrypted tunnels as this is required for control plane connectivity. When the sites are in an SMG, the data path traffic flows through the CE-CE tunnels as the preferred path. If this link fails, as a backup, the traffic gets routed to the REs and over the F5 network backbone to the other site. The number of tunnels on each link between two CE sites depends on the number of control nodes they have. Two single-node sites are connected using only one tunnel. If any one of these sites has three control nodes, it forms three tunnels to the other site. Figure:Number of tunnels between sites in a SMG Pros: Sites are directly connected, so the data path is not dependent on RE. Easy connectivity over the internet Traffic is always encrypted in transit Eliminates the need to manually configure cross-site VPNs or SD-WAN. Cons: Encryption and decryption require more CPU resources on CE nodes as performance requirements increase. Note: L3 Mode Enhanced Performance can be enabled on the sites to get more performance from available CPU and memory resources, but this must be enabled when the site is used only for L3 connectivity as it reduces the resources available for L7 features. Direct Site-to-Site Over Customer’s Network Backbone The CE-CE tunnels can also be configured to be connected over a private network if end-to-end private connectivity is required. Customer can leverage their existing private connectivity between data centers provisioned using private NaaS providers like Equinix. The sites can either be connected directly using SMG where the connections are encrypted or using the DC Cluster Group (DCG) feature, which connects the sites using IP-in-IP tunnels (no encryption). The number of tunnels on each link between two CE sites depends on the number of control nodes they have. Two single-node sites are connected using only one tunnel. If any one of these sites has three control nodes, it forms three tunnels to the other site. A DCG will give better performance, while an SMG is more secure. Unlike SMG, DCG does not fall back to sending RE-CE tunnels if private connectivity fails. Pros: Data path confined within the customer’s private perimeter. Sites are directly connected, so the data path is not dependent on RE. Option to choose encrypted or unencrypted transit. Simplifies the ACLs on the physical network and allows users to manage segmentation using the F5 XC console. Cons: Customer needs to manage the private connectivity across data centers or from the data center to the public cloud. Direct Site-to-Site Connectivity Topologies For direct site-to-site connectivity, sites can be grouped into Site Mesh Group (SMG) or DC Cluster Group (DCG). These allow sites to connect either in Full Mesh or Hub-Spoke topologies as described below: Full Mesh Site Mesh Group All sites that are part of a full-mesh SMG are connected to every other site using IPSec tunnels, forming a full-mesh topology. Figure:Sites in full mesh Site Mesh Group When To Use: In Hybrid cloud use cases. When all sites have equal functionality (e.g. connecting workloads across data centers). High fault tolerance is required for site-to-site connectivity (not dependent on any one site for transit). Hub-Spoke Site Mesh Group This mode allows the sites to be grouped into a hub SMG and a spoke SMG. The sites within the hub SMG are connected using full mesh topology. The sites within the spoke SMG are connected to the sites in hub SMG only and not to other sites in the spoke SMG. Figure: Sites in Hub-Spoke Site Mesh Group Some characteristics of Hub-Spoke SMG: The hub can have multiple sites for redundancy, but it usually has one site in most customer use cases. A hub site can be a spoke site for a different Hub-Spoke SMG. A CE site can be a spoke for multiple hubs. When To Use: For data center/cloud to edge/branch connectivity use cases. Full Mesh DC Cluster Group DC Cluster Group only supports full mesh topology. Every site in a DCG is connected to every other site using IP-in-IP tunnels. Traffic is not encrypted in transit, but DCG is only supported when sites can be connected over a private network. Figure: Sites in DC Cluster Group When To Use: Connecting VLANs on a data center to VLANs on other data centers or to public cloud VPCs/VNETs, when there is private connectivity between them. When the security regulations allow unencrypted traffic over the private transit. Offline Survivability CE Sites require control plane connectivity to the REs and Global Controller (GC) to exchange routes, renew certificates, and decrypt blindfolded secrets. To enable business continuity during an upstream outage, the Offline Survivability feature can be enabled on all sites in a Full Mesh SMG or DCG. The feature is not supported for Hub-Spoke SMG. With this feature enabled the sites can continue normal operations for 7 days without connecting to the REs and the GC. With the offline survivability feature enabled on a CE site, the local control plane becomes the certificate authority in case of connectivity loss. The decrypted secrets and certificates are cached locally on the CE. So, this feature is not turned on by default to allow the user to decide if enabling it aligns with the security regulations of the company. Logical Connectivity Once the physical transit is configured and the connection topology is chosen the workloads on the networks across the sites can be connected using segments or the applications on one site can be delivered to any other site by configuring a distributed load balancer. Connect Networks - Segmentation Users can create segments and add data center VLANs or public cloud VPCs/VNETs to them. All such networks added to a segment become part of a common routing domain and all workloads on these networks can reach each other using the CE sites as gateways. Users must ensure the networks added to a segment do not overlap. Segments are isolated layer 3 domains. So, workloads on one segment cannot access workloads from other segments by default. However, users can configure Segment Connectors to allow traffic from one segment to another. Figure: Segmentation Connect Applications - Distributed Load Balancing Instead of allowing the workloads to route directly to one another, the user can configure a distributed load balancer to publish a service from one site to other sites. This is done by adding the service endpoints to an origin pool of a load balancer object and advertising it using a custom VIP to another site or multiple sites. This allows the client to connect to the service as a local resource. Using Distributed Load Balancing, an LB admin can configure policies to expose the required API of the application only to the required sites. This reduces the attack surface and increases app security. Figure: Distributed Load Balancing E.g. in the figure above, the on-prem database is advertised to client apps on AWS and Azure which can access the DB using their local VIP, and the on-prem application is advertised to the client on Azure only. Decision Flow to Choose Physical Connectivity Options Once you understand the various physical and logical connectivity options, the below chart can help you to make an informed decision based on the connectivity requirements, infrastructure/platform available, and security restrictions. Once the connectivity is decided, you can choose to connect the network or only publish apps to sites where required based on the application requirements. Related Articles F5XC Site Site Mesh Group DC Cluster Group Segmentation Distributed Load Balancer299Views1like0CommentsSimplify Network Segmentation for Hybrid Cloud
Introduction Enterprises have always had the need to maintain separate development and production environments. Operational efficiency, reduction of blast radius, security and compliance are generally the common objectives behind separating these environments. By dividing networks into smaller, isolated segments, organizations can enhance security, optimize performance, and ensure regulatory compliance. This article demonstrates a practical strategy for implementing network segmentation in modern multicloud environments that also connect on-prem infrastructure. This uses F5 Distributed Cloud (F5 XC) services to connect and secure network segments in cloud environments like Amazon Web Services (AWS) and on-prem datacenters. Need for Segmentation Network segmentation is critical for managing complex enterprise environments. Traditional methods like Virtual Routing and Forwarding (VRFs) and Multiprotocol Label Switching (MPLS) have long been used to create isolated network segments in on-prem setups. F5 XC ensures segmentation in environments like AWS and it can extend the same segmentation to on-prem environments. These techniques separate traffic, enhance security, and improve network management by preventing unauthorized access and minimizing the attack surface. Scenario Overview Our scenario depicts an enterprise with three different environments (prod, dev, and shared services) extended between on-prem and cloud. A 3rd party entity requires access to a subset of the enterprise's services. This article, covers the following two networking segmentation use-cases: Hybrid Cloud Transit Extranet (servicing external 3 rd party partners/customers) Hybrid Cloud Transit Consider an enterprise with three distinct environments: Production (Prod), Development (Dev), and Shared Services. Each environment requires strict isolation to ensure security and performance. Using F5 XC Cloud Connect, we can assign each VPC a network segment effectively isolating the VPC’s. Segments in multiple locations (or VPC’s) can traverse F5 XC to reach distant locations whether in another cloud environment or on-prem. Network segments are isolated by default, for example, our Prod segment cannot access Shared. A segment connector is needed to allow traffic between Prod and Shared. The following diagram shows the VPC segments, ensuring complete "ships in the night" isolation between environments. In this setup, Prod, Dev, and Shared Services environments operate independently and are completely isolated from one another at the control plane level. This ensures that any issues or attacks in one environment do not affect the others. Customer Requirement: Shared Services Access Many enterprises deploy common services across their organization to support internal workloads and applications. Some examples include DHCP, DNS, NTP, and NFS, services that need to be accessible to both Prod and Dev environments while keeping Prod and Dev separate from each other. Segment Connectors is a method to allow communication between two isolated segments by leaking the routes between the source and destination segments. It is important to note that segment connector can be of type Direct or SNAT. Direct allows bidirectional communication between segments whereas the SNAT option allows unidirectional communication from the source to the destination. Extending Segmentation to On-Premises Enterprises already use segmented networks within their on-premises infrastructure. Extending this segmentation to AWS involves creating similar isolated segments in the cloud and establishing secure communication channels. F5 XC allows you to easily extend this segmentation from on-prem to the cloud regardless of the underlay technology. In this scenario, communication between the on-premises Prod segment and its cloud counterpart is seamless, and the same also applies for the Dev segment. Meanwhile Dev and Prod stay separate ensuring that existing security and isolation is preserved across the hybrid environment. Extranet In this scenario an external entity (customer/partner) needs access to a few applications within our Prod segment. There are two different ways to enable this access, Network-centric and App-centric. Let’s refer to the external entity as Company B. In order to connect Company B we generally need appropriate cloud credentials, but Company B will not share their cloud credentials with us. To solve this problem, F5 XC recommends using AWS STS:AssumeRole functionality whereby Company B creates an AWS IAM Role that trusts F5 XC with the minimum privileges necessary to configure Transit Gateway (TGW) attachments and TGW route table entries to extend access to the F5 XC network or network segments. Section 1 – Network-centric Extranet Many times, partners & customers need to access a unique subset of your enterprise’s applications. This can be achieved with F5 XC’s dedicated network segments and segment connectors. With a segment connector for the external and prod network segments, we can give Company B access to the required HTTP service without gaining broader access to other non-Prod segments. Locking Down with Firewall Policies We can implement a Zero Trust firewall policy to lock down access from the external segment. By refining these policies, we ensure that third-party consumers can only access the services they are authorized to use. Our firewall policy on the CE only allows access from the external segment to the intended application on TCP/80 in Prod. [ec2-user@ip-10-150-10-146 ~]$ curl --head 10.1.10.100 HTTP/1.1 200 OK Server: nginx/1.24.0 (Ubuntu) Date: Thu, 30 May 2024 20:50:30 GMT Content-Type: text/html Content-Length: 615 Last-Modified: Wed, 22 May 2024 21:35:11 GMT Connection: keep-alive ETag: "664e650f-267" Accept-Ranges: bytes [ec2-user@ip-10-150-10-146 ~]$ ping -O 10.1.10.100 PING 10.1.10.100 (10.1.10.100) 56(84) bytes of data. no answer yet for icmp_seq=1 no answer yet for icmp_seq=2 no answer yet for icmp_seq=3 ^C --- 10.1.10.100 ping statistics --- 4 packets transmitted, 0 received, 100% packet loss, time 3153ms After applying the new policies, we confirm that the third-party access is restricted to the intended services only, enhancing security and compliance. This demonstrates how F5 Distributed Cloud services enable networking segmentation across on-prem and cloud environments, with granular control over security policies applied between the segments. Section 2 - App-centric Extranet In the scenario above, Company B can directly access one or more services in Prod with a segment connector and we’ve locked it down with a firewall policy. For the App-centric method, we’ll only publish the intended services that live in Prod to the external segment. App-centric connectivity is made possible without a segment connector by using load balancers within App Connect that target the application within the Prod segment and advertises its VIP address to the external segment. The following illustration shows how to configure each component in the load balancer. Visualization of Traffic Flows The visualization flow analysis tool in the F5 XC Console shows traffic flows between the connected environments. By analyzing these flows, particularly between third-party consumers and the Prod environment, we can identify any unintended access or overreach. The following diagram is for a Network-centric connection flow: This following diagram shows an App-centric connection flow using the load balancer: Product Feature Demo Conclusion Effective network segmentation is a cornerstone of secure and efficient cloud environments. We’ve discussed how F5 XC enables hybrid cloud transit and extranet communication. Extranet can be done with either a network centric or app-centric deployment. F5 XC is an end to end platform that manages and orchestrates end-to-end segmentation and security in hybrid-cloud environments. Enterprises can achieve comprehensive segmentation, ensuring isolation, secure access, and compliance. The strategies and examples provided demonstrate how to implement and manage segmentation across hybrid environments, catering to diverse requirements and enhancing overall network security. Additional Resources More features and guidance are provided in the comprehensive guide below, where showing exactly how you can use the power and flexibility of F5 Distributed Cloud and Cloud Connect to deliver a Network-centric approach with a firewall and an App-centric approach with a load balancer. Create and manage segmented networks inyour own cloud and on-prem environments, and achieve the following benefits: Ability to isolate environments within AWS Ability to extend segmentation to on-prem environments Ability to connect external partners or customers to a specific segment Use Enhanced Firewall Policies to limit access and reduce the blast radius Enhance the compliance and regulatory requirements by isolating sensitive data and systems Visualize and monitor the traffic flows and policies across segments and network domains Workflow Guide - Secure Network Fabric (Multi-Cloud Networking) YouTube: Using network segmentation for hybrid-cloud and extranet with F5 Distributed Cloud Services DevCentral:Secure Multicloud Networking Article Series GitHub: S-MCN Use-case Playbooks (Console, Automation) for F5 Distributed Cloud Customers F5.com: Product Information Product Documentation Network Segmentation Cloud Connect Network Segment Connectors App Security App Networking CE Site Management299Views0likes0CommentsThe Dynamic Data Center: Cloud's Overlooked Little Brother
It may be heresy, but not every organization needs or desires all the benefits of cloud. There are multiple trends putting pressure on IT today to radically change the way they operate. From SDN to cloud, market pressure on organizations to adopt new technological models or utterly fail is immense. That's not to say that new technological models aren't valuable or won't fulfill promises to add value, but it is to say that the market often overestimates the urgency with which organizations must view emerging technology. Too, mired in its own importance and benefits, markets often overlook that not every organization has the same needs or goals or business drivers. After all, everyone wants to reduce their costs and simplify provisioning processes! And yet goals can often be met through application of other technologies that carry less risk, which is another factor in the overall enterprise adoption formula – and one that's often overlooked. DYNAMIC DATA CENTER versus cloud computing There are two models competing for data center attention today: dynamic data center and cloud computing. They are closely related, and both promise similar benefits with cloud computing offering "above and beyond" benefits that may or may not be needed or desired by organizations in search of efficiency. The dynamic data center originates with the same premises that drive cloud computing: the static, inflexible data center models of the past inhibit growth, promote inefficiency, and are fraught with operational risk. Both seek to address these issues with more flexible, dynamic models of provisioning, scale and application deployment. The differences are actually quite subtle. The dynamic data center is focused on NOC and administration, with enabling elasticity and shared infrastructure services that improve efficiency and decrease time to market. Cloud computing, even private cloud, is focused on the tenant and enabling for them self-service capabilities across the entire application deployment lifecycle. A dynamic data center is able to rapidly respond to events because it is integrated and automated to enable responsiveness. Cloud computing is able to rapidly respond to events because it is necessarily must provide entry points into the processes that drive elasticity and provisioning to enable the self-service aspects that have become the hallmark of cloud computing. DATA CENTER TRANSFORMATION: PHASE 4 You may recall the cloud maturity model, comprising five distinct steps of maturation from initial virtualization efforts through a fully cloud-enabled infrastructure. A highly virtualized data center, managed via one of the many available automation and orchestration frameworks, may be considered a dynamic data center. When the operational processes codified by those frameworks are made available as services to consumers (business and developers) within the organization, the model moves from dynamic data center to private cloud. This is where the dynamic data center fits in the overall transformational model. The thing is that some organizations may never desire or need to continue beyond phase 4, the dynamic data center. While cloud computing certainly brings additional benefits to the table, these may be benefits that, when evaluated against the risks and costs to implement (or adopt if it's public) simply do not measure up. And that's okay. These organizations are not some sort of technological pariah because they choose not to embark on a journey toward a destination that does not, in their estimation, offer the value necessary to compel an investment. Their business will not, as too often predicted with an overabundance of hyperbole, disappear or become in danger of being eclipsed by other more agile, younger versions who take to cloud like ducks take to water. If you're not sure about that, consider this employment ad from the most profitable insurance company in 2012, United Health Group – also #22 on the Fortune 500 list – which lists among its requirements "3+ years of COBOL programming." Nuff said. Referenced blogs & articles: Is Your Glass of Cloud Half-Empty or Half-Full? Fortune 500 Snapshot: United Health Group Hybrid Architectures Do Not Require Private Cloud291Views0likes0CommentsCloud Bursting: Gateway Drug for Hybrid Cloud
The first hit’s cheap kid … Recently Ben Kepes started a very interesting discussion on cloud bursting by asking whether or not it was real. This led to Christofer Hoff pointing out that “true” cloud bursting required routing based on business parameters. That needs to be extended to operational parameters, but in general, Hoff’s on the mark in my opinion. The core of the issue with cloud bursting, however, is not that requests must be magically routed to the cloud in an overflow situation (that seems to be universally accepted as part of the definition), but the presumption that the content must also be dynamically pushed to the cloud as part of the process, i.e. live migration. If we accept that presumption then cloud bursting is nowhere near reality. Not because live migration can’t be done, but because the time requirement to do so prohibits a successful “just in time” bursting approach. There is already a requirement that provisioning of resources in the cloud as preparation for a bursting event happen well before the event, it’s a predictive, proactive process nor a reactionary one, and the inclusion of live migration as part of the process would likely result in false provisioning events (where content is migrated prematurely based on historical trending which fails to continue and therefore does not result in an overflow situation). So this leaves us with cloud bursting as a viable architectural solution to scale on-demand only if we pre-position content in the cloud, with the assumption that provisioning is a less time intensive process than migration plus provisioning. This results in a more permanent, hybrid cloud architecture. THE ROAD to HYBRID The constraints on the network today force organizations who wish to address their seasonal or periodic need for “overflow” capacity to pre-position the content in demand at a cloud provider. This isn’t as simple as dropping a virtual machine in EC2, it also requires DNS modifications to be made and the implementation of the policy that will ultimately trigger the routing to the cloud campus. Equally important – actually, perhaps more important – is having the process in place that will actually provision the application at the cloud campus. In other words, the organization is building out the foundation for a hybrid cloud architecture. But in terms of real usage, the cloud-deployed resources may only be used when overflow capacity is required. So it’s only used periodically. But as its user base grows, so does the need for that capacity and organizations will see those resources provisioned more and more often, until they’re virtually always on. There’s obviously an inflection point at which the use of cloud-based resources moves out of the realm of “overflow capacity” and into the realm of “capacity”, period. At that point, the organization is in possession of a full, hybrid cloud implementation. LIMITATIONS IMPOSE the MODEL Some might argue – and I’d almost certainly concede the point – that a cloud bursting model that requires pre-positioning in the first place is a hybrid cloud model and not the original intent of cloud bursting. The only substantive argument I could provide to counter is that cloud bursting focuses more on the use of the resources and not the model by which they are used. It’s the on-again off-again nature of the resources deployed at the cloud campus that make it cloud bursting, not the underlying model. Regardless, existing limitations on bandwidth force the organization’s hand; there’s virtually no way to avoid implementing what is a foundation for hybrid cloud as a means to execute on a cloud bursting strategy (which is probably a more accurate description of the concept than tying it to a technical implementation, but I’m getting off on a tangent now). The decision to embark on a cloud bursting initiative, therefore, should be made with the foresight that it requires essentially the same effort and investment as a hybrid cloud strategy. Recognizing that up front enables a broader set of options for using those cloud campus resources, particularly the ability to leverage them as true “utility” computing, rather than an application-specific (i.e. dedicated) set of resources. Because of the requirement to integrate and automate to achieve either model, organizations can architect both with an eye toward future integration needs – such as those surrounding identity management, which continues to balloon as a source of concern for those focusing in on SaaS and PaaS integration. Whether or not we’ll solve the issues with live migration as a barrier to “true” cloud bursting remains to be seen. As we’ve never managed to adequately solve the database replication issue (aside from accepting eventual consistency as reality), however, it seems likely that a “true” cloud bursting implementation may never be possible for organizations who aren’t mainlining the Internet backbone.281Views0likes0CommentsThe Conspecific Hybrid Cloud
Operational consistency and control continue to be a driving force in hybrid cloud architectures When you’re looking to add new tank mates to an existing aquarium ecosystem, one of the concerns you must have is whether a particular breed of fish is amenable to conspecific cohabitants. Many species are not, which means if you put them together in a confined space, they’re going to fight. Viciously. To the death. Responsible aquarists try to avoid such situations, so careful attention to the conspecificity of animals is a must. Now, while in many respects the data center ecosystem correlates well to an aquarium ecosystem, in this case it does not. It’s what you usually get, today, but its not actually the best model. That’s because what you want in the data center ecosystem – particularly when it extends to include public cloud computing resources – is conspecificity in infrastructure. This desire and practice is being seen both in enterprise data center decision making as well as in startups suddenly dealing with massive growth and increasingly encountering performance bottlenecks over which IT has no control to resolve. OPERATIONAL CONSISTENCY One of the biggest negatives to a hybrid architectural approach to cloud computing is the lack of operational consistency. While enterprise systems may be unified and managed via a common platform, resources and delivery services in the cloud are managed using very different systems and interfaces. This poses a challenge for all of IT, but is particularly an impediment to those responsible for devops – for integrating and automating provisioning of the application delivery services required to support applications. It requires diverse sets of skills – often those peculiar to developers such as programming and standards knowledge (SOAP, XML) – as well as those traditionally found in the data center. “We own the base, rent the spike. We want a hybrid operation. We love knowing that shock absorber is there.” – Allan Leinwand, Zynga’s Infrastructure CTO Other bottlenecks were found in the networks to storage systems, Internet traffic moving through Web servers, firewalls' ability to process the streams of traffic, and load balancers' ability to keep up with constantly shifting demand. Zynga uses Citrix Systems CloudStack as its virtual machine management interface superimposed on all zCloud VMs, regardless of whether they're in the public cloud or private cloud. Inside Zynga’s Big Move To Private Cloud by InformationWeek’s Charles Babcock This operational inconsistency also poses a challenge in the codification of policies across the security, performance, and availability spectrum as diverse systems often require very different methods of encapsulating policies. Amazon security groups are not easily codified in enterprise-class systems, and vice-versa. Similarly, the options available to distribute load across instances required to achieve availability and performance goals are impeded by lack of consistent support for algorithms across load balancing services as well as differences in visibility and health monitoring that prevent a cohesive set of operational policies to govern the overall architecture. Thus if hybrid cloud is to become the architectural model of choice, it becomes necessary to unify operations across all environments – whether public or enterprise. UNIFIED OPERATIONS We are seeing this demand more and more, as enterprise organizations seek out ways to integrate cloud-based resources into existing architectures to support a variety of business needs – disaster recover, business continuity, and spikes in application demand. What customers are demanding is a unified approach to integrating those resources, which means infrastructure providers must be able to offer solutions that can be deployed both in a traditional enterprise-class model as well as a public cloud environment. This is also true for organizations that may have started in the cloud but are now moving to a hybrid model in order to seize control of the infrastructure as a means to address performance bottlenecks that simply cannot be addressed by cloud providers due to the innate nature of a shared model. This ability to invoke and coordinate both private and public clouds is "the hidden jewel" of Zynga's success, says Allan Leinwand, CTO of infrastructure engineering at the company. -- Lessons From FarmVille: How Zynga Uses The Cloud While much is made of Zynga’s “reverse cloud-bursting” business model, what seems to be grossly overlooked is the conspecificity of infrastructure required in order to move seamlessly between the two worlds. Whether at the virtualization layer or at the delivery infrastructure layer, a consistent model of operations is a must to transparently take advantage of the business benefits inherent in a cross-environment, aka hybrid, cloud model of deployment. As organizations converge on a hybrid model, they will continue to recognize the need and advantages of an operationally consistent model – and they are demanding it be supported. Whether it’s Zynga imposing CloudStack on its own infrastructure to maintain compatibility and consistency with its public cloud deployments or enterprise IT requiring public cloud deployable equivalents for traditional enterprise-class solutions, the message is clear: operational consistency is a must when it comes to infrastructure. H/T @Archimedius “The Hybrid Cloud is the Future of IT Infrastructure”266Views0likes0CommentsAll Your Packets Are Belong to … You?
Yes, even the ones over there, in that there cloud, can be yours. No one argues that networks have not exploded in terms of speeds and feeds in the past decade. What with more consumers (and cows), more companies going “online”, and more content it’d be hard to argue that there’s less traffic out there today than there was even a mere four or five years ago. The increasing pressure put on the network is often mentioned almost in passing, as though merely moving from 10Gbps to 40Gbps to 100Gbps will solve the problem. Move along now, nothing to see here but a higher flow of packets. But that higher density of packets along with greater diversity of content coupled with distribution through cloud computing that’s creating other issues for network services whose purpose it is to collect, analyze, and act upon those packets. IDS, IPS, secure web gateways, voice analyzers, honeypots. There are myriad network infrastructure devices that are tasked with analyzing the content of packets flowing in and out of the data center that find it more and more difficult to scale along with the rapid growth of data on the network. Application Performance Monitoring (APM) systems, as well, often take advantage of port mirroring as a way to collect and analyze intra-system traffic to pinpoint configuration or network issues that may cause performance degradation. These systems need one thing: all your (relevant) packets. The problem is that on most switches, you can designate only a couple of ports as egress span ports and you may have three, four or more devices and systems that need those packets. And Heaven forbid you have a desperate need to later tap into the switch to troubleshoot an urgent issue. The answer in the past has been some highly complex network topologies that are difficult to maintain and not easy to extend when the next system needing all your packets is deployed. Additionally, cloud-deployed applications and systems are not easily included, even though organizations desire the same level of visibility and analysis of those packets as is found in the data center. One answer to these issues is found in what Gartner is calling Network Packet Brokers. One such provider in this space is VSS Monitoring, which recently introduced a new set of solutions to resolve this lack of visibility both in the data center and within the cloud. VSS MONITORING VSS Monitoring has been around since 2006, shipping aggregation and related management products. Now it’s introduced several new products that assist in the goal of collecting packets across the increasingly cloudy landscape and getting them to the right place at the right time, a market being referred to as “Network Packet Brokers (NPB)”. Gartner analysts describe these solutions as consisting of “devices that facilitate monitoring and security technologies to see the traffic which is required for those solutions to work more effectively. They could be called “monitoring switches” “matrix switches” (Application Aware Network Performance Monitoring (NPM) and Network Packet Broker (NPB) research). NPB solutions must be able to perform many-to-many port mapping using a GUI or CLI, filter packets at L2-4, and perform packing slicing and deduplication as well as aggregation and intelligent distribution. This last criteria is an important one, as it allows operators to filter out noise when directing packets to reduce the requirement that analyzers and systems process (and ultimately discard) irrelevant traffic. VSS Monitoring has introduced a set of solutions that meet (and in some cases exceed) the requirements laid out by Gartner (VSS supports L2-7 filtering) and that further expand the scope of such solutions into cloud computing environments: New packet broker appliances -- vBrokers™ Expanded system-level scalability – vMesh™ Topology-level unified management console – vMC™ VSS achieves this inter-cloud monitoring capability by leveraging a proprietary L2 bi-directional protocol for its interconnects called vMesh. Its vBrokers are purpose-built appliances that can interconnect with one another using vMesh to form a virtual network tool optimization fabric . These vBrokers can be deployed across LAN, WAN segments and in a wide variety of cloud network infrastructure environments using the vMesh architecture effectively forming an overlay network over which packets are shared. From there, it’s a matter of dragging and dropping policies and configuration via its vMC unified management console to access network packets on demand and properly direct them based on organizational needs. the VSS’ new vMesh technology can scale out to up to 256 devices and 10,000 and more ports. VSS also provides an Open XML API that encourages integration. Configuration, remote management, metrics, etc… can be achieved via this API. VSS solutions today are not supported by common provisioning and automation frameworks (Chef, Puppet, OpenStack) although that is something that may very well be supported in the future. Still, the ability to reach out into the cloud and direct packets to DC-hosted infrastructure services providing analysis, security, or other functions solves a major issue with managing cloud-deployed applications: visibility. SDN versus NETWORK PACKET BROKERS At first read, this sounds a lot like a suggested SDN (Software-Defined Networking) use case (found on SDN Central) that posits the use of OpenFlow as a Virtual Patch Panel. However, on deeper inspection there are some distinct differences between the two solutions. While both are focused on solving what is essentially a port forwarding problem (port spanning is really just a case of directing ingress packets on one port to more than one egress port) SDN is (today) more disruptive a solution both in the enterprise and in the cloud. While it’s true that with both solutions you need some means to direct ingress packets to the desired egress port, VSS’ solution does not require that the switches in question be OpenFlow enabled (which may be problematic in cloud environments). Additionally, the forwarding mechanism available with OpenFlow is simple forwarding – packet in, packet out. While a more sophisticated forwarding algorithm could certainly be employed, this would require specific code. VSS, on the other hand, enables intelligent forwarding of actionable packets, reducing the amount of irrelevant traffic any given infrastructure solution might need to process. Voice analyzers, for example, need only see VoIP, SIP and related traffic. Such a system doesn’t need to inspect a JSON exchange, nor will it – the packets will be inspected and discarded. Using a more intelligent approach, VSS can intervene and eliminate the overhead associated with inspecting and discarding non-actionable traffic. This offload-like capability improves the capacity and performance of packet analyzing systems. Further more, VSS offers a single-pane of glass management system for monitoring and managing its packet brokers, while an OpenFlow-enabled solution currently does not. This is certainly an area of exploration for SDN and OpenFlow-enabled devices and future value-add for those banking on SDN; admittedly the technology is still very much in its nascent phase and maturation will bring more mature, robust solutions not only in core device support but in management and niche-market solutions. The other issue is deployment in the cloud, as a virtual device. The good news is that Open vSwitch is embedded in many hypervisors and is available as a package for a variety of Linux-based systems. The bad news is that in some cloud environments (like Amazon) these approaches may not be possible to deploy and/or take advantage of, thus rendering an SDN-OpenFlow approach more or less toothless. VSS’ packet broker, vBroker, supports a broad set of physical and virtual environments (i.e. physical and virtual span ports, ability to filter and remove VN-Tags, etc) which enables a wider set of cloud environments to take advantage of the capabilities. That’s not to say the two couldn’t be combined, either. In fact, VSS could be described as “SDN for networking monitoring”, though VSS itself has not chosen to represent its solution this way. But essentially it’s acting in the same manner as SDN – simply confined to a specific area of functionality – monitoring. As I posited in the past, I suspect we’ll continue to see these kinds of “pockets of SDN” capabilities pop up to resolve some pressing issues that simply can’t be addressed by traditional networking methods – or at least can’t be addressed efficiently or in an acceptably rapid manner. In such an architecture (one comprised of controllers at strategic points of control) VSS Monitoring is certainly positioned to act as the control point for managing a broadly distributed monitoring network.265Views0likes0Comments