strategic point of control
22 TopicsApplying ‘Centralized Control, Decentralized Execution’ to Network Architecture
#SDN brings to the fore some critical differences between concepts of control and execution While most discussions with respect to SDN are focused on a variety of architectural questions (me included) and technical capabilities, there’s another very important concept that need to be considered: control and execution. SDN definitions include the notion of centralized control through a single point of control in the network, a controller. It is through the controller all important decisions are made regarding the flow of traffic through the network, i.e. execution. This is not feasible, at least not in very large (or even just large) networks. Nor is it feasible beyond simple L2/3 routing and forwarding. HERE COMES the SCIENCE (of WAR) There is very little more dynamic than combat operations. People, vehicles, supplies – all are distributed across what can be very disparate locations. One of the lessons the military has learned over time (sometimes quite painfully through experience) is the difference between control and execution. This has led to decisions to employ what is called, “Centralized Control, Decentralized Execution.” Joint Publication (JP) 1-02, Department of Defense Dictionary of Military and Associated Terms, defines centralized control as follows: “In joint air operations, placing within one commander the responsibility and authority for planning, directing, and coordinating a military operation or group/category of operations.” JP 1-02 defines decentralized execution as “delegation of execution authority to subordinate commanders.” Decentralized execution is the preferred mode of operation for dynamic combat operations. Commanders who clearly communicate their guidance and intent through broad mission-based or effects-based orders rather than through narrowly defined tasks maximize that type of execution. Mission-based or effects-based guidance allows subordinates the initiative to exploit opportunities in rapidly changing, fluid situations. -- Defining Decentralized Execution in Order to Recognize Centralized Execution * Lt Col Woody W. Parramore, USAF, Retired Applying this to IT network operations means a single point of control is contradictory to the “mission” and actually interferes with the ability of subordinates (strategic points of control) to dynamically adapt to rapidly changing, fluid situations such as those experienced in virtual and cloud computing environments. Not only does a single, centralized point of control (which in the SDN scenario implies control over execution through admittedly dynamically configured but rigidly executed) abrogate responsibility for adapting to “rapidly changing, fluid situations” but it also becomes the weakest link. Clausewitz, in the highly read and respected “On War”, defines a center of gravity as "the hub of all power and movement, on which everything depends. That is the point against which all our energies should be directed." Most military scholars and strategists logically imply from the notion of a Clausewitzian center of gravity is the existence of a critical weak link. If the “controller” in an SDN is the center of gravity, then it follows it is likely a critical, weak link. This does not mean the model is broken, or poorly conceived of, or a bad idea. What it means is that this issue needs to be addressed. The modern strategy of “Centralized Control, Decentralized Execution” does just that. Centralized Control, Decentralized Execution in the Network The major issue with the notion of a centralized controller is the same one air combat operations experienced in the latter part of the 20th century: agility, or more appropriately, lack thereof. Imagine a large network adopting fully an SDN as defined today. A single controller is responsible for managing the direction of traffic at L2-3 across the vast expanse of the data center. Imagine a node, behind a Load balancer, deep in the application infrastructure, fails. The controller must respond and instruct both the load balancing service and the core network how to react, but first it must be notified. It’s simply impossible to recover from a node or link failure in 50 milliseconds (a typical requirement in networks handling voice traffic) when it takes longer to get a reply from the central controller. There’s also the “slight” problem of network devices losing connectivity with the central controller if the primary uplink fails. -- OpenFlow/SDN Is Not A Silver Bullet For Network Scalability, Ivan Pepelnjak (CCIE#1354 Emeritus) Chief Technology Advisor at NIL Data Communications The controller, the center of network gravity, becomes the weak link, slowing down responses and inhibiting the network (and IT) from responding in a rapid manner to evolving situations. This does not mean the model is a failure. It means the model must adapt to take into consideration the need to adapt more quickly. This is where decentralized execution comes in, and why predictions that SDN will evolve into an overarching management system rather than an operational one are likely correct. There exist today, within the network, strategic points of control; locations within the data center architecture at which traffic (data) is aggregated, forcing all data to traverse, from which control over traffic and data is maintained. These locations are where decentralized execution can fulfill the “mission-based guidance” offered through centralized control. Certainly it is advantageous to both business and operations to centrally define and codify the operating parameters and goals of data center networking components (from L2 through L7), but it is neither efficient nor practical to assume that a single, centralized controller can achieve both managing and executing on the goals. What the military learned in its early attempts at air combat operations was that by relying on a single entity to make operational decisions in real time regarding the state of the mission on the ground, missions failed. Airmen, unable to dynamically adjust their actions based on current conditions, were forced to watch situations deteriorate rapidly while waiting for central command (controller) to receive updates and issue new orders. Thus, central command (controller) has moved to issuing mission or effects-based objectives and allowing the airmen (strategic points of control) to execute in a way that achieves those objectives, in whatever way (given a set of constraints) they deem necessary based on current conditions. This model is highly preferable (and much more feasible given today’s technology) than the one proffered today by SDN. It may be that such an extended model can easily be implemented by distributing a number of controllers throughout the network and federating them with a policy-driven control system that defines the mission, but leaves execution up to the distributed control points – the strategic control points. SDN is new, it’s exciting, it’s got potential to be the “next big thing.” Like all nascent technology and models, it will go through some evolutionary massaging as we dig into it and figure out where and why and how it can be used to its greatest potential and organizations’ greatest advantage. One thing we don’t want to do is replicate erroneous strategies of the past. No network model abrogating all control over execution has every really worked. All successful models have been a distributed, federated model in which control may be centralized, but execution is decentralized. Can we improve upon that? I think SDN does in its recognition that static configuration is holding us back. But it’s decision to reign in all control while addressing that issue may very well give rise to new issues that will need resolution before SDN can become a widely adopted model of networking. QoS without Context: Good for the Network, Not So Good for the End user Cyclomatic Complexity of OpenFlow-Based SDN May Drive Market Innovation SDN, OpenFlow, and Infrastructure 2.0 OpenFlow/SDN Is Not A Silver Bullet For Network Scalability Prediction: OpenFlow Is Dead by 2014; SDN Reborn in Network Management OpenFlow and Software Defined Networking: Is It Routing or Switching ? Cloud Security: It’s All About (Extreme Elastic) Control Ecosystems are Always in Flux The Full-Proxy Data Center Architecture384Views0likes0CommentsNever attribute to technology that which is explained by the failure of people
#cloud Whether it’s Hanlon or Occam or MacVittie, the razor often cuts both ways. I am certainly not one to ignore the issue of complexity in architecture nor do I dismiss lightly the risk introduced by cloud computing through increased complexity. But I am one who will point out absurdity when I see it, and especially when that risk is unfairly attributed to technology. Certainly the complexity introduced by attempts to integrate disparate environments, computing models, and networks will give rise to new challenges and introduce new risk. But we need to carefully consider whether the risk we discover is attributable to the technology or to simple failure by those implementing it. Almost all of the concepts and architectures being “discovered” in conjunction with cloud computing are far from original. They are adaptations, evolutions, and maturation of existing technology and architectures. Thus, it is almost always the case that when a “risk” of cloud computing is discovered it is not peculiar to cloud computing at all, and thus likely has it roots in implementation not the technology. This is not to say there aren’t new challenges or risks associated with cloud computing, there are and will be cloud-specific risks that must be addressed (IP Identity Theft was heretofore unknown before the advent of cloud computing). But let’s not make mountains out of molehills by failing to recognize those “new” risks that actually aren’t “new” at all, but rather are simply being recognized by a wider audience due to the abundance of interest in cloud computing models. For example, I found this article particularly apocalyptic with respect to cloud and complexity on the surface. Digging into the “simple scenario”, however, revealed that the meltdown referenced was nothing new, and certainly wasn’t a technological problem – it was another instance of lack of control, of governance, of oversight, and of communication. The risk is being attributed to technology, but is more than adequately explained by the failure of people. The Hidden Risk of a Meltdown in the Cloud Ford identifies a number of different possibilities. One example involves an application provider who bases its services in the cloud, such as a cloud -based advertising service. He imagines a simple scenario in which the cloud operator distributes the service between two virtual servers, using a power balancing program to switch the load from one server to the other as conditions demand. However, the application provider may also have a load balancing program that distributes the customer load. Now Ford imagines the scenario in which both load balancing programs operate with the same refresh period, say once a minute. When these periods coincide, the control loops start sending the load back and forth between the virtual servers in a positive feedback loop. Could this happen? Yes. But consider for a moment how it could happen. I see three obvious possibilities: IT has completely abdicated its responsibility to governing foundational infrastructure services like load balancing and allowed the business or developers to run amokwithout regard for existing services. IT has failed to communicate its overarching strategy and architecture with respect to high-availability and scale in inter-cloud scenarios to the rest of the IT organization, i.e. IT has failed to maintain control (governance) over infrastructure services. The left hand of IT and the right hand of IT have been severed from the body of IT and geographically separated with no means to communicate. Furthermore, each hand of IT wholeheartedly believes that the other is incompetent and will fail to properly architect for high-availability and scalability, thus requiring each hand to implement such services as required to achieve high-availability. While the third possibility might make a better “made for SyFy tech-horror” flick, the reality is likely somewhere between 1 and 2. This particular scenario, and likely others, is not peculiar to cloud. The same lack of oversight in a traditional architecture could lead to the same catastrophic cascade described by Ford in the aforementioned article. Given a load balancing service in the application delivery tier, and a cluster controller in the application infrastructure tier, the same cascading feedback loop could occur, causing a meltdown and inevitably downtime for the application in question. Astute observers will conclude that an IT organization in which both a load balancing service and a cluster controller are used to scale the same application has bigger problems than duplicated services and a failed application. This is not a failure of technology, nor is it caused by excessive complexity or lack of transparency within cloud computing environments. It’s a failure to communicate, to control, to oversee the technical implementation of business requirements through architecture. That’s a likely conclusion before we even start considering an inter-cloud model with two completely separate cloud providers sharing access to virtual servers deployed in one or the other – maybe both? Still, the same analysis applies – such an architecture would require willful configuration and knowledge of how to integrate the environments. Which ultimately means a failure on the part of people to communicate. THE REAL PROBLEM The real issue here is failure to oversee – control – the integration and use of cloud computing resources by the business and IT. There needs to be a roadmap that clearly articulates what services should be used and in what environments. There needs to be an understanding of who is responsible for what services, where they connect, with whom they share information, and by whom they will (and can be) accessed. Maybe I’m just growing jaded – but we’ve seen this lack of roadmap and oversight before. Remember SOA? It ultimately failed to achieve the benefits promised not because the technology failed, but because the implementations were generally poorly architected and governed. A lack of oversight and planning meant duplicated services that undermined the success promised by pundits. The same path lies ahead with cloud. Failure to plan and architect and clearly articulate proper usage and deployment of services will undoubtedly end with the same disillusioned dismissal of cloud as yet another over-hyped technology. Like SOA, the reality of cloud is that you should never attribute to technology that which is explained by the failure of people. BFF: Complexity and Operational Risk The Pythagorean Theorem of Operational Risk At the Intersection of Cloud and Control… What is a Strategic Point of Control Anyway? The Battle of Economy of Scale versus Control and Flexibility Hybrid Architectures Do Not Require Private Cloud Control, choice, and cost: The Conflict in the Cloud Do you control your application network stack? You should. The Wisdom of Clouds: In Cloud Computing, a Good Network Gives You Control...188Views0likes0CommentsF5 Friday: Ops First Rule
#cloud #microsoft #iam “An application is only as reliable as its least reliable component” It’s unlikely there’s anyone in IT today that doesn’t understand the role of load balancing to scale. Whether cloud or not, load balancing is the key mechanism through which load is distributed to ensure horizontal scale of applications. It’s also unlikely there’s anyone in IT that doesn’t understand the relationship between load balancing and high-availability (reliability). High-Availability (HA) architectures are almost always implemented using load balancing services to ensure seamless transition from one service instance to another in the event of a failure. What’s often overlooked is that scalability and HA isn’t important just for applications. Services – whether application or network-focused – must also be reliable. It’s the old “only as strong as the weakest link in the chain” argument. An application is only as reliable as its least reliable component – and that includes services and infrastructure upon which that application relies. It is – or should be – ops first rule; the rule that guides design of data center architectures. This requirement becomes more and more obvious as emerging architectures combining the data center and cloud computing are implemented, particularly when federating identity and access services. That’s because it is desirable to maintain control over the identity and access management processes that authenticate and authorize use of applications no matter where they may be deployed. Such an architecture relies heavily on the corporate identity store as the authoritative source of both credentials and permissions. This makes the corporate identity store a critical component in the application dependency chain, one that must necessarily be made as reliable as possible. Which means you need load balancing. A good example of how this architecture can be achieved is found in BIG-IP load balancing support for Microsoft’s Active Directory Federation Services (AD FS). AD FS and F5 Load Balancing Microsoft’s Active Directory Federation Services, (AD FS) sever role is an identity access solution that extends the single sign-on, (SSO) experience for directory-authenticated clients, (typically provided on the Intranet via Kerberos), to resources outside of the organization’s boundaries, such as cloud computing environments. To ensure high-availability, performance, and scalability the F5 BIG-IP Local Traffic Manager (LTM) can be deployed to load balance an AD FS server farm. There are several scenarios in which BIG-IP can load balance AD FS services. 1. To enable reliability of AD FS for internal clients accessing external resources, such as those hosted in Microsoft Office 365. This is the simplest of architectures and the most restrictive in terms of access for end-users as it is limited to only internal clients. 2. To enable reliability of AD FS and AD FS proxy servers, which provide external end-user SSO access to both internal federation-enabled resources as well as partner resources like Microsoft Office 365. This is a more flexible option as it serves both internal and external clients. 3. BIG-IP Access Policy Manager (APM) can replace the need for AD FS proxy servers required for external end-user SSO access, which eliminates another tier and enables pre-authentication at the perimeter, offering both the flexibility required (supporting both internal and external access) as well as a more secure deployment. In all three scenarios, F5 BIG-IP serves as a strategic point of control in the architecture, assuring reliability and performance of services upon which applications are dependent, particularly those of authentication and authorization. Using BIG-IP APM instead of AD FS proxy servers both simplifies and makes more agile the architecture. This is because BIG-IP APM is inherently more programmable and flexible in terms of policy creation. BIG-IP APM, being deployed on the BIG-IP platform, can take full advantage of the context in which requests are made, ensuring that identity and access control go beyond simple credentials and take into consideration device, location, and other contextual-clues that enable a more secure system of authentication and authorization. High-availability – and ultimately scalability - is preserved for all services by leveraging the core load balancing and HA functionality of the BIG-IP platform. All components in the chain are endowed with HA capabilities, making the entire application more resilient and able to withstand minor and major failures. Using BIG-IP LTM for load balancing AD FS serves as an adaptable and extensible architectural foundation for a phased deployment approach. As a pilot phase, rolling out AD FS services for internal clients only makes sense, and is the simplest in terms of its implementation. Using BIG-IP as the foundation for such an architecture enables further expansion in subsequent phases, such as introducing BIG-IP APM in a phase two implementation that brings flexibility of access location to the table. Further enhancements can then be made regarding access when context is included, enabling more complex and business-focused access policies to be implemented. Time-based restrictions on clients or location can be deployed and enforced, as is desired or needed by operations or business requirements. Reliability is a Least Common Factor Problem Reliability must be enabled throughout the application delivery chain to ultimately ensure reliability of each application. Scalability is further paramount for those dependent services, such as identity and access management, that are intended to be shared across multiple applications. While certainly there are many other load balancing services that could be used to enable reliability of these services, an extensible and highly scalable platform such as BIG-IP is required to ensure both reliability and scalability of shared services upon which many applications rely. The advantage of a BIG-IP-based application delivery tier is that its core reliability and scalability services extend to any of the many services that can be deployed. By simplifying the architecture through application delivery service consolidation, organizations further enjoy the benefits of operational consistency that keeps management and maintenance costs reduced. Reliability is a least common factor problem, and Ops First Rule should be applied when designing a deployment architecture to assure that all services in the delivery chain are as reliable as they can be. F5 Friday: BIG-IP Solutions for Microsoft Private Cloud BYOD–The Hottest Trend or Just the Hottest Term The Four V’s of Big Data Hybrid Architectures Do Not Require Private Cloud The Cost of Ignoring ‘Non-Human’ Visitors Complexity Drives Consolidation What Does Mobile Mean, Anyway? At the Intersection of Cloud and Control… Cloud Bursting: Gateway Drug for Hybrid Cloud Identity Gone Wild! Cloud Edition228Views0likes0CommentsMobile versus Mobile: 867-5309
#mobile #context The identity crisis created by common platforms negatively impacts the ability to serve consumers and corporate IT consistently The focus on the explosion of mobile devices is heavily weighted toward IT in terms of management and security. While there’s nothing wrong with it, there’s another aspect of mobility that is often ignored. Much like their tethered counterparts, many mobile devices are constrained by a tight-coupling to numbers. In the case of the desktop it’s often IP address. In the mobile word, it’s another number: your phone number. I love my tablet, I really do. And I love mobile applications. But what I don’t love is mobile applications that, while perfectly able to run on my tablet and written for the same OS that powers many smart phone mobile devices, require tethering to a phone number. Because, Hello? McFly!?! It doesn’t have a phone number! NUMBER-BASED IDENTITY We’re going to skip the Prisoner analogy and just assume it was made, okay? While it’s always applicable to discussions this, it gets a bit tedious and cliché after a while and so let’s just assume that tethering identity to a number of any kind is a Very Bad Idea TM , m’kay? In the tethered world this is because such numbers are – especially today – highly volatile. You can’t count on even an application instance having the same IP address from one minute to the next let alone a user who may be roaming around the world. And in the mobile world, it’s even less of a sure bet as mobile devices of all kinds are moving between WiFi and mobile network faster than a four-year old tears open a Christmas present. We (as in IT) simply cannot enforce corporate security and serve the access needs of a highly nomadic user community if we’re constrained to doing so based on a single number – whether it be IP address or phone number. We’ve got to leverage as much information as possible about the user – their network, the device, their location, the security posture of the end-point. We’ve got to take into consider the context of each and every request and use that data as the basis for allowing or denying (or at least limiting) access to corporate resources. Mobile applications requiring phone-number do so as a means to secure access to resources. It is a failure of Epic Proportions because it tends to engender a false sense of security on the IT side, based on the premise that phone number is unique to an individual and highly static, which is not always the case. It is a failure of Epic Proportions because it stratifies a much larger technological market into “haves” and “have nots” based on whether a single identifying characteristic is present: a number. Such strategies are further a Very Bad Idea TM because of the impact on policy management and security and development. Applications relying on numbers for identity work only on devices that have such numbers; for the rest of the market (which is growing by leaps and bounds as more and more consumer devices are Internet-enabled) there must exist a separate but equal application. Developers time is already strapped, and maintaining two (in some cases three when the web is considered separately from mobile devices) discrete applications is madness, I say, madness. The pressure on IT to secure, manage, and support multiple versions of the same application were supposed to go the way of the Dodo with the advent of REST and the ascendancy of the API. And yet here were are, managing and securing and developing applications tied to numbers. MOBILE-MEDIATION You may recognize this one from a previous post, “Mobile versus Mobile: An Identity Crisis”, and it’s no less applicable to this problem than it was to the problem of mobile clients and OS or platform-based identification. Whether it’s OS, platform, or IP address/phone number, no single characteristic of a user’s request is enough information upon which to base any kind of decision. Period. No single piece of information gives IT the context in which security and delivery decisions can be accurately made. Without the big picture, without the context, it is nigh-unto impossible to ascertain which decision should be made with respect to the request. It is only by taking advantage of context that we can make decisions that are not only best for the organization and preserve a positive security posture, but that are also best for the user in terms of experience and performance. It is only at strategic points of control in the network, such as the application delivery tier, that all the variables on both sides of the equation – user and data center – are visible. It is at this tier where the rubber meets the road, as they say, and the two worlds of consumer and corporate meet. It is here where security and performance and access policies are most efficiently applied, where all the requisite variables that make up the context of the request can be extracted, evaluated, and acted upon. It is paramount to both end-user adoption and a positive corporate operational posture that such strategic points of control are leveraged. It is only context that provides insight into the “bigger picture” and ensures a smooth and secure experience for end-users that simultaneously preserves the security and availability of the applications and resources being delivered. Mobile users are not a number, nor are their tethered counterparts. They are users, with unique characteristics that are increasingly not only varied but volatile. Such variables must be evaluated contextually for every request to ensure the best possible experience without compromising operational or business expectations and requirements.179Views0likes0CommentsWhat is a Strategic Point of Control Anyway?
From mammoth hunting to military maneuvers to the datacenter, the key to success is control Recalling your elementary school lessons, you’ll probably remember that mammoths were large and dangerous creatures and like most animals they were quite deadly to primitive man. But yet man found a way to hunt them effectively and, we assume, with more than a small degree of success as we are still here and, well, the mammoths aren’t. Marx Cavemen PHOTO AND ART WORK : Fred R Hinojosa. The theory of how man successfully hunted ginormous creatures like the mammoth goes something like this: a group of hunters would single out a mammoth and herd it toward a point at which the hunters would have an advantage – a narrow mountain pass, a clearing enclosed by large rock, etc… The qualifying criteria for the place in which the hunters would finally confront their next meal was that it afforded the hunters a strategic point of control over the mammoth’s movement. The mammoth could not move away without either (a) climbing sheer rock walls or (b) being attacked by the hunters. By forcing mammoths into a confined space, the hunters controlled the environment and the mammoth’s ability to flee, thus a successful hunt was had by all. At least by all the hunters; the mammoths probably didn’t find it successful at all. Whether you consider mammoth hunting or military maneuvers or strategy-based games (chess, checkers) one thing remains the same: a winning strategy almost always involves forcing the opposition into a situation over which you have control. That might be a mountain pass, or a densely wooded forest, or a bridge. The key is to force the entire complement of the opposition through an easily and tightly controlled path. Once they’re on that path – and can’t turn back – you can execute your plan of attack. These easily and highly constrained paths are “strategic points of control.” They are strategic because they are the points at which you are empowered to perform some action with a high degree of assurance of success. In data center architecture there are several “strategic points of control” at which security, optimization, and acceleration policies can be applied to inbound and outbound data. These strategic points of control are important to recognize as they are the most efficient – and effective – points at which control can be exerted over the use of data center resources. DATA CENTER STRATEGIC POINTS of CONTROL In every data center architecture there are aggregation points. These are points (one or more components) through which all traffic is forced to flow, for one reason or another. For example, the most obvious strategic point of control within a data center is at its perimeter – the router and firewalls that control inbound access to resources and in some cases control outbound access as well. All data flows through this strategic point of control and because it’s at the perimeter of the data center it makes sense to implement broad resource access policies at this point. Similarly, strategic points of control occur internal to the data center at several “tiers” within the architecture. Several of these tiers are: Storage virtualization provides a unified view of storage resources by virtualizing storage solutions (NAS, SAN, etc…). Because the storage virtualization tier manages all access to the resources it is managing, it is a strategic point of control at which optimization and security policies can be easily applied. Application Delivery / load balancing virtualizes application instances and ensures availability and scalability of an application. Because it is virtualizing the application it therefore becomes a point of aggregation through which all requests and responses for an application must flow. It is a strategic point of control for application security, optimization, and acceleration. Network virtualization is emerging internal to the data center architecture as a means to provide inter-virtual machine connectivity more efficiently than perhaps can be achieved through traditional network connectivity. Virtual switches often reside on a server on which multiple applications have been deployed within virtual machines. Traditionally it might be necessary for communication between those applications to physically exit and re-enter the server’s network card. But by virtualizing the network at this tier the physical traversal path is eliminated (and the associated latency, by the way) and more efficient inter-vm communication can be achieved. This is a strategic point of control at which access to applications at the network layer should be applied, especially in a public cloud environment where inter-organizational residency on the same physical machine is highly likely. OLD SKOOL VIRTUALIZATION EVOLVES You might have begun noticing a central theme to these strategic points of control: they are all points at which some kind of virtualization – and thus aggregation – occur naturally in a data center architecture. This is the original (first) kind of virtualization: the presentation of many resources as a single resources, a la load balancing and other proxy-based solutions. When there is a one —> many (1:M) virtualization solution employed, it naturally becomes a strategic point of control by virtue of the fact that all “X” traffic must flow through that solution and thus policies regarding access, security, logging, etc… can be applied in a single, centrally managed location. The key here is “strategic” and “control”. The former relates to the ability to apply the latter over data at a single point in the data path. This kind of 1:M virtualization has been a part of datacenter architectures since the mid 1990s. It’s evolved to provide ever broader and deeper control over the data that must traverse these points of control by nature of network design. These points have become, over time, strategic in terms of the ability to consistently apply policies to data in as operationally efficient manner as possible. Thus have these virtualization layers become “strategic points of control”. And you thought the term was just another square on the buzz-word bingo card, didn’t you?1.2KViews0likes6CommentsF5 and Traffix: When Worlds Collide
#mwc12 #traffix #mobile Strategic points of control are critical to managing the convergence of technology in any network - enterprise or carrier What happens when technology converges? When old meets new? A fine example of what might happen is what has happened in the carrier space as voice and data services increasingly meet on the same network, each carrying unique characteristics forward from the older technology from which they sprung. In the carrier space having moved away from older communications technology does not mean having left behind core technology concepts. Though voice may be moving to IP with the advent of LTE/4G, it still carries with it the notion of signaling as a means to manage communication and users, and the impact on networks from that requisite signaling mechanism is significant. Along with the well-discussed and often-noted explosive growth of mobile and its impact on the enterprise comes a less-discussed and rarely noted explosive growth of signaling traffic and its impact on service providers. Enterprise experience with voice and signaling remains largely confined to SIP-focused deployments and are on a scale much smaller than that of the service provider. Hence the term “carrier-grade” to indicate the much more demanding environment. The number of signaling messages in 4G networks, for example, associated with a 3 minute IP voice call with data is 520. The same voice call today requires only 3. That exponential growth will put increasing pressure on carriers and require massive scale of infrastructure to support. All that signaling traffic in carrier networks occurs via Diameter, the standard agreed upon by 3GPP (3 rd Generation Partner Project) for network signaling in all 4G/LTE networks. Diameter is to carrier networks what HTTP is to web applications today: it’s the glue that makes it all happen. As the preeminent Diameter routing agent (DRA) for for 3G, 4G / LTE and IMS environments, Traffix’ solutions are fluent in the signaling language used by carriers across the globe to identify users, manage provisioning, and authorize access to services and networks. One could reasonably describe Diameter as the Identity and Access Management (IAM) technology of choice for service providers. When a user does anything on a 4G network, Diameter is involved somehow. What Traffix Signaling Delivery Controller (which is both a highly capable DRA as well as Diameter Edge Agent (DEA)) offers is a strategic point of control in the service providers network, serving as an intelligent tier in that network that enables interoperability, security, scale, and flexibility in how signaling traffic is managed and optimized. That should sound familiar, as F5 is no stranger to similar responsibilities in enterprise and web-class data centers today. F5 with its application and control plane technologies serves as an intelligent tier in the network that ensures interoperability, security, scale, and flexibility for how applications and services are delivered, secured, and optimized. What service providers do with Diameter – user identification, permission to roam, authorization to use certain networks, basically anything a user does on a 4G network – is akin to what F5 does with application delivery technology in the data center. F5’s vision has been to create a converged carrier architecture that unifies IP services end-to-end across the application, data, and control plane. Diameter is a foundational piece of that puzzle, just as any-IP support is critical to providing that same converged application services approach in the data center, a data center routing agent, if you will. Both approaches are ultimately about context, control, and collaboration. CONVERGENCE BREEDS FRAGMENTATION These three characteristics (context, control, collaboration) are required for a dynamic data center to handle the volatility inherent in emerging data center models as well as the convergence in service provider networks of voice and data. But as technologies converge, supporting infrastructure tends to fragment. This dichotomy is clearly present even in the enterprise, where unified communications (UC) implementations are creating chaos. In its early days, Diameter deployments in service provider networks experienced similar trends, and it was the development of the DRA that resolved the issue, bringing order out of chaos and providing a strategic point of control through which subscriber activity could be more efficiently managed. Out of chaos, order. That’s the value Traffix brings to carrier networks with its Signaling Delivery Controller (SDC). Traffix solutions optimize signaling traffic, offering service provider operators scalability, availability, visibility, interoperability, and more in an operationally consistent solution. With the number of mobile devices predicted to exceed the world population in the next year, and the advanced services those devices provide driving exponential growth in signaling traffic, the need to optimize signaling traffic is top of mind for most service providers today. When diverse systems converge, their infrastructure must also converge in terms of support for the resulting unified system. This is particularly true as mobile and virtual desktops become more prevalent and bring with them their own unique delivery challenges to both the service provider and data center networks. The two worlds are colliding, out there on the Internets and inside data centers, with more and more IP-related traffic requiring management within the carrier networks, and more and more traditionally carrier network traffic such as voice being seen inside the data center. What both worlds need is a fully end-to-end IP core infrastructure solution – one that can support IP and Diameter and scale regardless of whether the need is enterprise-class or carrier-grade. One that maintains context and manages access to resources across both voice and data and does so both seamlessly and transparently. Bringing together F5’s control plane with that of Traffix brings a holistic approach to controlling a converged voice-data network that enhances critical network functions across the application, control, and data planes. Traffix aligns well with F5’s overall vision of enabling intelligence in the network and providing context and control for all types of network services – whether carrier or enterprise. Additional Resources: F5 Networks Acquires Traffix Systems The LTE signaling challenge F5 Circles The Wagons and Adds Diameter to its Portfolio Traffix Systems F5 Sends LTE Signal With Acquisition F5 Friday: The Dynamic Control Plane238Views0likes0CommentsMobile versus Mobile: An Identity Crisis
#mobileThe expansive options consumers revel in creates an identity crisis for IT that is best resolved via context-aware mobile mediation. Back in the days of the browser wars, when standards were still largely ignored and the battle for the desktop was highly competitive, developers had to make choices and compromises. They could either write extensive client-side scripts to detect the user’s browser and address the peculiarities of that environment or they could simply ignore them with a disclaimer that “this site (works best when viewed in | was written for) browser X.” As time went by, developers were able to discontinue this annoying practice as browser features converged and a common, standardized platform emerged upon which all applications were able to be delivered to any popular browser without concern. Then mobile phones appeared, and the user experience degraded again, this time driven by the relatively feeble processing power of the platform. Small screens aside, the memory and processing power available on a mobile phone was such that – when combined with a constrained networking environment – the delivery of increasingly chunky, graphic heavy, interactive applications to mobile phones was simply a bad idea for both organizations and its visitors. Developers and web ops returned to the inspection of HTTP headers to determine whether or not a visitor was using a mobile platform, and began writing leaner, more compact interfaces specifically for those platforms. Enter tablets. Neither desktop nor phone, these admittedly mobile platforms are compact but nearly as powerful as their overweight tethered cousins without sacrificing the mobility of their anorexic form-factor brethren. COMMON GROUND: MOBILE OPERATING SYSTEM Tablets are certainly able to take advantage of emerging web application standards such as HTML5 to deliver a full, rich interactive desktop-style experience on a mobile platform, but are rarely offered it. They are by default offered up a mobile experience because of a common element: the operating system. Even if the operating system was masked, still they’ll find out by checking a second, lesser known HTTP header: HTTP_X_WAP_PROFILE. And if not there, then potentially in HTTP_PROFILE. Unfortunately, this is not necessarily the fault of the developer. There are no standards prescribing what should or should not identity a mobile device, and thus manufacturers are left to their own devices (pun only somewhat intended). There is no reliable way, today, to identify a mobile device accurately. Developers are left writing complex scripts that strip apart user agents, profile strings and whatever other contextual data they can extract from HTTP headers to determine how best to serve up content. That often means visitors are served content that is not wholly appropriate for their device. Even if they could, the market is so volatile at this point that it’s a sure bet that a new device will enter soon that requires modification to the application yet again. More code, more string manipulation, more latency in processing. CONTEXT-AWARE MOBILE MEDIATION It would be nice if developers could simply receive accurate inbound HTTP headers. Headers that clearly identify the device not only from an operating system perspective, but from a form-factor and network perspective in a standard way. But there are no standards, yet, and may never be. Thus a solution may be found in imposing standards upon inbound requests specifically for developers to better address the disparities in resolution, functionality, and performance between the various mobile device types. This requires some amount of pre-planning. Design, if you will, or architecture up front. It requires that developers and devops sit down together and determine a standard means of communicating information between infrastructure and the applications it supports. Consider the possibility of two custom HTTP headers, one identifying the network type and one specifying form-factor and device: HTTP_X_NETWORK = “WIFI | MOBILE | LAN” HTTP_X_DEVICE = “TABLET | PHONE | DESKTOP” If developers could rely on these two custom HTTP headers to exist for every inbound request, they could then develop applications based upon these characteristics that were more appropriate for the given device and network over which the device connected. Implementation requires only minimal inspection and insertion on a context-aware mediating device such as a network-side scripting capable application delivery controller. Because the application delivery controller is topologically positioned in a strategic point of control, it has visibility into the network, client, and server-side environments. This gives it the ability to better interpret and execute policies that govern the delivery of applications to optimize performance and assure availability, but it also provides the ability to extract and share pertinent data with applications and other infrastructure. This data can be shared in a number of ways, including modification of the payload, of the headers, insertion of new headers, removal of old headers, etc… The flexibility inherent in network-side scripting solutions, particularly those capable of side-band connectivity, allows devops and developers to design and develop a solution that works for them – in their environment. The advantage to such a solution lies not only in more accurate, actionable data to share with applications, but in its ability to easily be modified without negatively impacting the application. A second advantage is the ability of developers to also take into consideration the network characteristics of the mobile device, data generally not available or, if available, generally inaccurate. A mobile device today may be accessing an application via WiFi or a mobile network, and that piece of information is quite pertinent as the performance and capabilities of each network are quite different and have a significant impact on the end-user experience from a delivery perspective. Yet this data is not available by default to developers and it cannot reliably be inferred from device type. By leveraging a context-aware mediating solution, however, it becomes possible to share this data with developers such that they are able to take that information into consideration when putting together a response to a given request. While not a panacea, such a solution certainly provides a more consistent and overall accurate environment in which to deliver applications to the increasingly broad and diverse spectrum of mobile devices. Stack Overflow: How do detect Android Tablets in general. Useragent? Mobile Browsing Reaches All Time High The Magic of Mobile Cloud Understanding network-side scripting At the Intersection of Cloud and Control… Cloud-Tiered Architectural Models are Bad Except When They Aren’t WILS: WPO versus FEO Fire and Ice, Silk and Chrome, SPDY and HTTP Grokking the Goodness of MapReduce and SPDY185Views0likes0CommentsBFF: Complexity and Operational Risk
#adcfw The reason bars place bouncers at the door is because it’s easier and less riskier to prevent entry than to root out later No one ever said choosing a career in IT was going to be easy, but no one said it had to be so hard you’d be banging your head on the desk, either. One of the reasons IT practitioners end up with large, red welts on their foreheads is because data centers tend to become more, not less, complex and along with complexity comes operational risk. Security, performance, availability. These three inseparable issues often stem not from vulnerabilities or poorly written applications but merely from the complexity of data center network architectures needed to support the varying needs of both the business and IT. Unfortunately it is often the case that as emerging technologies creep (and sometimes run headfirst) into the data center the network is overlooked as a potential source of risk in supporting these new technologies. Traditionally, network readiness has entailed some load testing to ensure adequate bandwidth to support a new application, but rarely is it the case that we take a look at the actual architecture of the network and its services to determine if it is able to support new applications and initiatives. It’s the old “this is the way we do this” mantra that often ends up being the source of operational failure. COMPLEXITY MEANS MULTIPLE POINTS of FAILURE Consider the simple case of using SPAN ports to mirror traffic, a traditional network architecture technique that attempts to support the need for visibility into network traffic for security purposes without impeding performance. SPAN ports are used to clone all traffic, allowing it to traverse its intended path to an application service while simultaneously being examined for malicious and/or anomalous content. This architectural approach can inadvertently cause operational failure under heavy load – whether caused by an attack or a flash-mob of legitimate users. “One of the problems with SPAN ports, which people tend to use because they’re cheap, is that you won’t get to keep it for your use all the time. Someone will come along and need that because there’s a limited number of them,” said John Kindervag, senior analyst with Forrester Research. “Whenever you are under attack and need that data, the switch is going to get saturated and the first port that quits functioning is the SPAN port so that it can have some extra compute capacity. So at the exact time that you need it, the whole system is designed not to get that data to you.” … Network traffic capture systems can perform media conversion, sending lower bandwidth data streams to these network security appliances. They can also load balance these data streams across multiple appliances. -- Network traffic capture systems offer broader security visibility Unfortunately, it isn’t only SPAN ports (and thus the systems that rely upon them) that fail under load. Firewalls, too, have consistently failed under the arduous network conditions that occur during an attack. The failure of these components is devastating, disruptive, and unacceptable and is caused in part by architectural complexity. It isn’t, after all, the IPS or IDS that’s failing – it’s the port on the switch upon which it relies for data. The applications have not failed, but if the firewall melts down it really doesn’t matter – to the end user, that’s failure. One of the ways in which we can redress this operational risk is to simplify – to reduce the number of potential points of failure and more intelligently route traffic through the network. INSPECT at the EDGE If you look at the cause of failures in network architectures there are two distinct sources: connections and traffic. The latter is simple and it is also well understood. Too much traffic can overwhelms network components (and you can bet if it’s overwhelming a network component, it can easily overwhelm an application), introducing errors, high latency, lost packets, and more. This trickles up to the application, potentially causing time-outs, unacceptably long response times, or worse – a crash. The answer to this problem is either (1) increase switching capacity or (2) decrease traffic. Both are valid approaches. The latter, however, is rarely the path taken because it’s it’s an accepted fact that traffic is going to increase over time and there’s only so many optimizations you can make in the network architecture that will optimize it and decrease traffic on the wire. But there are architectural changes that can decrease traffic on the wire. It makes very little sense to expend compute and network processing power on traffic that is malicious in nature. The risk to application servers and network infrastructure is real, but avoidable. You need to check the traffic at the door and only let valid traffic in, otherwise you’re going to end up expending a lot more resources tracking it down and figuring out how to kick it out. The problem is that the current network bouncer, the firewall, isn’t able to adequately detect the fake IDs presented by application layer attacks. This traffic just slips through and ends up causing problems in the application tier. The closer to the edge of the network you are able to detect – and subsequently reject – malicious traffic the more operational risk you can mitigate. The more assurances you can provide that security infrastructure won’t end up being ineffective due to a SPAN port shut down, the more operational risk you can mitigate. Leveraging a converged approach will provide that assurance, as well as the ability to sniff out at the door those fake IDs presented by application layer attacks. Leveraging a network component that is capable of both detecting inbound attacks as well as cloning traffic to ensure holistic inspection by security infrastructure will reduce complexity in the network architecture and improve the overall security posture. Detecting attacks at the very edge of the network – network and application-layer attacks – means less burden on supporting security network infrastructure (like switches with SPAN ports) because less traffic is getting through the door. If the network component is also designed to manage connections at high-scale, then the risk of firewall failure from overwhelming inbound connections that appear legitimate but are not. Emerging architectural models are based on the premise of leveraging strategic points of control within the network; those places where traffic and flows are naturally aggregated and disaggregated through the use of network virtualization. Leveraging these points of control are critical to ensuring the success of new architectural and operational deployment models in the data center that allow organizations to realize their benefits of cost savings and operational efficiency. The application delivery tier is a strategic point of control in the new data center paradigm. It affords organizations a flexible, scalable tier of control that can efficiently address all three components of operational risk. Consolidating inbound security with application delivery at the edge of the network makes good operational sense. It reduces operational risk across a variety of components by eliminating the complexity in the underlying architecture. Simplification leads to fewer points of failure because complexity and operational risk really are BFF – you can’t address one without addressing the other. F5 Friday: The Art of Efficient Defense F5 Friday: When Firewalls Fail… F5 Friday: Performance, Throughput and DPS The Pythagorean Theorem of Operational Risk At the Intersection of Cloud and Control… When the Data Center is Under Siege Don’t Forget to Watch Under the Floor Challenging the Firewall Data Center Dogma What CIOs Can Learn from the Spartans What is a Strategic Point of Control Anyway? Server Virtualization versus Server Virtualization223Views0likes0CommentsF5 Friday: Latency, Logging, and Sprawl
#v11 Logging, necessary for a variety of reasons in the data center, can consume resources and introduce undesirable latency. Avoiding that latency improves application performance and in some cases, the quality of logs. Logging. It’s mandatory and, in some industries, critical. Logs are used not only for auditing and tracking but for debugging, for data mining and analysis, and in some tiers of the architecture, replication and synchronization of data. Logs are a critical component across the data center, of that there is no doubt. That’s why it’s particularly frustrating to know that the cost in terms of performance is also one of the highest, lagging only slightly behind graphics in terms of performance costs. Given that there is very little graphics-related processing that goes on in data center components, disk I/O leaps to the top of the stack when it comes to performance impeding operations. The latency introduced by writing to a log often impacts the overall performance experienced by end-users because of the consumption of resources on the component by the logging operations. While generally out-of-band and thus non-blocking today, the consumption of resources can negatively impede performance by draining memory and using CPU cycles to perform its required tasks. Increasingly, as components are deployed in pairs, triples and more – owing to both scaling out physically and virtually – these logs also introduce “log sprawl” that can increase the cost associated with administration and make it more difficult to troubleshoot. After all, if you aren’t sure through which instance of a device a specific request was sent, you can’t easily find it in the log file. For all these reasons, centralized and generally off-box logging for data center components is becoming more critical. Consider it “logging as a service” if you will. This is not a new concept; centralized syslog servers have long been leveraged to provide a centralized, easier to manage log service that can be leveraged by just about every data center component. For load balancing services, the need is to not only centralize web-related logs but to ensure that they are written as fast as possible, to keep up with today’s demanding application environments. BIG-IP is no stranger to the need for high-speed, off-device logging and with v11 brings an open application, high-speed logging engine to bear. BIG-IP HIGH-SPEED LOGGING ENGINE One of the benefits of a unified, internal architecture is the ability to share improvements in the underlying platform across all products ultimately deployed on that platform. This is the case with TMOS, F5’s core application delivery technology. By enabling TMOS with a high-speed logging engine capable of up to 200,000 UDP/TCP messages per second, all modules – LTM, GTM, APM, ASM, WA, etc… – deployed on the TMOS platform automatically gain the benefits. Support for both local and external (off-box) logging enables you to centralize the data in third-party logging engines and meet security and compliance requirements. That means you can, ostensibly, leverage the visibility of a strategic point of control in the network to perform logging of web requests (and responses if required) rather than spread the responsibility across what may be an unknown number of web servers. Consider that in a highly-virtualized or cloud computing -based architecture, the number of servers required to meet current demand is variable and makes collection of web-server written logs more difficult unless an off-server log service is leveraged. That’s because virtualized servers often simply write logs to the local disk, which may or may not be persistent enough to meet compliance – or operational – demands. It’s also the case that some upstream infrastructure may modify the request and/or response, leaving logs with incomplete information. This is the case when an external application delivery controller acts as a Cookie gateway, a common function for adding security and consistency to web applications. Thus, logging at a strategic point, closest to the client, provides the most accurate picture of the request. Consider, too, the impact on writing logs in the face of an attack. DDoS counts on the consumption of resources to drain server and network component capacity, and by increasing the number of requests a server has to handle, it also gets an added consumption bonus from the need to write to the log. This is true on upstream network components, which compounds the impact and drains more resources than necessary. By enabling high-speed logging on upstream devices, offloading responsibility to a log service, and eliminating the need for web servers to also write to disk, the impact of a concerted DDoS attack can be more effectively managed. And if you’re going to use an off-server log service, it is more efficient to do so at a point upstream from the web-servers and gain the benefits of reducing resource consumption on the servers. Eliminating the resource consumption required by logs on the web server can have a very positive impact on the performance and capacity of the web server, which when combined with improvements in logging speed and reduced consumption on the BIG-IP translate into faster web applications and simplified log management strategy. High speed logging (HSL) is configurable using the GUI (via the Request Logging Profile) and supports the W3C extended log format. Happy Logging!391Views0likes0CommentsStrategic Trifecta: Access Management
#mobile A single, contextual point of control for access management can ease the pain of managing the explosion of client devices in enterprise environments. Regardless of the approach to access management, ultimately any solution must include the concept of control. Control over data, over access to corporate resources, over processes and over actions b y users themselves. The latter requires a non-technological solution – education and clear communication of policies that promote a collaborative approach to security. As Michael Santarcangelo , a.k.a. The Security Catalyst, explains: “Our success depends on our ability to get closer to people, to work together to bridge the human paradox gap, to partner on how we protect information.” (Why dropping the label of “users” improves how we practice security) This includes facets of security that simply cannot effectively be addressed through technology. Don’t share confidential information on social networks, be aware of corporate data and where it may be at rest and protect it with passwords and encryption if it’s a personal device. Because of the nature of mobile devices, technology cannot seriously address security concerns without extensive assistance from service providers who are unlikely to be willing to implement what would be customer-specific controls over data within their already stressed networks. This means education and clear communication will be imperative to successfully navigating the growing security chasm between IT and mobile devices. The issues regarding control over access to corporate resources, however, can be addressed through the implementation of policies that govern access to resources in all its aspects. Like cloud, control is at the center of solving most policy enforcement issues that arise and like cloud, control is likely to be difficult to obtain. That is increasingly true as the number of devices grows at a rate nearly commensurate with that of data. IT security pros are outnumbered and attempts to continue manually configuring and deploying policies to govern the access to corporate resources from myriad evolving clients will inevitably end with an undesirable result: failure resulting in a breach of policy. Locating and leveraging strategic points of control within the data center architecture can be invaluable in reducing the effort required to manually codify policies and provides a means to uniformly enforce policies across devices and corporate resources. A strategic point of control offers both context and control, both of which are necessary to applying the right policy and the right time. A combination of user, location, device and resource must be considered when determining whether access should or should not be allowed, and it is at those points within the architecture where resources and users meet that make the most operationally efficient points at which policies can be enforced. Consider those “security” concerns that involve access to applications from myriad endpoints. Each has its own set of capabilities – some more limited than the others – for participating in authentication and authorization processes. Processes which are necessary to protect applications and resources from illegitimate access and to ensure audit trails and access logs are properly maintained. Organizations that have standardized on Kerberos-based architectures to support both single-sign on efforts and centralize identity management find that new devices often cannot be supported. Allowing users access from new devices lacking native support for Kerberos both impacts productivity in a negative way and increases the operational burden by potentially requiring additional integration points to ensure consistent back-end authentication and authorization support. Leveraging a strategic point of control that is capable of transitioning between non-Kerberos supporting authentication methods and a Kerberos-enabled infrastructure provides a centralized location at which the same corporate policies governing access can be applied. This has the added benefit of enabling single-sign on for new devices that would otherwise fall outside the realm of inclusion. Aggregating access management at a single point within the architecture allows the same operational and security processes that govern access to be applied to new devices based on similar contextual clues. That single, strategic point of control affords organizations the ability to consistently apply policies governing access even in the face of new devices because it simplifies the architecture and provides a single location at which those policies and processes are enforced and enabled. It also allows separation of client from resource, and encapsulates access services such that entirely new access management architectures can be deployed and leveraged without disruption. Perhaps a solution to the exploding mobile community in the enterprise is a secondary and separate AAA architecture. Leveraging a strategic point of control makes that possible by providing a service layer over the architectures and subsequently leveraging the organizationally appropriate one based on context – on the device, user and location. The key to a successful mobile security strategy is an agile infrastructure and architectural implementation. Security is a moving target, and mobile device management seems to make that movement a bit more frenetic at times. An agile infrastructure with a more services-oriented approach to policy enforcement that decouples clients from security infrastructure and processes will go a long way toward enabling the control and flexibility necessary to meet the challenges of a fast-paced, consumerized client landscape. Meet the Challenge of Consumerization by Managing Applications Instead of Clients The Consumerization of IT: The OpsStore This is Why We Can’t Have Nice Things Data Center Optimization is Like NASCAR without the Beer What CIOs Can Learn from the Spartans What is a Strategic Point of Control Anyway? What is Network-based Application Virtualization and Why Do You Need It? Solutions are Strategic. Technology is Tactical.192Views0likes0Comments