#infra2 #openflow #sdn #devops As cloud recedes, it reveals what it hid when the hype took it big: a focus on the network.
Is it making networking sexy again? Yes, insomuch as we’re at least talking about networking again, which is about it considering that the network is an integral component of all the other technology and models that took the spotlight from it in the first place.
Considering recent commentary on SDN* and OpenFlow, it seems folks are still divided on OpenFlow and SDN and are trying to figure out where it fits – or if it fits – in modern data center architectures.
Of course, many of the problems that the SDN vendors state – VM mobility, the limited range of VLAN IDs, the inability to move L2/L3 networking among data centers and the inflexibility of current networking command and control -- are problems faced only by cloud providers and a handful of large, large companies with big, global data centers. In other words: a rather small number of customers.
I think Mike is pretty spot on with this prediction. Essentially, the majority of organizations will end up leveraging SDN for something more akin to network management in a hybrid network architecture, though not necessarily for the reasons he cites. It won’t necessarily be a lack of need, it’ll be a lack of need universally and the cost of such a massive disruption to the data center.
With that in mind, we need to spend some time thinking about where SDN fits in the overall data center architecture. Routing and switching is only one part of the puzzle that is dynamic data centers, after all, and while its target problems include the dynamism inherent in on-demand provisioning of resources, alone it cannot solve this problem. Its current focus lies most often on solving how to get from point A to point B through the network when point B is a moving target – and doing so dynamically, to adjust flow in a way that’s optimal given … well, given basic networking principles like shortest path routing. Not that it will remain that way, mind you, but at the nonce that’s the way it is.
Greg Ferro sums up and explains in his typical straight-to-the-point manner the core concepts behind OpenFlow and SDN in a recent post.
OpenFlow defines a standard for sending flow rules to network devices so that the Control Plane can add them to the forwarding table for the Data Plane. These flow rules contains fields for elements such as source & destination MAC, Source & destination IP, source and destination TCP, VLAN, QoS and MPLS tags and more. The flow rules are then added to the existing forwarding table in the network device.
The forwarding table is what all routers and switches use to dispatch frame and packets to their egress ports.
OpenFlow value is realised in the Controller, and the most interesting changes are because the Controller will get new capabilities and truly granular control of the traffic flows.
Therefore, OpenFlow is neither routing or switching, it’s about forwarding.
This simple statement is central to the “big picture” when you step back and try to put SDN and OpenFlow into the perspective of where it fits in an existing, modern data center architecture because it’s designed to solve specific problems, not necessarily replace the entire network (if you’re starting from scratch, that’s likely a different story). It’s about forwarding and, in particular, it’s about forwarding in a dynamic, volatile environment such as exist in cloud computing models. Where SDN and OpenFlow appear to offer the most value to existing data centers with experiencing this problem is in the network pockets that must deal with the volatility inside the data center at the application infrastructure (server) tiers, where resource lifecycle management in large installations is likely to cause the most disruption.
The application delivery tier already includes the notion of separation of control from data plane. That’s the way it’s been for many years, though the terminology did not always exist to describe it as such. That separation has always been necessary to abstract the notion of an “application” or “service” from its implementation and allow for the implementation of reliability and availability strategies through technology like load balancing and failover to be transparent to the consumer. The end-point in the application delivery tier is static; it’s not rigid, but it is static because there’s no need for it to change dynamically. What was dynamic were the resources which have become even more dynamic today, specifically the resources that comprise the abstracted application: the various application instances (virtual machines) that make up the “application”. Elasticity is implemented in the application delivery tier, by seamlessly ensuring that consumers are able to access resources whether demand is high or low.
In modern data center models, the virtualization management systems – orchestration, automation, provisioning – are part of the equation, ensuring elasticity is possible by managing the capacity of resources in a manner commensurate with demand seen at the application delivery tier.
As resources in the application infrastructure tier are launched and shut down, as they move from physical location to physical location across the network, there is chaos. The diseconomy of scale that has long been mentioned in conjunction with virtualization and cloud computing happens here, inside the bowels of the data center. It is the network that connects the application delivery tier to the application infrastructure tier that is constantly in motion in large installations and private cloud computing environments, and it is here that SDN and OpenFlow show the most promise to achieve the operational efficiencies needed to contain costs and reduce potential errors due to overwhelmingly high volumes of changes in network configurations.
What’s missing is how that might happen. While the mechanisms and protocols used to update forwarding and routing tables on switches and routers is well-discussed, the impetus for such updates and changes is not. From where do such changes originate?
In a fully automated, self-aware data center (one that does not exist and may never do so) the mere act of provisioning a virtual machine (application) would trigger such changes. In more evolutionary data centers (which is more likely) such changes will be initiated due to provisioning system events, whether initiated automatically or at the behest of a user (in IT as a Service scenarios). Perhaps through data or options contained in existing network discovery protocols or through integration between the virtualization management systems and the SDN management plane. One of the core value propositions of SDN and OpenFlow being centralized control, one assumes that such functionality would be realized via integration between the two and not through modification and extension of existing protocols (although both methods would be capable, if we’re careful, of maintaining compatibility with non-SDN enabled networking components).
This is being referred to in some cases as the “northbound” API while the connectivity between the controller and the network components referred to as the “southbound” API.
OpenFlow, the southbound API between the controller and the switch, is getting most of the attention in the current SDN hype-fest, but the northbound API, between the controller and the data center automation system (orchestration) will yield the biggest impact for users. SDN has the potential to be extremely powerful because it provides a platform to develop new, higher level abstractions. The right abstraction can free operators from having to deal with layers of implementation detail that are not scaling well as networks increasingly need to support “Hyper-Scale” data centers.
In this way, SDN and OpenFlow provide the means by which the diseconomy of scale and volatility inherent in cloud computing and optimized resource utilization models can be brought under control and even reversed.
Isn’t that the problem Infrastructure 2.0 has been, in part, trying to address? Early on we turned to a similar, centralized model in which IFMAP provided the controller necessary to manage changes in the underlying network. An SDN-OpenFlow based model simply replaces that central controller with another, and distributes the impact of the scale of change across all network devices by delegating responsibility for implementation to individual components upon an “event” that changes the network topology.
Dynamic infrastructure [aka Infrastructure 2.0] is an evolution of traditional network and application network solutions to be more adaptable, support integration with its environment and other foundational technologies, and to be aware of context (connectivity intelligence).
What some SDN players are focusing on is a more complete architecture – one that’s entirely SDN and unfortunately only likely to happen in green field environments, or over time. That model, too, is interesting in that traditional data center tiers will still “exist” but would not necessarily be hierarchical, and would instead use the programmable nature of the network to ensure proper forwarding within the data center. Which is why this is going to ultimately fall into the realm of expertise owned by devops.
But all this is conjecture, at this point, with the only implementations truly out there still housed in academia. Whether it will make it into the data center depends on how disruptive and difficult it will be to integrate with existing network architectures. Because just like cloud, enterprises don’t rip and replace – they move cautiously in a desired direction. As in the case of cloud computing, strategies will likely revolve around hybrid architectures enabled by infrastructure integration and collaboration.
Which is what infrastructure 2.0 has been about from the beginning.
* Everyone right now is fleshing out definitions of SDN and jockeying for position, each to their own benefit of course. How it plays out remains to be seen, but I’m guessing we’re going to see a cloud like evolution. In other words, chaos of definitions. I don’t have a clear one as of yet, so I’m content (for now) to take at face value the definition offered by ONS and pursue how – and where - that might benefit the enterprise. I don’t see that it has to be all or nothing, obviously.