data center feng shui
37 TopicsData Center Feng Shui: Fault Tolerance and Fault Isolation
Like most architectural decisions the two goals do not require mutually exclusive decisions. The difference between fault isolation and fault tolerance is not necessarily intuitive. The differences, though subtle, are profound and have a substantial impact on data center architecture. Fault tolerance is an attribute of systems and architecture that allow it to continue performing its tasks in the event of a component failure. Fault tolerance of servers, for example, is achieved through the use of redundancy in power-supplies, in hard-drives, and in network cards. In an architecture, fault tolerance is also achieved through redundancy by deploying two of everything: two servers, two load balancers, two switches, two firewalls, two Internet connections. The fault tolerant architecture includes no single point of failure; no component that can fail and cause a disruption in service. load balancing, for example, is a fault tolerant-based strategy that leverages multiple application instances to ensure that failure of one instance does not impact the availability of the application. Fault isolation on the other hand is an attribute of systems and architectures that isolates the impact of a failure such that only a single system, application, or component is impacted. Fault isolation allows that a component may fail as long as it does not impact the overall system. That sounds like a paradox, but it’s not. Many intermediary devices employ a “fail open” strategy as a method of fault isolation. When a network device is required to intercept data in order to perform its task – a common web application firewall configuration – it becomes a single point of failure in the data path. To mitigate the potential failure of the device, if something should fail and cause the system to crash it “fails open” and acts like a simple network bridge by simply forwarding packets on to the next device in the chain without performing any processing. If the same component were deployed in a fault-tolerant architecture, there would be deployed two devices and hopefully leveraging non-network based failover mechanisms. Similarly, application infrastructure components are often isolated through a contained deployment model (like sandboxes) that prevent a failure – whether an outright crash or sudden massive consumption of resources – from impacting other applications. Fault isolation is of increasing interest as it relates to cloud computing environments as part of a strategy to minimize the perceived negative impact of shared network, application delivery network, and server infrastructure.399Views0likes2CommentsDispelling the New SSL Myth
Claiming SSL is not computationally expensive is like saying gas is not expensive when you don’t have to drive to work every day. My car is eight years old this year. It has less than 30,000 miles on it. Yes, you heard that right, less than 30,000 miles. I don’t drive my car very often because, well, my commute is a short trip down two flights of stairs. I don’t need to go very far when I do drive it’s only ten miles or so round trip to the grocery store. So from my perspective, gas isn’t really very expensive. I may use a tank of gas a month, which works out to … well, it’s really not even worth mentioning the cost. But for someone who commutes every day – especially someone who commutes a long-distance every day – gas is expensive. It’s a significant expense every month for them and they would certainly dispute my assertion that the cost of gas isn’t a big deal. My youngest daughter, for example, would say gas is very expensive – but she’s got a smaller pool of cash from which to buy gas so relatively speaking, we’re both right. The same is true for anyone claiming that SSL is not computationally expensive. The way in which SSL is used – the ciphers, the certificate key lengths, the scale – has a profound impact on whether or not “computationally expensive” is an accurate statement or not. And as usual, it’s not just about speed – it’s also about the costs associated with achieving that performance. It’s about efficiency, and leveraging resources in a way that enables scalability. It’s not the cost of gas alone that’s problematic, it’s the cost of driving, which also has to take into consideration factors such as insurance, maintenance, tires, parking fees and other driving-related expenses. MYTH: SSL is NOT COMPUTATIONALLY EXPENSIVE TODAY SSL is still computationally expensive. Improvements in processor speeds in some circumstances have made that expense less impactful. Circumstances are changing. Commoditized x86 hardware can in fact handle SSL a lot better today than it ever could before –when you’re using 1024-bit keys and “easy” ciphers like RC4. Under such parameters it is true that commodity hardware may perform efficiently and scale up better than ever when supporting SSL. Unfortunately for proponents of SSL-on-the-server, 1024-bit keys are no longer the preferred option and security professionals are likely well-aware that “easy” ciphers are also “easy” pickings for miscreants. In January 2011, NIST recommendations regarding the deployment of SSL went into effect. While NIST is not a standards body can require compliance or else, they can and do force government and military compliance and have shown their influence with commercial certificate authorities. All commercial certificate authorities now issue only 2048-bit keys. This increase has a huge impact on the capacity of a server to process SSL and renders completely inaccurate the statement that SSL is not computationally expensive anymore. A typical server that could support 1500 TPS using 1024-bit keys will only support 1/5 of that (around 300 TPS) when supporting modern best practices, i.e. 2048-bit keys. Also of note is that NIST recommends ephemeral Diffie-Hellman - not RSA - for key exchange, and per TLS 1.0 specification, AES or 3DES-EDE-CBC, not RC4. These are much less “easy” ciphers than RC4 but unfortunately they are also more computationally intense, which also has an impact on overall performance. Key length and ciphers becomes important to the performance and capacity of SSL not just during the handshaking process, but in bulk-encryption rates. It is one thing to say a standard server deployed to support SSL can handle X handshakes (connections) and quite another to simultaneously perform bulk-encryption on subsequent data responses. The size and number of those responses have a huge impact on the consumption rate of resources when performing SSL-related functions on the overall server’s capacity. Larger data sets require more cryptographic attention that can drag down the rate of encryption – that means slower response times for users and higher resource consumption on servers, which decreases resources available for handshaking and server processing and cascades throughout the entire system to result in a reduction of capacity and poor performance. Tweaked configurations, poorly crafted performance tests, and a failure to consider basic mathematical relationships may seem to indicate SSL is “not” computationally expensive yet this contradicts most experience with deploying SSL on the server. Consider this question and answer in the SSL FAQ for the Apache web server: Why does my webserver have a higher load, now that it serves SSL encrypted traffic? SSL uses strong cryptographic encryption, which necessitates a lot of number crunching. When you request a webpage via HTTPS, everything (even the images) is encrypted before it is transferred. So increased HTTPS traffic leads to load increases. This is not myth, this is a well-understood fact – SSL requires higher computational load which translates into higher consumption of resources. That consumption of resources increases with load. Having more resources does not change the consumption of SSL, it simply means that from a mathematical point of view the consumption rates relative to the total appear to be different. The “amount” of resources consumed by SSL (which is really the amount of resources consumed by cryptographic operations) is proportional to the total system resources available. The additional consumption of resources from SSL is highly dependent on the type and size of data being encrypted, the load on the server from both processing SSL and application requests, and on the volume of requests. Interestingly enough, the same improvements in capacity and performance of SSL associated with “modern” processors and architecture is also applicable to intermediate SSL-managing devices. Both their specialized hardware (if applicable) and general purpose CPUs significantly increase the capacity and performance of SSL/TLS encrypted traffic on such solutions, making their economy of scale much greater than that of server-side deployed SSL solutions. THE SSL-SERVER DEPLOYED DISECONOMY of SCALE Certainly if you have only one or even two servers supporting an application for which you want to enable SSL the costs are going to be significantly different than for an organization that may have ten or more servers comprising such a farm. It is not just the computational costs that make SSL deployed on servers problematic, it is also the associated impact on infrastructure and the cost of management. Reports that fail to factor in the associated performance and financial costs of maintaining valid certificates on each and every server – and the management / creation of SSL certificates for ephemeral virtual machines – are misleading. Such solutions assume a static environment and a deep pocket or perhaps less than ethical business practices. Such tactics attempt to reduce the capital expense associated with external SSL intermediaries by increasing the operational expense of purchasing and managing large numbers of SSL certificates – including having a ready store that can be used for virtual machine instances. As the number of services for which you want to provide SSL secured communication increase and the scale of those services increases, the more costly it becomes to manage the required environment. Like IP address management in an increasingly dynamic environment, there is a diseconomy of scale that becomes evident as you attempt to scale the systems and processes involved. DISECONOMY of SCALE #1: CERTIFICATE MANAGEMENT Obviously the more servers you have, the more certificates you need to deploy. The costs associated with management of those certificates – especially in dynamic environments – continues to rise and the possibility of missing an expiring certificate increase with the number of servers on which certificates are deployed. The promise of virtualization and cloud computing is to address the diseconomy of scale; the ability to provision and ready-to-function server complete with the appropriate web or application stack serving up an application for purposes of scale assumes that everything is ready. Unless you’re failing to properly provision SSL certificates you cannot achieve this with a server-deployed SSL strategy. Each virtual image upon which a certificate is deployed must be pre-configured with the appropriate certificate and keys and you can’t launch the same one twice. This has the result of negating the benefits of a dynamically provisioned, scalable application environment and unnecessarily increases storage requirements because images aren’t small. Failure to recognize and address the management and resulting impact on other areas of infrastructure (such as storage and scalability processes) means ignoring completely the actual real-world costs of a server-deployed SSL strategy. It is always interesting to note the inability of web servers to support SSL for multiple hosts on the same server, i.e. virtual hosts. Why can't I use SSL with name-based/non-IP-based virtual hosts? The reason is very technical, and a somewhat "chicken and egg" problem. The SSL protocol layer stays below the HTTP protocol layer and encapsulates HTTP. When an SSL connection (HTTPS) is established Apache/mod_ssl has to negotiate the SSL protocol parameters with the client. For this, mod_ssl has to consult the configuration of the virtual server (for instance it has to look for the cipher suite, the server certificate, etc.). But in order to go to the correct virtual server Apache has to know the Host HTTP header field. To do this, the HTTP request header has to be read. This cannot be done before the SSL handshake is finished, but the information is needed in order to complete the SSL handshake phase. Bingo! Because an intermediary terminates the SSL session and then determines where to route the requests, a variety of architectures can be more easily supported without the hassle of configuring each and every web server – which must be bound to IP address to support SSL in a virtual host environment. This isn’t just a problem for hosting/cloud computing providers, this is a common issue faced by organizations supporting different “hosts” across the domain for tracking, for routing, for architectural control. For example, api.example.com and www.example.com often end up on the same web server, but use different “hosts” for a variety of reasons. Each requires its own certificate and SSL configuration – and they must be bound to IP address – making scalability, particularly auto-scalability, more challenging and more prone to the introduction of human error. The OpEx savings in a single year from SSL certificate costs alone could easily provide an ROI justification for the CapEx of deploying an SSL device before even considering the costs associated with managing such an environment. CapEx is a onetime expense while OpEx is recurring and expensive. DISECONOMY of SCALE #2: CERTIFICATE/KEY SECURITY The simplistic nature of the argument also fails to take into account the sensitive nature of keys and certificates and regulatory compliance issues that may require hardware-based storage and management of those keys regardless of where they are deployed (FIPS 140-2 level 2 and above). While there are secure and compliant HSM (Hardware Security Modules) that can be deployed on each server, this requires serious attention and an increase of management and skills to deploy. The alternative is to fail to meet compliance (not acceptable for some) or simply deploy the keys and certificates on commoditized hardware (increases the risk of theft which could lead to far more impactful breaches). For some IT organizations to meet business requirements they will have to rely on some form of hardware-based solution for certificate and key management such as an HSM or FIPS 140-2 compliant hardware. The choices are deploy on every server (note this may become very problematic when trying to support virtual machines) or deploy on a single intermediary that can support all servers at the same time, and scale without requiring additional hardware/software support. DISECONOMY of SCALE #3: LOSS of VISIBILITY / SECURITY / AGILITY SSL “all the way to the server” has a profound impact on the rest of the infrastructure, too, and the scalability of services. Encrypted traffic cannot be evaluated or scanned or routed based on content by any upstream device. IDS and IPS and even so-called “deep packet inspection” devices upstream of the server cannot perform their tasks upon the traffic because it is encrypted. The solution is to deploy the certificates from every machine on the devices such that they can decrypt and re-encrypt the traffic. Obviously this introduces unacceptable amounts of latency into the exchange of data, but the alternative is to not scan or inspect the traffic, leaving the organization open to potential compromise. It is also important to note that encrypted “bad” traffic, e.g. malicious code, malware, phishing links, etc… does not change the nature of that traffic. It’s still bad, it’s also now “hidden” to every piece of security infrastructure that was designed and deployed to detect and stop it. A server-deployed SSL strategy eliminates visibility and control and the ability to rapidly address both technical and business-related concerns. Security is particularly negatively impacted. Emerging threats such as a new worm or virus for which AV scans have not yet but updated can be immediately addressed by an intelligent intermediary – whether as a long-term solution or stop-gap measure. Vulnerabilities in security protocols themselves, such as the TLS man-in-the-middle attack, can be immediately addressed by an intelligent, flexible intermediary long before the actual solutions providing the service can be patched and upgraded. A purely technical approach to architectural decisions regarding the deployment of SSL or any other technology is simply unacceptable in an IT organization that is actively trying to support and align itself with the business. Architectural decisions of this nature can have a profound impact on the ability of IT to subsequently design, deploy and manage business-related applications and solutions and should not be made in a technical or business vacuum, without a full understanding of the ramifications. The Anatomy of an SSL Handshake [Network Computing] Get Ready for the Impact of 2048-bit RSA Keys [Network Computing] SSL handshake latency and HTTPS optimizations [semicomplete.com] Black Hat: PKI Hack Demonstrates Flaws in Digital Certificate Technology [DarkReading] SSL/TLS Strong Encryption: FAQ [apache.org] The Open Performance Testing Initiative The Order of (Network) Operations Congratulations! You do no nothing faster than anyone else! Data Center Feng Shui: SSL WILS: SSL TPS versus HTTP TPS over SSL F5 Friday: The 2048-bit Keys to the Kingdom TLS Man-in-the-Middle Attack Disclosed Yesterday Solved Today with Network-Side Scripting299Views0likes2CommentsCloud Infrastructure Integration Model: Bridging
Examining architectures on which hybrid clouds are based… IT professionals, in general, appear to consider themselves well along the path toward IT as a Service with a significant plurality of them engaged in implementing many of the building blocks necessary to support the effort. IaaS, PaaS, and hybrid cloud computing models are essential for IT to realize an environment in which (manageable) IT as a Service can become reality. That IT professionals –65% of them to be exact – note their organization is in-progress or already completed with a hybrid cloud implementation is telling, as it indicates a desire to leverage resources from a public cloud provider. What the simple “hybrid cloud” moniker doesn’t illuminate is how IT organizations are implementing such a beast. To be sure, integration is always a rough road and integrating not just resources but its supporting infrastructure must certainly be a non-trivial task. That’s especially true given that there exists no “standard” or even “best practices” means of integrating the infrastructure between a cloud and a corporate data center. Specifications designed to address this gap are emerging and there are a number of commercial solutions available that provide the capability to transparently bridge cloud-hosted resources with the corporate data center. Without diving into the mechanism – standards-based or product solution – we can still examine the integration model from the perspective of its architectural goals, its advantages and disadvantages. THE BRIDGED CLOUD INTEGRATION ARCHITECTURE The basic premise of a bridged-cloud integration architecture is to transparently enable communication with and use of cloud-deployed resources. While the most common type of resources to be integrated will be compute, it is also the case that these resources may be network or storage focused. A bridged-cloud integration architecture provides for a seamless view of those resources. Infrastructure and applications deployed within the data center are able to communicate in an environment agnostic-manner, with no requirement of awareness of location. This is the premise of the network-oriented standards emerging as a solution: they portend the ability to extend the corporate data center network into a public cloud (or other geographically disparate location) network and make them appear as a single, logical network. Because of the reliance of infrastructure components on network topology, this is an important capability. Infrastructure within the data center and the services they provide are able to interact with and continue to enforce or apply policies to the resources located external to the data center. The resources can be treated as being “on” the local network by infrastructure and applications without modification. Basically, bridging normalizes the IP address space across disparate environments. Obviously this approach affords IT a greater measure of control over cloud-deployed resources than would be otherwise available. Resources and applications in “the cloud” can be integrated with corporate-deployed services in a way that is far less disruptive to the end-user. For example, a load balancing service can easily extend its pool of resources into the cloud to scale an application without the need to adjust its network configuration (VLANs, routing, ACLs, etc…) because all resources are available on what are existing logical networks. This has the added benefit of maintaining operational consistency, especially from a security perspective, as existing access and application security controls are applied inline. All is not rosy in bridging land, however, as there are negatives to this approach. The most obvious one should be the impact on performance. Latency across the Internet, implied by the integration of cloud-based resources, must be considered when determining to which uses those remote resources should be put. Scaling applications that are highly latency-sensitive using remote resources in a bridged architecture may incur too high a performance penalty. Alternatively, however, applications integrated using out-of-band processing, i.e. an application that periodically polls for new data and processes it in bulk, behind the scenes, may be well-suited to such an architecture as latency is not usually an issue. The bridging model also does not address the need for fault tolerance. If you’re utilizing remote resources to ensure scalability and without them failure may result, you run the risk of connectivity issues incurring an outage. It may be necessary to employ a tertiary provider, which could result in increased complexity in the network and require changes to infrastructure to support. Next time we’ll examine a second approach to cloud infrastructure integration: virtualization. Live Migration versus Pre-Positioning in the Cloud Cloud is an Exercise in Infrastructure Integration IT as a Service: A Stateless Infrastructure Architecture Model Cloud is the How not the What Cloud-Tiered Architectural Models are Bad Except When They Aren’t Cloud Chemistry 101 “Lights Out” in the Cloud The cost of bad cloud-based application performance Cloud is not Rocket Science but it is Computer Science296Views0likes0CommentsCloud Bursting: Gateway Drug for Hybrid Cloud
The first hit’s cheap kid … Recently Ben Kepes started a very interesting discussion on cloud bursting by asking whether or not it was real. This led to Christofer Hoff pointing out that “true” cloud bursting required routing based on business parameters. That needs to be extended to operational parameters, but in general, Hoff’s on the mark in my opinion. The core of the issue with cloud bursting, however, is not that requests must be magically routed to the cloud in an overflow situation (that seems to be universally accepted as part of the definition), but the presumption that the content must also be dynamically pushed to the cloud as part of the process, i.e. live migration. If we accept that presumption then cloud bursting is nowhere near reality. Not because live migration can’t be done, but because the time requirement to do so prohibits a successful “just in time” bursting approach. There is already a requirement that provisioning of resources in the cloud as preparation for a bursting event happen well before the event, it’s a predictive, proactive process nor a reactionary one, and the inclusion of live migration as part of the process would likely result in false provisioning events (where content is migrated prematurely based on historical trending which fails to continue and therefore does not result in an overflow situation). So this leaves us with cloud bursting as a viable architectural solution to scale on-demand only if we pre-position content in the cloud, with the assumption that provisioning is a less time intensive process than migration plus provisioning. This results in a more permanent, hybrid cloud architecture. THE ROAD to HYBRID The constraints on the network today force organizations who wish to address their seasonal or periodic need for “overflow” capacity to pre-position the content in demand at a cloud provider. This isn’t as simple as dropping a virtual machine in EC2, it also requires DNS modifications to be made and the implementation of the policy that will ultimately trigger the routing to the cloud campus. Equally important – actually, perhaps more important – is having the process in place that will actually provision the application at the cloud campus. In other words, the organization is building out the foundation for a hybrid cloud architecture. But in terms of real usage, the cloud-deployed resources may only be used when overflow capacity is required. So it’s only used periodically. But as its user base grows, so does the need for that capacity and organizations will see those resources provisioned more and more often, until they’re virtually always on. There’s obviously an inflection point at which the use of cloud-based resources moves out of the realm of “overflow capacity” and into the realm of “capacity”, period. At that point, the organization is in possession of a full, hybrid cloud implementation. LIMITATIONS IMPOSE the MODEL Some might argue – and I’d almost certainly concede the point – that a cloud bursting model that requires pre-positioning in the first place is a hybrid cloud model and not the original intent of cloud bursting. The only substantive argument I could provide to counter is that cloud bursting focuses more on the use of the resources and not the model by which they are used. It’s the on-again off-again nature of the resources deployed at the cloud campus that make it cloud bursting, not the underlying model. Regardless, existing limitations on bandwidth force the organization’s hand; there’s virtually no way to avoid implementing what is a foundation for hybrid cloud as a means to execute on a cloud bursting strategy (which is probably a more accurate description of the concept than tying it to a technical implementation, but I’m getting off on a tangent now). The decision to embark on a cloud bursting initiative, therefore, should be made with the foresight that it requires essentially the same effort and investment as a hybrid cloud strategy. Recognizing that up front enables a broader set of options for using those cloud campus resources, particularly the ability to leverage them as true “utility” computing, rather than an application-specific (i.e. dedicated) set of resources. Because of the requirement to integrate and automate to achieve either model, organizations can architect both with an eye toward future integration needs – such as those surrounding identity management, which continues to balloon as a source of concern for those focusing in on SaaS and PaaS integration. Whether or not we’ll solve the issues with live migration as a barrier to “true” cloud bursting remains to be seen. As we’ve never managed to adequately solve the database replication issue (aside from accepting eventual consistency as reality), however, it seems likely that a “true” cloud bursting implementation may never be possible for organizations who aren’t mainlining the Internet backbone.281Views0likes0CommentsWhat Do Database Connectivity Standards and the Pirate’s Code Have in Common?
A: They’re both more what you’d call “guidelines” than actual rules. An almost irrefutable fact of application design today is the need for a database, or at a minimum a data store – i.e. a place to store the data generated and manipulated by the application. A second reality is that despite the existence of database access “standards”, no two database solutions support exactly the same syntax and protocols. Connectivity standards like JDBC and ODBC exist, yes, but like SQL they are variable, resulting in just slightly different enough implementations to effectively cause vendor lock-in at the database layer. You simply can’t take an application developed to use an Oracle database and point it at a Microsoft or IBM database and expect it to work. Life’s like that in the development world. Database connectivity “standards” are a lot like the pirate’s Code, described well by Captain Barbossa in Pirates of the Carribbean as “more what you’d call ‘guidelines’ than actual rules.” It shouldn’t be a surprise, then, to see the rise of solutions that address this problem, especially in light of an increasing awareness of (in)compatibility at the database layer and its impact on interoperability, particularly as it relates to cloud computing . Forrester Analyst Noel Yuhanna recently penned a report on what is being called Database Compatibility Layers (DCL). The focus of DCL at the moment is on migration across database platforms because, as pointed out by Noel, they’re expensive, time consuming and very costly. Database migrations have always been complex, time-consuming, and costly due to proprietary data structures and data types, SQL extensions, and procedural languages. It can take up to several months to migrate a database, depending on database size, complexity, and usage of these proprietary features. A new technology has recently emerged for solving this problem: the database compatibility layer, a database access layer that supports another database management system’s (DBMS’s) proprietary extensions natively, allowing existing applications to access the new database transparently. -- Simpler Database Migrations Have Arrived (Forrester Research Report) Anecdotally, having been on the implementation end of such a migration I can’t disagree with the assessment. Whether the right answer is to sit down and force some common standards on database connectivity or build a compatibility layer is a debate for another day. Suffice to say that right now the former is unlikely given the penetration and pervasiveness of existing database connectivity, so the latter is probably the most efficient and cost-effective solution. After all, any changes in the core connectivity would require the same level of application modification as a migration; not an inexpensive proposition at all. According to Forrester a Database Compatibility Layer (DCL) is a “database layer that supports another DBMS’s proprietary SQL extensions, data types, and data structures natively. Existing applications can transparently access the newly migrated database with zero or minimal changes.” By extension, this should also mean that an application could easily access one database and a completely different one using the same code base (assuming zero changes, of course). For the sake of discussion let’s assume that a DCL exists that exhibits just that characteristic – complete interoperability at the connectivity layer. Not just for migration, which is of course the desired use, but for day to day use. What would that mean for cloud computing providers – both internal and external? ENABLING IT as a SERVICE Based on our assumption that a DCL exists and is implemented by multiple database solution vendors, a veritable cornucopia of options becomes a lot more available for moving enterprise architectures toward IT as a Service than might be at first obvious. Consider that applications have variable needs in terms of performance, redundancy, disaster recovery, and scalability. Some applications require higher performance, others just need a nightly or even weekly backup and some, well, some are just not that important that you can’t use other IT operations backups to restore if something goes wrong. In some cases the applications might have varying needs based on the business unit deploying them. The same application used by finance, for example, might have different requirements than the same one used by developers. How could that be? Because the developers may only be using that application for integration or testing while finance is using it for realz. It happens. What’s more interesting, however, is how a DCL could enable a more flexible service-oriented style buffet of database choices, especially if the organization used different database solutions to support different transactional, availability, and performance goals. If a universal DCL (or near universal at least) existed, business stakeholders – together with their IT counterparts – could pick and choose the database “service” they wished to employ based on not only the technical characteristics and operational support but also the costs and business requirements. It would also allow them to “migrate” over time as applications became more critical, without requiring a massive investment in upgrading or modifying the application to support a different back-end database. Obviously I’m picking just a few examples that may or may not be applicable to every organization. The bigger thing here, I think, is the flexibility in architecture and design that is afforded by such a model that balances costs with operational characteristics. Monitoring of database resource availability, too, could be greatly simplified from such a layer, providing solutions that are natively supported by upstream devices responsible for availability at the application layer, which ultimately depends on the database but is often an ignored component because of the complexity currently inherent in supporting such a varied set of connectivity standards. It should also be obvious that this model would work for a PaaS-style provider who is not tied to any given database technology. A PaaS-style vendor today must either invest effort in developing and maintaining a services layer for database connectivity or restrict customers to a single database service. The latter is fine if you’re creating a single-stack environment such as Microsoft Azure but not so fine if you’re trying to build a more flexible set of offerings to attract a wider customer base. Again, same note as above. Providers would have a much more flexible set of options if they could rely upon what is effectively a single database interface regardless of the specific database implementation. More importantly for providers, perhaps, is the migration capability noted by the Forrester report in the first place, as one of the inhibitors of moving existing applications to a cloud computing provider is support for the same database across both enterprise and cloud computing environments. While services layers are certainly a means to the same end, such layers are not universally supported. There’s no “standard” for them, not even a set of best practice guidelines, and the resulting application code suffers exactly the same issues as with the use of proprietary database connectivity: lock in. You can’t pick one up and move it to the cloud, or another database without changing some code. Granted, a services layer is more efficient in this sense as it serves as an architectural strategic point of control at which connectivity is aggregated and thus database implementation and specifics are abstracted from the application. That means the database can be changed without impacting end-user applications, only the services layer need be modified. But even that approach is problematic for packaged applications that rely upon database connectivity directly and do not support such service layers. A DCL, ostensibly, would support packaged and custom applications if it were implemented properly in all commercial database offerings. CONNECTIVITY CARTEL And therein lies the problem – if it were implemented properly in all commercial database offerings. There is a risk here of a connectivity cartel arising, where database vendors form alliances with other database vendors to support a DCL while “locking out” vendors whom they have decided do not belong. Because the DCL depends on supporting “proprietary SQL extensions, data types, and data structures natively” there may be a need for database vendors to collaborate as a means to properly support those proprietary features. If collaboration is required, it is possible to deny that collaboration as a means to control who plays in the market. It’s also possible for a vendor to slightly change some proprietary feature in order to “break” the others’ support. And of course the sheer volume of work necessary for a database vendor to support all other database vendors could overwhelm smaller database vendors, leaving them with no real way to support everyone else. The idea of a DCL is an interesting one, and it has its appeal as a means to forward compatibility for migration – both temporary and permanent. Will it gain in popularity? For the latter, perhaps, but for the former? Less likely. The inherent difficulties and scope of supporting such a wide variety of databases natively will certainly inhibit any such efforts. Solutions such as a REST-ful interface, a la PHP REST SQL or a JSON-HTTP based solution like DBSlayer may be more appropriate in the long run if they were to be standardized. And by standardized I mean standardized with industry-wide and agreed upon specifications. Not more of the “more what you’d call ‘guidelines’ than actual rules” that we already have. Database Migrations are Finally Becoming Simpler Enterprise Information Integration | Data Without Borders Review: EII Suites | Don't Fear the Data The Database Tier is Not Elastic Infrastructure Scalability Pattern: Sharding Sessions F5 Friday: THE Database Gets Some Love The Impossibility of CAP and Cloud Sessions, Sessions Everywhere Cloud-Tiered Architectural Models are Bad Except When They Aren’t277Views0likes1CommentCloud Infrastructure Integration Model: Virtualization
Examining architectures on which hybrid clouds are based… IT professionals, in general, appear to consider themselves well along the path toward IT as a Service with a significant plurality of them engaged in implementing many of the building blocks necessary to support the effort. IaaS, PaaS, and hybrid cloud computing models are essential for IT to realize an environment in which (manageable) IT as a Service can become reality. That IT professionals –65% of them to be exact – note their organization is in-progress or already completed with a hybrid cloud implementation is telling, as it indicates a desire to leverage resources from a public cloud provider. What the simple “hybrid cloud” moniker doesn’t illuminate is how IT organizations are implementing such a beast. To be sure, integration is always a rough road and integrating not just resources but its supporting infrastructure must certainly be a non-trivial task. That’s especially true given that there exists no “standard” or even “best practices” means of integrating the infrastructure between a cloud and a corporate data center. Existing standards and best practices with respect to network and site-level virtualization provide an alternative to a bridged integration model. Without diving into the mechanism – standards-based or product solution – we can still examine the integration model from the perspective of its architectural goals, its advantages and disadvantages. THE VIRTUALIZATION CLOUD INTEGRATION ARCHITECTURE The basic premise of a virtualization-based cloud integration architecture is to transparently enable communication with and use of cloud-deployed resources. While the most common type of resources to be integrated will be applications, it is also the case that these resources may be storage or even solution focused. A virtualization-based cloud integration architecture provides for transparent run-time utilization of those resources as a means to enable on-demand scalability and/or improve performance for a highly dispersed end-user base. Sometimes referred to as cloud-bursting, a virtualized cloud integration architecture presents a single view of an application or site regardless of how many physical implementations there may be. This model is based on existing GSLB (Global Server load balancing) concepts and leverages existing best practices around those concepts to integrate physically disparate resources into a single application “instance”. This allows organizations to leverage commoditized compute in cloud computing environments either to provide greater performance – by moving the application closer to both the Internet backbone and the end-user – or to enhance scalability by extending resources available to the application into external, potentially temporary, environments. A global application delivery service is responsible for monitoring the overall availability and performance of the application and directing end-users to the appropriate location based on configurable variables such as location, performance, costs, and capacity. This model has the added benefit of providing a higher level of fault tolerance because should either site fail, the global application delivery service simply directs end-users to the available instance. Redundancy is an integral component of fault tolerant architectures, and two or more sites fulfills that need. Performance is generally improved by leveraging the ability of global application delivery services to compare end-user location, network conditions and application performance and determine which site will provide the best performance for the given user. Because this model does not rely upon a WAN or tunnel, as with a bridged model, performance is also improved because it eliminates much of the overhead inherent in intra-environment communications on the back-end. There are negatives, however, that can imperil these benefits from being realized. Inconsistent architectural components may inhibit accurate monitoring that impedes some routing decisions. Best practice models for global application delivery imply a local application delivery service responsible for load balancing. If a heterogeneous model of local application delivery is used (two different load balancing services) then it may be the case that monitoring and measurements are not consistently available across the disparate sites. This may result in decisions being made by the global application delivery service that are not as able to meet service-level requirements as would be the case when using operationally consistent architectural components. This lack of architectural consistency can also result in a reduced security posture if access and control policies cannot be replicated in the cloud-hosted environment. This is particularly troubling in a model in which application data from the cloud-hosted instances may be reintroduced into corporate data stores. If data in the cloud is corrupted, it can be introduced into the corporate data store and potentially wreak havoc on applications, systems, and end-users that later access that tainted data. Because of the level of reliance on architectural parity across environments, this model requires more preparation to ensure consistency in security policy enforcement as well as ensuring the proper variables can be leveraged to make the best-fit decision with respect to end-user access.270Views0likes0CommentsLocation-Aware Load Balancing
No, it’s not global server load balancing or GeoLocation. It’s something more… because knowing location is only half the battle and the other half requires the ability to make on-demand decisions based on context. In most cases today, global application delivery bases the decision on which location should service a given client based on the location of the user, availability of the application at each deployment location and, if the user is lucky, some form of performance-related service-level agreement. With the advent of concepts like cloud bursting and migratory applications that can be deployed at any number of locations at any given time based on demand, the ability to determine not just the user location accurately but the physical location of the application as well is becoming increasingly important to address concerns regarding regulatory compliance. Making the equation more difficult is that these regulations vary from country to country and the focus of each varies greatly. In the European Union the focus is on privacy for the consumer, while in the United States the primary focus is on a combination of application location (export laws) and user location (access restrictions). These issues become problematic for not just application providers who want to tap into the global market, but for organizations whose employee and customer base span the globe. Many of the benefits of cloud computing are based on the ability to tap into cloud providers’ inexpensive resources not just at any time its needed for capacity (cloud bursting) but at any time that costs can be minimized (cloud balancing). These benefits are appealing, but can quickly run organizations afoul of regulations governing data and application location. In order to maximize benefits and maintain compliance with regulations relating to the physical location of data and applications and ensure availability and performance levels are acceptable to both the organization and the end-user, some level of awareness must be present in the application delivery architecture. Awareness of location provides a flexible application delivery infrastructure with the ability to make on-demand decisions regarding where to route any given application request based on all the variables required; based on the context. Because of the flexible nature of deployment (or at least the presumed flexibility of application deployment) it would be a poor choice to hard-code such decisions so that users in location X are always directed to the application at location Y. Real-time performance and availability data must also be taken into consideration, as well as capacity of each location.268Views0likes1CommentThe Skeleton in the Global Server Load Balancing Closet
Like urban legends, every few years this one rears its head and makes its rounds. It is certainly true that everyone who has an e-mail address has received some message claiming that something bad is going on, or someone said something they didn’t, or that someone influential wrote a letter that turns out to be wishful thinking. I often point the propagators of such urban legends to Snopes because the folks who run Snopes are dedicated to hunting down the truth regarding these tidbits that make their way to the status of urban legend. It would nice, wouldn’t it, if there was such a thing for technical issues; a technology-focused Snopes, if you will. But there isn’t, and every few years a technical urban legend rears its head again and sends some folks into a panic. And we, as an industry, have to respond and provide some answers. This is certainly the case with Global Server load balancing (GSLB) and Round-Robin DNS (RR DNS). CLAIM : DNS Based Global Server Load Balancing (GSLB) Doesn’t Work STATUS : Inaccurate ORIGINS The origins of this skeleton in GLSB’s closet is a 2004 paper written by Pete Tenereillo, “Why DNS Based Global Server Load Balancing (GSLB) Doesn’t Work.” It is important to note that at the time of the writing Pete was not only very experienced with these technologies but was also considered an industry expert. Pete was intimately involved in the early days of load balancing and global server load balancing, being an instrumental part of projects at both Cisco and Alteon (Nortel, now Radware). So his perspective on the subject certainly came from experience and even “inside” knowledge about the way in which GSLB worked and was actually deployed in the “real” world. The premise upon which Pete bases his conclusion, i.e. GSLB doesn’t work, is that the features and functionality over and above that offered by standard DNS servers are inherently flawed and in theory sound good, but don’t work. His ultimate conclusion is that the only way to implement true global high-availability is to use multiple A records, which are already a standard function of DNS servers. DNS based Global Server Load Balancing (GSLB) solutions exist to provide features and functionality over and above what is available in standard DNS servers. This paper explains the pitfalls in using such features for the most common Internet services, including HTTP, HTTPS, FTP, streaming media, and any other application or protocol that relies on browser based client access. … An Axiom The only way to achieve high-availability GSLB for browser based clients is to include the use of multiple A records It would be easy to dismiss Pete’s concerns based on the fact that his commentary is nearly seven years old at this point. But the basic principles upon which DNS and GSLB are implemented today are still based on the same theories with which Pete takes issue. What Pete missed in 2004 and what remains missing from this treatise is twofold. First, GSLB implementations at that time, and today, do in fact support returning multiple A records, a.k.a. Round-Robin DNS. Second, the features and functionality provided over and above standard DNS do, in fact, address the issues raised and these features and functionality have, in fact, evolved over the past seven years. Is returning multiple A records to LDNS the only way of achieving High Availability? How is advanced health checking an important component of providing High Availability? Many people misuse the term ‘high availability’ by indicating that it only equates to when a site is either up or down. This type of binary thinking is misguided and is purely technical in focus. Our customers have all indicated that high availability also includes performance of the application or site. The reason is that by business definitions if a site or application is too slow it is unavailable. Poor performance directly impacts productivity, one of the key performance indicators used to measure the effectiveness of business employees and processes. As a result, high availability can be achieved in a number of different ways. Intelligent GSLB solutions, through advanced monitoring and statistical correlation, take into account not only whether the site is up or down, but such detail as hop count, packet loss, round-trip time, and data-center capacity to name a few. These metrics then transparently provide users with most efficient and intelligent way of steering traffic and achieving high availability. Geolocation is another means of steering traffic to the appropriate service location, as well as any number of client and business-specific variables. Context is important to application delivery in general, but is a critical component of GSLB to maintain availability – including performance. The round robin handling of the A records by the Local DNS (LDNS) is a well known problem in the industry. When multiple A records are handed back to the LDNS for an address resolution, the LDNS shuffles the list and returns the A records list back to the client without honoring the order in which it received it. The next time the client requests an address, the LDNS responds with a different ordered list of A records. This LDNS behavior makes it very difficult to predict the order in which A records are being returned to the client. In order to overcome this problem, many prefer to configure a GSLB solution to send back one A record to the LDNS. When compared with just a ‘plain’ DNS server that would send back any one of the site addresses with a TTL value, an intelligent GSLB sends back the address of the best performing site, based on the metrics that are important to the business, and sets the TTL value. A majority of the LDNS that are RFC compliant will honor the TTL value and resolve again after the TTL value has expired. The GSLB performs advanced health checking and sends back the address of the best performing site taking into account metrics like application availability, site capacity, round trip time, hops and completion rate hence providing the best user experience and meeting applicable business service level agreements. In the event of a site failure (when the link is down or because of a catastrophic event), existing clients would connect to the unavailable site for the period of time equal to the TTL value. The GSLB sets a TTL value of 30 seconds when returning an A record back to the LDNS. As soon as the 30 second time period expires, the LDNS resolves again and the GSLB uses its advanced health checking capability and determines that one of the multiple sites is unavailable. The GSLB then starts to direct users transparently to the best performing site by returning the address of that site back to the LDNS. A flexible GSLB will also provides a Manual Resume option that gives them the option of letting the unavailable site stay down to mitigate the commonly known back end database synchronization problem. An intelligent GSLB also has the option of sending multiple A records that allows delivery of content from the best performing sites. For example, let’s say an enterprise wants to deliver their content using 10 sites and wants to provide high availability. Using sophisticated health checking, the GSLB can determine the two best performing sites and return their addresses to the users. The GLSB would track each site’s performance and send back the best sites based on current network and site conditions (context) for every resolution. Slow sites or sites that are down would never be sent back to the user. What about the issues with DNS Browser Caching? Of all the issues raised by Pete in his seminal work of 2004, this is likely the one that is still relevant. Browser technology has evolved, yes, but the underlying network stack and functionality has not, mainly because DNS itself has not changed much in the past ten years. Most modern browsers may or may not (evidence is conflicting and documentation nailing it down scant) honor DNS TTL but they have, at least, reduced the caching interval on the client side. This may or may not - depending on timing - result in a slight delay in the event of a failure while resolution catches up but it does not have nearly the dramatic negative impact it once had. In early days, a delay of 15 minutes could be expected. Today that delay can generally be counted in seconds. It is, admittedly, still a delay but one that is certainly more acceptable to most business stakeholders and customers alike. And yet while the issue of DNS browser caching is still technically accurate, it is not all that relevant; the same solution Pete suggests to address the issue – RR DNS – has always been available as an option for GSLB. Any technology, when not configured to address the issue for which it was implemented, can be considered a failure. This is not a strike against the technology, but the particular implementation. The instances of browser caching impacting site availability and performance is minimal in most cases and for those organizations for which such instances would be completely unacceptable it is simply a matter of mitigation using the proper policies and configuration. > SUMMARY What it comes down to is that Pete, in his paper, is pushing for the use of Round-Robin DNS (RR DNS). Modern Global Server Load Balancing (GSLB) solutions fully support this option today and generally speaking always have. However, the focus on the technical aspects completely ignores the impact of business requirements and agreements and does not take into consideration the functions and features over and above standard DNS that assist in supporting those requirements. For example, health-checking has come a long way since 2004, and includes not only simply up-down indicators or even performance-based indicators but is now able to incorporate a full range of contextual variables into the equation. Location, client-type, client-network, data center network, capacity… all these parameters can be leveraged to perform “health” checks that enable a more accurate and ultimately adaptable decision. Interestingly, standard DNS servers leveraged to implement a GSLB solution are not capable of nor do they provide the means by which such health checks can be implemented. Such “health monitoring” is, however, a standard offering for GSLB solutions. NEW FACTORS to CONSIDER Given the dynamism inherent not only to local data centers but global implementations and the inclusion of cloud computing and virtualization, GSLB must also provide the means by which management and maintenance and process automation can be accomplished. Traditional DNS solutions like BIND do not provide such means of control; they are enabled with the ability to participate in the collaborative processes necessary to automate the migration and capacity fulfillment functions for which virtualization and cloud computing are and will be used. Thus a simple RR DNS implementation may be desirable, but the solution through which such implementations will be implemented must be more modern and capable of addressing management and business concerns as well. These are the “functions and features” over and above standard DNS servers that provide value to organizations regardless of the technical details of the algorithms and methods used to distribute DNS records. Additionally, traditional DNS solutions – while supporting new security initiatives like DNSSEC – are less able to handle such initiatives in a dynamic environment. A GSLB must be able to provide dynamic signing of records to enable global server load balancing as a means to support DNSSEC. DNSSEC introduces a variety of challenges associated with GSLB that cannot be easily or efficiently addressed by standard DNS services. Modern GLSB solutions can and do address these challenges while enabling integration and support for other emerging data center models that make use of cloud computing and virtualization. This skeleton is sure to creep out of the closet yet again in a few years, primarily because DNS itself is not changing. Extensions such as DNSSEC occasionally crop up to address issues that arise over time, but the core principles upon which DNS have always operated are still true and are likely to remain true for some time. What has changed are the data center architectures, technology, and business requirements that IT organizations must support, in part through the use of DNS and GSLB. The fact is that GSLB does work and modern GSLB solutions do provide the means by which both technical and business requirements can be met while simultaneously addressing new and emerging challenges associated with the steady march of technology toward a dynamic data center. So You Put an Application in the Cloud. Now what? F5 Friday: Hyperlocalize Applications for Everyone Location-Aware Load Balancing Cloud Needs Context-Aware Provisioning What is a Strategic Point of Control Anyway? German DPA Issues Legal Opinion on Cloud Computing As Cloud Computing Goes International, Whose Laws Matter? Load Balancing in a Cloud259Views0likes0CommentsF5 Friday: Lessons from (IT) Geese
Birds migrate in flocks, which means every individual has the support of others. IT often migrates alone – but it doesn’t have to. “Lessons from Geese” has been around a long time. It is often cited and referenced, particularly with respect to teamwork and collaboration. The very first “lesson” learned from geese migrations applied to human collaboration is this: Fact #1: As each goose flaps its wings, it creates an uplift for the others behind it. By flying in a "V" formation, the whole flock adds 71% greater flying range than if each bird flew alone. Lesson: People who share a common direction and sense of community can get where they are going quicker and easier because they are traveling on the thrust of another. That’s probably not surprising at all and the basic lesson is one we’re all familiar with, no doubt. Fact #3: When the lead goose tires, it rotates back into the formation and another goose flies to the point position. Lesson: It pays to take turns doing the hard tasks and sharing leadership, as with geese, people are interdependent on each other’s skill, capabilities and unique arrangement of gifts, talents or resources. This lesson works well, if everyone is a goose is similarly talented at flying. But within IT there are myriad skill sets being used that must come together to migrate implementations from one version to another. It’s not just software – it’s data stores, identity stores, switches, and application delivery systems. There’s a lot of different skills required to successfully migrate large, business critical systems. And we can’t just pick a random goose to lead when it comes to migrating specific subsets and components; we need experts in various systems to assist. And sometimes, we don’t have the right goose. So we have to find one. “A Plan-Net survey found that 87% of organizations are currently using Exchange 2003 or earlier. There has been a reluctance to adopt the 2007 version, often considered to be the Vista of the server platform — faulty and dispensable.” -- 10 reasons to migrate to Exchange 2010 This doesn’t explain a reluctance to move to Exchange 2010. With larger mailboxes, virtualization support, voicemail transcription, and higher availability, what’s not to like? Significant changes in the underlying architecture – which cascade into the infrastructure – may be one of them. Upgrading a business critical service like Exchange requires more planning and forethought than upgrading to the latest version of Angry Birds, after all. Continuity of service is required even as the new version is put in place. And while there are plenty of experts who can help with the migration of Exchange, there are fewer that can help with the migration of its supporting infrastructure services. F5 has an answer for that, a skilled goose, if you will, who can take the lead and keep the organization on track. Introducing: F5 Architecture Design for Microsoft Exchange Service The F5 Architecture Design for Microsoft Exchange service comprises an intense three days of discussion, information gathering, analysis and knowledge-sharing of network considerations for the optimal deployment of Microsoft Exchange in an F5 network environment. F5 Professional Services consultants with Exchange expertise conduct assessments during which they review your current network and future needs to streamline your new implementation, upgrade or migration to your preferred version of Microsoft Exchange. Plan During the project kick-off call, F5 Professional Services consultants make sure to understand your overall project goals, flag dependencies, and validate that all questionnaires and information requirements have been addressed prior to the initiation of the engagement. Analyze The F5 Architecture Design for Microsoft Exchange Service facilitates the discussion, analysis and development of the network architecture requirements that best support your Exchange deployment. The engagement starts with an overview and whiteboard discussion of F5 technology, focusing on topics of high availability, scalability, security and performance. Next, the consultants engage in conversations about mail deployment for legacy mail systems or new deployments, touching on sizing, security and service-level agreements. Finally, they review the architectural components specific to your environment, including network flows, client access, unified messaging, and considerations of single vs. multisite deployments. Design and Report The F5 Professional Services consultants consolidate the results from the analysis phase and deliver a Proposed Microsoft Exchange Network Architecture and a Proposed Network Migration Plan report detailing the recommendations. F5 consultants intimately understand F5 BIG-IP ® systems and their operation, and can draw on the F5 Solutions for Microsoft Exchange Server. You can be assured of the thoroughness and relevance of their recommendations. The consultants’ reports provide you with the blueprint for flexible and cost-effective communication and collaboration in your organization. For more information about the F5 Architecture Design for Microsoft Exchange service, use the search function on f5.com or contact consulting@f5.com Additional Resources: Microsoft Exchange 2010: HELO New Architecture Deploying F5 with Microsoft Exchange 2010 F5 solution for Microsoft Exchange Microsoft Exchange 2010: HELO New Architecture F5 Friday: BIG-IP Solutions for Microsoft Private Cloud Webcast - BIG-IP v11 and Microsoft Technologies Social Forums - F5/Microsoft Solutions Eliminating Data Center Vertigo with F5 and Microsoft F5 Friday: Microsoft and F5 Lync Up on Unified Communications F5 Friday: Playing in the Infrastructure Orchestra(tion)251Views0likes0CommentsThe Future of Cloud: Infrastructure as a Platform
Cloud needs to become a platform, and that means its comprising infrastructure must also embrace the platform paradigm. There’s been a spate of articles, blogs, and mentions of OpenFlow in the past few months. IBM was the latest entry into the OpenFlow game, releasing an enabling RackSwitch G8264, an update of a 64-port, 10 Gigabit Ethernet switch IBM put out a year ago. Interest in the specification appears to be growing and not just because it’s got the prefix-du-jour as part of its name, implying everything to everyone – free, extensible, interoperable, etc… While all those modifiers are indeed interesting and, to some, a highly important facet of the would-be standard, there’s something else about it that is driving its popularity. That something-else can be summed it with the statement: “infrastructure as a platform.” THE WEB 2.0 LESSON. AGAIN. The importance of turning infrastructure into a platform can be evidenced by noting commentary on Web 2.0, a.k.a. social networking, applications and their failure/success to garner mind-share. Recently, a high-profile engineer at Google mistakenly posted a length and refreshingly blunt commentary on what he views as Google’s failure to recognize the importance of platform to successful offerings in today’s demanding marketplace. To Google’s credit, once the erroneous posting was discovered, it decided to “let it stand” and thus we are able to glean some insight about the importance of platform to today’s successful offerings: While Yegge doesn’t have a lot of good things to say about Amazon and its founder Jeff Bezos, he does note that Bezos – unlike Google – understands that its not just about developing interesting products, but that it takes a platform to create a great product. -- SiliconFilter, “Google Engineer: “Google+ is a Prime Example of Our Complete Failure to Understand Platforms” This insight is not restricted to software developers and engineers at all; the rising interest of PaaS (Platform as a Service) and the continued siren’s song that it will dominate the cloud landscape in the future is all tied to the same premise: it is the availability of a robust platform that makes or breaks solutions today, not features or functions or price. It is the ability to be successful by building, as Yegge says in his post, “an entire constellation of products by allowing other people to do the work.” Lest you think this concept applicable only to software, let me remind you of Nokia CEO Stephen Elop’s somewhat blunt assessment of his company’s failure to recognize this truth: The battle of devices has now become a war of ecosystems, where ecosystems include not only the hardware and software of the device, but developers, applications, ecommerce, advertising, search, social applications, location-based services, unified communications and many other things. Our competitors aren’t taking our market share with devices; they are taking our market share with an entire ecosystem. This means we’re going to have to decide how we either build, catalyse or join an ecosystem. -- DevCentral F5 Friday, “A War of Ecosystems” Interestingly, 47% of respondents surveyed by Zenoss/Cloud.com for its Cloud Computing Outlook 2011 indicated use of PaaS in 2011. Like SaaS, PaaS has some wiggle room in its definition, but its general popularity seems to indicate that yes, indeed, platform is an important factor. OpenFlow essentially provides this capability, turning infrastructure into a platform and enabling extensibility and customization that could not be achieved otherwise. It basically turns a piece of infrastructure into a giant backplane for new functions, features, and services. It introduces, allegedly, dynamism into what is typically a static network. It is what IaaS had the promise to be, but as of yet has failed to achieve. CLOUD as a PLATFORM The takeaway for cloud and infrastructure providers is that organizations want platforms. Developers want platforms. Operations wants platforms (see Puppet and Chef as examples of operational platforms). It’s about enabling an ecosystem that encourages innovation, i.e. new features and functions and services, without requiring the wheel to be reinvented. It’s about drag and drop, figuratively speaking, in the realm of infrastructure. Bringing the ability to deploy new services atop a platform that provides the basics. OpenFlow promises just such capabilities for infrastructure much in the same way Facebook provides these basics for game and application developers. Mobile platforms offer the same for devices and operating systems. It’s about enabling an ecosystem in which organizations can focus on not the core infrastructure, but on custom functionality and process automation that delivers efficiency to IT across operations and development alike. “The beauty of this is it gives more flexibility and control to the network,” said Shaughnessy [marketing manager for system networking at IBM], “so you could actually adjust the way the traffic flows go through your network dynamically based on what’s going on with your applications.” -- IBM releases OpenFlow-enabled switch It enables flexibility in the network, the means to deploy more dynamism in traffic policy enforcement and shaping and ties back to cloud with its ability to impart multi-tenant capabilities to infrastructure without completely modifying the internal architecture of components – a major obstacle for many network-focused devices. OpenFlow is not a panacea, there are myriad reasons why it may not be appropriate as the basis for architecting the cloud platform foundation required to support future initiatives. But it is a prime example of the kind of platform-focused capabilities organizations desire to move ahead in their journey to IT as a Service. The cloud on which organizations will be able to build their future data center architecture will be a platform, and that means from the bottom (infrastructure) to the middle (development) to the top (operations). What cloud and infrastructure providers must do is simulate the Facebook experience at the infrastructure layer. Infrastructure as a platform is the next step in the evolution of cloud computing . IT Services: Creating Commodities out of Complexity IBM releases OpenFlow-enabled switch The Cloud Configuration Management Conundrum IT as a Service: A Stateless Infrastructure Architecture Model If a Network Can’t Go Virtual Then Virtual Must Come to the Network You Can’t Have IT as a Service Until IT Has Infrastructure as a Service This is Why We Can’t Have Nice Things WILS: Automation versus Orchestration The Infrastructure Turk: Lessons in Services Putting the Cloud Before the Horse238Views0likes0Comments