IT as a Service: A Stateless Infrastructure Architecture Model
The dynamic data center of the future, enabled by IT as a Service, is stateless. One of the core concepts associated with SOA – and one that failed to really take hold, unfortunately – was the ability to bind, i.e. invoke, a service at run-time. WSDL was designed to loosely couple services to clients, whether they were systems, applications or users, in a way that was dynamic. The information contained in the WSDL provided everything necessary to interface with a service on-demand without requiring hard-coded integration techniques used in the past. The theory was you’d find an appropriate service, hopefully in a registry (UDDI-based), grab the WSDL, set up the call, and then invoke the service. In this way, the service could “migrate” because its location and invocation specific meta-data was in the WSDL, not hard-coded in the client, and the client could “reconfigure”, as it were, on the fly. There are myriad reasons why this failed to really take hold (notably that IT culture inhibited the enforcement of a strong and consistent governance strategy) but the idea was and remains sound. The goal of a “stateless” architecture, as it were, remains a key characteristic of what is increasingly being called IT as a Service – or “private” cloud computing . TODAY: STATEFUL INFRASTRUCTURE ARCHITECTURE The reason the concept of a “stateless” infrastructure architecture is so vital to a successful IT as a Service initiative is the volatility inherent in both the application and network infrastructure needed to support such an agile ecosystem. IP addresses, often used to bypass the latency induced by resolution of host names at run-time from DNS calls, tightly couple systems together – including network services. Routing and layer 3 switching use IP addresses to create a virtual topology of the architecture and ensure the flow of data from one component to the next, based on policy or pre-determine routes as meets the needs of the IT organization. It is those policies that in many cases can be eliminated; replaced with a more service-oriented approach that provisions resources on-demand, in real-time. This eliminates the “state” of an application architecture by removing delivery dependencies on myriad policies hard-coded throughout the network. Policies are inexorably tied to configurations, which are the infrastructure equivalent of state in the infrastructure architecture. Because of the reliance on IP addresses imposed by the very nature of network and Internet architectural design, we’ll likely never reach full independence from IP addresses. But we can move closer to a “stateless” run-time infrastructure architecture inside the data center by considering those policies that can be eliminated and instead invoked at run-time. Not only would such an architecture remove the tight coupling between policies and infrastructure, but also between applications and the infrastructure tasked with delivering them. In this way, applications could more easily be migrated across environments, because they are not tightly bound to the networking and security policies deployed on infrastructure components across the data center. The pre-positioning of policies across the infrastructure requires codifying topological and architectural meta-data in a configuration. That configuration requires management; it requires resources on the infrastructure – storage and memory – while the device is active. It is an extra step in the operational process of deploying, migrating and generally managing an application. It is “state” and it can be reduced – though not eliminated – in such a way as to make the run-time environment, at least, stateless and thus more motile. TOMORROW: STATELESS INFRASTRUCTURE ARCHITECTURE What’s needed to move from a state-dependent infrastructure architecture to one that is more stateless is to start viewing infrastructure functions as services. Services can be invoked, they are loosely coupled, they are independent of solution and product. Much in the same way that stateless application architectures address the problems associated with persistence and impede real-time migration of applications across disparate environments, so too does stateless infrastructure architectures address the same issues inherent in policy-based networking – policy persistence. While standardized APIs and common meta-data models can alleviate much of the pain associated with migration of architectures between environments, they still assume the existence of specific types of components (unless, of course, a truly service-oriented model in which services, not product functions, are encapsulated). Such a model extends the coupling between components and in fact can “break” if said service does not exist. Conversely, a stateless architecture assumes nothing; it does not assume the existence of any specific component but merely indicates the need for a particular service as part of the application session flow that can be fulfilled by any appropriate infrastructure providing such a service. This allows the provider more flexibility as they can implement the service without exposing the underlying implementation – exactly as a service-oriented architecture intended. It further allows providers – and customers – to move fluidly between implementations without concern as only the service need exist. The difficulty is determining what services can be de-coupled from infrastructure components and invoked on-demand, at run-time. This is not just an application concern, it becomes an infrastructure component concern, as well, as each component in the flow might invoke an upstream – or downstream – service depending on the context of the request or response being processed. Assuming that such services exist and can be invoked dynamically through a component and implementation-agnostic mechanism, it is then possible to eliminate many of the pre-positioned, hard-coded policies across the infrastructure and instead invoke them dynamically. Doing so reduces the configuration management required to maintain such policies, as well as eliminating complexity in the provisioning process which must, necessarily, include policy configuration across the infrastructure in a well-established and integrated enterprise-class architecture. Assuming as well that providers have implemented support for similar services, one can begin to see the migratory issues are more easily redressed and the complications caused by needed to pre-provision services and address policy persistence during migration mostly eliminated. SERVICE-ORIENTED THINKING One way of accomplishing such a major transformation in the data center – from policy to service-oriented architecture – is to shift our thinking from functions to services. It is not necessarily efficient to simply transplant a software service-oriented approach to infrastructure because the demands on performance and aversion to latency makes a dynamic, run-time binding to services unappealing. It also requires a radical change in infrastructure architecture by adding the components and services necessary to support such a model – registries and the ability of infrastructure components to take advantage of them. An in-line, transparent invocation method for infrastructure services offers the same flexibility and motility for applications and infrastructure without imposing performance or additional dependency constraints on implementers. But to achieve a stateless infrastructure architectural model, one must first shift their thinking from functions to services and begin to visualize a data center in which application requests and responses communicate the need for particular downstream and upstream services with them, rather than completely in hard-coded policies stored in component configurations. It is unlikely that in the near-term we can completely eliminate the need for hard-coded configuration, we’re just no where near that level of dynamism and may never be. But for many services – particularly those associated with run-time delivery of applications, we can achieve the stateless architecture necessary to realize a more mobile and dynamic data center. Now Witness the Power of this Fully Operational Feedback Loop Cloud is the How not the What Challenging the Firewall Data Center Dogma Cloud-Tiered Architectural Models are Bad Except When They Aren’t Cloud Chemistry 101 You Can’t Have IT as a Service Until IT Has Infrastructure as a Service Let’s Face It: PaaS is Just SOA for Platforms Without the Baggage The New Distribution of The 3-Tiered Architecture Changes Everything497Views0likes1CommentHTML5 Web Sockets Changes the Scalability Game
#HTML5 Web Sockets are poised to completely change scalability models … again. Using Web Sockets instead of XMLHTTPRequest and AJAX polling methods will dramatically reduce the number of connections required by servers and thus has a positive impact on performance. But that reliance on a single connection also changes the scalability game, at least in terms of architecture. Here comes the (computer) science… If you aren’t familiar with what is sure to be a disruptive web technology you should be. Web Sockets, while not broadly in use (it is only a specification, and a non-stable one at that) today is getting a lot of attention based on its core precepts and model. Web Sockets Defined in the Communications section of the HTML5 specification, HTML5 Web Sockets represents the next evolution of web communications—a full-duplex, bidirectional communications channel that operates through a single socket over the Web. HTML5 Web Sockets provides a true standard that you can use to build scalable, real-time web applications. In addition, since it provides a socket that is native to the browser, it eliminates many of the problems Comet solutions are prone to. Web Sockets removes the overhead and dramatically reduces complexity. - HTML5 Web Sockets: A Quantum Leap in Scalability for the Web So far, so good. The premise upon which the improvements in scalability coming from Web Sockets are based is the elimination of HTTP headers (reduces bandwidth dramatically) and session management overhead that can be incurred by the closing and opening of TCP connections. There’s only one connection required between the client and server over which much smaller data segments can be sent without necessarily requiring a request and a response pair. That communication pattern is definitely more scalable from a performance perspective, and also has a positive impact of reducing the number of connections per client required on the server. Similar techniques have long been used in application delivery (TCP multiplexing) to achieve the same results – a more scalable application. So far, so good. Where the scalability model ends up having a significant impact on infrastructure and architectures is the longevity of that single connection: Unlike regular HTTP traffic, which uses a request/response protocol, WebSocket connections can remain open for a long time. - How HTML5 Web Sockets Interact With Proxy Servers This single, persistent connection combined with a lot of, shall we say, interesting commentary on the interaction with intermediate proxies such as load balancers. But ignoring that for the nonce, let’s focus on the “remain open for a long time.” A given application instance has a limit on the number of concurrent connections it can theoretically and operationally manage before it reaches the threshold at which performance begins to dramatically degrade. That’s the price paid for TCP session management in general by every device and server that manages TCP-based connections. But Lori, you’re thinking, HTTP 1.1 connections are persistent, too. In fact, you don’t even have to tell an HTTP 1.1 server to keep-alive the connection! This really isn’t a big change. Whoa there hoss, yes it is. While you’d be right in that HTTP connections are also persistent, they generally have very short connection timeout settings. For example, the default connection timeout for Apache 2.0 is 15 seconds and for Apache 2.2 a mere 5 seconds. A well-tuned web server, in fact, will have thresholds that closely match the interaction patterns of the application it is hosting. This is because it’s a recognized truism that long and often idle connections tie up server processes or threads that negatively impact overall capacity and performance. Thus the introduction of connections that remain open for a long time changes the capacity of the server and introduces potential performance issues when that same server is also tasked with managing other short-lived, connection-oriented requests. Why this Changes the Game… One of the most common inhibitors of scale and high-performance for web applications today is the deployment of both near-real-time communication functions (AJAX) and traditional web content functions on the same server. That’s because web servers do not support a per-application HTTP profile. That is to say, the configuration for a web server is global; every communication exchange uses the same configuration values such as connection timeouts. That means configuring the web server for exchanges that would benefit from a longer time out end up with a lot of hanging connections doing absolutely nothing because they were used to grab standard dynamic or static content and then ignored. Conversely, configuring for quick bursts of requests necessarily sets timeout values too low for near or real-time exchanges and can cause performance issues as a client continually opens and re-opens connections. Remember, an idle connection is a drain on resources that directly impacts the performance and capacity of applications. So it’s a Very Bad Thing™. One of the solutions to this somewhat frustrating conundrum, made more feasible by the advent of cloud computing and virtualization, is to deploy specialized servers in a scalability domain-based architecture using infrastructure scalability patterns. Another approach to ensuring scalability is to offload responsibility for performance and connection management to an appropriately capable intermediary. Now, one would hope that a web server implementing support for both HTTP and Web Sockets would support separately configurable values for communication settings on at least the protocol level. Today there are very few web servers that support both HTTP and Web Sockets. It’s a nascent and still evolving standard so many of the servers are “pure” Web Sockets servers, many implemented in familiar scripting languages like PHP and Python. Which means two separate sets of servers that must be managed and scaled. Which should sound a lot like … specialized servers in a scalability domain-based architecture. The more things change, the more they stay the same. The second impact on scalability architectures centers on the premise that Web Sockets keep one connection open over which message bits can be exchanged. This ties up resources, but it also requires that clients maintain a connection to a specific server instance. This means infrastructure (like load balancers and web/application servers) will need to support persistence (not the same as persistent, you can read about the difference here if you’re so inclined). That’s because once connected to a Web Socket service the performance benefits are only realized if you stay connected to that same service. If you don’t and end up opening a second (or Heaven-forbid a third or more) connection, the first connection may remain open until it times out. Given that the premise of the Web Socket is to stay open – even through potentially longer idle intervals – it may remain open, with no client, until the configured time out. That means completely useless resources tied up by … nothing. Persistence-based load balancing is a common feature of next-generation load balancers (application delivery controllers) and even most cloud-based load balancing services. It is also commonly implemented in application server clustering offerings, where you’ll find it called server-affinity. It is worth noting that persistence-based load balancing is not without its own set of gotchas when it comes to performance and capacity. THE ANSWER: ARCHITECTURE The reason that these two ramifications of Web Sockets impacts the scalability game is it requires an broader architectural approach to scalability. It can’t necessarily be achieved simply by duplicating services and distributing the load across them. Persistence requires collaboration with the load distribution mechanism and there are protocol-based security constraints with respect to incorporating even intra-domain content in a single page/application. While these security constraints are addressable through configuration, the same caveats with regards to the lack of granularity in configuration at the infrastructure (web/application server) layer must be made. Careful consideration of what may be accidentally allowed and/or disallowed is necessary to prevent unintended consequences. And that’s not even starting to consider the potential use of Web Sockets as an attack vector, particularly in the realm of DDoS. The long-lived nature of a Web Socket connection is bound to be exploited at some point in the future, which will engender another round of evaluating how to best address application-layer DDoS attacks. A service-focused, distributed (and collaborative) approach to scalability is likely to garner the highest levels of success when employing Web Socket-based functionality within a broader web application, as opposed to the popular cookie-cutter cloning approach made exceedingly easy by virtualization. Infrastructure Scalability Pattern: Partition by Function or Type Infrastructure Scalability Pattern: Sharding Sessions Amazon Makes the Cloud Sticky Load Balancing Fu: Beware the Algorithm and Sticky Sessions Et Tu, Browser? Forget Hyper-Scale. Think Hyper-Local Scale. Infrastructure Scalability Pattern: Sharding Streams Infrastructure Architecture: Whitelisting with JSON and API Keys Does This Application Make My Browser Look Fat? HTTP Now Serving … Everything641Views0likes5CommentsThe Challenges of SQL Load Balancing
#infosec #iam load balancing databases is fraught with many operational and business challenges. While cloud computing has brought to the forefront of our attention the ability to scale through duplication, i.e. horizontal scaling or “scale out” strategies, this strategy tends to run into challenges the deeper into the application architecture you go. Working well at the web and application tiers, a duplicative strategy tends to fall on its face when applied to the database tier. Concerns over consistency abound, with many simply choosing to throw out the concept of consistency and adopting instead an “eventually consistent” stance in which it is assumed that data in a distributed database system will eventually become consistent and cause minimal disruption to application and business processes. Some argue that eventual consistency is not “good enough” and cite additional concerns with respect to the failure of such strategies to adequately address failures. Thus there are a number of vendors, open source groups, and pundits who spend time attempting to address both components. The result is database load balancing solutions. For the most part such solutions are effective. They leverage master-slave deployments – typically used to address failure and which can automatically replicate data between instances (with varying levels of success when distributed across the Internet) – and attempt to intelligently distribute SQL-bound queries across two or more database systems. The most successful of these architectures is the read-write separation strategy, in which all SQL transactions deemed “read-only” are routed to one database while all “write” focused transactions are distributed to another. Such foundational separation allows for higher-layer architectures to be implemented, such as geographic based read distribution, in which read-only transactions are further distributed by geographically dispersed database instances, all of which act ultimately as “slaves” to the single, master database which processes all write-focused transactions. This results in an eventually consistent architecture, but one which manages to mitigate the disruptive aspects of eventually consistent architectures by ensuring the most important transactions – write operations – are, in fact, consistent. Even so, there are issues, particularly with respect to security. MEDIATION inside the APPLICATION TIERS Generally speaking mediating solutions are a good thing – when they’re external to the application infrastructure itself, i.e. the traditional three tiers of an application. The problem with mediation inside the application tiers, particularly at the data layer, is the same for infrastructure as it is for software solutions: credential management. See, databases maintain their own set of users, roles, and permissions. Even as applications have been able to move toward a more shared set of identity stores, databases have not. This is in part due to the nature of data security and the need for granular permission structures down to the cell, in some cases, and including transactional security that allows some to update, delete, or insert while others may be granted a different subset of permissions. But more difficult to overcome is the tight-coupling of identity to connection for databases. With web protocols like HTTP, identity is carried along at the protocol level. This means it can be transient across connections because it is often stuffed into an HTTP header via a cookie or stored server-side in a session – again, not tied to connection but to identifying information. At the database layer, identity is tightly-coupled to the connection. The connection itself carries along the credentials with which it was opened. This gives rise to problems for mediating solutions. Not just load balancers but software solutions such as ESB (enterprise service bus) and EII (enterprise information integration) styled solutions. Any device or software which attempts to aggregate database access for any purpose eventually runs into the same problem: credential management. This is particularly challenging for load balancing when applied to databases. LOAD BALANCING SQL To understand the challenges with load balancing SQL you need to remember that there are essentially two models of load balancing: transport and application layer. At the transport layer, i.e. TCP, connections are only temporarily managed by the load balancing device. The initial connection is “caught” by the Load balancer and a decision is made based on transport layer variables where it should be directed. Thereafter, for the most part, there is no interaction at the load balancer with the connection, other than to forward it on to the previously selected node. At the application layer the load balancing device terminates the connection and interacts with every exchange. This affords the load balancing device the opportunity to inspect the actual data or application layer protocol metadata in order to determine where the request should be sent. Load balancing SQL at the transport layer is less problematic than at the application layer, yet it is at the application layer that the most value is derived from database load balancing implementations. That’s because it is at the application layer where distribution based on “read” or “write” operations can be made. But to accomplish this requires that the SQL be inline, that is that the SQL being executed is actually included in the code and then executed via a connection to the database. If your application uses stored procedures, then this method will not work for you. It is important to note that many packaged enterprise applications rely upon stored procedures, and are thus not able to leverage load balancing as a scaling option. Depending on your app or how your organization has agreed to protect your data will determine which of these methods are used to access your databases. The use of inline SQL affords the developer greater freedom at the cost of security, increased programming(to prevent the inherent security risks), difficulty in optimizing data and indices to adapt to changes in volume of data, and deployment burdens. However there is lively debate on the values of both access methods and how to overcome the inherent risks. The OWASP group has identified the injection attacks as the easiest exploitation with the most damaging impact. This also requires that the load balancing service parse MySQL or T-SQL (the Microsoft Transact Structured Query Language). Databases, of course, are designed to parse these string-based commands and are optimized to do so. Load balancing services are generally not designed to parse these languages and depending on the implementation of their underlying parsing capabilities, may actually incur significant performance penalties to do so. Regardless of those issues, still there are an increasing number of organizations who view SQL load balancing as a means to achieve a more scalable data tier. Which brings us back to the challenge of managing credentials. MANAGING CREDENTIALS Many solutions attempt to address the issue of credential management by simply duplicating credentials locally; that is, they create a local identity store that can be used to authenticate requests against the database. Ostensibly the credentials match those in the database (or identity store used by the database such as can be configured for MSSQL) and are kept in sync. This obviously poses an operational challenge similar to that of any distributed system: synchronization and replication. Such processes are not easily (if at all) automated, and rarely is the same level of security and permissions available on the local identity store as are available in the database. What you generally end up with is a very loose “allow/deny” set of permissions on the load balancing device that actually open the door for exploitation as well as caching of credentials that can lead to unauthorized access to the data source. This also leads to potential security risks from attempting to apply some of the same optimization techniques to SQL connections as is offered by application delivery solutions for TCP connections. For example, TCP multiplexing (sharing connections) is a common means of reusing web and application server connections to reduce latency (by eliminating the overhead associated with opening and closing TCP connections). Similar techniques at the database layer have been used by application servers for many years; connection pooling is not uncommon and is essentially duplicated at the application delivery tier through features like SQL multiplexing. Both connection pooling and SQL multiplexing incur security risks, as shared connections require shared credentials. So either every access to the database uses the same credentials (a significant negative when considering the loss of an audit trail) or we return to managing duplicate sets of credentials – one set at the application delivery tier and another at the database, which as noted earlier incurs additional management and security risks. YOU CAN’T WIN FOR LOSING Ultimately the decision to load balance SQL must be a combination of business and operational requirements. Many organizations successfully leverage load balancing of SQL as a means to achieve very high scale. Generally speaking the resulting solutions – such as those often touted by e-Bay - are based on sound architectural principles such as sharding and are designed as a strategic solution, not a tactical response to operational failures and they rarely involve inspection of inline SQL commands. Rather they are based on the ability to discern which database should be accessed given the function being invoked or type of data being accessed and then use a traditional database connection to connect to the appropriate database. This does not preclude the use of application delivery solutions as part of such an architecture, but rather indicates a need to collaborate across the various application delivery and infrastructure tiers to determine a strategy most likely to maintain high-availability, scalability, and security across the entire architecture. Load balancing SQL can be an effective means of addressing database scalability, but it should be approached with an eye toward its potential impact on security and operational management. What are the pros and cons to keeping SQL in Stored Procs versus Code Mission Impossible: Stateful Cloud Failover Infrastructure Scalability Pattern: Sharding Streams The Real News is Not that Facebook Serves Up 1 Trillion Pages a Month… SQL injection – past, present and future True DDoS Stories: SSL Connection Flood Why Layer 7 Load Balancing Doesn’t Suck Web App Performance: Think 1990s.2.2KViews0likes1CommentF5 Friday: HP Cloud Maps Help Navigate Server Flexing with BIG-IP
The economy of scale realized in enterprise cloud computing deployments is as much (if not more) about process as it is products. HP Cloud Maps simplify the former by automating the latter. When the notion of “private” or “enterprise” cloud computing first appeared, it was dismissed as being a non-viable model due to the fact that the economy of scale necessary to realize the true benefits were simply not present in the data center. What was ignored in those arguments was that the economy of scale desired by enterprises large and small was not necessarily that of technical resources, but of people. The widening gap between people and budgets and data center components was a primary cause of data center inefficiency. Enterprise cloud computing promised to relieve the increasing burden on people by moving it back to technology through automation and orchestration. As a means to achieve such a feat – and it is a non-trivial feat – required an ecosystem. No single vendor could hope to achieve the automation necessary to relieve the administrative and operational burden on enterprise IT staff because no data center is ever comprised of components provided by a single vendor. Partnerships – technological and practical partnerships – were necessary to enable the automation of processes spanning multiple data center components and achieve the economy of scale promised by enterprise cloud computing models. HP, while providing a wide variety of data center components itself, has nurtured such an ecosystem of partners. Combined with its HP Operations Orchestration, such technologically-focused partnerships have built out an ecosystem enabling the automation of common operational processes, effectively shifting the burden from people to technology, resulting in a more responsive IT organization. HP CLOUD MAPS One of the ways in which HP enables customers to take advantage of such automation capabilities is through Cloud Maps. Cloud Maps are similar in nature to F5’s Application Ready Solutions: a package of configuration templates, guides and scripts that enable repeatable architectures and deployments. Cloud Maps, according to HP’s description: HP Cloud Maps are an easy-to-use navigation system which can save you days or weeks of time architecting infrastructure for applications and services. HP Cloud Maps accelerate automation of business applications on the BladeSystem Matrix so you can reliably and consistently fast- track the implementation of service catalogs. HP Cloud Maps enable practitioners to navigate the complex operational tasks that must be accomplished to achieve even what seems like the simplest of tasks: server provisioning. It enables automation of incident resolution, change orchestration and routine maintenance tasks in the data center, providing the consistency necessary to enable more predictable and repeatable deployments and responses to data center incidents. Key components of HP Cloud Maps include: Templates for hardware and software configuration that can be imported directly into BladeSystem Matrix Tools to help guide planning Workflows and scripts designed to automate installation more quickly and in a repeatable fashion Reference whitepapers to help customize Cloud Maps for specific implementation HP CLOUD MAPS for F5 NETWORKS The partnership between F5 and HP has resulted in many data center solutions and architectures. HP’s Cloud Maps for F5 Networks today focuses on what HP calls server flexing – the automation of server provisioning and de-provisioning on-demand in the data center. It is designed specifically to work with F5 BIG-IP Local Traffic Manager (LTM) and provides the necessary configuration and deployment templates, scripts and guides necessary to implement server flexing in the data center. The Cloud Map for F5 Networks can be downloaded free of charge from HP and comprises: The F5 Networks BIG-IP reference template to be imported into HP Matrix infrastructure orchestration Workflow to be imported into HP Operations Orchestration (OO) XSL file to be installed on the Matrix CMS (Central Management Server) Perl configuration script for BIG-IP White papers with specific instructions on importing reference templates, workflows and configuring BIG-IP LTM are also available from the same site. The result is an automation providing server flexing capabilities that greatly reduces the manual intervention necessary to auto-scale and respond to capacity-induced events within the data center. Happy Flexing! Server Flexing with F5 BIG-IP and HP BladeSystem Matrix HP Cloud Maps for F5 Networks F5 Friday: The Dynamic Control Plane F5 Friday: The Evolution of Reference Architectures to Repeatable Architectures All F5 Friday Posts on DevCentral Infrastructure 2.0 + Cloud + IT as a Service = An Architectural Parfait What is a Strategic Point of Control Anyway? The F5 Dynamic Services Model Unleashing the True Potential of On-Demand IT307Views0likes1CommentData Center Feng Shui: Fault Tolerance and Fault Isolation
Like most architectural decisions the two goals do not require mutually exclusive decisions. The difference between fault isolation and fault tolerance is not necessarily intuitive. The differences, though subtle, are profound and have a substantial impact on data center architecture. Fault tolerance is an attribute of systems and architecture that allow it to continue performing its tasks in the event of a component failure. Fault tolerance of servers, for example, is achieved through the use of redundancy in power-supplies, in hard-drives, and in network cards. In an architecture, fault tolerance is also achieved through redundancy by deploying two of everything: two servers, two load balancers, two switches, two firewalls, two Internet connections. The fault tolerant architecture includes no single point of failure; no component that can fail and cause a disruption in service. load balancing, for example, is a fault tolerant-based strategy that leverages multiple application instances to ensure that failure of one instance does not impact the availability of the application. Fault isolation on the other hand is an attribute of systems and architectures that isolates the impact of a failure such that only a single system, application, or component is impacted. Fault isolation allows that a component may fail as long as it does not impact the overall system. That sounds like a paradox, but it’s not. Many intermediary devices employ a “fail open” strategy as a method of fault isolation. When a network device is required to intercept data in order to perform its task – a common web application firewall configuration – it becomes a single point of failure in the data path. To mitigate the potential failure of the device, if something should fail and cause the system to crash it “fails open” and acts like a simple network bridge by simply forwarding packets on to the next device in the chain without performing any processing. If the same component were deployed in a fault-tolerant architecture, there would be deployed two devices and hopefully leveraging non-network based failover mechanisms. Similarly, application infrastructure components are often isolated through a contained deployment model (like sandboxes) that prevent a failure – whether an outright crash or sudden massive consumption of resources – from impacting other applications. Fault isolation is of increasing interest as it relates to cloud computing environments as part of a strategy to minimize the perceived negative impact of shared network, application delivery network, and server infrastructure.398Views0likes2CommentsF5 Friday: Gracefully Scaling Down
What goes up, must come down. The question is how much it hurts (the user). An oft ignored side of elasticity is scaling down. Everyone associates scaling out/up with elasticity of cloud computing but the other side of the coin is just as important, maybe more so. After all, what goes up must come down. The trick is to scale down gracefully, i.e. to do it in such a way as to prevent the disruption of service to existing users while simultaneously trying to scale back down after a spike in demand. The ramifications of not scaling down are real in terms of utilization and therefore cost. Scaling up with the means to scale back down means higher costs, and simply shutting down an instance that is currently in use can result in angry users as service is disrupted. What’s necessary is to be able to gracefully scale down; to indicate somehow to the load balancing solution that a particular instance is no longer necessary and begin preparation for eventually shutting it down. Doing so gracefully requires that you are somehow able to quiesce or bleed off the connections. You want to continue to service those users who are currently connected to the instance while not accepting any new connections. This is one of the benefits of leveraging an application-aware application delivery controller versus a simple Load balancer: the ability to receive instruction in-process to begin preparation for shut down without interrupting existing connections. SERVING UP ACTIONABLE DATA BIG-IP users have always had the ability to specify whether disabling a particular “node” or “member” results in the rejection of all connections (including existing ones) or if it results in refusing new connections while allowing old ones to continue to completion. The latter technique is often used in preparation for maintenance on a particular server for applications (and businesses) that are sensitive to downtime. This method maintains availability while accommodating necessary maintenance. In version 10.2 of the core BIG-IP platform a new option was introduced that more easily enables the process of draining a server/application’s connections in preparation for being taken offline. Whether the purpose is maintenance or simply the scaling down side of elastic scalability is really irrelevant; the process is much the same. Being able to direct a load balancing service in the way in which connections are handled from the application is an increasingly important capability, especially in a public cloud computing environment because you are unlikely to have the direct access to the load balancing system necessary to manually engage this process. By providing the means by which an application can not only report but direct the load balancing service, some measure of customer control over the deployment environment is re-established without introducing the complexity of requiring the provider to manage the thousands (or more) credentials that would otherwise be required to allow this level of control over the load balancer’s behavior. HOW IT WORKS For specific types of monitors in LTM (Local Traffic Manager) – HTTP, HTTPS, TCP, and UDP – there is a new option called “Receive Disable String.” This “string” is just that, a string that is found within the content returned from the application as a result of the health check. In phase one we have three instances of an application (physical or virtual, doesn’t matter) that are all active. They all have active connections and are all receiving new connections. In phase two a health check on one server returns a response that includes the string “DISABLE ME.” BIG-IP sees this and, because of its configuration, knows that this means the instance of the application needs to gracefully go offline. LTM therefore continues to direct existing connections (sessions) with that instance to the right application (phase 3), but subsequently directs all new connection requests to the other instances in the pool (farm, cluster). When there are no more existing connections the instance can be taken offline or shut down with zero impact to users. The combination of “receive string” and “receive disable string” impacts the way in which BIG-IP interprets the instruction. A “receive string” typically describes the content received that indicates an available and properly executing application. This can be as simple as “HTTP 200 OK” or as complex as looking for a specific string in the response. Similarly the “receive disable” string indicates a particular string of text that indicates a desire to disable the node and begin the process of bleeding off connections. This could be as simple as “DISABLE” as indicated in the above diagram or it could just as easily be based solely on HTTP status codes. If an application instance starts returning 50x errors because it’s at capacity, the load balancing policy might include a live disable of the instance to allow it time to cool down – maintaining existing connections while not allowing new ones. Because action is based on matching a specific string, the possibilities are pretty much wide open. The following table describes the possible interactions between the two receive string types: LEVERAGING as a PROVIDER One of the ways in which a provider could leverage this functionality to provide differentiated value-added cloud services (as Randy Bias calls them) would be to define an application health monitoring API of sorts that allows customers to add to their application a specific set of URIs that are used solely for monitoring and can thus control the behavior of the load balancer without requiring per-customer access to the infrastructure itself. That’s a win-win, by the way. The customer gets control but so does the provider. Consider an health monitoring API that is a single URI: http://$APPLICATION_INSTANCE_HOSTNAME/health/check. Now provide a set of three options for customers to return (these are likely oversimplified for illustration purposes, but not by much): ENABLE QUIESCE DISABLE For all application instances the BIG-IP will automatically use an HTTP-derived monitor that calls $APP_INSTANCE/health/check and examines the result. The monitor would use “ENABLE” as the “receive string” and “QUIESCE” as the “receive disable” string. Based on the string returned by the application, the BIG-IP takes the appropriate action (as defined by the table above). Of course this can also easily be accomplished by providing a button on the cloud management interface to do the same via iControl, but this option is more able to be programmatically defined by customers and thus is more dynamic and allows for automation. And of course such an implementation isn’t relegated only to service providers; IT organizations in any environment can take advantage of such an implementation, especially if they’re working toward an automated data center and/or self-service provisioning/management of IT services. That is infrastructure as a service. Yes, this means modification to the application being deployed. No, I don’t think that’s a problem – cloud and Infrastructure as a Service (IaaS), at least real IaaS is going to necessarily require modifications to existing applications and new applications will need to include this type of integration in the future if we are to take advantage of the benefits afforded by a more application aware infrastructure and, conversely, a more infrastructure-aware application architecture. Related Posts728Views0likes1CommentEvent-Driven Platforms Critical for Next-Gen Network
#SDN #devops #Node.js is awesomesauce for the network, too. I recently stumbled across an article discussing node.js and devops in the same breath. Yes, I was in heaven, why do you ask? In any case, the article notes two primary reasons why node.js is an excellent choice for devops tools: Node.js has a couple distinct advantages for infrastructure maintenance: 1. It has a small footprint 2. The event-based nature of Node.js helps keep servers from getting bogged down handling time consuming tasks. -- Why So Many DevOps Tools are Written in Node.js While I'm not discounting the value of the first point, I think the second is much more important not just to devops but to the data center (and in particular management of) in general, especially given the trend toward centralized management planes across the entire network stack (think SDN). The funny thing is that if you take a step back and look at DevOps and SDN you'll see that we've had this discussion under the Infrastructure 2.0 umbrella in the past. One of the core components of the idea of Infrastructure 2.0 (which is really the pre-cursor to SDN, if not its parent) was that some centralized controller (sound familiar?), mediator, whatever you want to call it, would be responsible for accepting and distributing events across the network infrastructure stack. It grew out of concerns around IPAM (IP Address Management) and the increasingly volatile rate of change occurring there thanks to virtualization and cloud computing. But these controllers would be software, necessarily, and thus concerns regarding scalability and performance quickly become a sticking point. It should be no surprise that one of the primary arguments against OpenFlow-enabled architectures is that the controller does not scale or ultimately perform well. Clustering is an attempt to address both issues in large-scale networks, but this is just rehashing the same "how do we scale web applications" that's been going on since the dot com era began. BLOCKING versus NON-BLOCKING. SYNC versus ASYNC. That's because traditional models of development were thread-based, synchronous and ultimately blocking. Modern scalable frameworks are non-blocking and support asynchronous modes. The modern non-blocking, asynchronous model is geared toward long sessions with intermittent communication between client and server. Which is exactly the kind of model laid out to support a more dynamic, event-driven network architecture. Network components need to know right now when an event happens, but they can't waste cycles opening and closing connections to "check" every x-seconds. That mode of operation doesn't scale and the overhead from TCP session management on both the component and the server would be devastating. So asynchronous, high-capacity is what's necessary and node.js fulfills those requirements. Can you write async, non-blocking servers in other languages and on other platforms? Yes. Java and Erlang and a variety of other languages have frameworks that support such a model. But they are ultimately still encumbered by the underlying platform (application server). The impact on performance and scale still exists, albeit it occurs at a much higher threshold than traditional applications. Node.js eliminates the platform impact, thus providing DevOps (and developers, for that matter) with a high-capacity, high-performing, highly programmable platform upon which to deploy... well, whatever they want. And because so many devops-related tasks that need to be automated are driven by events (provision, launch, shut-down, add, remove, stop, start) it starts to make a lot of sense to look at node as a platform for DevOps upon which they can build tools. (Etsy has a great blog on the use of node.js for devops tools, if you're interested) The gravitational pull toward node.js is, likely, because it is inherently event-driven. It isn't that you can do event-driven, it is. That's it. With Java you can write event-driven daemons, yes, but it's not the primary model for Java (or its underlying platforms) and it takes more work than writing a traditional application. The network (and by "network" I mean the whole stack from L2 to L7) is driven by events - packet in, packet out, connection made, response received, IP address assigned, IP address released - and thus a lightweight, highly scalable event-driven framework fits naturally as a platform upon which to build tools that help manage the network.173Views0likes0CommentsCloud Brokers: The eHarmony of Cloud Infrastructure?
A long-lived match made in the cloud will require more detail than providers and brokers are currently offering Buried in blogs around the Internets are references to a research survey conducted by research firm ESG earlier this year on behalf of VMware. While obtaining the full contents require more than my meager pockets contain, the summary data held several nuggets of gold, among them this one on compatibility across cloud and on-premise infrastructure: Importance of compatibility: 78% of respondents reported that it was also important that their cloud service providers’ infrastructure technologies were compatible with their internal private cloud/virtualized datacenter. -- New Study Shows Growth of Infrastructure-as-a-Service (IaaS) Adoption I'm guessing that if they'd used a term more common to networking / infrastructure domains, say interoperability, they may have gotten more ping on this statistic. And yet, I don't think interoperability is exactly what this statistic refers to. Interoperability and compatibility imply some subtle differences. Compatibility implies a level of mutual understanding at the data level which, for infrastructure, means control-plane compatibility. Policy sharing, as an example. Interoperability implies a lower-level of exchange at the data plane, protocol processing and such. Assuming that what these 614 global respondents – all IT managers with budget responsibility – were considering important is really mobility of infrastructure service application, i.e. operational consistency, across infrastructure, this leads to an interesting question. How does one know whether a cloud offers such compatibility or not? Cloud Brokers: Catalogs Most discussions on cloud brokers today continue to focus on establishing matches between consumers of cloud and providers via characteristics like price and location and sometimes performance. But rarely do we see a requirement for matching infrastructure service capabilities (and thus compatibility) across environments. Most cloud providers do offer catalogs of a sort from which services can be selected and provisioned, but these are not very shareable, if you will. They aren't in a neat, publicly accessible list that can be compared against other providers' lists. Much in the same way registries were central to the notion of enabling the success of service-oriented architectures, cloud catalogs will be central to the success of cloud broker services' ability to compare and contrast offerings. Like profiles in on-line match making services, cloud catalogs will enable the process of matching consumers with providers based not only on simple characteristics like price and performance, but on deeper more critical capabilities like security and data integrity services, acceleration and optimization, and application-layer networking. But it's not just about listing out services. To really get to the heart of compatibility, if what we're desiring is operational consistency, we need not only a more standard method of describing infrastructure policies (rule sets, processing directives, etc…) but the means to determine whether a given infrastructure service is capable of not only accepting but creating such a policy. Such policies must be abstract, they cannot be specific to any given environment. We need a way to describe the rules used to configure Amazon Security Groups, for example, such that they can be consumed and implemented by Rackspace, or BlueLock or an OpenStack-based private cloud framework. While there are certainly efforts around describing aspects of cloud – virtual machines, applications, and even layer 2-3 networking – in a standards-based format, there's very little in the way of efforts to do so at the infrastructure service layers. Organizations for whom infrastructure compatibility is an important factor in the provider decision making process need exactly this kind of information to aid in their transition from private to hybrid and public architectures. Cloud brokers could provide this level of metadata, if providers recognize the importance of disseminating that information in a way that's easily consumable and are willing to offer the data organizations need to make their decisions. Making a compatible match between two people requires a lot more detail than just age, gender, and hobbies. It's also going to take a lot more detail than just price and location to make a compatible match between cloud consumers and providers when critical business functions are on the line.220Views0likes0CommentsThe Limits of Cloud: NICS and Nets
#Cloud is great at many things. At other things, not so much. Understanding the limitations of cloud will better enable a successful migration strategy. You might have noticed that in general, enterprise-grade networking solutions aren't always available for general deployment in public cloud environments. You might also have noticed that when you provision a compute instance in a public cloud environment you get one public (and usually one private) IP address. I'll stop for a moment and let you consider the relationship between these two facts. Many mature enterprise-grade networking solutions require at least two network interfaces – one for traffic (data plane) and one for management (control plane) and often suggest a third for optimal, best-practice deployment. It's been a long time since I've seen mature networking solutions that don't employ segregated management networks. Those solutions that sit inline and that are in the line of fire, as you will, from concentrated network and application-layer attacks, absolutely need segregated management as a means to control the solution and mitigate an in-progress attack or sudden spike in utilization that might be overwhelming the primary network. The use of a separate management network also ensures that the control plane is secured from general access. Generally speaking, it's been considered a best-practice to use a separate, secured management network for critical network components since the web exploded. Unfortunately, most cloud environments don't support this capability for customers. While certainly cloud is the largest example of control-data plane separation, such environments are designed to transport control of provider functions and service-control, not customer-deployed solutions. Thus the instances provisioned by the customer are expected to exist on the data plane because management functions (control plane) are handled through the provider's framework / API. That means vendors of mature, enterprise-grade networking solutions have few options for cloud-enabling their solutions when NICs (and networks) are limited. Amazon EC2 is one such environment; it currently does not support multiple IP addresses per instance. Simply AMI-enabling a networking solution that requires a separate management network is not going to be enough. That's why you see enterprise-class networking solutions becoming available for Amazon AWS, but only in its Virtual Private Cloud (VPC) environment. In the VPC environment instances are able to take advantage of more advanced networking capabilities familiar to enterprise operations such as control over IP address ranges, creation of subnets, and configuration of routes and network gateways. Rackspace, on the other hand, is moving toward enabling multiple networks capable of broadcasting and multicasting through its evolving support for OpenStack. Such capabilities will enable customers to take advantage of mature networking solutions within its environment. There are restrictions, of course, as will be the case with any provider, but in general such a move toward enabling advanced networking within open cloud environments is a positive one. What this all means is that when considering cloud providers and migration of applications (and their supporting infrastructure) it is critical to seek out and understand what advanced networking capabilities are – or aren't – available for each provider you are evaluating. Infrastructure support is a key component for many enterprise-class applications now being considered for migration to the cloud and not all clouds will be able to equally support the advanced networking services necessary.215Views0likes0CommentsSimplifying Application Architecture in a Dynamic Data Center through Virtualization
Application architecture has never really been easy, but the introduction of virtualization may make it even less easy – unless you plan ahead Most applications today maintain at least two if not three or more "tiers" within their architecture. Web (presentation), application server (business logic), and database (data) are the three most common "tiers" to an application, web-based or otherwise, with the presence of middleware (queues and buses) being an optional fourth tier, depending on the application and its integration needs. It's never been an easy task to ensure that the web servers know where the application servers know where the middleware know where the databases are during a deployment. Most of this information is manually configured as pools of connections defined by IP addresses. IP addresses that, with the introduction of virtualization and cloud computing , may be dynamic. Certainly not as dynamic as one would expect in a public cloud environment, but dynamic nonetheless. A high frequency of change is really no more disruptive to such an IP-dependent environment than a single change, as it requires specific configuration modifications, additions or, in the case of elasticity, removals. It is only somewhat ironic that the same technology that introduces the potential for problems is the same technology that offers a simple solution: virtualization. Only this virtualization is not the one-to-many server-style virtualization, but rather the many-to-one network virtualization that has existed since the advent of proxy-based solutions. VIRTUAL SERVICE ENDPOINTS If considered before deployment – such as during design and implementation – then the use of virtual service endpoints as implemented through network virtualization techniques can dramatically improve the ability of application architectures to scale up and down and deal with any mobility within the infrastructure that might occur. Rather than directing any given tier to the next directly, all that is necessary is to direct the tier to a virtual service endpoint (an IP-port combination) on the appropriate infrastructure and voila! Instant scalability. The endpoint on the infrastructure, such as an application delivery controller, can ensure the scalability and availability (as well as security and other operationally necessary functions) by acting on that single "virtual service endpoint". Each tier scales elastically of its own accord, based on demand, without disrupting the connectivity and availability of other tiers. No other tier need care or even be aware of changes in the IP address assignments of other tiers, because it sees the entire tier as being the "virtual service endpoint" all the time. It stabilizes integration challenges by eliminating the problems associated with changing IP addresses in each tier of the architecture. It also has the benefit of eliminating service-IP sprawl that is often associated with clustering or other proxy-based solutions that must also scale along with demand and consume more and more (increasingly) valuable IP addresses. This decoupling is also an excellent form of abstraction that enables versioning and upgrades to occur without disruption, and can be used to simultaneously support multiple versions of the same interface simply by applying some application (page level) routing at the point of ingress (the virtual service endpoint) rather than using redirects or rewrites in the application itself. The ability to leverage network virtualization to create virtual service endpoints through which application architectures can be simplified and scaled should be seriously considered during the architectural design phase to ensure applications are taking full advantage of its benefits. Once virtual service endpoints are employed, there are a variety of other functions and capabilities that may be able to further simplify or extend application architectures such as two-factor authentication and OTP (one time password) options. The future is decoupled – from internal application design to application integration to the network. Employing a decoupling-oriented approach to enterprise architecture should be a focus for architects to enable IT to take advantage of the benefits and eliminate potential operational challenges.200Views0likes0Comments