dynamic infrastructure
282 TopicsThe Full-Proxy Data Center Architecture
Why a full-proxy architecture is important to both infrastructure and data centers. In the early days of load balancing and application delivery there was a lot of confusion about proxy-based architectures and in particular the definition of a full-proxy architecture. Understanding what a full-proxy is will be increasingly important as we continue to re-architect the data center to support a more mobile, virtualized infrastructure in the quest to realize IT as a Service. THE FULL-PROXY PLATFORM The reason there is a distinction made between “proxy” and “full-proxy” stems from the handling of connections as they flow through the device. All proxies sit between two entities – in the Internet age almost always “client” and “server” – and mediate connections. While all full-proxies are proxies, the converse is not true. Not all proxies are full-proxies and it is this distinction that needs to be made when making decisions that will impact the data center architecture. A full-proxy maintains two separate session tables – one on the client-side, one on the server-side. There is effectively an “air gap” isolation layer between the two internal to the proxy, one that enables focused profiles to be applied specifically to address issues peculiar to each “side” of the proxy. Clients often experience higher latency because of lower bandwidth connections while the servers are generally low latency because they’re connected via a high-speed LAN. The optimizations and acceleration techniques used on the client side are far different than those on the LAN side because the issues that give rise to performance and availability challenges are vastly different. A full-proxy, with separate connection handling on either side of the “air gap”, can address these challenges. A proxy, which may be a full-proxy but more often than not simply uses a buffer-and-stitch methodology to perform connection management, cannot optimally do so. A typical proxy buffers a connection, often through the TCP handshake process and potentially into the first few packets of application data, but then “stitches” a connection to a given server on the back-end using either layer 4 or layer 7 data, perhaps both. The connection is a single flow from end-to-end and must choose which characteristics of the connection to focus on – client or server – because it cannot simultaneously optimize for both. The second advantage of a full-proxy is its ability to perform more tasks on the data being exchanged over the connection as it is flowing through the component. Because specific action must be taken to “match up” the connection as its flowing through the full-proxy, the component can inspect, manipulate, and otherwise modify the data before sending it on its way on the server-side. This is what enables termination of SSL, enforcement of security policies, and performance-related services to be applied on a per-client, per-application basis. This capability translates to broader usage in data center architecture by enabling the implementation of an application delivery tier in which operational risk can be addressed through the enforcement of various policies. In effect, we’re created a full-proxy data center architecture in which the application delivery tier as a whole serves as the “full proxy” that mediates between the clients and the applications. THE FULL-PROXY DATA CENTER ARCHITECTURE A full-proxy data center architecture installs a digital "air gap” between the client and applications by serving as the aggregation (and conversely disaggregation) point for services. Because all communication is funneled through virtualized applications and services at the application delivery tier, it serves as a strategic point of control at which delivery policies addressing operational risk (performance, availability, security) can be enforced. A full-proxy data center architecture further has the advantage of isolating end-users from the volatility inherent in highly virtualized and dynamic environments such as cloud computing . It enables solutions such as those used to overcome limitations with virtualization technology, such as those encountered with pod-architectural constraints in VMware View deployments. Traditional access management technologies, for example, are tightly coupled to host names and IP addresses. In a highly virtualized or cloud computing environment, this constraint may spell disaster for either performance or ability to function, or both. By implementing access management in the application delivery tier – on a full-proxy device – volatility is managed through virtualization of the resources, allowing the application delivery controller to worry about details such as IP address and VLAN segments, freeing the access management solution to concern itself with determining whether this user on this device from that location is allowed to access a given resource. Basically, we’re taking the concept of a full-proxy and expanded it outward to the architecture. Inserting an “application delivery tier” allows for an agile, flexible architecture more supportive of the rapid changes today’s IT organizations must deal with. Such a tier also provides an effective means to combat modern attacks. Because of its ability to isolate applications, services, and even infrastructure resources, an application delivery tier improves an organizations’ capability to withstand the onslaught of a concerted DDoS attack. The magnitude of difference between the connection capacity of an application delivery controller and most infrastructure (and all servers) gives the entire architecture a higher resiliency in the face of overwhelming connections. This ensures better availability and, when coupled with virtual infrastructure that can scale on-demand when necessary, can also maintain performance levels required by business concerns. A full-proxy data center architecture is an invaluable asset to IT organizations in meeting the challenges of volatility both inside and outside the data center. Related blogs & articles: The Concise Guide to Proxies At the Intersection of Cloud and Control… Cloud Computing and the Truth About SLAs IT Services: Creating Commodities out of Complexity What is a Strategic Point of Control Anyway? The Battle of Economy of Scale versus Control and Flexibility F5 Friday: When Firewalls Fail… F5 Friday: Platform versus Product4.3KViews1like1CommentInfrastructure Architecture: Whitelisting with JSON and API Keys
Application delivery infrastructure can be a valuable partner in architecting solutions …. AJAX and JSON have changed the way in which we architect applications, especially with respect to their ascendancy to rule the realm of integration, i.e. the API. Policies are generally focused on the URI, which has effectively become the exposed interface to any given application function. It’s REST-ful, it’s service-oriented, and it works well. Because we’ve taken to leveraging the URI as a basic building block, as the entry-point into an application, it affords the opportunity to optimize architectures and make more efficient the use of compute power available for processing. This is an increasingly important point, as capacity has become a focal point around which cost and efficiency is measured. By offloading functions to other systems when possible, we are able to increase the useful processing capacity of an given application instance and ensure a higher ratio of valuable processing to resources is achieved. The ability of application delivery infrastructure to intercept, inspect, and manipulate the exchange of data between client and server should not be underestimated. A full-proxy based infrastructure component can provide valuable services to the application architect that can enhance the performance and reliability of applications while abstracting functionality in a way that alleviates the need to modify applications to support new initiatives. AN EXAMPLE Consider, for example, a business requirement specifying that only certain authorized partners (in the integration sense) are allowed to retrieve certain dynamic content via an exposed application API. There are myriad ways in which such a requirement could be implemented, including requiring authentication and subsequent tokens to authorize access – likely the most common means of providing such access management in conjunction with an API. Most of these options require several steps, however, and interaction directly with the application to examine credentials and determine authorization to requested resources. This consumes valuable compute that could otherwise be used to serve requests. An alternative approach would be to provide authorized consumers with a more standards-based method of access that includes, in the request, the very means by which authorization can be determined. Taking a lesson from the credit card industry, for example, an algorithm can be used to determine the validity of a particular customer ID or authorization token. An API key, if you will, that is not stored in a database (and thus requires a lookup) but rather is algorithmic and therefore able to be verified as valid without needing a specific lookup at run-time. Assuming such a token or API key were embedded in the URI, the application delivery service can then extract the key, verify its authenticity using an algorithm, and subsequently allow or deny access based on the result. This architecture is based on the premise that the application delivery service is capable of responding with the appropriate JSON in the event that the API key is determined to be invalid. Such a service must therefore be network-side scripting capable. Assuming such a platform exists, one can easily implement this architecture and enjoy the improved capacity and resulting performance boost from the offload of authorization and access management functions to the infrastructure. 1. A request is received by the application delivery service. 2. The application delivery service extracts the API key from the URI and determines validity. 3. If the API key is not legitimate, a JSON-encoded response is returned. 4. If the API key is valid, the request is passed on to the appropriate web/application server for processing. Such an approach can also be used to enable or disable functionality within an application, including live-streams. Assume a site that serves up streaming content, but only to authorized (registered) users. When requests for that content arrive, the application delivery service can dynamically determine, using an embedded key or some portion of the URI, whether to serve up the content or not. If it deems the request invalid, it can return a JSON response that effectively “turns off” the streaming content, thereby eliminating the ability of non-registered (or non-paying) customers to access live content. Such an approach could also be useful in the event of a service failure; if content is not available, the application delivery service can easily turn off and/or respond to the request, providing feedback to the user that is valuable in reducing their frustration with AJAX-enabled sites that too often simply “stop working” without any kind of feedback or message to the end user. The application delivery service could, of course, perform other actions based on the in/validity of the request, such as directing the request be fulfilled by a service generating older or non-dynamic streaming content, using its ability to perform application level routing. The possibilities are quite extensive and implementation depends entirely on goals and requirements to be met. Such features become more appealing when they are, through their capabilities, able to intelligently make use of resources in various locations. Cloud-hosted services may be more or less desirable for use in an application, and thus leveraging application delivery services to either enable or reduce the traffic sent to such services may be financially and operationally beneficial. ARCHITECTURE is KEY The core principle to remember here is that ultimately infrastructure architecture plays (or can and should play) a vital role in designing and deploying applications today. With the increasing interest and use of cloud computing and APIs, it is rapidly becoming necessary to leverage resources and services external to the application as a means to rapidly deploy new functionality and support for new features. The abstraction offered by application delivery services provides an effective, cross-site and cross-application means of enabling what were once application-only services within the infrastructure. This abstraction and service-oriented approach reduces the burden on the application as well as its developers. The application delivery service is almost always the first service in the oft-times lengthy chain of services required to respond to a client’s request. Leveraging its capabilities to inspect and manipulate as well as route and respond to those requests allows architects to formulate new strategies and ways to provide their own services, as well as leveraging existing and integrated resources for maximum efficiency, with minimal effort. Related blogs & articles: HTML5 Going Like Gangbusters But Will Anyone Notice? Web 2.0 Killed the Middleware Star The Inevitable Eventual Consistency of Cloud Computing Let’s Face It: PaaS is Just SOA for Platforms Without the Baggage Cloud-Tiered Architectural Models are Bad Except When They Aren’t The Database Tier is Not Elastic The New Distribution of The 3-Tiered Architecture Changes Everything Sessions, Sessions Everywhere3.1KViews0likes0CommentsBack to Basics: Health Monitors and Load Balancing
#webperf #ado Because every connection counts One of the truisms of architecting highly available systems is that you never, ever want to load balance a request to a system that is down. Therefore, some sort of health (status) monitoring is required. For applications, that means not just pinging the network interface or opening a TCP connection, it means querying the application and verifying that the response is valid. This, obviously, requires the application to respond. And respond often. Best practices suggest determining availability every 5 seconds or so. That means every X seconds the load balancing service is going to open up a connection to the application and make a request. Just like a user would do. That adds load to the application. It consumes network, transport, application and (possibly) database resources. Resources that cannot be used to service customers. While the impact on a single application may appear trivial, it's not. Remember, as load increases performance decreases. And no matter how trivial it may appear, health monitoring is adding load to what may be an already heavily loaded application. But Lori, you may be thinking, you expound on the importance of monitoring and visibility all the time! Are you saying we shouldn't be monitoring applications? Nope, not at all. Visibility is paramount, providing the actionable data necessary to enable highly dynamic, automated operations such as elasticity. Visibility through health-monitoring is a critical means of ensuring availability at both the local and global level. What we may need to do, however, is move from active to passive monitoring. PASSIVE MONITORING Passive monitoring, as the modifier suggests, is not an active process. The Load balancer does not open up connections nor query an application itself. Instead, it snoops on responses being returned to clients and from that infers the current status of the application. For example, if a request for content results in an HTTP error message, the load balancer can determine whether or not the application is available and capable of processing subsequent requests. If the load balancer is a BIG-IP, it can mark the service as "down" and invoke an active monitor to probe the application status as well as retrying the request to another available instance – insuring end-users do not see an error. Passive (inband) monitors are not binary. That is, they aren't simple "on" or "off" based on HTTP status codes. Such monitors can be configured to track the number of failures and evaluate failure rates against a configurable failure interval. When such thresholds are exceeded, the application can then be marked as "down". Passive monitors aren't restricted to availability status, either. They can also monitor for performance (response time). Failure to meet response time expectations results in a failure, and the application continues to be watched for subsequent failures. Passive monitors are, like most inline/inband technologies, transparent. They quietly monitor traffic and act upon that traffic without adding overhead to the process. Passive monitoring gives operations the visibility necessary to enable predictable performance and to meet or exceed user expectations with respect to uptime, without negatively impacting performance or capacity of the applications it is monitoring.2.9KViews1like2CommentsF5 Friday: Load Balancing MySQL with F5 BIG-IP
Scaling MySQL just got a whole lot easier load balancing MySQL – any database, really – is not a trivial task. Generally speaking one does not simply round robin your way through a cluster of MySQL databases as a means to achieve scalability. It is databases, in fact, that have driven a wide variety of scalability patterns such as sharding and partitioning to achieve the ultimate goal of high-performance and scalability simultaneously. Unfortunately, most folks don’t architect their applications with scalability in mind. A single database is all that’s necessary at first, and because of the way in which the application interacts with the database, it doesn’t make sense to code in support for multiple database instances, such as is often implemented with a MySQL master-slave cluster. That’s because the application has to actually open a connection to the database in question. If you’re only starting with one database, you really can’t code in a connection to a separate instance. Eventually that application’s usage grows and the demands upon the database require a more scalable approach. Enter the MySQL master/slave relationship. A typical configuration is to maintain the master as the “write” database, i.e. all updates and/or inserts must use the master, while the slave instance is used as a “read only” instance. Obviously this means the application code must be changed to support this kind of functional sharding. Unless you leverage network server virtualization from a load balancing service capable of acting as a full-proxy at layer 7 (application) like BIG-IP. This solution leverages iRules to implement database load balancing. While this specific example is designed to perform the common functional sharding pattern of read-write separation for a master-slave MySQL cluster, the flexibility of iRules is such that other architectural solutions can easily be designed using the same basic functions. Location based sharding is another popular means of scaling databases, and using the GeoLocation capabilities of BIG-IP along with iRules to inspect and route database requests, it should be a fairly trivial architectural task to implement. The ability to further extend sharding or other distribution methodologies for scaling databases without modifying the application itself is a huge bonus for both developers and operations. By decoupling the application from the database, it provides a more flexibility set of scalability domains in which technology targeted scalability strategies can be leveraged independent of the other layers. This is an important facet of agile infrastructure architecture and should not be underestimated as a benefit of network server virtualization. MySQL Load Balancing Resources: MySQL Proxy iRule MySQL Proxy iApp (deployment package for BIG-IP v11) The Full-Proxy Data Center Architecture Infrastructure Scalability Pattern: Sharding Streams Infrastructure Scalability Pattern: Sharding Sessions Infrastructure Scalability Pattern: Partition by Function or Type IT as a Service: A Stateless Infrastructure Architecture Model F5 Friday: Platform versus Product At the Intersection of Cloud and Control… What is a Strategic Point of Control Anyway? All F5 Friday Posts on DevCentral Why Single-Stack Infrastructure Sucks2.6KViews0likes0CommentsThe Challenges of SQL Load Balancing
#infosec #iam load balancing databases is fraught with many operational and business challenges. While cloud computing has brought to the forefront of our attention the ability to scale through duplication, i.e. horizontal scaling or “scale out” strategies, this strategy tends to run into challenges the deeper into the application architecture you go. Working well at the web and application tiers, a duplicative strategy tends to fall on its face when applied to the database tier. Concerns over consistency abound, with many simply choosing to throw out the concept of consistency and adopting instead an “eventually consistent” stance in which it is assumed that data in a distributed database system will eventually become consistent and cause minimal disruption to application and business processes. Some argue that eventual consistency is not “good enough” and cite additional concerns with respect to the failure of such strategies to adequately address failures. Thus there are a number of vendors, open source groups, and pundits who spend time attempting to address both components. The result is database load balancing solutions. For the most part such solutions are effective. They leverage master-slave deployments – typically used to address failure and which can automatically replicate data between instances (with varying levels of success when distributed across the Internet) – and attempt to intelligently distribute SQL-bound queries across two or more database systems. The most successful of these architectures is the read-write separation strategy, in which all SQL transactions deemed “read-only” are routed to one database while all “write” focused transactions are distributed to another. Such foundational separation allows for higher-layer architectures to be implemented, such as geographic based read distribution, in which read-only transactions are further distributed by geographically dispersed database instances, all of which act ultimately as “slaves” to the single, master database which processes all write-focused transactions. This results in an eventually consistent architecture, but one which manages to mitigate the disruptive aspects of eventually consistent architectures by ensuring the most important transactions – write operations – are, in fact, consistent. Even so, there are issues, particularly with respect to security. MEDIATION inside the APPLICATION TIERS Generally speaking mediating solutions are a good thing – when they’re external to the application infrastructure itself, i.e. the traditional three tiers of an application. The problem with mediation inside the application tiers, particularly at the data layer, is the same for infrastructure as it is for software solutions: credential management. See, databases maintain their own set of users, roles, and permissions. Even as applications have been able to move toward a more shared set of identity stores, databases have not. This is in part due to the nature of data security and the need for granular permission structures down to the cell, in some cases, and including transactional security that allows some to update, delete, or insert while others may be granted a different subset of permissions. But more difficult to overcome is the tight-coupling of identity to connection for databases. With web protocols like HTTP, identity is carried along at the protocol level. This means it can be transient across connections because it is often stuffed into an HTTP header via a cookie or stored server-side in a session – again, not tied to connection but to identifying information. At the database layer, identity is tightly-coupled to the connection. The connection itself carries along the credentials with which it was opened. This gives rise to problems for mediating solutions. Not just load balancers but software solutions such as ESB (enterprise service bus) and EII (enterprise information integration) styled solutions. Any device or software which attempts to aggregate database access for any purpose eventually runs into the same problem: credential management. This is particularly challenging for load balancing when applied to databases. LOAD BALANCING SQL To understand the challenges with load balancing SQL you need to remember that there are essentially two models of load balancing: transport and application layer. At the transport layer, i.e. TCP, connections are only temporarily managed by the load balancing device. The initial connection is “caught” by the Load balancer and a decision is made based on transport layer variables where it should be directed. Thereafter, for the most part, there is no interaction at the load balancer with the connection, other than to forward it on to the previously selected node. At the application layer the load balancing device terminates the connection and interacts with every exchange. This affords the load balancing device the opportunity to inspect the actual data or application layer protocol metadata in order to determine where the request should be sent. Load balancing SQL at the transport layer is less problematic than at the application layer, yet it is at the application layer that the most value is derived from database load balancing implementations. That’s because it is at the application layer where distribution based on “read” or “write” operations can be made. But to accomplish this requires that the SQL be inline, that is that the SQL being executed is actually included in the code and then executed via a connection to the database. If your application uses stored procedures, then this method will not work for you. It is important to note that many packaged enterprise applications rely upon stored procedures, and are thus not able to leverage load balancing as a scaling option. Depending on your app or how your organization has agreed to protect your data will determine which of these methods are used to access your databases. The use of inline SQL affords the developer greater freedom at the cost of security, increased programming(to prevent the inherent security risks), difficulty in optimizing data and indices to adapt to changes in volume of data, and deployment burdens. However there is lively debate on the values of both access methods and how to overcome the inherent risks. The OWASP group has identified the injection attacks as the easiest exploitation with the most damaging impact. This also requires that the load balancing service parse MySQL or T-SQL (the Microsoft Transact Structured Query Language). Databases, of course, are designed to parse these string-based commands and are optimized to do so. Load balancing services are generally not designed to parse these languages and depending on the implementation of their underlying parsing capabilities, may actually incur significant performance penalties to do so. Regardless of those issues, still there are an increasing number of organizations who view SQL load balancing as a means to achieve a more scalable data tier. Which brings us back to the challenge of managing credentials. MANAGING CREDENTIALS Many solutions attempt to address the issue of credential management by simply duplicating credentials locally; that is, they create a local identity store that can be used to authenticate requests against the database. Ostensibly the credentials match those in the database (or identity store used by the database such as can be configured for MSSQL) and are kept in sync. This obviously poses an operational challenge similar to that of any distributed system: synchronization and replication. Such processes are not easily (if at all) automated, and rarely is the same level of security and permissions available on the local identity store as are available in the database. What you generally end up with is a very loose “allow/deny” set of permissions on the load balancing device that actually open the door for exploitation as well as caching of credentials that can lead to unauthorized access to the data source. This also leads to potential security risks from attempting to apply some of the same optimization techniques to SQL connections as is offered by application delivery solutions for TCP connections. For example, TCP multiplexing (sharing connections) is a common means of reusing web and application server connections to reduce latency (by eliminating the overhead associated with opening and closing TCP connections). Similar techniques at the database layer have been used by application servers for many years; connection pooling is not uncommon and is essentially duplicated at the application delivery tier through features like SQL multiplexing. Both connection pooling and SQL multiplexing incur security risks, as shared connections require shared credentials. So either every access to the database uses the same credentials (a significant negative when considering the loss of an audit trail) or we return to managing duplicate sets of credentials – one set at the application delivery tier and another at the database, which as noted earlier incurs additional management and security risks. YOU CAN’T WIN FOR LOSING Ultimately the decision to load balance SQL must be a combination of business and operational requirements. Many organizations successfully leverage load balancing of SQL as a means to achieve very high scale. Generally speaking the resulting solutions – such as those often touted by e-Bay - are based on sound architectural principles such as sharding and are designed as a strategic solution, not a tactical response to operational failures and they rarely involve inspection of inline SQL commands. Rather they are based on the ability to discern which database should be accessed given the function being invoked or type of data being accessed and then use a traditional database connection to connect to the appropriate database. This does not preclude the use of application delivery solutions as part of such an architecture, but rather indicates a need to collaborate across the various application delivery and infrastructure tiers to determine a strategy most likely to maintain high-availability, scalability, and security across the entire architecture. Load balancing SQL can be an effective means of addressing database scalability, but it should be approached with an eye toward its potential impact on security and operational management. What are the pros and cons to keeping SQL in Stored Procs versus Code Mission Impossible: Stateful Cloud Failover Infrastructure Scalability Pattern: Sharding Streams The Real News is Not that Facebook Serves Up 1 Trillion Pages a Month… SQL injection – past, present and future True DDoS Stories: SSL Connection Flood Why Layer 7 Load Balancing Doesn’t Suck Web App Performance: Think 1990s.2.2KViews0likes1CommentReactive, Proactive, Predictive: SDN Models
#SDN #openflow A session at #interop sheds some light on SDN operational models One of the downsides of speaking at conferences is that your session inevitably conflicts with another session that you'd really like to attend. Interop NY was no exception, except I was lucky enough to catch the tail end of a session I was interested in after finishing my own. I jumped into OpenFlow and Software Defined Networks: What Are They and Why Do You Care? just as discussion about an SDN implementation at CERN labs was going on, and was quite happy to sit through the presentation. CERN labs has implemented an SDN, focusing on the use of OpenFlow to manage the network. They partner with HP for the control plane, and use a mix of OpenFlow-enabled switches for their very large switching fabric. All that's interesting, but what was really interesting (to me anyway) was the answer to my question with respect to the rate of change and how it's handled. We know, after all, that there are currently limitations on the number of inserts per second into OpenFlow-enabled switches and CERN's environment is generally considered pretty volatile. The response became a discussion of SDN models for handling change. The speaker presented three approaches that essentially describe SDN models for OpenFlow-based networks: Reactive Reactive models are those we generally associate with SDN and OpenFlow. Reactive models are constantly adjusting and are in flux as changes are made immediately as a reaction to current network conditions. This is the base volatility management model in which there is a high rate of change in the location of end-points (usually virtual machines) and OpenFlow is used to continually update the location and path through the network to each end-point. The speaker noted that this model is not scalable for any organization and certainly not CERN. Proactive Proactive models anticipate issues in the network and attempt to address them before they become a real problem (which would require reaction). Proactive models can be based on details such as increasing utilization in specific parts of the network, indicating potential forthcoming bottlenecks. Making changes to the routing of data through the network before utilization becomes too high can mitigate potential performance problems. CERN takes advantage of sFlow and Netflow to gather this data. Predictive A predictive approach uses historical data regarding the performance of the network to adjust routes and flows periodically. This approach is less disruptive as it occurs with less frequency that a reactive model but still allows for trends in flow and data volume to inform appropriate routes. CERN uses a combination of proactive and predictive methods for managing its network and indicated satisfaction with current outcomes. I walked out with two takeaways. First was validation that a reactive, real-time network operational model based on OpenFlow was inadequate for managing high rates of change. Second was the use of OpenFlow as more of an operational management toolset than an automated, real-time self-routing network system is certainly a realistic option to address the operational complexity introduced by virtualization, cloud and even very large traditional networks. The Future of Cloud: Infrastructure as a Platform SDN, OpenFlow, and Infrastructure 2.0 Applying ‘Centralized Control, Decentralized Execution’ to Network Architecture Integration Topologies and SDN SDN is Network Control. ADN is Application Control. The Next IT Killer Is… Not SDN How SDN Is Defined Within ADN Architectures2.1KViews0likes0CommentsThe Limits of Cloud: Gratuitous ARP and Failover
#Cloud is great at many things. At other things, not so much. Understanding the limitations of cloud will better enable a successful migration strategy. One of the truisms of technology is that takes a few years of adoption before folks really start figuring out what it excels at – and conversely what it doesn't. That's generally because early adoption is focused on lab-style experimentation that rarely extends beyond basic needs. It's when adoption reaches critical mass and folks start trying to use the technology to implement more advanced architectures that the "gotchas" start to be discovered. Cloud is no exception. A few of the things we've learned over the past years of adoption is that cloud is always on, it's simple to manage, and it makes applications and infrastructure services easy to scale. Some of the things we're learning now is that cloud isn't so great at supporting application mobility, monitoring of deployed services and at providing advanced networking capabilities. The reason that last part is so important is that a variety of enterprise-class capabilities we've come to rely upon are ultimately enabled by some of the advanced networking techniques cloud simply does not support. Take gratuitous ARP, for example. Most cloud providers do not allow or support this feature which ultimately means an inability to take advantage of higher-level functions traditionally taken for granted in the enterprise – like failover. GRATUITOUS ARP and ITS IMPLICATIONS For those unfamiliar with gratuitous ARP let's get you familiar with it quickly. A gratuitous ARP is an unsolicited ARP request made by a network element (host, switch, device, etc… ) to resolve its own IP address. The source and destination IP address are identical to the source IP address assigned to the network element. The destination MAC is a broadcast address. Gratuitous ARP is used for a variety of reasons. For example, if there is an ARP reply to the request, it means there exists an IP conflict. When a system first boots up, it will often send a gratuitous ARP to indicate it is "up" and available. And finally, it is used as the basis for load balancing failover. To ensure availability of load balancing services, two load balancers will share an IP address (often referred to as a floating IP). Upstream devices recognize the "primary" device by means of a simple ARP entry associating the floating IP with the active device. If the active device fails, the secondary immediately notices (due to heartbeat monitoring between the two) and will send out a gratuitous ARP indicating it is now associated with the IP address and won't the rest of the network please send subsequent traffic to it rather than the failed primary. VRRP and HSRP may also use gratuitous ARP to implement router failover. Most cloud environments do not allow broadcast traffic of this nature. After all, it's practically guaranteed that you are sharing a network segment with other tenants, and thus broadcasting traffic could certainly disrupt other tenant's traffic. Additionally, as security minded folks will be eager to remind us, it is fairly well-established that the default for accepting gratuitous ARPs on the network should be "don't do it". The astute observer will realize the reason for this; there is no security, no ability to verify, no authentication, nothing. A network element configured to accept gratuitous ARPs does so at the risk of being tricked into trusting, explicitly, every gratuitous ARP – even those that may be attempting to fool the network into believing it is a device it is not supposed to be. That, in essence, is ARP poisoning, and it's one of the security risks associated with the use of gratuitous ARP. Granted, someone needs to be physically on the network to pull this off, but in a cloud environment that's not nearly as difficult as it might be on a locked down corporate network. Gratuitous ARP can further be used to execute denial of service, man in the middle and MAC flooding attacks. None of which have particularly pleasant outcomes, especially in a cloud environment where such attacks would be against shared infrastructure, potentially impacting many tenants. Thus cloud providers are understandably leery about allowing network elements to willy-nilly announce their own IP addresses. That said, most enterprise-class network elements have implemented protections against these attacks precisely because of the reliance on gratuitous ARP for various infrastructure services. Most of these protections use a technique that will tentatively accept a gratuitous ARP, but not enter it in its ARP cache unless it has a valid IP-to-MAC mapping, as defined by the device configuration. Validation can take the form of matching against DHCP-assigned addresses or existence in a trusted database. Obviously these techniques would put an undue burden on a cloud provider's network given that any IP address on a network segment might be assigned to a very large set of MAC addresses. Simply put, gratuitous ARP is not cloud-friendly, and thus it is you will be hard pressed to find a cloud provider that supports it. What does that mean? That means, ultimately, that failover mechanisms in the cloud cannot be based on traditional techniques unless a means to replicate gratuitous ARP functionality without its negative implications can be designed. Which means, unfortunately, that traditional failover architectures – even using enterprise-class load balancers in cloud environments – cannot really be implemented today. What that means for IT preparing to migrate business critical applications and services to cloud environments is a careful review of their requirements and of the cloud environment's capabilities to determine whether availability and uptime goals can – or cannot – be met using a combination of cloud and traditional load balancing services.1.1KViews1like0CommentsQuantifying Reputation Loss From a Breach
#infosec #security Putting a value on reputation is not as hard as you might think… It’s really easy to quantify some of the costs associated with a security breach. Number of customers impacted times the cost of a first class stamp plus the cost of a sheet of paper plus the cost of ink divided by … you get the picture. Some of the costs are easier than others to calculate. Some of them are not, and others appear downright impossible. One of the “costs” often cited but rarely quantified is the cost to an organization’s reputation. How does one calculate that? Well, if folks sat down with the business people more often (the ones that live on the other side of the Meyer-Briggs Mountain) we’d find it’s not really as difficult to calculate as one might think. While IT folks analyze flows and packet traces, business folks analyze market trends and impacts – such as those arising from poor customer service. And if a breach of security isn’t interpreted by the general populace as “poor customer service” then I’m not sure what is. While traditionally customer service is how one treats the customer, increasingly that’s expanding to include how one treats the customer’s data. And that means security. This question “how much does it really cost” is one Jeremiah Grossman asks fairly directly in a recent blog, “Indirect Hard Losses”: As stated by InformationWeek regarding a Ponemon Institute study on the Cost of a Data Breach, “Customers, it seems, lose faith in organizations that can't keep data safe and take their business elsewhere.” The next logical question is how much? Jeremiah goes on to focus on revenue lost from web transactions after a breach and that’s certainly part of the calculation, but what about those losses that might have been but now will never be? How can we measure not only the loss of revenue (meaning a decrease in first-order customers) but the potential loss of revenue? That’s harder, but just as important as it more accurately represents the “reputation loss” often mentioned in passing but never assigned a concrete value (at least not publicly, some industries discretely share such data with trusted members of the same industry, but seeing these numbers in the wild? Good luck!) HERE COMES the ALMOST SCIENCE 20% of the businesses that lost data lost customers as a direct result. The impacts were most severe for companies with more than 100 employees. Almost half of them lost sales. Rubicon Survey One of the first things we have to calculate is influence, as that directly impacts reputation. It is the ability of even a single customer to influence a given number of others (negatively or positively) that makes up reputation. It’s word of mouth, what people say about you, after all. If we turn to studies that focus more on marketing and sales and businessy things, we can find a lot of this data. It’s a well-studied area. One study 1 indicates that the reach of a single dissatisfied customer will tell approximately 8-16 people. Each of those people has a circle of influence of about 250, with 25 of those being within an organization's primary target audience. Of all those told 2% (1 in 50) will defect or avoid an organization upon hearing of the victim's dissatisfaction. So for every angry customer, the reputation impact is a loss of anywhere from 40-80 customers, existing and future. So much for thinking 100 records stolen in a breach is small potatoes, eh? Thousands of existing and potential customers loss is nothing to sneeze at. Now, here’s where it gets a little harder, because you’re going to have to talk to the businessy folks to get some values to attach to those losses. See, there’s two numbers you need yet: customer lifetime value (CLV) and the cost to replace a customer (which is higher than the cost of acquire a customer, but don’t ask me why, I’m not a businessy folk). Customer values are highly dependent upon industry. For example, based on 2010 FDIC data, the industry average annual customer value for a banking customer is $209 2 . Facebook’s annual revenue per user (ARPU) is estimated at $2.00 3 . Estimates claim Google makes $9.85 annually off each Android user 4 . And Zynga’s ARPU is estimated at $3.96 (based on a reported $0.33 monthly per user revenue) 5 . This is why you actually have to talk to the businessy guys, they know what these values are and you’ll need them to plug in to the influence calculation to come up with a at-least-it’s-closer-than-guessing value. You also need to ask what the average customer lifetime is, so you can calculate the loss from dissatisfied and defecting customers. Then you just need to start plugging in the numbers. Remember, too, that it’s a model; an estimate. It’s not a perfect valuation system, but it should give you some kind of idea of what the reputational impact from a breach would be, which is more than most folks have today. Even if you can’t obtain the cost to replace value, try the model without it. Try a small breach, just for fun, say of 100 records. Let’s use $4.00 as an annual customer value and a lifetime of ten years as an example. Affected Customer Loss: 100 * ($4 *10) = $4000 Influenced Customer Loss: 100 * (40) = 4000 * 40 = $160,000 Total Reputation Cost: $164,000 Adding in the cost to replace can only make this larger and serves very little purpose except to show that even what many consider a relatively small breach (in terms of records lost) can be costly. WHY is THIS VALUABLE? The reason this is valuable is two-fold. First, it serves as the basis for a very logical and highly motivating business case for security solutions designed to prevent breaches. The problem with much of security is it’s intangible and incalculable. It is harder to put monetary value to risk than it is to put monetary value on solutions. Thus, the ability to perform a cost-benefit analysis that is based in part on “reputation loss” is difficult for security professionals and IT in general. The business needs to be able to justify investments, and to do that they need hard-numbers that they can balance against. It is the security professionals who so often are called upon to explain the “risk” of a breach and loss of data to the business. By providing them tangible data based on accepted business metrics and behavior offers them a more concrete view of the costs – in money – of a breach. That gives IT the leverage, the justification, for investing in solutions such as web application firewalls and vulnerability scanning services that are designed to detect and ultimately prevent such breaches from occurring. It gives infosec some firm ground upon which stand and talk in terms the business understands: dollar signs. [1] PUTTING A PRICE TAG ON A LOST CUSTOMER [2] Free Checking and Debit Incentives Post-Durbin [3] Facebook’s Annual Revenue Per User [4] Each Android User Will Make Google $9.85 per Year in 2012 [5] Zynga Doubled ARPU From Last Year Even as Facebook Platform Changes Slowed Growth1.1KViews0likes0CommentsCloud bursting, the hybrid cloud, and why cloud-agnostic load balancers matter
Cloud Bursting and the Hybrid Cloud When researching cloud bursting, there are many directions Google may take you. Perhaps you come across services for airplanes that attempt to turn cloudy wedding days into memorable events. Perhaps you'd rather opt for a service that helps your IT organization avoid rainy days. Enter cloud bursting ... yes, the one involving computers and networks instead of airplanes. Cloud bursting is a term that has been around in the tech realm for quite a few years. It, in essence, is the ability to allocate resources across various public and private clouds as an organization's needs change. These needs could be economic drivers such as Cloud 2 having lower cost than Cloud 1, or perhaps capacity drivers where additional resources are needed during business hours to handle traffic. For intelligent applications, other interesting things are possible with cloud bursting where, for example, demand in a geographical region suddenly needs capacity that is not local to the primary, private cloud. Here, one can spin up resources to locally serve the demand and provide a better user experience.Nathan Pearcesummarizes some of the aspects of cloud bursting inthis minute long video, which is a great resource to remind oneself of some of the nuances of this architecture. While Cloud Bursting is a term that is generally accepted by the industry as an "on-demand capacity burst,"Lori MacVittiepoints out that this architectural solution eventually leads to aHybrid Cloudwhere multiple compute centers are employed to serve demand among both private-based resources are and public-based resources, or clouds, all the time. The primary driver for this: practically speaking,there are limitations around how fast data that is critical to one's application (think databases, for example) can be replicated across the internet to different data centers.Thus, the promises of "on-demand" cloud bursting scenarios may be short lived, eventually leaning in favor of multiple "always-on compute capacity centers"as loads increase for a given application.In any case, it is important to understand thatthat multiple locations, across multiple clouds will ultimately be serving application content in the not-too-distant future. An example hybrid cloud architecture where services are deployed across multiple clouds. The "application stack" remains the same, using LineRate in each cloud to balance the local application, while a BIG-IP Local Traffic Manager balances application requests across all of clouds. Advantages of cloud-agnostic Load Balancing As one might conclude from the Cloud Bursting and Hybrid Cloud discussion above, having multiple clouds running an application creates a need for user requests to be distributed among the resources and for automated systems to be able to control application access and flow. In order to provide the best control over how one's application behaves, it is optimal to use a load balancer to serve requests. No DNS or network routing changes need to be made and clients continue using the application as they always did as resources come online or go offline; many times, too, these load balancers offer advanced functionality alongside the load balancing service that provide additional value to the application. Having a load balancer that operates the same way no matter where it is deployed becomes important when resources are distributed among many locations. Understanding expectations around configuration, management, reporting, and behavior of a system limits issues for application deployments and discrepancies between how one platform behaves versus another. With a load balancer like F5's LineRate product line, anyone can programmatically manage the servers providing an application to users. Leveraging this programatic control, application providers have an easy way spin up and down capacity in any arbitrary cloud, retain a familiar yet powerful feature-set for their load balancer, ultimately redistribute resources for an application, and provide a seamless experience back to the user. No matter where the load balancer deployment is, LineRate can work hand-in-hand with any web service provider, whether considered a cloud or not. Your data, and perhaps more importantly cost-centers, are no longer locked down to one vendor or one location. With the right application logic paired with LineRate Precision's scripting engine, an application can dynamically react to take advantage of market pricing or general capacity needs. Consider the following scenarios where cloud-agnostic load balancer have advantages over vendor-specific ones: Economic Drivers Time-dependent instance pricing Spot instances with much lower cost becoming available at night Example: my startup's billing system can take advantage in better pricing per unit of work in the public cloud at night versus the private datacenter Multiple vendor instance pricing Cloud 2 just dropped their high-memory instance pricing lower than Cloud 1's Example: Useful for your workload during normal business hours; My application's primary workload is migrated to Cloud 2 with a simple config change Competition Having multiple cloud deployments simultaneously increases competition, and thusyour organization's negotiated pricing contracts become more attractiveover time Computational Drivers Traffic Spikes Someone in marketing just tweeted about our new product. All of a sudden, the web servers that traditionally handled all the loads thrown at them just fine are gettingslashdottedby people all around North America placing orders. Instead of having humans react to the load and spin up new instances to handle the load - or even worse: doing nothing - your LineRate system and application worked hand-in-hand to spin up a few instances in Microsoft Azure's Texas location and a few more in Amazon's Virginia region. This helps you distribute requests from geographically diverse locations: your existing datacenter in Oregon, the central US Microsoft Cloud, and the east-coast based Amazon Cloud. Orders continue to pour in without any system downtime, or worse: lost customers. Compute Orchestration A mission-critical application in your organization's private cloud unexpectedly needs extra computer power, but needs to stay internal for compliance reasons. Fortunately, your application can spin up public cloud instances and migrate traffic out of the private datacenter without affecting any users or data integrity. Your LineRate instance reaches out to Amazon to boot instances and migrate important data. More importantly, application developers and system administrators don't even realize the application has migrated since everything behaves exactly the same in the cloud location. Once the cloud systems boot, alerts are made to F5's LTM and LineRate instances that migrate traffic to the new servers, allowing the mission-critical app to compute away. You just saved the day! The benefit to having a cloud-agnostic load balancing solution for connecting users with an organization's applications not only provides a unified user experience, but provides powerful, unified way of controlling the application for its administrators as well. If all of a sudden an application needs to be moved from, say, aprivate datacenter with a 100 Mbps connection to a public cloud with a GigE connection, this can easily be done without having to relearn a new load balancing solution. F5's LineRate product is available for bare-metal deployments on x86 hardware, virtual machine deployments, and has recently deployed anAmazon Machine Image (AMI). All of these deployment types leverage the same familiar, powerful tools that LineRate offers:lightweight and scalable load balancing, modern management through its intuitive GUI or the industry-standard CLI, and automated control via itscomprehensive REST API.LineRate Point Load Balancerprovides hardened, enterprise-grade load balancing and availability services whereasLineRate Precision Load Balanceradds powerful Node.js programmability, enabling developers and DevOps teams to leveragethousands of Node.js modulesto easily create custom controlsfor application network traffic. Learn about some of LineRate'sadvanced scripting and functionalityhere, ortry it out for freeto see if LineRate is the right cloud-agnostic load balancing solution for your organization.900Views0likes0CommentsArchitecting Scalable Infrastructures: CPS versus DPS
#webperf As we continue to find new ways to make connections more efficient, capacity planning must look to other metrics to ensure scalability without compromising performance. Infrastructure metrics have always been focused on speeds and feeds. Throughput, packets per second, connections per second, etc… These metrics have been used to evaluate and compare network infrastructure for years, ultimately being used as a critical component in data center design. This makes sense. After all, it's not rocket science to figure out that a firewall capable of handling 10,000 connections per second (CPS) will overwhelm a next hop (load balancer, A/V scanner, etc… ) device only capable of 5,000 CPS. Or will it? The problem with old skool performance metrics is they focus on ingress, not egress capacity. With SDN pushing a new focus on both northbound and southbound capabilities, it makes sense to revisit the metrics upon which we evaluate infrastructure and design data centers. CONNECTIONS versus DECISIONS As we've progressed from focusing on packets to sessions, from IP addresses to users, from servers to applications, we've necessarily seen an evolution in the intelligence of network components. It's not just application delivery that's gotten smarter, it's everything. Security, access control, bandwidth management, even routing (think NAC), has become much more intelligent. But that intelligence comes at a price: processing. That processing turns into latency as each device takes a certain amount of time to inspect, evaluate and ultimate decide what to do with the data. And therein lies the key to our conundrum: it makes a decision. That decision might be routing based or security based or even logging based. What the decision is is not as important as the fact that it must be made. SDN necessarily brings this key differentiator between legacy and next-generation infrastructure to the fore, as it's just software-defined but software-deciding networking. When a switch doesn't know what to do with a packet in SDN it asks the controller, which evaluates and makes a decision. The capacity of SDN – and of any modern infrastructure – is at least partially determined by how fast it can make decisions. Examples of decisions: URI-based routing (load balancers, application delivery controllers) Virus-scanning SPAM scanning Traffic anomaly scanning (IPS/IDS) SQLi / XSS inspection (web application firewalls) SYN flood protection (firewalls) BYOD policy enforcement (access control systems) Content scrubbing (web application firewalls) The DPS capacity of a system is not the same as its connection capacity, which is merely the measure of how many new connections a second can be established (and in many cases how many connections can be simultaneously sustained). Such a measure is merely determining how optimized the networking stack of any given solution might be, as connections – whether TCP or UDP or SMTP – are protocol oriented and it is the networking stack that determines how well connections are managed. The CPS rate of any given device tells us nothing about how well it will actually perform its appointed tasks. That's what the Decisions Per Second (DPS) metric tells us. CONSIDERING BOTH CPS and DPS Reality is that most systems will have a higher CPS compared to its DPS. That's not necessarily bad, as evaluating data as it flows through a device requires processing, and processing necessarily takes time. Using both CPS and DPS merely recognizes this truth and forces it to the fore, where it can be used to better design the network. A combined metric helps design the network by offering insight into the real capacity of a given device, rather than a marketing capacity. When we look only at CPS, for example, we might feel perfectly comfortable with a topological design with a flow of similar CPS capacities. But what we really want is to make sure that DPS –> CPS (and vice-versa) capabilities were matched up correctly, lest we introduce more latency than is necessary into a given flow. What we don't want is to end up with is a device with a high DPS rate feeding into a device with a lower CPS rate. We also don't want to design a flow in which DPS rates successively decline. Doing so means we're adding more and more latency into the equation. The DPS rate is a much better indicator of capacity than CPS for designing high-performance networks because it is a realistic measure of performance, and yet a high DPS coupled with a low CPS would be disastrous. Luckily, it is almost always the case that a mismatch in CPS and DPS will favor CPS, with DPS being the lower of the two metrics in almost all cases. What we want to see is as close a CPS:DPS ratio as possible. The ideal is 1:1, of course, but given the nature of inspecting data it is unrealistic to expect such a tight ratio. Still, if the ratio becomes too high, it indicates a potential bottleneck in the network that must be addressed. For example, assume an extreme case of a CPS:DPS of 2:1. The device can establish 10,000 CPS, but only process at a rate of 5,000 DPS, leading to increasing latency or other undesirable performance issues as connections queue up waiting to be processed. Obviously there's more at play than just new CPS and DPS (concurrent connection capability is also a factor) but the new CPS and DPS relationship is a good general indicator of potential issues. Knowing the DPS of a device enables architects to properly scale out the infrastructure to remediate potential bottlenecks. This is particularly true when TCP multiplexing is in play, because it necessarily reduces CPS to the target systems but in no way impacts the DPS. On the ingress, too, are emerging protocols like SPDY that make more efficient use of TCP connections, making CPS an unreliable measure of capacity, especially if DPS is significantly lower than the CPS rating of the system. Relying upon CPS alone – particularly when using TCP connection management technologies - as a means to achieve scalability can negatively impact performance. Testing systems to understand their DPS rate is paramount to designing a scalable infrastructure with consistent performance. The Need for (HTML5) Speed SPDY versus HTML5 WebSockets Y U No Support SPDY Yet? Curing the Cloud Performance Arrhythmia F5 Friday: Performance, Throughput and DPS Data Center Feng Shui: Architecting for Predictable Performance On Cloud, Integration and Performance735Views0likes0Comments