paas
14 TopicsThe Next Cloud Battleground: PaaS
#cloud #PaaS #devops Private will win this one because it already exists in theory Back in the day - when the Internets were exploding and I was still coding - I worked in enterprise architecture. Enterprise architecture, for the record, is generally not the same as application development. When an organization grows beyond a certain point, it becomes necessary to start designing a common framework upon which applications can be rapidly developed and deployed. Architects design and implement this framework and application developers then code their applications for deployment on that architecture. If that sounds a lot like PaaS it should because deep down, it is. The difference with PaaS is its focus on self-service and operationalization of the platform through automation and orchestration. Traditional enterprise architectures scaled through traditional mechanisms, while PaaS enables a far more fluid and elastic model for scalability and a more service-oriented, API-driven method of management. A 2012 Engine Yard survey found that it is the operational benefits that are driving interest in PaaS. The "cost-savings" argument typically associated with cloud solutions? A distant third in benefits attributed to this "new" model: Interestingly, folks seem positively enamored of public models of cloud computing, including PaaS, and are ignoring the ginormous potential within the data center, inside the enterprise walls. It's far less of a leap to get enterprise architects and developers migrating to a PaaS model in the enterprise than it is to get server and network administrators and operators to move to a service-based model for infrastructure. That's because the architects and developers are familiar with the paradigm, they've been "doing it" already and all that's really left is the operationalization of the underlying infrastructure upon which their architectural frameworks (and thus applications) have been deployed. At the end of the day (or the end of the hype cycle as it were), PaaS is not all that different from what enterprise architects have been building out for years. What they need now is operationalization of the platforms to enable the scalability and reliability of the application infrastructure upon which they've built their frameworks.185Views0likes0CommentsProgrammability in the Network: Your Errors, Do not Show Them to Me
#devops Errors happen, but your users should never see them. Ever. Every once in a while things happen, like errors. They are as inevitable as winter in Wisconsin, rain in Seattle, and that today someone will post a picture of a cat that shows up on your Facebook news feed. Admit it, you looked, didn't you? The inevitability of 404 errors launched an entire "best practice" of web design to include a fun or amusing error page to present to users. Because looking at a standard 404 error page is really pretty ... boring. We should, of course, be building systems to fail - or more precisely to handle failure gracefully. But at some point, the system is going to fail so spectacularly that it's going to cascade back toward the user. At some point, the application can't address an error because, well, it is the cause of the error. At that point, it becomes the responsibility of the network - of the infrastructure - to handle the error gracefully. That means without splashing a really ugly 503, text-based error on the screen that's going to confuse 99% of a site or application's users. The ability to "catch" errors on the egress (outbound) data path is not new, nor is the ability to programmatically deal with that error in order to either (a) reactively attempt to redress the situation or (b) present the user with something that makes sense (translated that means it does not include the HTTP error or error codes). Even something as simple as providing as much information as possible about the failed transaction can be valuable, but requires the ability to recognize and subsequently act upon the error. That means visibility and programmability somewhere in the network. This is where devops lives, in the network but at the application layers. This is where the value of devops is illustrated outside of monthly reports and charts indicating number of releases and length of time to deploy. It is in the network at the application layer where devops is able to bridge the gap between operations and applications and provide the insight, information, and services necessary to deliver applications smoothly. That means, in part, addressing errors that inevitably occur in a meaningful way. Logs, notifications, and presenting consumable information about the state of the application to the user are all part and parcel of dealing with failure. There is no reason a user should ever be presented with raw error messages emanating from a misbehaving application given the breadth and depth of programmable services available today unless, of course, a solution which does not provide such capabilities has been made the foundation for an application's delivery infrastructure.231Views0likes0CommentsEnterprise PaaS is about Operations
#PaaS #devops The notion that PaaS exists solely "in the cloud" as a discrete environment of developer services is hampering the maturation of enterprise PaaS The three most common answers to "give me an example of PaaS" are: Force.com, Azure, Google. I didn't even need to do an unscientific Internet survey to nail that one down. These are certainly fine examples of PaaS, but they are not necessarily examples of enterprise PaaS solutions. While off-premise PaaS offerings do address many of the same challenges being faced by enterprise operations today, they do so in a way that makes integration and control - not to mention the measurement and monitoring required by developers - nearly impossible. “Our core competency is application development; we are not a technical operations or cloud operations team,” Fischer said. PaaS technology saving time, money, changing application development Indeed, an Engine Yard "State of PaaS" survey in 2012 showed that respondents see the biggest benefits of PaaS as being operational in nature. What the Engine Yard survey does not provide insight into is whether this explosive growth in PaaS will be on- or off-premise. While the benefits remain the same (with the exception, perhaps, of capital cost reduction) the difference between the two models is significant. On the one hand, an off-premise PaaS solution means no worries about anything infrastructure for anyone within the organization. On the other hand, an off-premise premise PaaS solution comes with the same baggage as off-premise IaaS offerings: a lack of control and visibility into the infrastructure. Visibility and control that is often considered critical to maintaining performance of applications and applying appropriate security and access control policies. On-premise PaaS, of course, comes with a price tag and a longer time to implement, both of which may be show-stoppers for those organizations that cannot risk capital or time. But for the more forward looking, long-term investment capable organizations, on-premise PaaS will ultimately offer both the reduction in costs (over time) as well as the highly desirable operational benefits without compromising on visibility or control. That is in part because as a component of the PaaS infrastructure, visibility and control mechanisms can be accounted for and architected into the solution. This is exactly what Margaret Dawson, Vice President of Product Marketing and Cloud Evangelist for HP and CloudNOW Top Ten Women in Cloud recipient, noted at DeployCon this past spring, "Most enterprises are going to move to a hybrid world. What we don’t have yet is a common management and monitoring layer. For enterprises, to be able to audit and control everything in one interface is going to become critical.” [emphasis mine] It is common management that enables both control and visibility, as well as operationalizing scalability - a key capability desired by application developers and DevOps adoptees alike. Control and visibility are critical to addressing a key application concern - performance. As noted by a recent F5 worldwide survey, performance of web applications remain at the top of their "to address" list, closely followed by controlling costs and mobile-user performance. Controlling infrastructure costs can be in part helped by visibility and more intelligent distribution of load that adapts more rapidly to demand and ensures that resources are consumed in a manner that optimizes utilization without compromising on performance. That means application-appropriate load balancing algorithms (read: round robin is almost never appropriate for modern applications) as well as an integrated feedback loop between the application and the scalability service (i.e. the load balancing service through which elasticity is achieved). Enterprise PaaS, therefore, will focus more on the implementation of a common management and operational platform that provides control, visibility, and scalability to enterprise developers and devops that can meet both performance and cost requirements. Enterprise PaaS is, ultimately, about operations and how operations can enable development and devops to address the concerns typically associated with application delivery. There are signs that application server platform vendors like RedHat are beginning to address these concerns with new and expanded offerings. But as with many other application delivery capabilities that started out in the application server tier, these will likely not remain there long as other tiers within the data center begin to offer more capable and robust application-specific solutions.203Views0likes0CommentsWhat Applications Want
#SDN #API #devops What is it that applications want, and more importantly, what of those desires can the network fulfill? That's one of the questions SDN has to answer in order to make SDN relevant in the big picture that is the software-defined data center. What is it, other than forwarding packets and routing between hops and adding a little QoS here and there, can the network offer to applications? Consider the response of Robert Sherwood, CTO of Big Switch Networks and head of the ONF's Architecture and Framework Working Group, responsible in part for the standardizing of SDN controller northbound APIs to Network World Editor in Chief John Dix's question regarding the role of the northbound API in the SDN architecture: So the northbound API is how that business application [e.g. Hadoop, OpenStack Nova] talks to the controller to explicitly describe its requirements: I am OpenStack. I want this VM field to talk to this other VM but no other VMs can talk to them, etc. But also give me a view of how loaded the network is so I can make an informed decision on where to put new VMs. So those are two examples of northbound APIs that I think are meaningful for people. Clarifying the role of software-defined networking northbound APIs These are two powerful examples of visibility (monitoring of load and conditions) and security (access control, essentially) that are lacking in today's architectures. While people (ops) clearly has visibility, this data is often shuttered off to an APM (application performance monitoring) system, never to be seen again except in the week operations report. Security, of course, is something applications and devops have traditionally accomplished through the use of IP access control lists in the operating system or using application-specific methods to enable/disable access from specific IP addresses and/or ranges. This, of course, is simply not a sustainable method of managing access in a modern, volatile environment. Such models were designed for fixed, static networks wherein application servers and systems were assigned an IP address at deployment - and they stayed put. Virtualization and cloud computing models break that model and introduce volatility, particularly when elasticity is desired. Also of importance is the ability to segment out network traffic, to isolate tenants in the parlance of modern cloud architectures. VLAN assignment has traditionally been a very manual process, requiring updates to multiple pieces of network infrastructure along the data path. By enabling a more dynamic and automatic assignment process, tenant traffic can then be assigned specific network performance profiles that aid in meeting service level agreements, as well as routing to services specific to the application such as those providing security at multiple layers of the network stack. This is the concept behind service chaining; dynamically routing traffic through a set of services to provide valuable infrastructure functions on the inbound and outbound data path. What this implies is not that the controller or the controller "applications" are necessarily providing higher order functions. The controller applications can also be responsible for routing traffic to the appropriate services that provide those higher order functions. The SDN controller and its applications become the primary means of orchestrating traffic through the network, delegating to services hosted in the network those functions that are appropriate for the application. BUT THAT'S NOT WHAT APPLICATIONS WANT What's interesting is that VLAN and default gateway configurations are not really application concerns. They are operating system concerns, network device concerns, but they are not, as is becoming the vernacular, domain concerns that are or even should be something the "application" wants. Oh, certainly the application needs an IP address and security policies may dictate that it exchange data only with certain other systems, but that's not what the application wants. That's what it needs. To really start addressing what applications want, we must start evaluating domain concerns that are specific to the application. An example of this is moving the functionality provided by WCCP (Web Cache Communication Protocol) to an SDN controller application. The cache application on the SDN controller would not necessarily provide the caching service itself, but rather offers the ability to determine if application requests destined for a specific application should be redirected to a caching service which is deployed atop an SDN-enabled (managed) network fabric. The way in which a router today uses WCCP to redirect and route network traffic to a stand-alone web cache translates to an SDN application. In the SDN model, using the northbound API, an application can inform the network it desires the services of a caching system. The SDN controller might then orchestrates the flow of traffic appropriately, chaining services to ensure the inclusion of the cache in the data path. The interesting thing to watch in the coming months (and probably years, considering the maturation level of SDN in general) will be discovering what "wants" an application has that might be fulfilled using this model. Is it the case that an application will be able to inform an SDN controller it "wants" web application firewall protection for a set of URIs and that from that information the SDN controller will be able to orchestrate (chain) the appropriate services as well as its configuration? Only time will tell whether this model will mature and turn out to be "the one" but what seems obvious is that success of this model depends entirely on just how application (domain) aware the model will be. Because what applications want are application (domain) services that reside far higher in the stack than what today's SDN models propose to provide and support. Service chaining in conjunction with a robust northbound API seems a feasible means to address that.227Views0likes0CommentsThe Polyglot PaaS Platform
#PaaS @rishidot What does that mean? It means operationalizing the platform. Krishnan Subramanian recently posted slides he created for Cloud Connect India on the topic of PaaS. Krish covers drivers and where PaaS is headed and includes in his list of characteristics of next-generation platforms the criterion of "polyglot platforms." This simple phrase encapsulates so much more than just the notion of platforms capable of supporting multiple development languages. It comprises the notion of an operationalized polyglot platform, one that brings standardization to operations while providing flexibility for developers to choose the right tool (language) for the job (application). To understand why this is so important (and game-changing) you have to understand traditional enterprise application platforms. They are largely single-language platforms (think .NET and Java EE) that are operationalized only in the sense that scripting languages are capable of remotely modifying platform configurations and restarting the daemon. Enterprise organizations have long standardized on a few key platforms as a means to constrain the costs associated with licensing (OSS gained traction this way) and more importantly, administrative overhead. Very little in the platform is focused on networky-things like elasticity (scale) and performance monitoring or application domain-layer concerns. Add-on modules provide some application domain-layer concerns, but bring to the table additional overhead in compatibility and long-term added overhead. The need at the second layer of the cloud pyramid (remember that one? IaaS PaaS SaaS ) for operational consistent and orchestration is as great at that of the first layer, IaaS. That's because at the PaaS layer there are all sorts of interesting, application domain-specific things happening under the covers. Acceleration, optimization, application routing and a variety of other application network services often end up being implemented in the application server as a means to provide what essentially can be considered support for multi-tenancy, i.e. the ability to tailor network and application network services to specific applications. The thing is that being able to offer "IT as a Service" or consistently deploy services is made more difficult by the need to support 2 or 3 or more application platforms at the operational layer. Thus, if a single, polyglot capable platform were available it would provide operations with a way to enable streamlined provisioning and management without overly restricting the choice of languages. Note that a polyglot platform isn't necessarily one that can run PHP and node and RoR at the same time. Oh, maybe it could, but you really wouldn't want to do that. The value of a polyglot platform is in the platform; in the operational consistency and sameness of the underlying platform across applications implemented using different languages. It's in standardizing the platform layer to achieve economy of scale in the deployment domain in much the same manner that it is standardization at the infrastructure and network layers comprising cloud that enables the benefits of effiiciency and agility. It is about standardization at the platform layer without placing undesirable constraints on the development environment. The idea that you can create an agile ops environment by standardizing on a platform without impacting development is a pretty powerful one, because it remains flexible (for the consumers, the developers) while reigning in control of the platform in a way that ultimately leads to the benefits afforded through economy of scale in the network infrastructure layer at the application infrastructure layer.585Views0likes0CommentsKnapsacks, Shopping Carts and Application Workloads
#geek Don't laugh or next week we'll talk fractals and tessellations and nature and tie that back to ERP. Somehow. Don't tempt me. I'm not kidding. The other day I was shopping (on-line of course, because you know, Internets) as I often do: I was putting things into my shopping cart and occasionally pruning it back to fit within this imaginary total I was willing to spend. That's time consuming (albeit fun and sometimes agonizing) so I decided I should write an application that, when presented with X number of items and prices, could arrange those items in multiple carts, each under budget so I had all the possible combinations represented. That, of course, immediately evoked thoughts of "How would you write that?" which immediately turned to the thought that it distilled down to a very basic knapsack problem. Which in turn - you knew it was going there - led to the realization that really, capacity planning is as NP-complete as provisioning of compute in the cloud. It's as difficult a process as manually (or automatically) rearranging my shopping cart because applications are not workloads, and vice-versa. Not today's applications, anyway. WORKLOAD != APPLICATION An application today is actually comprised of multiple workload types, each of which exerts its own demands on compute and I/O that impact total capacity. We tend to base capacity planning around the notion of requests per second, or concurrent connections, or concurrent users, but those metrics are only vaguely related to actual compute and I/O consumption. The consumption profile of an application request that accesses a database is different than one performing analysis is different from one parsing and filtering data. These are the metrics upon which we would optimally base capacity, and yet we can't because they're all mixed up together in the same application and it is the application we are scaling, not individual workload types. The knapsack problem is NP-complete primarily because the objects being put into the knapsack are non-equivalent. They are all different, like the prices of clothing and various pairs of shoes I put in my shopping cart. And when you have a target you're trying to hit and using variably sized objects to hit it, you end up with a knapsack-like problem. In an ideal, workload-driven architecture, we'd be able to recognize that X compute is needed for every request for a specific workload and thus if we want to support Y users or connections of that workload concurrently, we need Z compute. Determining how many "servers" is needed then becomes simple mathematics and it's no longer NP-Complete or NP-Hard or NP-anything. If we could focus selecting a pool of resources based on one-constraint - I/O or compute - and match that to "users" needing that same constraint, we'd be making much more efficient use of resources all around. It would aid in sizing of "servers" because we match the RAM and CPU available to the demands placed on the hardware of the services being served. There's a model that enables this, and it isn't the composite application model we're used to. Ultimately, the API model will provide the means by which we can more efficiently provision and scale "applications" by lessening the requirement to consider multiple variables. Stay tuned for a more in-depth post on that topic....186Views0likes0CommentsDoes Cloud Solve or Increase the 'Four Pillars' Problem?
It has long been said – often by this author – that there are four pillars to application performance: Memory CPU Network Storage As soon as you resolve one in response to application response times, another becomes the bottleneck, even if you are not hitting that bottleneck yet. For a bit more detail, they are “memory consumption” – because this impacts swapping in modern Operating Systems. “CPU utilization” – because regardless of OS, there is a magic line after which performance degrades radically. “Network throughput” – because applications have to communicate over the network, and blocking or not (almost all coding for networks today is), the information requested over the network is necessary and will eventually block code from continuing to execute. “Storage” – because IOPS matter when writing/reading to/from disk (or the OS swaps memory out/back in). These four have long been relatively easy to track. The relationship is pretty easy to spot, when you resolve one problem, one of the others becomes the “most dangerous” to application performance. But historically, you’ve always had access to the hardware. Even in highly virtualized environments, these items could be considered both at the Host and Guest level – because both individual VMs and the entire system matter. When moving to the cloud, the four pillars become much less manageable. The amount “much less” implies depends a lot upon your cloud provider, and how you define “cloud”. Put in simple terms, if you are suddenly struck blind, that does not change what’s in front of you, only your ability to perceive it. In the PaaS world, you have only the tools the provider offers to measure these things, and are urged not to think of the impact that host machines may have on your app. But they do have an impact. In an IaaS world you have somewhat more insight, but as others have pointed out, less control than in your datacenter. Picture Courtesy of Stanley Rabinowitz, Math Pro Press. In the SaaS world, assuming you include that in “cloud”, you have zero control and very little insight. If you app is not performing, you’ll have to talk to the vendors’ staff to (hopefully) get them to resolve issues. But is the problem any worse in the cloud than in the datacenter? I would have to argue no. Your ability to touch and feel the bits is reduced, but the actual problems are not. In a pureplay public cloud deployment, the performance of an application is heavily dependent upon your vendor, but the top-tier vendors (Amazon springs to mind) can spin up copies as needed to reduce workload. This is not a far cry from one common performance trick used in highly virtualized environments – bring up another VM on another server and add them to load balancing. If the app is poorly designed, the net result is not that you’re buying servers to host instances, it is instead that you’re buying instances directly. This has implications for IT. The reduced up-front cost of using an inefficient app – no matter which of the four pillars it is inefficient in – means that IT shops are more likely to tolerate inefficiency, even though in the long run the cost of paying monthly may be far more than the cost of purchasing a new server was, simply because the budget pain is reduced. There are a lot of companies out there offering information about cloud deployments that can help you to see if you feel blind. Fair disclosure, F5 is one of them, I work for F5. That’s all you’re going to hear on that topic in this blog. While knowing does not always directly correlate to taking action, and there is some information that only the cloud provider could offer you, knowing where performance bottlenecks are does at least give some level of decision-making back to IT staff. If an application is performing poorly, looking into what appears to be happening (you can tell network bandwidth, VM CPU usage, VM IOPS, etc, but not what’s happening on the physical hardware) can inform decision-making about how to contain the OpEx costs of cloud. Internal cloud is a much easier play, you still have access to all the information you had before cloud came along, and generally the investigation is similar to that used in a highly virtualized environment. From a troubleshooting performance problems perspective, it’s much the same. The key with both virtualization and internal (private) clouds is that you’re aiming for maximum utilization of resources, so you will have to watch for the bottlenecks more closely – you’re “closer to the edge” of performance problems, because you designed it that way. A comprehensive logging and monitoring environment can go a long way in all cloud and virtualization environments to keeping on top of issues that crop up – particularly in a large datacenter with many apps running. And developer education on how not to be a resource hog is helpful for internally developed apps. For externally developed apps the best you can do is ask for sizing information and then test their assumptions before buying. Sometimes, cloud simply is the right choice. If network bandwidth is the prime limiting factor, and your organization can accept the perceived security/compliance risks, for example, the cloud is an easy solution – bandwidth in the cloud is either not limited, or limited by your willingness to write a monthly check to cover usage. Either way, it’s not an Internet connection upgrade, which can be dastardly expensive not just at install, but month after month. Keep rocking it. Get the visibility you need, don’t worry about what you don’t need. Related Articles and Blogs: Don MacVittie - Load Balancing For Developers Advanced Load Balancing For Developers. The Network Dev Tool Load Balancers for Developers – ADCs Wan Optimization ... Intro to Load Balancing for Developers – How they work Intro to Load Balancing for Developers – The Gotchas Intro to Load Balancing for Developers – The Algorithms Load Balancing For Developers: Security and TCP Optimizations Advanced Load Balancers for Developers: ADCs - The Code Advanced Load Balancing For Developers: Virtual Benefits Don MacVittie - ADCs for Developers Devops Proverb: Process Practice Makes Perfect Devops is Not All About Automation 1024 Words: Why Devops is Hard Will DevOps Fork? DevOps. It's in the Culture, Not Tech. Lori MacVittie - Development and General Devops: Controlling Application Release Cycles to Avoid the ... An Aristotlean Approach to Devops and Infrastructure Integration How to Build a Silo Faster: Not Enough Ops in your Devops236Views0likes0CommentsCloud Computing Goes Back to College
The University of Washington adds a cloud computing certificate program to its curriculum It’s not unusual to find cloud computing in a college environment. My oldest son was writing papers on cloud computing years ago in college, before “cloud” was a marketing term thrown about by any and everyone pushing solutions and products hosted on the Internet. But what isn’t often seen is a focus on cloud computing on its own; as its own “area of study” within the larger context of computer science. That could be because when you get down to it, cloud computing is merely an amalgamation of computer science topics and is more about applying processes and technology to modern data center problems than it is a specific technology itself. But it is a topic of interest, and it is a complex subject (from the perspective of someone building out a cloud or even architecting solutions that take advantage of the cloud) so a program of study may in fact be appropriate to provide a firmer foundation in the concepts and technologies underpinning the nebulous “cloud” umbrella. The University of Washington recently announced the addition of a cloud computing certificate program to its curriculum. This three-course program of study is intended to explore cloud computing across a broad spectrum of concerns, from IaaS to PaaS to SaaS, with what appears to be a focus on IaaS in later courses. The courses and instructors are approved by the UW Department of Computer Science, and are designed for college-level and career professionals. They are non-credit courses that will set you back approximately $859 per course. Those of us not in close proximity may want to explore the online option, if you’re interested in such a certificate to hang upon your wall. This is one of the first certificates available, so it will be interesting to see whether it’s something the market is seeking or whether it’s just a novelty. In general, the winter course appears to really get into the meat and serves up a filling course. While I’m not dismissing the first course offered in the fall, it does appear light on the computer science and heavy on the market which, in general, seems more appropriate for an MBA-style program than one tied to computer science. The spring selection looks fascinating – but may be crossing too many IT concerns at one time. There’s very few folks who are as comfortable on a switch command line that are also able to deal with the programmatic intricacies of data-related topics like Hadoop, HIVE, MapReduce and NoSQL. My guess is that the network and storage network topics will be a light touch given the requirement for programming experience and the implicit focus on developer-related topics. The focus on databases and lack of a topic specifically addressing scalability models of applications is also interesting, though given the inherent difficulties and limitations on scaling “big data” in general, it may be necessary to focus more on the data tier and less on the application tiers. Of course I’m also delighted beyond words to see the load testing component in the winter session, as it cannot be stressed enough that load testing is an imperative when building any highly scalable system and it’s rarely a topic discussed in computer science degree programs. The program is broken down into a trimester style course of study, with offerings in the fall, winter and spring. Fall: Introduction to Cloud Computing Overview of cloud (IaaS/PaaS/Saas, major vendors, market overview) Cloud Misconceptions Cloud Economics Fundamentals of distributed systems Data center design Cloud at startup Cloud in the Enterprise Future Trends Winter: Cloud Computing in Action Basic Cloud Application Building Instances Flexible persistent storage (aka EBS) Hosted SQL Load testing Operations (Monitoring, version control, deployment, backup) Small Scaling Autoscaling Continued Operations Advanced Topics ( Query optimization, NoSQL solutions, memory caching, fault tolerance, disaster recovery) Spring: Scalable and Data-Intensive Computing in the Cloud Components of scalable computing Cloud building topics (VLAN, NAS, SAN, Network switches, VMotion) Consistency models for large-scale distributed systems MapReduce/Big Data/NoSQL Systems Programming Big Data (Hadoop, HIVE, Pig, etc) Database-as-a- Service (SQL Azure, RDS, Database.com) Apposite to the view that cloud computing is a computer science related topic, not necessarily a business-focused technology, are the requirements for the course: programming experience, a fundamental understanding of protocols and networking, and the ability to remotely connect to Linux instances via SSH are expected to be among the skill set of applicants. The requirement for programming experience is an interesting one, as it seems to assume the intended users are or will be developers, not operators. The question becomes is scripting as is often leveraged by operators and admins to manage infrastructure considered “programming experience?” Looking deeper into the courses it later appears to focus on operations and networking, diving into NAS, SAN, VLAN and switching concerns; a focus in IT which is unusual for developers. That’s interesting because in general computer science as a field of study tends to be highly focused on system design and programming, with some degree programs across the country offering more tightly focused areas of expertise in security or networking. But primarily “computer science” degrees focus more on programmatic concerns and less on protocols, networking and storage. Cloud computing, however, appears poised to change that – with developers needing more operational and networking fu and vice-versa. A focus of devops has been on adopting programmatic methodologies such as agile and applying them to operations as a means to create repeatable deployment patterns within production environments. Thus, a broad overview of all the relevant technologies required for “cloud computing” seems appropriate, though it remains to be seen whether such an approach will provide the fundamentals really necessary for its attendees to successfully take advantage of cloud computing in the Real World™. Regardless, it’s a step forward for cloud computing to be recognized as valuable enough to warrant a year of study, let alone a certificate, and it will be interesting to hear what students of the course think of it after earning a certificate. You can learn more about the certificate program at the University of Washington’s web site. Cloud is not Rocket Science but it is Computer Science The Database Tier is Not Elastic Certificate in Cloud Computing UW The Impossibility of CAP and Cloud Brewer’s CAP Theorem Joe Weinman – Cloud Computing is NP-Complete Proof Greedy (IT) Algorithms Not all application requests are created equal184Views0likes0CommentsWeb 2.0 Killed the Middleware Star
Pondering the impact of cloud and Web 2.0 on traditional middleware messaging-based architectures and PaaS. It started out innocently enough with a simple question, “What exactly *is* the model for PaaS services scalability? If based on HTTP/REST API integration, fairly easy. If native middleware… input?” You’ll forgive the odd phrasing – Twitter’s limitations sometimes make conversations of this nature … interesting. The discussion culminated in what appeared to be the sentiment that middleware was mostly obsolete with respect to PaaS. THE OLD WAY Very briefly for those of you who are more infrastructure / network minded than application architecture fluent, let’s review the traditional middleware-based application architecture. Generally speaking, middleware – a.k.a. JMS, MQ Series and most recently, ESB – is leveraged as means to enable a publish-subscribe model of sharing data. Basically, it’s an integration pattern, but no one really likes to admit that because of the connotations associated with the evil word “integration” in the enterprise. But that’s what it’s used for – to integrate applications by sharing data. It’s more efficient than a point-to-point integration model, especially when one application might need to share data with two, three or more other applications. One application puts data into a queue and other applications pull it out. If the target of the “messages” is multiple applications or people, then the queue keeps the message for a specified period of time otherwise, it deletes or archives (or both) the message after the intended recipient receives it. That pattern is probably very familiar to even those who aren’t entrenched in enterprise application architecture because it’s similar to most social networking software in action today. One person writes a status update, the message, and it’s distributed to all the applications and users who have subscribed (followed, put in a circle, friended, etc… ). The difference between Web-based social networking and traditional enterprise applications is two-fold: First – web-based applications were not, until the advent of Web 2.0 and specifically AJAX, well-suited to “polling for” or “subscribing to” messages (updates, statuses, etc…) thus the use of traditional pub-sub architectures for web applications never much gained traction. Second – Middleware has never scaled well using traditional scalability models (up or out). Web-based applications generally require higher capacity and transaction rates than traditional applications taking advantage of middleware, making middleware’s inability to scale problematic. It is unsuited to use in social networking and other high-volume data sharing systems, where rapidity of response is vital to success. Moreover as James Urquhart noted earlier in the conversation, although cloud computing and virtualization appear capable of addressing the scalability issue by scaling middleware at the VM layer – which is certainly viable and makes the solution scalable in terms of volume – this introduces issues with consistency, a.k.a. CAP, because persistence is not addressed and thus the consistency of messages across queues shared by users is always in question. Basically, we end up – as pointed out by James Saull - with a model that basically kicks the problem to another tier – the scalable persistence service. Generally that means a database-based solution even if we use the power of virtualization and cloud computing to address the innate challenges associated with scaling messaging middleware. Mind you, that doesn’t mean an RDBMS is involved, but a data store of some kind and all data stores introduce similar architectural and technologically issues with consistency, reliability and scalability. THE NEW WAY Now this is not meant to say the concept of queuing, of pub-sub, is absent in web applications and social networking. Quite the contrary, in fact. The concept is seen in just about any social networking site today that bases itself on interaction (integration) of people. What’s absent is the traditional middleware as a means to manage the messages across those people (and applications). See, scaling middleware ran into the same issues as stateful applications – they required persistence or a shared-nothing architecture to ensure proper behavior. The problem as you added middleware servers became the same as other persistence-based issues seen in web applications, digital shopping carts and even today’s VDI implementations. How do you ensure that having been subscribed to a particular topic that you actually manage to get the messages when the load balancing solution arbitrarily directs you to the next available server? You can’t. Hence the use of persistent stores to enable scalability of middleware. What you end up with is essentially 4 tiers – web, application, middleware and database. You might at this point begin to recognize that one of these tiers is redundant and, given the web-based constraints above, unnecessary. Three guesses which one it is, and the first two do not count. Right. The middleware tier. Web 2.0 applications don’t generally use a middleware tier to facilitate messaging across users or applications. They use APIs and web-based database access methods to go directly to the source. Same concept, more scalable implementation, less complexity. As James put it later in the conversation, “the “proven” architecture seems to be Web2, which has its limitations.” BRINGING IT BACK to PaaS So how does this relate to PaaS? Well, PaaS is Platform as a Service which is really a nebulous way of describing developer services delivered in a cloud computing environment. Data, messaging, session management, libraries; the entire application development ecosystem. One of those components is messaging and, so it would seem, traditional middleware (as a service, of course). But the scalability issues with middleware really haven’t been solved and the persistence issues remain. Adding pressure is the web development paradigm in which middleware has traditionally been excluded. Most younger developers have not had the experience (and they should count themselves lucky in this regard) of dealing with queuing systems and traditional pub-sub implementations. They’re a three-tier generation of developers who implement the concept of messaging by leveraging database connectivity directly and most recently polling via AJAX and APIs. Queueing may be involved but if it is, it’s implemented in conjunction with the database – the persistent store – and clients access via the application tier directly, not through middleware. The ease with which web and application tiers are scaled in a cloud computing environment, moreover, meets the higher concurrent user and transaction volume requirements (not to mention performance) associated with highly integrated web applications today. Scaling middleware services at the virtualization layer, as noted above, is possible, but reintroduces the necessity of a persistent store. And if we’re going to use a persistent store, why add a layer of complexity (and cost) to the architecture when Web 2.0 has shown us it is not only viable but inherently more scalable to go directly to that source? At the end of the day, it certainly appears that between cloud computing models and Web 2.0 having been forced to solve the shared messaging concept without middleware – and having done so successfully – that middleware as a service is obsolete. Not dead, mind you, as some will find a use case in which it is a vital component, but those will be few and far between. The scalability and associated persistence issues have been solved by some providers – take RabbitMQ for example – but that ignores the underlying reality that Web 2.0 forced a solution that did not require middleware and that nearly all web-based applications eschew middleware as the mechanism for implementing pub-sub and other similar architectural patterns. We’ve gotten along just fine on the web without it, why reintroduce what is simply another layer of complexity and costs in the cloud unless there’s good reason. Viva la evolution. The Inevitable Eventual Consistency of Cloud Computing Let’s Face It: PaaS is Just SOA for Platforms Without the Baggage Cloud-Tiered Architectural Models are Bad Except When They Aren’t The Database Tier is Not Elastic The New Distribution of The 3-Tiered Architecture Changes Everything The Great Client-Server Architecture Myth Infrastructure Scalability Pattern: Sharding Sessions Infrastructure Scalability Pattern: Partition by Function or Type Applying Scalability Patterns to Infrastructure Architecture Sessions, Sessions Everywhere238Views0likes0CommentsThe Stealthy Ascendancy of JSON
While everyone was focused on cloud, JSON has slowly but surely been taking over the application development world It looks like the debate between XML and JSON may be coming to a close with JSON poised to take the title of preferred format for web applications. If you don’t consider these statistics to be impressive, consider that ProgrammableWeb indicated that its “own statistics on ProgrammableWeb show a significant increase in the number of JSON APIs over 2009/2010. During 2009 there were only 191 JSON APIs registered. So far in 2010 [August] there are already 223!” Today there are 1262 JSON APIs registered, which means a growth rate of 565% in the past eight months, nearly catching up to XML which currently lists 2162 APIs. At this rate, JSON will likely overtake XML as the preferred format by the end of 2011. This is significant to both infrastructure vendors and cloud computing providers alike, because it indicates a preference for a programmatic model that must be accounted for when developing services, particularly those in the PaaS (Platform as a Service) domain. PaaS has yet to grab developers mindshare and it may be that support for JSON will be one of the ways in which that mindshare is attracted. Consider the results of the “State of Web Development 2010” survey from Web Directions in which developers were asked about their cloud computing usage; only 22% responded in the affirmative to utilizing cloud computing. But of those 22% that do leverage cloud computing, the providers they use are telling: PaaS represents a mere 7.35% of developers use of cloud computing, with storage (Amazon S3) and IaaS (Infrastructure as a Service) garnering 26.89% of responses. Google App Engine is the dominant PaaS platform at the moment, most likely owing to the fact that it is primarily focused on JavaScript, UI, and other utility-style services as opposed to Azure’s middle-ware and definitely more enterprise-class focused services. SaaS, too, is failing to recognize the demand from developers and the growing ascendancy of JSON. Consider this exchange on the Salesforce.com forums regarding JSON. Come on salesforce lets get this done. We need to integrate, we need this [JSON]. If JSON continues its steady rise into ascendancy, PaaS and SaaS providers alike should be ready to support JSON-style integration as its growth pattern indicates it is not going away, but is instead picking up steam. Providers able to support JSON for PaaS and SaaS will have a competitive advantage over those that do not, especially as they vie for the hearts and minds of developers which are, after all, their core constituency. THE IMPACT What the steady rise of JSON should trigger for providers and vendors alike is a need to support JSON as the means by which services are integrated, invoked, and data exchanged. Application delivery, service-providers and Infrastructure 2.0 focused solutions need to provide APIs that are JSON compatible and which are capable of handling the format to provide core infrastructure services such as firewalling and data scrubbing duties. The increasing use of JSON-based APIs to integrate with external, third-party services continues to grow and the demand for enterprise-class service to support JSON as well will continue to rise. There are drawbacks, and this steady movement toward JSON has in some cases a profound impact on the infrastructure and architectural choices made by IT organizations, especially in terms of providing for consistency of services across what is likely a very mixed-format environment. Identity and access management and security services may not be prepared to handle JSON APIs nor provide the same services as it has for XML, which through long established usage and efforts comes with its own set of standards. Including social networking “streams” in applications and web-sites is now as common as including images, but changes to APIs may make basic security chores difficult. Consider that Twitter – very quietly – has moved to supporting JSON only for its Streaming API. Organizations that were, as well they should, scrubbing such streams to prevent both embarrassing as well as malicious code from being integrated unknowingly into their sites, may have suddenly found that infrastructure providing such services no longer worked: API providers and developers are making their choice quite clear when it comes to choosing between XML and JSON. A nearly unanimous choice seems to be JSON. Several API providers, including Twitter, have either stopped supporting the XML format or are even introducing newer versions of their API with only JSON support. In our ProgrammableWeb API directory, JSON seems to be the winner. A couple of items are of interest this week in the XML versus JSON debate. We had earlier reported that come early December, Twitter plans to stop support for XML in its Streaming API. --JSON Continues its Winning Streak Over XML, ProgrammableWeb (Dec 2010) Similarly, caching and acceleration services may be confused by a change from XML to JSON; from a format that was well-understood and for which solutions were enabled with parsing capabilities to one that is not. IT’S THE DATA, NOT the API The fight between JSON and XML is one we continue to see in a general sense. See, it isn’t necessarily the API that matters, in the end, but the data format (the semantics) used to exchange that data which matters. XML is considered unstructured, though in practice it’s far more structured than JSON in the sense that there are meta-data standards for XML that constrain security, identity, and even application formats. JSON, however, although having been included natively in ECMA v5 (JSON data interchange format gets ECMA standards blessing) has very few standards aside from those imposed by frameworks and toolkits such as JQuery. This will make it challenging for infrastructure vendors to support services targeting application data – data scrubbing, web application firewall, IDS, IPS, caching, advanced routing – to continue to effectively deliver such applications without recognizing JSON as an option. The API has become little more than a set of URIs and nearly all infrastructure directly related to application delivery is more than capable of handling them. It is the data, however, that presents a challenge and which makes the developers’ choice of formats so important in the big picture. It isn’t just the application and integration that is impacted, it’s the entire infrastructure and architecture that must adapt to support the data format. The World Doesn’t Care About APIs – but it does care about the data, about the model. Right now, it appears that model is more than likely going to be presented in a JSON-encoded format. JSON data interchange format gets ECMA standards blessing JSON Continues its Winning Streak Over XML JSON versus XML: Your Choice Matters More Than You Think I am in your HTTP headers, attacking your application The Web 2.0 API: From collaborating to compromised Would you risk $31,000 for milliseconds of application response time? Stop brute force listing of HTTP OPTIONS with network-side scripting The New Distribution of The 3-Tiered Architecture Changes Everything Are You Scrubbing the Twitter Stream on Your Web Site?906Views0likes0Comments