In the first post of this series, I mentioned that we are at the advent of a new era of glorious distributed computing of many kinds, and therefore infrastructure has to be reimagined to meet the evolving needs and challenges. In this article, we will explore how computing infrastructure has become over-centralized with SaaS and IaaS cloud services, and a new need and opportunity is emerging to broadly distribute computing to solve some needs and cater to new opportunities.
In the early days of computing, except for a few government institutions, no enterprise had its own data center. Educational/research institutions and businesses used computers hosted by large service providers such as IBM. IBM founder Thomas J. Watson is supposed to have remarked that the entire worldwide market for computers was limited to just 5 (yes you read that right, just five!), large installations! We all know how that turned out! It is a different matter that today’s IaaS clouds are forming into about five large installations – it is not the same thing — each cloud has millions of servers far more powerful than supercomputers of that era!
As software solutions started to emerge to provide specialized application capabilities, and computers started to come down in price due to advances in electronics, every enterprise started to be able to afford data centers of their own. The advent of mini-computers and personal computers/servers then proliferated the widespread deployment of computers and application installations everywhere.
Slowly the cost capital expense (Capex) started to be outstripped by operational expenses (Opex).
And also, the cost of professional services to customize applications software far outstripped the license costs to the tune of 5 times license expense for services! That’s when SaaS burst into the scene, exploiting the internet and web, to offer quick and easy, pay-as-you-go consumption of application capabilities with simple click-and-done customizations.
Although before the dot.com bubble burst some services offered by ASP’s (application service providers), the services were still single instance, heavily customized software with just the license and data-center being owned by the provider, largely following the same old heavy-weight installation model. It was as if the total cost of ownership was just being amortized over some longer period rather than being a real multi-tenant pay-as-you-go easy-to-use SaaS.
I think Salesforce.com can be said to be the true pioneer of SaaS as we know it today. “Software is dead” was their motto, meaning don’t buy, rent; don’t install, use. Kudos to them! SaaS has become so mainstream that these days everyone looks askance if an enterprise buys and installs the software. This trend centralized application services, but not all applications, of course.
Enterprises still have lots of custom applications, particularly when the use of the software is intrinsic (such as eBiz) or crucial to business advantage (e.g., large financial services firms).
About a decade or so ago, Amazon.com brought forth a new revolution: Infrastructure as a service (IaaS) in the form of Amazon web services (AWS). In part 2 of this series, we explored the dramatic impact it had on application development & deployment and consequently speed of innovation.
As enterprises, particularly e-Businesses, saw the advantages IaaS provided, they started to adopt IaaS with increasing comfort as AWS began to provide more and better services. This led to the centralization of deployment of even custom applications. Interestingly, some enterprises have even started to dismantle data centers entirely and have even gone to the extent of re-deploying their packaged vendor software on IaaS clouds. It has been an ongoing transition which has been accelerating recently.
However, application delivery owners were confronted with three major challenges: end-user experience, data gravity, and data sovereignty.
These challenges are driving SaaS providers, e-Businesses and enterprises to start to reconsider whether the centralized deployment of applications in their data centers (or co-location centers) or on IaaS public clouds should be complemented by using distributed edge locations. The advent of 5G telecom services is adding weight such considerations. Let us now elaborate on each of these issues.
Many SaaS providers use a few co-located data centers, since SaaS pre-dated IaaS, and only lately adopted IaaS to complement or replace (perhaps in distant future) their data centers.
Even though public clouds have global deployments, they are built for scale and hence tend to have concentrated deployments in major locations. One major issue that has been observed is that the resulting end-user experience for such centralized apps is poor, especially in remote branches or for mobile/home-office workers.
Although many solutions have been tried, end-user experience from cloud locations (albeit spread around the globe) and from SaaS/E-Biz data centers has been uneven at best and unacceptable at worst, sometimes even triggering service-level penalties or outright cancelations (also known in SaaS world by the dreaded term, churn!)
Not only that, it has been proven beyond doubt that end-user response times directly result in increase or decrease of revenues for businesses that rely upon revenue generation from online activities in whole or in most parts (such as e-commerce, social networks, etc.) Lost shopping carts or lost advertising revenue haunts the nightmares of application delivery executives of such businesses.
While there are numerous factors that can improve or degrade end-user response times, two factors dominate the time it takes a request or a response to traversing the network. One is quite simply the speed of light. As some smart-aleck once quipped, 300,000 km per second is not just a good idea; it is the law, the ultimate universal speed limit!
Because of this limit, the farther a user is from the application location, the slower the response time, and conversely the closer an application are to the end-user, the faster the interaction can be.
The upcoming deployment of 5G mobile tech is supposed to reduce the latency between a connected device and the first possible packet-processing unit to just two milliseconds.
This is revolutionary! Already applications are being devised to exploit this ultra-low latency, such as multi-player gaming and Augmented/Virtual Reality (AR/VR) apps. But to exploit this low latency, applications need to be close to the location that the 5G “last-mile” connection enters into the telecom provider’s network.
Otherwise, the vagaries of the “mid-mile” links will overwhelm the advantages of brought by 5G.
The second factor is network connectivity and path lengths. This is how a request or response is routed through the network, the number of router hops and the path length between each hop.
Due to vagaries of service provider interconnections (aka peering), routing protocols and preferences, sometimes network packets can end up taking quite roundabout paths, with more hops than strictly needed and/or much greater path lengths. This can result in momentary or lasting poor application performance. This issue also can potentially be addressed by network optimization services and application acceleration services deployed closer to the end-user, in various edge locations.
Such locations could include 5G service delivery clusters provided by telecom service providers, which would drastically cut latencies, making it possible to deliver blindingly fast, high bandwidth services such as multi-player gaming, AR/VR experiences, etc.
A variant of end-user experience is end “thing” experience. This occurs in the internet of things (IoT) installations where a decision needs to be made nearly instantaneously (such as when a monitored aluminum smelter is over-heating) or when disconnected from the cloud (such as at a really remote oil rig). In both such cases, it would be imperative for the application component (micro-service?!) to be located as close as possible to the “things” if possible, in the same local monitoring network.
We all know about gravity! The more massive an object is, the more gravity weighs things down, and it takes a lot more effort to lift them.
“Data gravity” is, however, not a term in common parlance, and probably needs a bit of explanation and context. It can be defined as somewhat simplistically as “the amount of data that is needed by a particular application task at any given time”. Going by this definition, if an application task requires just one byte of data, it has very low data gravity. If, however, an application task has to churn through a huge amount of data, it can be said to have high data gravity.
As opposed to real gravity which gets weaker as distance increases (per Newton’s universal law of gravitation), the strange and annoying thing about data gravity is that its effect gets worse as the “time distance” between the application and location of the data increases! Effects of data-gravity can also be exacerbated by the amount of bandwidth to transfer data between the storage location and the compute location (including high-speed data-center links, high-speed system hardware backplanes, and even the links between processors and hardware main memory). Modern data-crunching analytics and machine learning apps have very high data gravity.
The natural conclusion one can reach is that if an application task’s data gravity is high the closer it is executed to the data it processes, and the larger the bandwidth of data-access, the better it will perform. There are several situations where executing an application task on the cloud makes it suffer from data gravity.
One such scenario is when a large amount of data is pre-existing which needs to be processed along with new data (for instance for machine learning), and it is not practical or secure to copy the data to the cloud. (Some financial services companies refuse to upload sensitive data to the cloud and hence are firmly wedded to data-center only or at best, a hybrid-cloud deployment model for this reason).
Another scenario is when a huge volume of data is continuously produced (such as in many internet of things installations with thousands of devices being instrumented and a large volume of data collected). In such cases, many times, it does not make sense to transport all data to the cloud, but only transmit it when it makes sense. It may make sense to transmit only when some monitored item changes.
An example is when someone is monitoring temperature: it may be collected every second, but if it has not changed, why transmit it?! To accomplish such change detection or some other aggregation functions, we will need to locate processing close to where data is generated. As a discerning reader, you may have noticed that this is similar to the IoT case we discussed for end “thing” experience. Kudos to you!
Therefore, we can surmise that high data gravity “attracts” application tasks to be placed as close as possible to the data.
Recently you may have heard the term “data sovereignty” a lot. For those of you who may not be cognizant with what it means, I would like to explain it briefly as the context for how it matters for distributed compute infrastructure. Sovereignty in this context means that data needs to stay in the location where it is generated. In other words, the location of data generation has sovereignty over that piece of data, just like a nation has sovereignty over its citizens who reside there.
Several countries (e.g., China), groups of countries (e.g., European Union) and even smaller domains (e.g., the state of California in the USA) have enacted data sovereignty laws.
India recently started enforcing strict data sovereignty laws, which forced international companies to scramble to implement stop-gap solutions to be compliant and avoid hefty fines or even possible business downtime. EU’s general data protection regulations (GDPR) has caused enterprises to think and work ever harder to devise ways to remain compliant. Many SaaS companies also who have customers in domains with such sovereignty laws have to devise ways to comply.
All of these leads to one plausible answer: to paraphrase the well-known aphorism, if the (data) mountain cannot go (be moved) to the miner (app), then the miner has to go to the mountain (much like the data gravity situation). That is, if the data which legally cannot be exported, move the application that processes data to the location of the data.
Moving an entire application is easier said than done of course, but some part of the application that can process the data and extract and return relevant compliant results (such as AI models) that are needed elsewhere, or take any other action on the data locally (including actions such as “forget me” required by data privacy laws).
As we can see from the description above, there are compelling reasons not to limit applications to be deployed only at central cloud locations or data centers, but to distribute them (at least parts of them) as close as possible to users, things and data. This calls for a distributed edge compute infrastructure for applications and/or acceleration services.
In the next article, we will explore enterprise blockchains, a different, application-level trend that is distributing trust across the participants.