Workload design for Cloud
Traditional approaches to creating and maintaining workloads in the enterprise often include contributions from many teams and individuals. They also include a combination of manual and scripted activities. I sometimes refer to this as an assembly line. The workloads that come off the assembly line, like cars, have odometers and require regular maintenance. Others refer to these workloads as pets. When moving workloads to the cloud, it’s helpful to visualize how this typically happens on premise and how it will change when going into cloud.
The illustration above shows the traditional approach in the top left, including some of the teams that participate in the preparation of the workload.
- Infrastructure Build is the team responsible for the bare metal servers and virtualization platform (e.g. VMWare and OpenStack are common).
- Infrastructure Config is the team responsible for installing an operating system, attaching storage and sometimes installing certain packages. This is often the team responsible for ongoing patching.
- Platform Setup is handled by a team responsible for the runtime environment. Components often include desired programming languages and libraries. In some cases it also includes platform components (e.g. Tomcat).
- Application and Application Config are often handled by another team who is responsible for bundling and distributing the source code or compiled executables to the hosts. This team often is also responsible for application config, such as database credentials.
In many cases, enterprises have a single “golden” image that is used as the basis for every workload. This means that all of the work shown in the upper left of the diagram must be performed for every workload. While moving toward cloud, either toward containers or VMs, it is desirable to codify infrastructure, so that all the “assembly line” work is packaged into the image. This is depicted on the bottom right.
Components of Cloud Infrastructure
When done right, three primary components are make up the new delivery approach. First is the image that encapsulates all the work previously done on the “assembly line”. This means none of that work needs to happen when the instance is launched. This delivers several benefits, including reliable scale and heal actions as well as immutable workload definitions that reduce risk during deployments.
Second, persistent storage is managed outside the life-cycle of the instances or the images. Depending on the deployment platform, there are various ways to facilitate this externalized storage. It’s most important to recognize the need to manage the life-cycle of persistent storage outside of the instances. This can feel like a big shift away from traditional workloads that are expected to exist for long periods of time with local storage expected to be durable.
Third, the delivery of configuration data is external to the image. This should be injected at runtime, and possibly updated during runtime in some cases. This is important because it makes it possible to use the same immutable image in Development, Staging and Production environments, without modification. In other words, the same workload that you validated in Development and Staging is exactly what is delivered to Production.
Parameters
Some important parameters to consider in choosing whether to move to VMs or containers include
- How much work will it take to implement?
- How much manual work can be eliminated from the process?
- What is the cost to run the workloads?
- Will the target state achieve required RTO, RPO and SLO?
- Will the target state deliver required performance?
These parameters will often be evaluated on a case by case basis, focusing on the workload. In some cases it may be appropriate to consider focusing on a single target state, even if there are some inefficiencies for the workloads. Other times it will be better to let the workload dictate the right target state.
Variations of this approach
I previously discussed how cloud native mapped onto traditional management of a Software Development Life Cycle. This view goes deeper into how workload management and design can accommodate the incremental steps that organizations take toward automating their infrastructure. It’s also important to note that, while these patterns are most accessible in Kubernetes, they work in non-containerized cloud environments too. This includes the obvious AWS, Azure and Google clouds, but also VMWare and OpenStack.
The next step beyond this is to ensure that the images are managed through and automated pipeline. I’ll cover that in another post. When I do, it will become more clear how managing many images, as shown in the bottom right of the above diagram, is easier than it might first appear.