Kubernetes is getting a lot of attention recently, and there is good reason for that. Docker containers alone are little more than a developer convenience. Orchestration moves containers from laptop into the datacenter. Kubernetes does that in a way that simplifies development and operations. Unfortunately I struggled to find easy to understand high level descriptions of how kubernetes worked, so I made the diagram below.
While I don’t show the operator specifically (usually someone in IT, or a managed offering like GKE), everything in the yellow box would be managed by the operator. The nodes host pods. The master nodes facilitate orchestration. The gateway facilitates incoming traffic. A bastion host is often used to manage members of the cluster, but isn’t shown above.
Persistent storage, in the form of block storage, databases, etc. is also not directly part of the cluster. There are kubernetes resources, like Persistent Volumes, that can be used to associate external persistent storage with pods, but the management of the storage solution is outside the scope of managing kubernetes.
The Developer performs two primary activities: Create docker images and tell kubernetes to deploy them. Kubernetes is not opinionated when it comes to container image design. This means a developer can choose to manage ports, volumes, libraries, runtimes, and so on in any way that suits him. Kubernetes also has no opinion on container registry, as long as it is compliant with the docker registry v2 spec.
There are a few ways to deploy a container image to kubernetes. The diagram above shows two, one based on an application definition (usually in YAML format) and the other using shortcut commands. Both are typically triggered using the kubectl CLI. In either case, the developer gives kubernetes a desired state that includes details about which container image, how many replicas and exposed ports, for example. Kubernetes then assumes the job of ensuring that the desired state is the actual state. When nodes in a cluster change, or containers fail, kubernetes acts in realtime to to what is necessary to get back to the desired state.
The consumer doesn’t need to know anything about the kubernetes cluster, it’s members or even that kubernetes is being used. The gateway shown might be an nginx reverse proxy or HAProxy. The important point is that the gateway needs to be able to route to the pods, which are generally managed on a flannel or calico type network. It is possible to create redundant gateways and place them behind a load balancer.
Services are used to expose pods (and deployments). Usually the service type of LoadBalancer will trigger an automatic reconfiguration of the gateway to route traffic. Since the gateway is a single host, each port can be used only once. To get around this limitation, it is possible to use an ingress controller to provide name based routing.
Kubernetes definitely has its share of complexity. Depending on your role, it can be very approachable. Cluster installation is by far the most difficult part, but after that, the learning curve is quite small.
IT general controls are important for various reasons, such as business continuity and regulatory compliance. Traditionally, controls have focused on the infrastructure itself. In the context of long running servers in fixed locations, this was often an effective approach. As virtualization and container technologies become more prevalent, especially in public cloud, infrastructure focused IT controls can start to get in the way of realizing the following benefits:
- Just in time provisioning
- Workload migration
- Network isolation
- Tight capacity management
- Automated deployments
- Automated remediation
One way to maintain strong IT controls can still get the above benefits is to shift the focus of those controls away from the infrastructure and instead focus on routing (traffic management).
As shown above, a focus on routing ensures that IT can control where production traffic is routed, including production data. Engineering teams are free to deploy as needed and automation can be used freely. Since infrastructure is replaced with each deployment, rather than updated, there is no need to maintain rigid controls around any specific server, VM or container.
In the diagram shown, a gateway is used to facilitate routing. Other mechanisms, like segregated container image repositories and deployment environments may also be appropriate.
I found this article on serverwatch today: http://www.serverwatch.com/server-trends/why-kubernetes-is-all-conquering.html
It’s not technically deep, but it does highlight the groundswell of interest for and adoption of kubernetes. It’s also worth noting that GCE and Azure will now both have a native, fully managed kubernetes offering. I haven’t found a fully managed docker datacenter offering, but I’m sure there is one. It would be interesting to compare the two from a public cloud offering perspective.
I’ve worked a lot with OpenStack for on premises clouds. This naturally leads to the idea of using OpenStack as a platform for container orchestration platforms (yes, I just layered platforms). As of today, the process of standing up Docker Datacenter or kubernetes still needs to mature. Last month eBay mentioned that it created its own kubernetes deployment tool on top of openstack: http://www.zdnet.com/article/ebay-builds-its-own-tool-to-integrate-kubernetes-and-openstack/. While it does plan to open source the new tool, it’s not available today.
One OpenStack Vendor, Mirantis, provides support for kubernetes through Murano as their preferred container solution: https://www.mirantis.com/solutions/container-technologies/. I’m not sure how reliable Murano is for long term management of kubernetes. For organizations that have an OpenStack vendor, support like this could streamline the evaluation and adoption of containers in the enterprise.
I did find a number of demo, PoC, kick the tires examples of Docker datacenter on OpenStack, but not much automation or production support. I still love the idea of using the Docker trusted registry. I know that kubernetes provides a private registry component (https://github.com/kubernetes/kubernetes/tree/master/cluster/addons/registry), but it’s not as sophisticated as Docker Trusted Registry in terms of signing, scanning, etc. However, this functionality is quickly making its way into kubernetes, with some functionality already available in alpha: https://github.com/kubernetes/kubernetes/issues/22888
On the whole, I’m more drawn to kubernetes from a wholistic point of view, but Docker is effectively keying into some real enterprise concerns. Given the open source community and vendor investment in kubernetes, I expect the enterprise gap (like a trusted registry for kubernetes) will close this year.
Container orchestration is at the heart of a successful container architecture. Orchestration takes as input a definition of how a deployed application should look. This usually includes how many containers for a certain image are needed, volumes for persistent data, networking for communication between containers and awareness of various discovery mechanisms. Discovery may include such things as identifying other containers which are also participating with the application or how to access services required by the running containers. Here’s a high level view.
Containers need infrastructure to run. Both virtual and physical infrastructure can be used to host containers. Some argue that it’s better to run containers directly on physical servers to get the maximum performance. While there are performance benefits, there is also more operational overhead in standing up and maintaining physical servers. Automation available in virtual environments often makes it easier to provision, monitor and remediate servers. Using virtual infrastructure also makes it possible to share capacity between different types of workloads, where some may not be optimized for containers. Tools like Docker cloud (formerly Tutum) and Rancher can streamline operations for virtual environments.
If all workloads will be containerized and top performance is critical, favor a physical deployment. If some applications will still require IaaS and capacity will be shared between various types of workloads, choose virtual.
Orchestration is the process by which containers are managed to ensure that a predefined application configuration is maintained. These often require a plain text definition (usually YAML) of which container images are wanted, networking between those containers, mounted volumes, etc. The orchestration tool is then given this definition, which it uses to pull the necessary images and create containers, setup networking and mount storage.
Kubernetes (http://kubernetes.io/) is was originally contributed to the open source community by Google and was based on their decade old container technology Borg. It aims to be a comprehensive container management platform providing everything from orchestration to monitoring to service and discovery and more. It abstracts the container technology in what it calls a pod, making it possible to use Docker or rkt or any other technology that comes around in the future. For many people the appeal of this platform is that it has no direct tie back to a commercial vendor, so investment is more likely to be driven by the community.
Docker Swarm is Docker’s orchestration layer. It is designed to integrate seamlessly with other Docker tools, including the Docker daemon and registry tools. Some of the appeal to Swarm has to do with simplicity. Swarn is more narrowly focused than kubernetes, which may suggest better focus and more flexibility in choosing the right solutions for each container management need, althought it’s optimal to stick with Docker solutions.
As containers continue to grow in prominence, some PaaS solutions, such as cloudfoundry, are reworking their narrative to position themselves as container management systems. It is true that the current version of cloudfoundry supports direct deployment of Docker container images and provides platform components, like routing, health management and scaling. Some drawbacks to using a PaaS for container orchestration is that deployments become more prescriptive and it provides less granular control over container deployment and interactions.
Container images can be created in several ways, including using a mechanism like Dockerfile, or using other automation tools. Container images should never contain credentials or other sensitive data (see Discovery below). In some cases it may be appropriate to host an internal container registry. External registry options that provide private images may provide sufficient protection for some applications.
Another aspect of image security has to do with vulnerabilities. Some registry solutions provide image scanning tools that can detect vulnerabilities or out of date packages. When external images are used as a base for internal application images, these should be carefully curated and confirmed to be safe before using them to derive application images.
One motivation behind containerization is that it better accommodates Continuous Integration (CI) and Continuous Delivery (CD). When building CI/CD pipelines, it’s important that the orchestration layer make it easy to automate to lifecycle of containers for unittests, functional tests, load tests and other automatic verification of the current state of an applicaiton. The CI/CD pipeline may be responsible for both triggering container creation as well as creating the container image.
Two way communication with CI/CD tooling is important so that the end result of testing and validation can be reported and possibly acted on by the CI/CD tool affecting later stages.
Discovery is the process by which a container identifies other containers and services or registers itself to be found by other containers with which it participates in order to function. Discovery may include scenarios such as finding a database or static file storage with data necessary to run, or identifying other containers across which requests are distributed in order to accommodate synchronization.
Two common solutions for Discovery include a distributed key/value store and DNS. A distributed key/value store, such as etcd, ensures that each physical node hosting containers has a synchronized set of key/value data. In this scenario, the orchestration tool can add details about newly created containers to the key/value store so that existing containers are aware of them. New containers can query the key/value store to identify related containers and services.
DNS based discovery (a popular tools is Consul) is very similar, except that DNS is used to manage resolution of services and containers based on URLs. In this way, new containers can simply call the predetermined URL and trust that the request will be routed to the appropriate container or resource. As containers change, DNS is updated in realtime so that no changes are required on individual containers.
I hear a lot of people talking about cloud native applications these days. This includes technologists and business managers. I have found that there really is a spectrum of meaning for the term cloud native and that two people rarely mean the same thing when they say cloud native.
At one end of the spectrum would be running a traditional workload on a virtual machine. In this scenario the virtual host may have been manually provisioned, manually configured, manually deployed, etc. It’s cloudiness comes from the fact that it’s a virtual machine running in the cloud.
I tend to think of cloud native at the other end and propose the following definition:
The ability to provision and configure infrastructure, stage and deploy an application and address the scale and health needs of the application in an automated and deterministic way without human interaction
The activities necessary to accomplish the above are:
- Build and Test
- Scale and Heal
Provision and Configure
The following diagram illustrates some of the workflow involved in provisioning and configuring resources for a cloud native application.
You’ll notice that there are some abstractions listed, including HEAT for openstack, CloudFormation for AWS and even Terraform, which can provision against both openstack and AWS. You’ll also notice that I include a provision flow that produces an image rather than an actual running resource. This can be helpful when using IaaS directly, but becomes essential when using containers. The management of that image creation process should include a CI/CD pipeline and a versioned image registry (more about that another time).
Build, Test, Deploy
With provisioning defined it’s time to look at the application Build, Test and Deploy steps. These are depicted in the following figure:
The color of the “Prepare Infrastructure” activity should hint that in this process it represents the workflow shown above under Provision and Configure. For clarity, various steps have been grouped under the heading “Application Staging Process”. While these can occur independently (and unfortunately sometimes testing never happens), it’s helpful to think of those four steps as necessary to validate any potential release. It should be possible to fully automate the staging of an application.
The discovery step is often still done in a manual way using configuration files or even manual edits after deploy. Discovery could include making sure application components know how to reach a database or how a load balancer knows to which application servers it should direct traffic. In a cloud native application, this discovery should be fully automated. When using containers it will be essential and very fluid. Some mechanisms that accommodate discovery include system level tools like etcd and DNS based tools like consul.
Monitor and Heal or Scale
There are loads of monitoring tools available today. A cloud native application requires monitoring to be close to real time and needs to be able to act on monitoring outputs. This may involve creating new resources, destroying unhealthy resources and even shifting workloads around based on latency or other metrics.
Tools and Patterns
There are many tools to establish the workflows shown above. The provision step will almost always be provider specific and based on their API. Some tools, such as terraform, attempt to abstract this away from the provider with mixed results. The configure step might include Ansible or a similar tool. The build, test and deploy process will likely use a tool like Jenkins to accomplish automation. In some cases the above process may include multiple providers, all integrated by your application.
Regardless of the tools you choose, the most important characteristic of a cloud native application is that all of the activities listed are automated and deterministic.