VMworld EU 2018 – Deep Dive: The Value of Running Kubernetes on vSphere

Frank Denneman, Chief Technologist, VMware

Michael Gasch, Customer Success Architect – Application Platforms, VMware

So the first session at VMworld I managed to capture for this blog is something I was looking forward too and had been recommended by a number of people; Frank Denneman and Michale Gasch covering Kubernetes on vSphere. Their session was to answer why vSphere for Kubernetes and not on bare metal.

Michael starts with a timeline of Kubernetes, going all the way back to Google’s problems scaling to support distributed systems like Google Search.

Google’s answer was to build Borg as a cluster manager which would take over all of the tasks in managing infrastructure for distributed systems. This was around 2003. Google still run on Borg.

In 2007, Google helped write what became cgroups, to help with isolation and process resource management for applications sharing the same hardware.

In 2013, Docker created a common language for deploying applications using containers.

Michael took a moment to describe the difference between containers and virtual machines, and that containers are not always better. Ultimately, the host OS has no knowledge of the container.

This all lead to managing containers across multiple hosts and the birth of Kubernetes in 2015.

Michael then ran through a high level architecture of Kubernetes.

Customer scenario

Next, to help illustrate the advantages of using vSphere for Kubernetes, Michael described a fictional company called ABC Inc, who’s Linux team had decided to use bare metal for Kubernetes. ABC Inc; Siloed teams, 90% on vSphere. Digital transformation project.

Frank stepped in to describe how going bare metal means going back in time, as you lose capabilities we have pretty much taken for granted like vMotion.

To help use ABC Inc as a case study and illustrate the advantages vSphere brings, the rest of the presentation was broken down into 4 stages;

  1. Day 0; Planning
  2. Day 1; first deployments
  3. Day 2; Container sprawl
  4. Day 3; Maintenance and Availability

For each stage, Michael described the challenges of the bare metal approach and Frank then described how vSphere helps.

Day 0; Planning and Installing on bare metal

Provision hardware and setup lots of software. Build a custom cloud provider. Manage lots of external integrations. Lots of things for a team to setup, manage and integrated.

How can vSphere help

Frank jumps in to talk about how slow it is to provision physical hardware. In the US it takes an average of 86 days to get a physical server deployed.

Bare metal means managing hardware compatibility, whereas with vSphere it abstracts and standardises the hardware and is based on VMware’s HCL.

Patching hardware means not having to turn servers off as you can use vMotion. With 7 out of 10 workloads being stateful (according to Datadog), this is an overlooked consideration for managing container workloads.

In addition, vSphere provides a flexible security boundary, can wrap right sized VMs around a certain set of containers.

Day 1; Experiences with first deployments

Michael takes a step back to describe what an app will do when running on top of a physical host. He states that many people consume a service to get VMs and have forgotten or never experienced why we implemented virtualisation in the first place.

Kubernetes on bare metal helps with scheduling decisions for applications, rather than running them all together in the underlying host. However, resources can be wasted, runtimes are not aware they are sharing with other pods and it can be difficult to tune per workload.

How can vSphere help

Frank stepped back into talk about his deep knowledge of the hypervisor. He talked about how vSphere is aware of hyper threading and NUMA, so can make intelligent decisions to best optimise performance for container hosts.

Day 2; Container and Cluster Sprawl

Once Kubernetes becomes successful, other teams will want more containers, and more clusters. Clusters for staging, test, prod etc.

How vSphere can help

vSphere provides multi-tenancy so multiple clusters can be created and separated effectively, and DRS can distribute load.

Day 3; Maintenance and Availability

What happens when you have a bare metal master node failure. No alternative hosts for services to restart on and there is no admission control to stop contention when failure occur.

For worker nodes, Kubernetes by default waits 5 minutes before starting pods on a new host and 6 minutes before reattaching volumes from failed hosts.

How can vSphere help

vSphere can provide HA restart priorities and HA dependencies. This could be used to prioritise the control plane starting before worker nodes. DRS affinity can be used to separate master nodes across different hosts, or to group them to one site in a stretched cluster. Proactive HA can help move VMs before a hosts fails or prevent scheduling on a host is suffering hardware degradation.

And with that, Frank and Michael quickly wrapped up as they had ran out of time!

One comment

Leave a comment