Can Microcloud kill K8S?

As highlighted recently in a blog post, Canonical just unveiled their latest product “Microcloud”, an interesting lightweight approach to VM (and containers) high availability cluster, based on LXD, CEPH and OVN.

Promised to be easy to deploy, manage and grow, how does it overlap with Kubernetes ? Are they competing in any way ? Or can they work together nicely ?

Microcloud and K8S look similar

Microcloud is the sum of all parts in Canonical ‘micro’ line of products as it creates a unified product around :

  • LXD
  • MicroCEPH
  • MicroOVN

Its goal is to deploy, in a single command and a few seconds, a high availability cluster on which you’ll be able to deploy LXC containers and VMs. CEPH and OVN layers allow for VM migrations and automated hot-restarts on a different node, should a node go down.

This really is close to the K8S promise : “Setup a cluster of physical nodes, and K8S will span your workloads over them (as “pods”). Should a node come down, your services are kept up by the other nodes”.

Different use-cases

K8S is really great for hosting mass consumed products, such as databases, ML frameworks, blogs, etc. It makes their deployment easy and their maintenance light. Most of these products require databases (sometimes several of them), depend on some other software, imply serious configurations and networking preparation. Imagine having to have to set all that up by hand, (and manage the underlying OS as well).

K8S, it often is a matter of pulling, setting a handful of variables, managing ingress. Not that trivial, but a huge gain of time an effort in comparison with the “bare-vm” plan.

K8S is great for scaling (in public clouds). A.I. and M.L. are typical consumers of huge amounts of resources, for a limited amount of time. Do you need to train your new model ? You need large amounts of units of workloads, for a few hours, then you can switch them all off and proceed to analysis and redesign. You do not want to manage servers, OS, or anything. You just need “compute power as a service”, which is basically what public clouds offer,(and are good at).

LXD is best used with proprietary/internal software. Not all companies on this planet exclusively ‘of the shelf” software, but develop their own pieces of software. Many of these companies do not work in A.I. and actually need stable, constant and easy to use infrastructure. They have no use in managing an extra layer of virtualization for their apps, i.e. creating and maintaining dockerfiles and external dependencies.

In LXD, you get a fresh, autonomous, “works as your server”, portable machine in seconds. No need to spend time on docker files and registries, and all that jazz. Of course, you could always pull an Ubuntu image on Docker and launch an instance, but there are extra configurations, limitations and specifics that make it less easy. Docker is simply not made for this, and it will bite your sooner or later.

LXD is great for private clouds and bare metal. We’ve already established that Public clouds are great for some use case. But they can also prove expensive, hard to monitor, setup and debug. If your company has a recurrent or constant need for some resources, Public cloud are never the right choice. Even with 3 years commit plans with 70% discounts. And what if you change your plans (up or down?) within these 3 years? In these cases LXD makes the more sense.

Can we they still be friends?

At Garaia, when looking at resilience done simply in IT infrastructures, our recommendation rarely points to K8S, because, the potential gains never outperform the costs (often cost in upgrading the IT team or budget to integrate K8S proficiency). It does not mean that we disregard or would not implement K8S.

As discussed, K8S and LXD have different goals, and they are not incompatible.

As many K8S deployment currently are based either on bare metal or individual VMs, availability really relies on K8S HA layer. Now, why not deploying your K8S cluster on top of clustered LXD VMs (and probably soon containers). If your bare metal node went down, it would not be for long as LXD would soon spin it back up, in the state it was on another machine. You then would have two layers of High Availability on your infra. Is Hyper Available even a thing ?

Clearly adding a layer of availability adds a layer of complexity, hence, that would not be something we’d recommend except on very specific and highly justified use-cases.

Leave a Reply

Your email address will not be published. Required fields are marked *