Terminus
Kubernetes vs Docker: The Backbone of Modern Backend Technologies

Kubernetes vs Docker: The Backbone of Modern Backend Technologies

The two most popular technologies to date for managing and deploying containerized applications are Docker and Kubernetes.

Docker is a tool that enables developers to create, deploy, and run applications in self-contained units known as containers, making it easier to package, distribute, and manage software on individual machines.

Kubernetes, on the other hand, is an orchestration tool that automates the deployment, scaling, and management of containerized applications in a cluster environment.

Although different, these two technologies often work in tandem and offer several advantages, such as portability and scalability, while enabling faster and more reliable software delivery.

In this article, we'll explore the similarities and core differences between Docker and Kubernetes, and discuss scenarios where one technology might be preferred over the other.

[#overview-of-docker] An overview of Docker [#overview-of-docker]

Docker is an open-source software running on Linux, macOS, and Windows that enables developers to automate the development, deployment, and maintenance of applications by packaging them and their dependencies together into self-sufficient units called containers.

It allows developers to manage every step of a container's lifecycle including building container images, running containers, interacting with running containers, and publishing images to a hosted registry.

It relies on the concept of containerization, which consists in making applications run consistently and reliably regardless of the underlying operating system by making them self-contained, isolated, and portable.

[#core-concepts] Core Docker concepts [#core-concepts]

Docker revolves around two core concepts: containers and images.

A Docker container is an executable software package running in an isolated environment that includes its own operating system, filesystem, virtual network, and namespace, and that bundles an application and its dependencies.

A Docker image, on the other hand, is the template from which containers are created. It includes the base operating system, initial filesystem, application code, tools, libraries, dependencies, and configurations required to be able to create a runnable instance within a container.

[#images-and-dockerfiles] Docker images and Dockerfiles [#images-and-dockerfiles]

A Docker image is an immutable, stand-alone, and executable package that contains everything needed to run a piece of software, including the runtime, filesystem, system tools, application's code, libraries, and settings.

An image is composed of a base image, which typically includes a minimal operating system, a runtime environment, and some essential tools required to run software applications, and additional layers that represent modifications and additions made to that base image.

It serves as a template for creating Docker containers.

Both base and custom images are usually stored on and downloaded from servers called Docker registries, like the official Docker Hub registry.

Dockerfiles

A Dockerfile is a plain text file that contains all the necessary instructions to create a Docker image.

It serves as a template that describes how to build a custom image from a base image, step by step, and usually includes instructions for copying data and application files, defining environment variables, installing packages and tools, and more.

When you build the Docker image using the [.inline-code]docker build[.inline-code] command, Docker takes the base image and applies the instructions present in the Dockerfile to create the final image.

For example, the following Dockerfile contains all the necessary instructions to build an image that packages an Nginx server running on the latest version of Ubuntu:

 FROM ubuntu:latest

RUN apt update
RUN apt install -y nginx

COPY test_html /var/www/html

RUN ln -sf /dev/stdout /var/log/nginx/access.log

EXPOSE 80
CMD /usr/sbin/nginx -g 'daemon off;'

Where:

  • [.inline-code]FROM[.inline-code] is used to define the base image.
  • [.inline-code]RUN[.inline-code] is used to execute arbitrary commands.
  • [.inline-code]COPY[.inline-code] is used to copy files from the local environment into the image.
  • [.inline-code]EXPOSE[.inline-code] is used to document the ports used by the container.
  • [.inline-code]CMD[.inline-code] is used to define the default command that will be executed by the container launched from this image.

You can learn more about Dockerfiles by visiting the official Dockerfile Reference page.

[#containers] Docker containers [#containers]

A Docker container is a runnable instance of a Docker image.

More specifically, it is a lightweight, standalone, and executable package that bundles a complete application, including its configuration and dependencies, making it easy to run in a consistent and predictable manner on different environments, such as development machines or production servers, regardless of the underlying infrastructure.

Although containers use the host operating system's kernel and resources, they are isolated from it through the use of namespaces, which prevents any unwanted interference between containers or with the host system itself. This isolation is what enables multiple containers to run concurrently on a single host without conflicts or security risks.

Moreover, containers have their own separate filesystem, virtual network, and processes, allowing them to run their applications autonomously.

Just like any other processes, containers can be created, started, stopped, and removed, but also customized, as it is possible to define how much resources are allocated to each container, their environment, how they store their data, and so on.

[#compose] Docker Compose [#compose]

Docker Compose is a tool that simplifies the management of multi-container Docker applications by enabling users to define the entire application stack and its configuration into a single, easy-to-read YAML file.

For instance, a typical web app might include a web server, a database, and maybe other containers like a backend service or a queue. These containers often need various settings, such as environment variables, port forwarding, network configurations, and so on.

Managing all these separate settings with numerous Docker commands is neither efficient nor easy to share with others.

An overview of Docker Compose files

Docker Compose files are written in the YAML format and are, by default, named [.inline-code]compose.yaml[.inline-code].

They usually contain a list of:

  • Services: the containers, their configurations, and their relationships.
  • Networks: the virtual networks that facilitate the communication between the different services.
  • Volumes: the persistent data storages used by the services.
  • Configs: the services configuration stored separately from the containers filesystems.
  • Secrets: the sensitive services data, such as passwords or tokens.

For example:

services:
  web_server:
    image: nginx:latest
    ports:
      - "80:80"
    networks:
      - my_network
    volumes:
      - ./nginx_config:/etc/nginx/conf.d
  
  website:
    build:
      context: ./website
      dockerfile: Dockerfile
    ports:
      - "8080:80"
    networks:
      - my_network
  
  api:
    build:
      context: ./api
      dockerfile: Dockerfile
    ports:
      - "5000:5000"
    networks:
      - my_network

networks:
  my_network:
    driver: bridge

This Compose file launches three services, including an Nginx web server, a custom server running a website, and a custom server running a web API.

You can learn more about Compose by visiting the official Docker Compose page.

[#overview-of-kubernetes] An overview of Kubernetes [#overview-of-kubernetes]

Kubernetes is an open-source container orchestration tool built on top of a container runtime like Docker that is designed to automate the deployment, scaling, and management of containerized applications, ensuring high availability and resilience.

It has several notable features such as automatic load balancing, self-healing capabilities, and automated rollouts and rollbacks.

 Furthermore, Kubernetes offers a declarative approach to defining the desired state of your application, allowing for automatic adjustments as necessary, thus reducing the operational overhead.

[#core-concepts] Core Kubernetes concepts [#core-concepts]

Kubernetes revolves around three core concepts: Pods, Nodes and clusters.

A Pod is the smallest deployable unit of computation within a Node, that runs one or more tightly coupled containers that are designed to share storage and network resources.

A Node, also called Worker Node, is an individual server within a cluster that can run multiple Pods, providing the resources and runtime environment for containers.

A cluster is a collection of Nodes that are managed by the control plane and that work together to run containerized applications by providing the entire computing infrastructure.

In short, Pods are created in a declarative manner using configuration files called specifications that describe their desired state in the Cluster. Based on these specifications, the control plane schedules the Pods on the various Nodes of the cluster responsible for running the Pods' containers. Kubernetes then automatically handles events such as crashes or reboots to make sure that the general state of the cluster reflects the user-defined specifications.

[#control-plane] The Kubernetes control plane [#control-plane]

The control plane is the central management component that administers the entire Kubernetes cluster.

It is in charge of monitoring the cluster and automatically reconciling the actual cluster state with the desired state defined in the specifications, by scheduling, adding, or removing Pods to and from the various cluster Nodes.

It is composed of several components, including:

  • The API server, which is responsible for processing RESTful API requests and serving as the gateway to the cluster's internal workings.
  • etcd, which is a database used by Kubernetes to store and manage the cluster's configuration data.
  • The scheduler, which is responsible for placing Pods onto suitable Nodes within the cluster.
  • The controller manager, which regulates the state of the system to ensure it matches the desired configurations specified by users.

[#worker-nodes] Kubernetes Worker Nodes [#worker-nodes]

Worker nodes are the servers within a Kubernetes cluster that are responsible for running the containers in Pods and managing the associated resources such as the CPU, the memory, and so on.

Nodes are composed of several components, including:

  • The Kubelet, which communicates with the control plane, is responsible for starting, stopping, and maintaining containers within the Pods.
  • The container runtime, which is the actual engine (i.e. containerd) responsible for running the containers.
  • The kube-proxy, which is responsible for generating the virtual network between all the nodes of the cluster, allowing them to communicate with each other and external services.

[#pods] Kubernetes Pods [#pods]

In Kubernetes, a Pod is the smallest deployable unit into which containers run. Each Pod can configure, run, and manage multiple containers and their associated resources, such as their image, environment, ports, volumes, and so on.

Containers running within the same Pod share the same network namespace and IP address, which allows them to communicate with each other through localhost. Additionally, they share the same storage volumes, which enables them to easily and efficiently share data.

Pods are declared in specifications files containing metadata about the Pods, such as their name or labels, and configuration about the containers running within the Pods, such as their name, image, ports, and so on.

For example, this specification defines a single container named [.inline-code]mysite[.inline-code] which uses the latest [.inline-code]nginx[.inline-code] image and defines a name for the port 80 called [.inline-code]web[.inline-code].

 apiVersion: v1
kind: Pod
metadata:
  name: nginx-k8s
  labels:
    app: frontend
spec:
  containers:
  - name: mysite
    image: nginx:latest
    ports:
    - containerPort: 80
      name: web

[#controllers] Kubernetes Controllers [#controllers]

While Pods serve as the fundamental unit for deploying containers, they are often managed by higher-level abstractions called Controllers, which provide advanced features that Pods themselves don't have like scaling, rolling updates, and self-healing.

[#using-docker-and-kubernetes] Using Docker and Kubernetes [#using-docker-and-kubernetes]

While both Docker and Kubernetes can be used to manage containers, they serve different purposes.

Docker is great for tasks related to the development of containerized applications, such as creating images and testing containers, as it offers rapid deployment, local file mounting, and easy image iteration.

Kubernetes, on the other hand, is designed for running containers in production environments. It excels in orchestrating containers across multiple servers and allows fine-grained configuration of security and access.

Although in certain cases Docker can be suitable for production, Kubernetes is better equipped for performing complex tasks, such as controlling access to specific container ports, managing permissions on public and private networks, implementing role-based access control, and handling scaling and replication across nodes.

Associated challenges

Getting started with Kubernetes can be challenging due to its focus on complex production requirements, resulting in numerous resource types and settings.

In contrast, Docker offers a more straightforward and predictable experience because it includes everything you need "out of the box."

While Kubernetes's modularity is a strength, it can also be quite confusing in the first place. Different distributions may have various network and storage plugins, DNS management methods, and firewall configurations.

While this flexibility accommodates diverse setups, understanding the core resource types and how these plugins and container interfaces work can be overwhelming and introduce too much overhead for simpler applications.

However, there are user-friendly Kubernetes distributions, such as minikube, Colima, or k3s, designed for development purposes.

Although they provide an easier way to get started, you should keep in mind that transitioning to using Kubernetes in a production environment is a more extensive journey compared to the simplicity of Docker.

[#docker-swarm-vs-kubernetes] Docker Swarm vs Kubernetes [#docker-swarm-vs-kubernetes]

Docker Swarm is Docker's open-source container orchestration solution that allows developers to manage containers across multiple nodes of a cluster, called a swarm.

Like Kubernetes, it provides features such as network proxying, load balancing, service discovery, scaling, and high availability.

However, unlike Kubernetes, it is by design much simpler to set up and use, as it follows the  declarative Docker philosophy of promoting the use of commands over configuration, where Kubernetes uses verbose and sometimes complex YAML configuration files.

For example, in Docker Swarm, to create a new service with 3 replicas and have them all accessible through a single network exposed port, you can use the following command:

$ docker service create --replicas 3 --name web-server -p 8111:80 nginx:latest

While, in Kubernetes, you need to create two YAML resources, a Deployment and a Service:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: web-server
  labels:
    app: web-server
spec:
  replicas: 3
  selector:
    matchLabels:
      app: web-server
  template:
    metadata:
      labels:
        app: web-server
    spec:
      containers:
      - name: web-server-container
        image: nginx:latest
        ports:
        - containerPort: 80
          name: web
---
apiVersion: v1
kind: Service
metadata:
  labels:
    app: web-server
  name: web-server-service
spec:
  ports:
  - name: web
    nodePort: 30080
    port: 8111
    protocol: TCP
    targetPort: 80
  selector:
    app: web-server
  type: NodePort

And then apply this YAML file using the following command:

$ kubectl apply -f webserver.yaml

It's also good to note that Kubernetes provides more advanced strategies for rolling updates and deployments, while Docker Swarm's rolling update mechanism is in essence simpler but less flexible.

In conclusion, Docker Swarm is a simpler and more lightweight container orchestration solution that may be a better fit for smaller projects or when Docker is the primary container runtime.

On the other hand, Kubernetes is more complex to work with as it offers a more extensive feature set, scalability, and low-level control over each component, making it suitable for complex, large-scale applications.

The choice between Docker Swarm and Kubernetes therefore depends on the specific needs and scale of your containerized application. 

[#real-life-example] A real-life scenario: Automating the deployment of an application with Docker and Kubernetes [#real-life-example]

Nowadays, the deployment of containerized applications in development or production environments often involves multiple technologies, tools, and steps, which are usually automated using a CI/CD (Continuous Integration / Continuous Deployment) pipeline.

In this example, we'll give you a high-level overview of a possible deployment process of a simple Node.js application involving Docker and Kubernetes.

Step 1: Developing the application

The first step before containerization is to develop the application on your local machine, which involves writing the code, configuring the dependencies, and testing the application itself to make sure everything is working as expected.

Step 2: Creating a Dockerfile

The second step is to write a Dockerfile that will instruct Docker on how to package the application into a suitable production-ready image, which among other steps usually includes selecting a base image, setting up the environment, copying the application's code into the image, installing the production dependencies only, defining the start command to launch the application, and so on.

For example:

FROM node:18

WORKDIR /app

COPY package*.json ./

RUN npm install --only=production

COPY . .

EXPOSE 3000

CMD ["npm", "start"]

This Dockerfile will set the base image as Node version 18, set the container's working directory as [.inline-code]/app[.inline-code], copy the [.inline-code]package.json[.inline-code] file, install the production dependencies, copy the rest of the application files, and set the [.inline-code]npm start[.inline-code] command as the default command to be executed on container's startup.

Step 3: Building the Docker image

The third step is to build the Docker image using the previously defined Dockerfile and tag it appropriately by giving it a distribution name and version number to facilitate the tracking of the different application versions.

$ docker build -t node-server:1.1 .

Step 4: Testing the Docker image

The fourth step is to test the packaged application by first launching a container based on the previously built image, and then verify that the application is up and running, reachable over the network, and so on.

$ docker run -p 8080:3000 node-server:1.1

Step 5: Pushing the Docker image to a registry

The fifth step is to upload (i.e.publish) the image on a public or private Docker registry, such as Docker Hub, so that it can be downloaded on the development or production environment in a consistent and predictable manner.

$ docker push node-server:1.1

Step 6: Setting up a Kubernetes cluster

The sixth step is to set up a Kubernetes cluster where the application will be deployed after each build, by first connecting to the remote server over SSH:

$ ssh <username>@<host>

Where [.inline-code]username[.inline-code] is your login user name and [.inline-code]host[.inline-code] is the IP address or domain name of the remote server.

Installing the [.inline-code]k3s[.inline-code] package, which is an easy to use Kubernetes distribution:

$ curl -sfL https://get.k3s.io | INSTALL_K3S_EXEC="--tls-san <host>" sh -s -

Where [.inline-code]host[.inline-code] is the IP address or domain name of the remote server instance.

And updating the kubeconfig file, which contains the TLS certificate required to connect, by replacing the default [.inline-code]127.0.0.1[.inline-code] with the actual IP address of the host:

$ sudo cat /etc/rancher/k3s/k3s.yaml | sed s/127.0.0.1/<public address>/

Note that the remote server must be accessible by the CI platform over the default Kubernetes port 6443.

Step 7: Creating a Kubernetes Deployment and Service configuration

The seventh step is to create a Kubernetes Deployment and Service configuration, where the Deployment is in charge of Pod replication and scaling, while the Service acts as a load balancer responsible for exposing a single endpoint where traffic will be routed to one of the Pod replicas.

For example:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: node-server-deployment
spec:
  replicas: 3
  selector:
    matchLabels:
      app: node-server
  template:
    metadata:
      labels:
        app: node-server
    spec:
      containers:
      - name: node-server
        image: repository/node-server:1.1
        ports:
        - containerPort: 3000
---
apiVersion: v1
kind: Service
metadata:
  name: node-server-service
spec:
  selector:
    app: node-server
  ports:
    - protocol: TCP
      port: 80
      targetPort: 3000
      nodePort: 30030
  type: NodePort

This configuration file contains two resources.

A Deployment that defines 3 replicas of the Pod running the container based on the Docker image uploaded to the Docker registry at step 5.

A NodePort typed Service, which is a kind of service that is accessible outside of the cluster’s virtual network, that listens on a single port (80 inside the cluster and 30030 from the server running this Pod) and redirects the traffic reaching this port to one of the Pod replicas labeled “app: node-server”.

Step 8: Applying the Kubernetes configuration to the cluster

The eighth step is to apply the Kubernetes configuration to the cluster to test that the configuration is working.

$ kubectl --kubeconfig=/etc/rancher/k3s/k3s.yaml apply -f config.yaml

Step 9: Automating the process with CI/CD

The final step is to streamline and automate this process, from the publication of the image to the deployment of the application, by configuring a CI/CD tool, such as CircleCI, TravisCI, or Github Actions. Most of these platforms allow you to define multiple pipeline stages for building, testing, scanning, and deploying your containerized applications in a consistent manner, regardless of the targeted environment.

For example:

name: Build and Deploy
on:
  push:
    branches:
      - staging

jobs:
  build-and-push:
    runs-on: ubuntu-latest
    steps:
    - name: Checkout code
      uses: actions/checkout@v2

    - name: Login to Docker Hub
      run: echo "${{ secrets.DOCKER_PASSWORD }}" | docker login -u "${{ secrets.DOCKER_USERNAME }}" --password-stdin

    - name: Build and push Docker image
      run: |
        docker build -t ${{ secrets.DOCKER_USERNAME }}/node-server:${{ github.sha }} .
        docker push ${{ secrets.DOCKER_USERNAME }}/node-server:${{ github.sha }}
  deploy:
    needs: build-and-push
    runs-on: ubuntu-latest

    steps:
    - name: Checkout code
      uses: actions/checkout@v2

    - name: Set up kubectl
      uses: azure/setup-kubectl@v1

    - name: Configure kubectl
      run: echo "${{ secrets.KUBE_CONFIG_DATA }}" | base64 -d  > $HOME/.kube/config

    - name: Replace image in Kubernetes manifest and deploy
      run: |
        sed -i 's|repository/node-server:latest|${{ secrets.DOCKER_USERNAME }}/example-app:${{ github.sha }}|' k8s.yaml
        kubectl apply -f k8s.yaml

This Github Actions pipeline is composed of two jobs (i.e. tasks): [.inline-code]build-and-push[.inline-code] and [.inline-code]deploy[.inline-code].

Every time a new commit is pushed to the [.inline-code]staging[.inline-code] branch, it will first trigger the [.inline-code]build-and-push[.inline-code] job, which logs into Docker Hub, builds the Docker image, and pushes the image to the Docker registry.

Then, once completed, it will trigger the [.inline-code]deploy[.inline-code] job, which checks out the code, sets up and configures a Kubernetes cluster, and applies the kubeconfig configuration to the cluster.

Looking to dive in?

Getting started with our Docker articles

Getting started with our Kubernetes articles