Skip to main content

Kubernetes

https://kubernetes.io

Docs - https://kubernetes.io/docs/home/

Definitions:

  • Distributed operating system.
  • Operating system for distributed computing.
  • Universal computing platform.
  • An open-source system for automating deployment, scaling, and management of containerized applications.

https://github.com/kubernetes

https://github.com/kubernetes-sigs

Copilot instructions - https://github.com/github/awesome-copilot/blob/main/instructions/kubernetes-deployment-best-practices.instructions.md

https://github.com/kelseyhightower/kubernetes-the-hard-way

https://github.com/dennyzhang/cheatsheet-kubernetes-A4

https://github.com/ramitsurana/awesome-kubernetes

OWASP Kubernetes Top 10 - https://github.com/OWASP/www-project-kubernetes-top-ten

Deploy a Production Ready Kubernetes Cluster - https://github.com/kubernetes-sigs/kubespray - https://kubespray.io

minikube - https://minikube.sigs.k8s.io/docs

Local Kubernetes Development - https://github.com/GoogleContainerTools/skaffold - https://skaffold.dev

Examples - https://github.com/AdminTurnedDevOps/kubernetes-examples

https://github.com/nigelpoulton/TheK8sBook

https://github.com/MichaelCade/90DaysOfDevOps#kubernetes

https://github.com/bregman-arie/devops-exercises/blob/master/topics/kubernetes/README.md

https://github.com/siderolabs/talos - https://www.talos.dev

https://readmedium.com/top-10-kubernetes-pod-concepts-that-confuse-beginners-8c0954021f3f

The History of Kubernetes on a Timeline - https://blog.risingstack.com/the-history-of-kubernetes

Kubernetes Distributions & Platforms - https://docs.google.com/spreadsheets/d/1uF9BoDzzisHSQemXHIKegMhuythuq_GL3N1mlUUK2h0/edit?usp=sharing

https://github.com/GoogleCloudPlatform/microservices-demo - Sample cloud-first application with 10 microservices showcasing Kubernetes, Istio, and gRPC

For the Love of God, Stop Using CPU Limits on Kubernetes - https://home.robusta.dev/blog/stop-using-cpu-limits

Does Kubernetes really give you multicloud portability? - https://medium.com/digital-mckinsey/does-kubernetes-really-give-you-multicloud-portability-476270a0acc7

Y tú, ¿odias o amas Kubernetes? - https://dev.to/aws-espanol/y-tu-odias-o-amas-kubernetes-ind - https://www.paradigmadigital.com/dev/odias-amas-kubernetes

Se ha dado tanta flexibilidad a Kubernetes que se puede ejecutar cualquier carga. Esto en principio parece bueno, pero el que se pueda ejecutar, no significa que sea lo más optimo, y menos si queremos evolucionar. Un claro ejemplo serían las BBDD en Kubernetes. Es posible ejecutar una BBDD en Kubernetes, pero no tiene sentido. Al final no estás contenerizado un microservicio, sino que estás contenerizado un servidor entero de BBDD.

Otro ejemplo horrible son los famosos “Lift and Shift to Kubernetes”, ¿qué sentido tiene pasar de un servidor virtualizado a un pod en Kubernetes? Es posible hacerlo, pero solamente estamos generando problemas y utilizando la tecnología de contenedores para algo que no es su propósito.

El problema no es que Kubernetes pueda ejecutar estas cargas, el problema es que es un mal caso de uso, que se está generalizando demasiado.

Es muy habitual que empecemos por montar un cluster de Kubernetes para ejecutar nuestras futuras cargas de trabajo, sin tener en cuenta las cargas de trabajo en sí. Primero montamos el cluster y luego ya definimos las cargas. También existe la variante de directamente desarrollar en Kubernetes sin tener en cuenta si va a ser lo más optimo.

Estamos en 2023, la división entre infraestructura y desarrollo es algo del pasado, debemos de pensar en la carga que vamos a desarrollar y elegir el lugar más optimo para ejecutarla.

Aunque ECS, EKS y Kubernetes permiten montar discos persistentes en los pods no es algo recomendado, es más se debería de evitar al máximo.

Validators / linters / vulnerabilities

https://github.com/stackrox/kube-linter

https://github.com/datreeio/datree

Static analysis to find misconfigurations and vulnerabilities - https://www.checkov.io - https://github.com/bridgecrewio/checkov

Security risk analysis for Kubernetes resources - https://kubesec.io - https://github.com/controlplaneio/kubesec

https://github.com/aquasecurity/trivy - https://trivy.dev/latest/tutorials/kubernetes/cluster-scanning/

What is Kubernetes?

https://kubernetes.io/docs/concepts/overview/what-is-kubernetes

Kubernetes comprises a set of independent, composable control processes that continuously drive the current state towards the provided desired state.

https://www.redhat.com/en/topics/containers/what-is-kubernetes

https://cloud.google.com/learn/what-is-kubernetes

History: https://cloud.google.com/blog/products/containers-kubernetes/from-google-to-the-world-the-kubernetes-origin-story

Benefits

  • Automatic scaling management
  • Secrets and configuration management
  • Service discovery (DNS service for internal communication)
  • Load balancing
  • Container health checks and automatic replacement. Self-healing, high availability
  • Rolling updates and rollbacks. Zero downtime deployment
  • Persistent storage
  • Network management
  • Efficient cluster utilization
  • Workload balance across servers
  • Open source. Large community
  • Extensible

https://jessitron.com/2022/10/02/why-we-use-kubernetes

What is Kubernetes? - https://www.youtube.com/watch?v=a2gfpZE8vXY

Concepts and components

https://kubernetes.io/docs/concepts/overview/components

https://kubernetes.io/docs/reference/glossary/?fundamental=true

https://www.redhat.com/en/topics/containers/kubernetes-architecture

https://docs.aws.amazon.com/eks/latest/userguide/kubernetes-concepts.html

Kubernetes – Architecture and main components overview

  • Cluster: a set of worker machines (nodes).
  • Node: a worker machine.
    • Can be virtual or physical.
    • Each node has a container runtime (eg Docker, containerd, CRI-O).
  • Pod: a set of running containers.
    • https://kubernetes.io/docs/concepts/workloads/pods
    • The smallest deployable unit of computing that you can create and manage in Kubernetes.
    • A pod can have 1 or more containers (eg application, logging...).
    • Pods are replicated across multiple nodes, providing high availability.
    • Pods are disposable and replaceable (ephemeral, nonpermanent, not persistent), and can be created and terminated by the control plane.
    • All containers in a pod share an IP address, IPC, hostname, and other resources. (source)
  • Service: An abstract way to expose an application running on a set of Pods as a network service
  • Volume: a directory containing data, accessible to the containers in a Pod.
    • Since pods are ephemeral, volumes provide a persistent way to store data.
  • Namespace: a virtual grouping of objects.
    • Kubernetes resources are either namespace or cluster-scoped (non-namespaced).
      • Namespaced: Pod, ReplicaSet, Deployment, StatefulSet, DaemonSet, Service, Ingress, ConfigMap, Secret, PersistentVolumeClaim...
      • Non-namespaced: Namespace, Node, PersistentVolume, ClusterRole, ClusterRoleBinding, IngressClass, StorageClass, CustomResourceDefinition...
      • Use kubectl api-resources --namespaced=true and kubectl api-resources --namespaced=false to list all resources.
    • Built-in namespaces: default, kube-system, kube-public, kube-node-lease.
    • Names of resources need to be unique within a namespace, but not across namespaces.
Kubernetes objects
Source: AWS Experience

Hierarchy

  • A cluster has many nodes
  • A node has many pods
  • A pod has many containers

Glossary

https://kubernetes.io/docs/reference/glossary/?all=true

From https://kubectl.docs.kubernetes.io/guides/introduction/resources_controllers:

  • Resource Config: declarative files with resources that are written to a cluster.
  • Resources: instances of Kubernetes objects, which are declared as json or yaml and applied to a cluster. For example: deployment, services, namespaces, etc.
    • Resources are uniquely identified by:
      • apiVersion: API Type Group and Version
      • kind: API Type Name
      • metadata.namespace: Instance namespace
      • metadata.name: Instance name
  • Controllers: actuate Kubernetes APIs. They observe the state of the system and look for changes either to desired state of Resources (create, update, delete) or the system (Pod or Node dies).
  • Workloads: resources which run containers. For example: Deployments, StatefulSets, Jobs, CronJobs and DaemonSets.
Workload API
DeploymentsStateless Applicationsreplication + rollouts
StatefulSetsStateful Applicationsreplication + rollouts + persistent storage + identity
JobsBatch Workrun to completion
CronJobsScheduled Batch Workscheduled run to completion
DaemonSetsPer-Machineper-Node scheduling

Control plane

A cluster is managed by the control plane (called master in the past), which exposes an API that allows for example to interact with the scheduler.

The control plane is responsible for maintaining the desired state of the cluster, such as which applications are running and which container images they use. (source)

Kubernetes components
Source: AWS Experience

Components:

  • kube-apiserver: exposes the Kubernetes REST API used to connect to Kubernetes and deploy workloads.
  • etcd: key-value store for all cluster data. Database for non-ephemeral data.
  • kube-scheduler: watches for newly created Pods with no assigned node, and selects a worker node for them to run on.
  • kube-controller-manager: runs controller processes, which confirms that the current state is the desired state for all the running workloads.
  • cloud-controller-manager (optional): embeds cloud-specific control logic. Lets you link your cluster into your cloud provider's API.

See https://kubernetes.io/docs/concepts/architecture/#control-plane-components and https://kubernetes.io/docs/concepts/overview/components/#control-plane-components.

You want to have a minimum of 3 control planes, since etcd uses the RAFT consensus algorithm, which requires leader election. One of them will be the main control plane.

API Server

https://kubernetes.io/docs/reference/command-line-tools-reference/kube-apiserver/

What you use to interact with Kubernetes.

What you will be working with the most, since the operations you do with kubectl interact with this API. For example, when you run kubectl apply -f manifest.yaml, you are doing a POST request that sends the manifest.yaml to the API server. And when you run kubectl get pods you are doing a GET request.

Worker Nodes

Also called data plane.

Components:

See https://kubernetes.io/docs/concepts/architecture/#node-components and https://kubernetes.io/docs/concepts/overview/components/#node-components.

The recommended number of nodes is between 3 and 5. It needs to have high availability and scaling, otherwise the pods won't have a place to move to if a worker node fails.

Addons

https://kubernetes.io/docs/concepts/overview/components/#addons

Kustomize

https://kustomize.io

https://www.eksworkshop.com/docs/introduction/kustomize/

Port forward

Expose an internal pod locally.

kubectl port-forward <pod-name> 6000:80 → We can access the service locally at http://localhost:6000, and requests are forwarded to port 80 on the pod in our cluster

kubectl port-forward <pod-name> [<local-port>:]<pod-port>

Other

Show all events: kubectl get events -w

Show component status (deprecated in 1.19): kubectl get componentstatuses

Check the rollout status: kubectl rollout status deployment/simple-flask-deployment

Get external IP address: kubectl get services <service-name> -o wide

kubectl plugins

Plugin manager - https://krew.sigs.k8s.io - https://github.com/kubernetes-sigs/krew

Pod

https://kubernetes.io/docs/concepts/workloads/pods/

Lifecycle - https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/

Container probes

https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/

  • Readiness probe: the container is ready to accept requests
  • Liveness probe: the container is still accepting requests
  • Startup probe: the application is started

Resource limits

https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/

Requests a specific amount of CPU and memory so the Kubernetes scheduler can place it on a node with enough available resources.

Deployment

https://kubernetes.io/docs/concepts/workloads/controllers/deployment/

Manage one or more replicas of a pod, allowing it to scale horizontally. A Deployment manages ReplicaSets automatically.

Provides update strategies: Recreate or RollingUpdate.

Namespace

To group objects and avoid name collisions.

Deleting a namespace deletes all its objects.

The Namespaces are a logical grouping of the resources for each microservice and also act as a soft isolation boundary, which can be used to effectively implement controls using Kubernetes RBAC and Network Policies. source

Labels

Recommended Labels - https://kubernetes.io/docs/concepts/overview/working-with-objects/common-labels/

Service

https://kubernetes.io/docs/concepts/services-networking/service/

The IP address of a pod is not stable, for example, it changes when a pod is restarted. A service load balances a set of pods matching labels, and exposes them over a network. Allows an application running as a set of pods to be called by other components inside the Kubernetes cluster. Each service is given its own virtual IP and DNS entry.

Transport layer (4): TCP, UDP and TLS.

Types:

  • ClusterIP: internal, not accessible from outside the cluster.
  • NodePort: accessible from outside the cluster. For development.
  • LoadBalancer: external load balancer, of a cloud provider.

ClusterIP services are internal to the cluster, so we cannot access them from the Internet or even the VPC. However, we can use exec to access an existing pod in the EKS cluster to check the catalog API is working (source):

kubectl -n catalog exec -i \
deployment/catalog -- curl catalog.catalog.svc/catalog/products | jq .

https://github.com/kubernetes-sigs/aws-load-balancer-controller

  • AWS Application Load Balancer → Kubernetes Ingress
  • AWS Network Load Balancer → Kubernetes Service

Ingress

https://kubernetes.io/docs/concepts/services-networking/ingress/

Sits in front of a service. Application layer (7): HTTP and HTTPS.

Acts as the entry point for your cluster. Lets you consolidate your routing rules into a single resource, so that you can expose multiple components of your workload, running separately in your cluster, behind a single listener.

To be replaced by the Gateway API.

https://www.f5.com/products/nginx/nginx-ingress-controller

Gateway API

The Gateway API is going to replace the Ingress in the long term.

https://gateway-api.sigs.k8s.io

Network Policy

https://kubernetes.io/docs/concepts/services-networking/network-policies/

https://github.com/ahmetb/kubernetes-network-policy-recipes

https://cilium.io

DeamonSet

https://kubernetes.io/docs/concepts/workloads/controllers/daemonset/

Ensures that a set of worker nodes run a copy of a Pod. As nodes are added to the cluster, Pods are added to them.

Use case: logging agents, node monitoring deamon, etc.

StatefulSet

https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/

Runs a group of Pods, and maintains a sticky identity for each of those Pods. Guarantees a unique network ID and startup order.

Used to manage stateful workloads, for example a MySQL database that runs inside a Kubernetes cluster.

Job

https://kubernetes.io/docs/concepts/workloads/controllers/job/

One-off tasks that run to completion and then stop. Can run in parallel.

CronJob

https://kubernetes.io/docs/concepts/workloads/controllers/cron-jobs/

For performing regular scheduled actions such as backups, report generation, etc.

ServiceAccount

https://kubernetes.io/docs/concepts/security/service-accounts/

ConfigMap

https://kubernetes.io/docs/concepts/configuration/configmap/

Used to store non-confidential data in key-value pairs, and expose it to a pod.

Secrets

https://kubernetes.io/docs/concepts/configuration/secret/

To avoid including confidential data in application code or a container image.

Volume

https://kubernetes.io/docs/concepts/storage/volumes/

A directory accessible to the containers in a pod. Volumes provide data persistance and shared storage for pods.

  • Static provisioning: done by the cluster administrator. Explicitly refers to physical storage.
  • Dynamic provisioning: done by the application developer. Requires a StorageClasses.

PersistentVolume

https://kubernetes.io/docs/concepts/storage/persistent-volumes/

A persistentVolumeClaim volume is used to mount a PersistentVolume into a Pod. PersistentVolumeClaims are a way for users to "claim" durable storage (such as an iSCSI volume) without knowing the details of the particular cloud environment.

RBAC

https://kubernetes.io/docs/reference/access-authn-authz/rbac/

ClusterRole is a non-namespaced resource, it applies to all namespaces.

Tools

https://collabnix.github.io/kubetools

https://velero.io - Backup and migrate Kubernetes resources and persistent volumes

Lens (GUI) - https://k8slens.dev - https://www.mirantis.com/blog/getting-started-with-lens

Secrets management - https://external-secrets.io/latest

TLS certificates management - https://cert-manager.io

https://github.com/stern/stern - Logs

Dashboard

https://github.com/kubernetes/dashboard

https://kubernetes.io/docs/tasks/access-application-cluster/web-ui-dashboard/

Learn

Security

https://github.com/controlplaneio/simulator

Node.js

https://blog.platformatic.dev/the-myths-and-costs-of-running-nodejs-on-kubernetes

https://www.linkedin.com/posts/matteocollina_last-month-a-cto-friend-called-me-in-a-panic-activity-7375937934878912512-4LtP/

Node.js applications are (usually!!) single-threaded and event-driven, while Kubernetes was built for heavyweight Java applications.

We're using CPU metrics when we should watch event loop lag.

One team I know switched their scaling metrics to event loop lag and cut response times in half. Another reduced their cloud bill by 60 percent just by understanding how V8 uses memory versus what Kubernetes assumes.

The companies winning at this have stopped trying to force Node.js to behave like Java. They scale on metrics that actually matter for event-driven architectures. They've stopped blindly trusting Kubernetes defaults.

Terraform

https://medium.com/devops-mojo/terraform-provision-amazon-eks-cluster-using-terraform-deploy-create-aws-eks-kubernetes-cluster-tf-4134ab22c594

TODO mirar