Containers & Orchestration

Containers package an application and all its dependencies into a single, portable unit that runs consistently across any environment. Container orchestrators then manage scheduling, scaling, networking, and health of those containers at scale. Together, they are the de-facto deployment model for modern distributed systems.

Docker

Docker is the standard toolchain for building and running containers. A container is a lightweight, isolated process that shares the host OS kernel but has its own filesystem, network, and process space. Unlike a virtual machine, there is no guest OS — containers start in milliseconds and have negligible overhead.

Images and containers

Image — a read-only, layered snapshot of the filesystem: OS base, runtime, dependencies, and application code. Built from a Dockerfile; stored in a registry (Docker Hub, Amazon ECR, GitHub Container Registry).
Container — a running instance of an image. You can run many containers from the same image simultaneously. Containers are ephemeral — stopping one discards any state written inside it. Persistent data must be mounted via volumes.

Dockerfile

A Dockerfile is a recipe that defines how to build an image. Each instruction creates a new layer; layers are cached and reused, so only changed layers are rebuilt. Keep layers small and order instructions from least to most frequently changed to maximise cache hits.

Key concepts

Registry — a repository for images. docker push uploads an image; docker pull downloads it. CI/CD pipelines build and push on every merge.
Volumes — mounts that persist data outside the container lifecycle. Used for databases, file uploads, or any state that must survive a container restart.
Networking — containers on the same host communicate over a virtual bridge network. Docker assigns each container a private IP; services discover each other by name in Docker Compose.
Docker Compose — a tool for defining and running multi-container applications locally. A single docker-compose.yml specifies all services, their images, environment variables, volumes, and network links. Essential for local development; not used in production (replaced by an orchestrator).

Kubernetes

Kubernetes (K8s) is an open-source container orchestrator originally built by Google. It automates deployment, scaling, self-healing, and rolling updates of containerised workloads across a cluster of machines.

Architecture

Control plane — the brain of the cluster. Runs the API server (all communication goes through it), the scheduler (assigns pods to nodes), the controller manager (reconciles desired vs actual state), and etcd (the distributed key-value store that holds all cluster state).
Worker nodes — machines that run your workloads. Each node runs a kubelet (agent that communicates with the control plane), a container runtime (containerd, Docker), and kube-proxy (manages network rules).

Core objects

Pod — the smallest deployable unit. A pod wraps one or more containers that share a network namespace and storage volumes. Pods are ephemeral; they are created and destroyed, never updated in place.
Deployment — declares the desired state for a set of identical pods (replica count, image version). The deployment controller continuously reconciles actual state to match. Rolling updates and rollbacks are built in.
Service — a stable network endpoint (DNS name + virtual IP) that load-balances traffic across the pods matching a label selector. Pods come and go; the Service IP never changes.
Ingress — an HTTP/HTTPS routing layer that sits in front of Services. Routes external traffic to the right Service based on hostname or URL path; handles TLS termination.
ConfigMap / Secret — externalise configuration and credentials from the container image. Injected as environment variables or mounted as files at runtime.
Horizontal Pod Autoscaler (HPA) — automatically scales the replica count of a Deployment based on CPU, memory, or custom metrics.

Self-healing

Kubernetes continuously compares desired state (defined in manifests) to actual state (what is running). If a pod crashes, the controller restarts it. If a node dies, pods are rescheduled on healthy nodes. This reconciliation loop is the core of how Kubernetes achieves reliability without manual intervention.

Managed Kubernetes

Running the control plane yourself is operationally expensive. Every major cloud provider offers a managed Kubernetes service where the control plane is their responsibility:

Amazon EKS — Elastic Kubernetes Service
Google GKE — Google Kubernetes Engine (the most mature; Kubernetes originated here)
Azure AKS — Azure Kubernetes Service

Amazon ECS

Amazon Elastic Container Service (ECS) is AWS's proprietary container orchestrator. It is simpler than Kubernetes — fewer abstractions, less configuration — and integrates tightly with the AWS ecosystem (IAM, ALB, CloudWatch, ECR). The trade-off is that it is AWS-only; there is no portable open standard underneath.

Core concepts

Task definition — a blueprint for a container (or group of containers): which image to use, CPU/memory allocation, environment variables, IAM role, port mappings. Equivalent to a Kubernetes Pod spec.
Task — a running instance of a task definition. Equivalent to a Kubernetes Pod.
Service — maintains a desired count of running tasks, integrates with an Application Load Balancer for traffic distribution, and replaces failed tasks automatically. Equivalent to a Kubernetes Deployment + Service.
Cluster — a logical grouping of tasks and services.

EC2 launch type vs Fargate

	EC2 launch type	Fargate
Infrastructure	You manage EC2 instances in the cluster	AWS manages the underlying infrastructure
Billing	Pay for EC2 instances regardless of utilisation	Pay per task CPU/memory per second
Control	Full control over instance type, AMI, storage	No access to underlying host
Best for	Predictable, sustained workloads; GPU tasks	Variable workloads; teams that want no server management

ECS vs Kubernetes

	Amazon ECS	Kubernetes
Learning curve	Low — fewer concepts, AWS console driven	High — many abstractions, YAML-heavy
Portability	AWS only	Any cloud or on-premise
Ecosystem	Deep AWS integration (IAM, ALB, CloudWatch)	Large open-source ecosystem (Helm, Istio, Argo)
Flexibility	Opinionated — simpler but less customisable	Highly extensible via CRDs and operators
Best for	AWS-native teams wanting simplicity	Multi-cloud, large teams, complex workloads