Containers & Orchestration
Containers package an application and all its dependencies into a single, portable unit that runs consistently across any environment. Container orchestrators then manage scheduling, scaling, networking, and health of those containers at scale. Together, they are the de-facto deployment model for modern distributed systems.
Docker
Docker is the standard toolchain for building and running containers. A container is a lightweight, isolated process that shares the host OS kernel but has its own filesystem, network, and process space. Unlike a virtual machine, there is no guest OS — containers start in milliseconds and have negligible overhead.
Images and containers
- Image — a read-only, layered snapshot of the filesystem: OS base, runtime, dependencies, and application code. Built from a
Dockerfile; stored in a registry (Docker Hub, Amazon ECR, GitHub Container Registry). - Container — a running instance of an image. You can run many containers from the same image simultaneously. Containers are ephemeral — stopping one discards any state written inside it. Persistent data must be mounted via volumes.
Dockerfile
A Dockerfile is a recipe that defines how to build an image. Each instruction creates a new layer; layers are cached and reused, so only changed layers are rebuilt. Keep layers small and order instructions from least to most frequently changed to maximise cache hits.
Key concepts
- Registry — a repository for images.
docker pushuploads an image;docker pulldownloads it. CI/CD pipelines build and push on every merge. - Volumes — mounts that persist data outside the container lifecycle. Used for databases, file uploads, or any state that must survive a container restart.
- Networking — containers on the same host communicate over a virtual bridge network. Docker assigns each container a private IP; services discover each other by name in Docker Compose.
- Docker Compose — a tool for defining and running multi-container applications locally. A single
docker-compose.ymlspecifies all services, their images, environment variables, volumes, and network links. Essential for local development; not used in production (replaced by an orchestrator).
Kubernetes
Kubernetes (K8s) is an open-source container orchestrator originally built by Google. It automates deployment, scaling, self-healing, and rolling updates of containerised workloads across a cluster of machines.
Architecture
- Control plane — the brain of the cluster. Runs the API server (all communication goes through it), the scheduler (assigns pods to nodes), the controller manager (reconciles desired vs actual state), and etcd (the distributed key-value store that holds all cluster state).
- Worker nodes — machines that run your workloads. Each node runs a kubelet (agent that communicates with the control plane), a container runtime (containerd, Docker), and kube-proxy (manages network rules).
Core objects
- Pod — the smallest deployable unit. A pod wraps one or more containers that share a network namespace and storage volumes. Pods are ephemeral; they are created and destroyed, never updated in place.
- Deployment — declares the desired state for a set of identical pods (replica count, image version). The deployment controller continuously reconciles actual state to match. Rolling updates and rollbacks are built in.
- Service — a stable network endpoint (DNS name + virtual IP) that load-balances traffic across the pods matching a label selector. Pods come and go; the Service IP never changes.
- Ingress — an HTTP/HTTPS routing layer that sits in front of Services. Routes external traffic to the right Service based on hostname or URL path; handles TLS termination.
- ConfigMap / Secret — externalise configuration and credentials from the container image. Injected as environment variables or mounted as files at runtime.
- Horizontal Pod Autoscaler (HPA) — automatically scales the replica count of a Deployment based on CPU, memory, or custom metrics.
Self-healing
Kubernetes continuously compares desired state (defined in manifests) to actual state (what is running). If a pod crashes, the controller restarts it. If a node dies, pods are rescheduled on healthy nodes. This reconciliation loop is the core of how Kubernetes achieves reliability without manual intervention.
Managed Kubernetes
Running the control plane yourself is operationally expensive. Every major cloud provider offers a managed Kubernetes service where the control plane is their responsibility:
- Amazon EKS — Elastic Kubernetes Service
- Google GKE — Google Kubernetes Engine (the most mature; Kubernetes originated here)
- Azure AKS — Azure Kubernetes Service
Amazon ECS
Amazon Elastic Container Service (ECS) is AWS's proprietary container orchestrator. It is simpler than Kubernetes — fewer abstractions, less configuration — and integrates tightly with the AWS ecosystem (IAM, ALB, CloudWatch, ECR). The trade-off is that it is AWS-only; there is no portable open standard underneath.
Core concepts
- Task definition — a blueprint for a container (or group of containers): which image to use, CPU/memory allocation, environment variables, IAM role, port mappings. Equivalent to a Kubernetes Pod spec.
- Task — a running instance of a task definition. Equivalent to a Kubernetes Pod.
- Service — maintains a desired count of running tasks, integrates with an Application Load Balancer for traffic distribution, and replaces failed tasks automatically. Equivalent to a Kubernetes Deployment + Service.
- Cluster — a logical grouping of tasks and services.
EC2 launch type vs Fargate
| EC2 launch type | Fargate | |
|---|---|---|
| Infrastructure | You manage EC2 instances in the cluster | AWS manages the underlying infrastructure |
| Billing | Pay for EC2 instances regardless of utilisation | Pay per task CPU/memory per second |
| Control | Full control over instance type, AMI, storage | No access to underlying host |
| Best for | Predictable, sustained workloads; GPU tasks | Variable workloads; teams that want no server management |
ECS vs Kubernetes
| Amazon ECS | Kubernetes | |
|---|---|---|
| Learning curve | Low — fewer concepts, AWS console driven | High — many abstractions, YAML-heavy |
| Portability | AWS only | Any cloud or on-premise |
| Ecosystem | Deep AWS integration (IAM, ALB, CloudWatch) | Large open-source ecosystem (Helm, Istio, Argo) |
| Flexibility | Opinionated — simpler but less customisable | Highly extensible via CRDs and operators |
| Best for | AWS-native teams wanting simplicity | Multi-cloud, large teams, complex workloads |