BLOG

Kubernetes 2 Pods
Kubernetes Pods Note

Basic Definition

apiVersion: v1                                 # string: API version for Pod resource
kind: Pod                                      # string: Kind of Kubernetes object
metadata:
  name: my-app-pod                             # string: Name of the pod
  namespace: production                        # string: Namespace where the pod will be created
  labels:                                      # map[string]string: Labels for selection/grouping
    app: my-app
    tier: backend

spec:
  securityContext:                             # object: Pod-level security context
    runAsUser: 1000                            # int: Run all containers as this user
    runAsGroup: 3000                           # int: Group ID for file permissions
    fsGroup: 2000                              # int: Group ID for mounted volumes

  containers:                                  # array: List of containers in this Pod
    - name: web-container                      # string: Container name
      image: nginx:1.21                        # string: Image name with tag
      ports:
        - name: http                           # string: Port name
          containerPort: 80                    # int: Container port

    - name: app-container
      image: myregistry.com/app:2.0.1
      imagePullPolicy: IfNotPresent            # enum: Always | Never | IfNotPresent
      command: ["/bin/app"]                    # array[string]: Command to run
      args: ["--env", "prod", "--debug"]       # array[string]: Arguments to pass
      workingDir: /usr/src/app                 # string: Working directory inside container

      volumeMounts:
        - name: app-config
          mountPath: /etc/config
          readOnly: true                       # bool: Mount as read-only

      ports:
        - name: grpc
          containerPort: 50051
          hostPort: 30051                      # int: Host port to expose
          protocol: TCP                        # string: Protocol (TCP, UDP, SCTP)
        - name: metrics
          containerPort: 9090
          hostPort: 39090
          protocol: TCP

      env:
        - name: LOG_LEVEL
          value: "info"                        # string: Environment variable value
        - name: DB_HOST
          value: "postgres.prod.local"

      resources:                               # object: CPU/memory requirements
        limits:                                # object: Maximum resources allowed
          cpu: "500m"                          # string: CPU limit (e.g., 500m = 0.5 core)
          memory: "1Gi"                        # string: Memory limit
        requests:                              # object: Minimum guaranteed resources
          cpu: "200m"
          memory: "512Mi"

      lifecycle:                               # object: Hook for lifecycle events
        postStart:
          exec:
            command: ["/bin/sh", "-c", "echo Post-start hook triggered"]
        preStop:
          exec:
            command: ["/bin/sh", "-c", "echo Pre-stop hook triggered"]

  restartPolicy: Always                        # enum: Always | OnFailure | Never
  nodeName: node-123.internal.cluster.local    # string: Bind this pod to a specific node (optional)

  nodeSelector:                                # map[string]string: Schedule pod to specific node(s)
    disktype: ssd

  imagePullSecrets:
    - name: regcred                            # string: Secret for pulling private images

  hostNetwork: false                           # bool: If true, pod uses host's network namespace

  volumes:                                     # array: Volumes available to containers
    - name: temp-storage
      emptyDir: {}                             # object: Temporary storage, erased when pod stops

    - name: host-logs
      hostPath:
        path: /var/log/app                     # string: Host path
        type: Directory                        # string (optional): Type of host path

    - name: app-secret
      secret:
        secretName: app-secret                 # string: Secret resource name
        items:
          - key: password                      # string: Secret key
            path: db-password.txt              # string: File name in container

    - name: app-config
      configMap:
        name: app-config                       # string: ConfigMap resource name
        items:
          - key: config.yaml                   # string: Key in ConfigMap
            path: config.yaml                  # string: File name in container

Pod Lifecycle

Pod Status Phases

Official Docs

  • Pending: waiting for init containers to complete
  • Running: main containers are running
  • postStart
  • livenessProbe
  • readinessProbe
  • Succeeded: containers finished successfully
  • Failed: containers exited with error
  • Unknown: node/pod state can't be determined

Pod Creation Process

  1. User sends request to apiServer
  2. apiServer stores pod data in etcd and responds (async)
  3. Scheduler watches for unscheduled pods and picks a node
  4. kubelet on selected node creates the pod
  5. Pod status is updated via apiServeretcd

Pod Deletion Process

  1. User sends delete request
  2. Pod status changes to Terminating
  3. kubelet initiates shutdown
  4. Runs preStop hook (if defined)
  5. Containers are stopped gracefully
  6. If grace period exceeded, pod is force killed
  7. Pod is removed by garbage collector (GC)

Pod Start Process

  • Init containers start one by one (sequentially)
  • Main containers start in parallel
  • If a container fails:
  • Restart is controlled by restartPolicy

Lifecycle Hooks & Probes

  • postStart: runs after container starts
  • preStop: runs before container is stopped
  • livenessProbe: checks if container is alive
  • readinessProbe: checks if container is ready to receive traffic

Note: Probes are not technically hooks.

Container Lifecycle

  • Container lifecycle is tied to its main process (PID 1)
  • If that process exits, the container stops
  • Kubernetes will restart containers based on restartPolicy

Scheduling, Preemption & Eviction

Scheduling: Filtering & Scoring

Filtering (Hard Requirements)

  • nodeName: bind to a specific node
  • nodeSelector: match node labels

    If no match, pod remains in Pending

Scoring (Soft Preferences)

  • Scheduler ranks suitable nodes using:
  • nodeAffinity
  • podAffinity
  • podAntiAffinity
Affinity
  • Encourages pods to run together
  • Use case: improved performance
Anti-Affinity
  • Spreads pods across nodes
  • Use case: high availability (HA)

Taints & Tolerations

Taints (on Nodes)

  • NoSchedule: never schedule unless tolerated
  • PreferNoSchedule: try to avoid, but allow
  • NoExecute: evict running pods unless tolerated

Tolerations (on Pods)

  • Allows pods to tolerate node taints and be scheduled

Useful for isolating workloads, e.g. GPU nodes, critical workloads, etc.