You’ve learned about Pods, ReplicaSets, and Services. You understand that Kubernetes orchestrates containers across nodes. But here’s what happens next in most interviews: the interviewer asks, “How would you handle database storage in Kubernetes?” or “How do you manage application secrets?” and suddenly, you’re stuck.
Most engineers can explain what a Pod is. Far fewer can explain when you actually need a StatefulSet instead of a Deployment, or why you’d use a DaemonSet over a ReplicaSet. These aren’t academic distinctions—they reflect real production decisions that separate junior engineers from those ready for senior roles.
In production systems at companies like Uber or Netflix, you’ll encounter scenarios where the basic building blocks from Part 1 aren’t enough. You need databases that persist data even when Pods restart. You need configuration that changes without rebuilding images. You need to run monitoring agents on every single node. That’s what we’re covering here.
Why Configuration and Storage Matter in Real Systems
When you first start with Kubernetes, everything feels like it should be stateless. Deploy a web server? Easy—just throw it in a Deployment. But real applications have state. They need to store files. They need credentials. They need to remember things between restarts.
I’ve seen teams struggle for weeks because they treated configuration as an afterthought. They hardcoded database passwords in container images. They lost customer data because they didn’t understand volume lifecycles. They built systems that couldn’t update configuration without downtime.
Here’s the truth: Kubernetes gives you tools for every scenario, but you need to know which tool to use and why. Using a Deployment when you need a StatefulSet isn’t just a minor mistake—it’s the difference between a system that works and one that loses data.
What Interviewers Are Really Testing
When an interviewer asks about Secrets or StatefulSets, they’re not testing memorization. They’re checking if you understand:
- State management: Can you distinguish stateless from stateful workloads?
- Production readiness: Do you know how to handle credentials securely?
- Trade-offs: Can you explain when complexity is justified?
- Real-world thinking: Have you actually thought about what happens when a database Pod restarts?
A junior engineer might say, “Use a StatefulSet for databases.” A senior engineer explains, “StatefulSets provide stable network identities and ordered deployment, which matters for distributed databases like Cassandra where nodes need to know each other’s identity. But for a simple Postgres instance, you might just need a Deployment with a PersistentVolumeClaim—the extra guarantees aren’t always necessary.”
ConfigMaps and Secrets: Managing Configuration
Let me tell you what happens in most startups: someone hardcodes an API key in the code. It works. They ship it. Six months later, they need to rotate that key, and suddenly they’re rebuilding containers and redeploying everything. I’ve watched teams take down production during key rotation because they never set up proper configuration management.
ConfigMaps: Environment-Specific Settings
ConfigMaps store non-sensitive configuration data as key-value pairs. Think of them as external configuration files that you can inject into your Pods without baking them into your container images.
Here’s a real scenario: You’re building a microservice that needs different database connection strings for dev, staging, and production. Without ConfigMaps, you’d either:
- Build three different images (terrible idea)
- Pass everything as environment variables in your deployment YAML (messy and hard to manage)
- Mount configuration files from somewhere (but where?)
ConfigMaps solve this cleanly:
yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: app-config
namespace: production
data:
database_host: "prod-db.example.com"
database_port: "5432"
cache_ttl: "300"
log_level: "info"
feature_flags: |
{
"new_ui": true,
"experimental_api": false
}
Now inject this into your Pod:
yaml
apiVersion: v1
kind: Pod
metadata:
name: my-app
spec:
containers:
- name: app
image: my-app:1.0
env:
- name: DATABASE_HOST
valueFrom:
configMapKeyRef:
name: app-config
key: database_host
- name: DATABASE_PORT
valueFrom:
configMapKeyRef:
name: app-config
key: database_port
volumeMounts:
- name: config
mountPath: /etc/config
volumes:
- name: config
configMap:
name: app-config
What’s happening here? The Pod gets DATABASE_HOST as an environment variable, but it can also read the entire ConfigMap as files in /etc/config. This gives you flexibility—use environment variables for simple values, mount as files for complex configuration like JSON or YAML.
⚠️ Common Mistake: ConfigMaps aren’t automatically updated in running Pods when you change them. If you modify a ConfigMap, you typically need to restart your Pods. Some teams use tools like Reloader to automate this, but in interviews, know that ConfigMap updates don’t trigger Pod restarts by default.
Secrets: Handling Sensitive Data
Secrets work almost identically to ConfigMaps, but with a crucial difference: they’re designed for sensitive data like passwords, tokens, and SSH keys. Kubernetes stores them base64-encoded (note: not encrypted by default) and provides additional access controls.
yaml
apiVersion: v1
kind: Secret
metadata:
name: db-credentials
type: Opaque
data:
username: cG9zdGdyZXM= # base64 encoded "postgres"
password: c3VwZXJzZWNyZXQ= # base64 encoded "supersecret"
In production systems, you’d typically use external secret management systems like AWS Secrets Manager, HashiCorp Vault, or Google Secret Manager, and sync them into Kubernetes using operators. But understanding Kubernetes Secrets is fundamental.
💡 Pro Insight: At scale, teams don’t manually create Secret objects. They use tools like External Secrets Operator or sealed-secrets to sync from external sources. But in interviews, explaining the basic Secret object shows you understand the foundation.
Volumes and Persistent Storage: Where State Lives
Here’s a scenario that breaks many designs: You deploy a database as a regular Deployment. It works great. Then a node dies, Kubernetes reschedules the Pod to another node, and… all your data is gone. Why? Because container filesystems are ephemeral by default.
Understanding Volume Types
Kubernetes offers multiple volume types, but let’s focus on what matters in real systems:
mermaid
graph TD
A[Pod Needs Storage] --> B{Data Lifetime?}
B -->|Pod Lifetime| C[emptyDir]
B -->|Survives Pod Restarts| D[PersistentVolume]
D --> E{Cloud Provider?}
E -->|AWS| F[EBS Volume]
E -->|GCP| G[GCE Persistent Disk]
E -->|Azure| H[Azure Disk]
E -->|On-Prem| I[NFS/Ceph/Local]
C --> J[Temporary Data<br/>Caches, Scratch Space]
F --> K[Long-term Storage<br/>Databases, Files]
G --> K
H --> K
I --> K
emptyDir: Created when a Pod starts, deleted when it stops. Perfect for caches or temporary processing. If your Pod has multiple containers that need to share files, emptyDir is your friend.
yaml
volumes:
- name: cache
emptyDir: {}
PersistentVolumes (PV) and PersistentVolumeClaims (PVC): This is where it gets interesting. In real systems, you don’t directly create PersistentVolumes—you create PersistentVolumeClaims that describe what you need, and Kubernetes provisions the actual storage.
Think of it like ordering from a restaurant: You (the Pod) place an order (PVC) for what you want. The kitchen (StorageClass) prepares it. The waiter (Kubernetes) delivers the food (PV) to your table.
yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: postgres-data
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 20Gi
storageClassName: fast-ssd
This PVC requests 20GB of storage from the “fast-ssd” StorageClass. On AWS, this might provision an EBS volume. On GCP, a Persistent Disk. The beauty is your application doesn’t care—it just gets a filesystem.
🎯 Interview Tip: When discussing storage, mention access modes: ReadWriteOnce (single node), ReadOnlyMany (multiple nodes for reading), ReadWriteMany (multiple nodes for writing). Most candidates forget that not all storage supports all modes. AWS EBS, for example, only supports ReadWriteOnce.
StatefulSets: When Order and Identity Matter
Here’s where most engineers get confused. When do you actually need a StatefulSet instead of a Deployment?
StatefulSets are for workloads where:
- Each Pod needs a stable identity (predictable names like app-0, app-1, app-2)
- Startup/shutdown order matters (app-0 must start before app-1)
- Each Pod needs its own persistent storage (app-0 and app-1 have different data)
Real-world examples where this matters:
- Cassandra clusters: Each node needs a stable identity to participate in the ring
- Kafka: Brokers need stable network identities for partition leadership
- MySQL primary-replica: You need to start the primary before replicas
- Elasticsearch: Nodes need stable identities to maintain cluster state
Here’s a StatefulSet for PostgreSQL:
yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: postgres
spec:
serviceName: postgres
replicas: 1
selector:
matchLabels:
app: postgres
template:
metadata:
labels:
app: postgres
spec:
containers:
- name: postgres
image: postgres:14
env:
- name: POSTGRES_PASSWORD
valueFrom:
secretKeyRef:
name: db-credentials
key: password
volumeMounts:
- name: data
mountPath: /var/lib/postgresql/data
volumeClaimTemplates:
- metadata:
name: data
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 20Gi
The volumeClaimTemplates section is key. Unlike a Deployment where all Pods share the same PVC, a StatefulSet creates a unique PVC for each Pod. If postgres-0 restarts, it reattaches to the same PVC—your data survives.
mermaid
sequenceDiagram
participant K as Kubernetes
participant SS as StatefulSet
participant P0 as postgres-0
participant PVC0 as data-postgres-0
participant P1 as postgres-1
participant PVC1 as data-postgres-1
K->>SS: Create StatefulSet
SS->>P0: Create postgres-0
SS->>PVC0: Create PVC for postgres-0
P0->>PVC0: Attach volume
Note over P0: postgres-0 fully running
SS->>P1: Create postgres-1
SS->>PVC1: Create PVC for postgres-1
P1->>PVC1: Attach volume
Note over P1: postgres-1 fully running
Notice the order: postgres-0 must be running before postgres-1 starts. This ordered startup is automatic with StatefulSets.
⚠️ Common Mistake: Many candidates think StatefulSets automatically handle database replication or clustering. They don’t. A StatefulSet gives you stable identities and ordered deployment, but you still need to configure your database to replicate. The StatefulSet just provides the infrastructure foundation.
Ingress: External Access to Your Services
You’ve deployed your application. It’s running in Pods behind a Service. Now how do users actually reach it from the internet?
A Service gives you internal networking—Pods can talk to each other. But external access requires either:
- LoadBalancer Service: Creates a cloud load balancer (expensive—one per service)
- NodePort: Exposes a port on every node (awkward port numbers, hard to manage)
- Ingress: HTTP/HTTPS routing with a single entry point (this is what you want)
How Ingress Works
Think of Ingress as a smart reverse proxy sitting at the edge of your cluster. It routes incoming requests to different Services based on hostnames and paths.
yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: app-ingress
annotations:
nginx.ingress.kubernetes.io/rewrite-target: /
spec:
rules:
- host: api.example.com
http:
paths:
- path: /users
pathType: Prefix
backend:
service:
name: user-service
port:
number: 80
- path: /orders
pathType: Prefix
backend:
service:
name: order-service
port:
number: 80
- host: admin.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: admin-service
port:
number: 80
This Ingress routes:
api.example.com/users/*→ user-serviceapi.example.com/orders/*→ order-serviceadmin.example.com/*→ admin-service
mermaid
graph LR
User[User] -->|api.example.com/users| Ingress[Ingress Controller]
Ingress -->|Route /users| US[user-service]
Ingress -->|Route /orders| OS[order-service]
Ingress -->|admin.example.com| AS[admin-service]
US --> UP1[User Pod 1]
US --> UP2[User Pod 2]
OS --> OP1[Order Pod 1]
AS --> AP1[Admin Pod 1]
💡 Pro Insight: Ingress is just the configuration. You need an Ingress Controller (like nginx-ingress, Traefik, or cloud-specific controllers) actually running in your cluster to implement these rules. This confuses many beginners—the Ingress object is declarative config, the controller is the implementation.
DaemonSets: One Pod Per Node
Sometimes you need exactly one Pod running on every node. Not two. Not zero. One per node. This is what DaemonSets do.
Real-world use cases:
- Log collectors: Fluentd or Filebeat collecting logs from every node
- Monitoring agents: Prometheus node exporter on each node
- Network plugins: CNI plugins that need to run on every node
- Storage daemons: Ceph or GlusterFS storage agents
yaml
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: fluentd
spec:
selector:
matchLabels:
name: fluentd
template:
metadata:
labels:
name: fluentd
spec:
containers:
- name: fluentd
image: fluentd:latest
volumeMounts:
- name: varlog
mountPath: /var/log
- name: varlibdockercontainers
mountPath: /var/lib/docker/containers
readOnly: true
volumes:
- name: varlog
hostPath:
path: /var/log
- name: varlibdockercontainers
hostPath:
path: /var/lib/docker/containers
The key difference from a Deployment: Kubernetes automatically schedules one Pod on each node, including new nodes that join the cluster. Add a node, get a DaemonSet Pod. Remove a node, the DaemonSet Pod goes with it.
🎯 Interview Tip: Explain that DaemonSets ignore the normal scheduler logic about spreading Pods across nodes. They specifically want one per node. If an interviewer asks, “How would you run monitoring on all nodes?” and you suggest a Deployment with replicas, you’ve missed the point.
Jobs and CronJobs: Batch Processing
Not everything needs to run forever. Sometimes you need to:
- Run a database migration
- Process a batch of files
- Generate a daily report
- Clean up old data
Jobs: Run Once Until Complete
A Job runs a Pod until it successfully completes (exits with code 0). If the Pod fails, the Job creates a new Pod and tries again.
yaml
apiVersion: batch/v1
kind: Job
metadata:
name: database-migration
spec:
template:
spec:
containers:
- name: migration
image: my-app-migrations:v2
command: ["python", "migrate.py"]
restartPolicy: Never
backoffLimit: 3
This Job runs a database migration. If it fails, Kubernetes retries up to 3 times (backoffLimit). Once it succeeds, the Job is complete and the Pod remains for you to check logs.
CronJobs: Scheduled Execution
CronJobs are Jobs that run on a schedule, using the same cron syntax you’d use in Linux.
yaml
apiVersion: batch/v1
kind: CronJob
metadata:
name: backup-database
spec:
schedule: "0 2 * * *" # Every day at 2 AM
jobTemplate:
spec:
template:
spec:
containers:
- name: backup
image: postgres:14
command:
- /bin/sh
- -c
- pg_dump -h $DB_HOST -U postgres mydb > /backup/dump.sql
env:
- name: DB_HOST
value: postgres-service
restartPolicy: OnFailure
This backs up a database every night at 2 AM. Kubernetes creates a new Job (which creates a new Pod) on schedule.
⚠️ Common Mistake: CronJobs don’t guarantee exactly-once execution. If the cluster is busy or a node fails, a scheduled run might be missed. For critical workflows, you need external job scheduling systems or at least monitoring to detect missed runs.
Putting It All Together: A Real Production Example
Let’s walk through a realistic scenario. You’re building a content management system with:
- Web frontend (stateless)
- API backend (stateless)
- PostgreSQL database (stateful)
- Redis cache (can be stateful or not)
- Background workers for image processing
- Daily cleanup job
Here’s how you’d architect this:
mermaid
graph TD
Internet[Internet] --> Ingress[Ingress Controller]
Ingress -->|cms.example.com| Frontend[Frontend Service]
Ingress -->|api.cms.example.com| API[API Service]
Frontend --> FP1[Frontend Pod]
Frontend --> FP2[Frontend Pod]
API --> AP1[API Pod]
API --> AP2[API Pod]
API --> DB[(PostgreSQL<br/>StatefulSet)]
API --> Redis[(Redis<br/>StatefulSet)]
AP1 --> Queue[Message Queue]
AP2 --> Queue
Queue --> Worker1[Worker Pod]
Queue --> Worker2[Worker Pod]
Worker1 --> S3[S3 Storage]
Worker2 --> S3
CronJob[Daily Cleanup<br/>CronJob] -.->|2 AM| DB
Frontend and API: Deployments with 2+ replicas each. They’re stateless, so scaling and updates are simple.
PostgreSQL: StatefulSet with persistent storage. Even if the Pod restarts, data persists.
Redis: Could be a Deployment with a PVC for persistence, or a StatefulSet if you need Redis clustering.
Workers: Deployment that scales based on queue depth. These process images and upload to S3.
Cleanup Job: CronJob that runs nightly to delete old data.
Ingress: Single entry point routing cms.example.com to frontend and api.cms.example.com to the API.
This architecture uses every concept we’ve covered:
- ConfigMaps for environment-specific settings
- Secrets for database passwords and API keys
- PersistentVolumeClaims for database storage
- StatefulSet for the database
- Deployments for stateless services
- Ingress for external access
- CronJob for scheduled maintenance
How to Talk About This in Interviews
When an interviewer asks, “How would you deploy a database in Kubernetes?” don’t just say “StatefulSet.” Walk through the reasoning:
“For a database, I’d use a StatefulSet rather than a Deployment because we need stable Pod identities and persistent storage that survives Pod restarts. Each Pod in a StatefulSet gets its own PersistentVolumeClaim, so if postgres-0 crashes and restarts, it reattaches to the same volume—no data loss.
I’d configure the StatefulSet with volumeClaimTemplates to automatically provision storage using a StorageClass that maps to our cloud provider’s block storage—AWS EBS or GCP Persistent Disks. I’d set the access mode to ReadWriteOnce since most databases can’t be safely accessed from multiple nodes simultaneously.
For configuration like connection credentials, I’d use Secrets rather than hardcoding them in the container image. This lets us rotate credentials without rebuilding images.
That said, for production databases, many teams actually run databases outside Kubernetes—either managed services like RDS or dedicated database servers. StatefulSets work, but they add operational complexity. You need to handle backups, replication, and failover yourself. It’s a trade-off between operational flexibility and complexity.”
Notice what this answer demonstrates:
- Understanding of StatefulSets vs Deployments
- Knowledge of storage concepts
- Security awareness (Secrets)
- Real-world pragmatism (maybe databases shouldn’t be in Kubernetes)
- Trade-off analysis
That’s senior-level thinking.
Wrapping Up
You now understand the infrastructure beyond basic Pods and Services. You know how to:
- Manage configuration with ConfigMaps and Secrets
- Handle persistent data with volumes and StatefulSets
- Route external traffic with Ingress
- Run system-level services with DaemonSets
- Execute batch work with Jobs and CronJobs
The real skill isn’t memorizing these objects—it’s knowing when to use each one. A Deployment for stateless apps. A StatefulSet when identity matters. A DaemonSet when you need one per node. A Job for one-time tasks. A CronJob for scheduled work.
In your next interview, when someone asks about Kubernetes, you won’t just list objects. You’ll explain the problems they solve and the trade-offs they involve. That’s the difference between reciting definitions and demonstrating understanding.
Practice explaining these concepts out loud. Draw the diagrams. Write the YAML. Build the mental model of how these pieces connect. When you can explain why Netflix uses StatefulSets for Cassandra but Deployments for their API gateways, you’re ready for the interview.