The Secret Life of Kubernetes: A Tale of Two Clusters and the Quest for Secure Secrets

The Problem We All Face (But Pretend We Don’t)

Picture this: It’s 10 PM in the DevOps Den. Laptop screens glow under table lamps, and half the team is connected via Teams from home. We’re staring at Kubernetes clusters and YAML files scattered across multiple screens. We’ve been using ConfigMaps and Kubernetes Secrets for months now, but the workflow is painful—manually collecting credentials from developers, base64-encoding them, hardcoding values into YAML files, applying them to clusters. Rinse, repeat. Every new application, every credential rotation, every new environment means more manual work.

The truth? ConfigMaps and Secrets work, but they don’t scale. Secrets scattered across clusters, no centralized management, no rotation strategy, no audit trail, and definitely no peace of mind when security audit emails land in our inbox. There had to be a better way—something simpler, more scalable, more secure.

This is the story of how our team built a production-grade secret management system using HashiCorp Vault’s sidecar injection pattern. What started as a learning exercise with dev mode evolved into a full-blown production HA Vault setup with multi-cluster authentication, cross-cluster secret injection, and enough YAML files to make even the most seasoned DevOps engineer’s eye twitch.

Grab your coffee (or your beverage of choice - no judgment here), because this is going to be a ride through two DigitalOcean Kubernetes clusters, production Vault deployments, late nights in the DevOps Den, and the moment we realized that unsealing Vault is actually kind of nerve-wracking.

Act I: The Evolution from Dev to Production

The Humble Beginning: Dev Mode

Every great journey starts with a simple step. Before tackling production, we started with dev mode to understand the basics.

1
2
3
4
5
6
7
8
9
# Add the HashiCorp Helm repository
helm repo add hashicorp https://helm.releases.hashicorp.com
helm repo update

# The "I'm just testing" installation
helm install vault hashicorp/vault \
  --set "server.dev.enabled=true" \
  --set "server.dev.devRootToken=toor123" \
  -n vault01 --create-namespace

Dev mode is beautiful in its simplicity:

  • Pre-unsealed (no unseal key juggling)
  • In-memory storage (ephemeral - restart and poof, it’s gone)
  • Single node (no HA complexity)
  • Root token available immediately

It’s perfect for learning, terrible for production, and absolutely not something you want to run when real secrets are at stake.

We spent a day testing secret injection, authentication, and policies. Impressive, but clearly just a warm-up.

The “Oh Wait, This Needs to Be Real” Moment

Two days later, the urgency hit. “We need this in production. Like, yesterday.”

Time to do it right. Production means HA, persistence, proper unsealing. No shortcuts.

Option 1: Production Single Node

1
helm install vault hashicorp/vault -n vault --create-namespace

This gives you a proper Vault installation with persistent storage, but it’s still a single point of failure. Better, but not quite there.

Option 2: Production HA with Raft (The One I Chose… Eventually)

1
2
3
4
5
helm install vault hashicorp/vault \
  --set "server.ha.enabled=true" \
  --set "server.ha.replicas=2" \
  --set "server.ha.raft.enabled=true" \
  -n vault --create-namespace

Now we’re talking. Two Vault pods running in HA mode using the Raft consensus protocol. If one goes down, the other keeps serving secrets. This is what production looks like.

But wait. Let me tell you about the Raft rabbit hole we fell into.

The Great Raft Documentation Hunt (Or: Where Is Everything?)

Here’s the thing about Raft in Vault: it exists, it’s well-documented somewhere, but finding that documentation when you need it is like finding a specific grain of sand on a beach.

We spent hours Googling, reading docs, trying commands. The information exists but is scattered across multiple sources.

The official HashiCorp docs have pages on:

But here’s what they don’t prominently tell you (or at least, not in a way that jumps out when you’re frantically Googling at midnight):

  1. The vault-1 won’t automatically join vault-0. You need to manually run vault operator raft join on each follower node. The Helm chart creates the StatefulSet, but the clustering? That’s on you.

  2. The internal service name matters. It’s vault-0.vault-internal:8200, not vault-0:8200, not vault:8200, not vault-0.vault.svc.cluster.local:8200. Get it wrong, and you’ll see cryptic “failed to join raft cluster” errors.

  3. Each node needs to be unsealed individually. Seal one? Sealed. Seal two? Also sealed. Pod restarts? Everything’s sealed. Node failure? You’re unsealing again. Hope you saved those keys!

  4. The order matters. Initialize vault-0, unseal it, then join vault-1, then unseal vault-1. Do it out of order and you’ll get errors that make you question your life choices.

  5. GitHub is full of issues about this exact pain point:

We spent hours piecing together information from:

  • Three different HashiCorp tutorials (each covering one piece of the puzzle)
  • GitHub issues (where the real truth lives)
  • Random blog posts from people who’ve been there
  • The Helm chart’s values.yaml comments (surprisingly informative, actually)

The information exists, but it’s scattered like puzzle pieces across the internet. No single “Raft in Kubernetes: Here’s Everything You Need to Know” guide existed.

The moment it clicked: Late evening in the DevOps Den, someone looked up from their laptop. “Wait… Raft is a consensus algorithm, not magic clustering?”

Exactly. The Helm chart creates the infrastructure, but you create the cluster by explicitly telling each node to join. The mental model finally made sense.

So yes, we now have two Vault pods running in HA mode using Raft. But getting here required archaeological-level documentation excavation.

The Initialization Ritual

When you install Vault in production mode, it starts sealed. Think of it as a safe that needs to be unlocked before you can use it. This is a one-time operation—miss this, and you’re locked out forever.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
# Initialize Vault and save the keys (KEEP THESE SAFE!)
kubectl exec vault-0 -- vault operator init \
  -key-shares=1 \
  -key-threshold=1 \
  -format=json > keys.json

# Extract the unseal key
VAULT_UNSEAL_KEY=$(cat keys.json | jq -r ".unseal_keys_b64[]")
echo $VAULT_UNSEAL_KEY

# Extract the root token
VAULT_ROOT_KEY=$(cat keys.json | jq -r ".root_token")
echo $VAULT_ROOT_KEY

CRITICAL WARNING: In a real production environment, you’d use 5 key shares with a threshold of 3, and distribute them to different trusted individuals. You’d also enable auto-unseal using a cloud KMS. But for this learning journey, we’re keeping it simple with one key.

That keys.json file? Treat it like the nuclear launch codes. Seriously.

Unsealing the Vault (Literally)

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
# Unseal vault-0
kubectl exec vault-0 -- vault operator unseal $VAULT_UNSEAL_KEY

# Login to verify
kubectl exec vault-0 -- vault login $VAULT_ROOT_KEY

# Join vault-1 to the Raft cluster
kubectl exec -ti vault-1 -- vault operator raft join http://vault-0.vault-internal:8200

# Unseal vault-1
kubectl exec vault-1 -- vault operator unseal $VAULT_UNSEAL_KEY

# Login to vault-1
kubectl exec vault-1 -- vault login $VAULT_ROOT_KEY

And just like that, you have a two-node HA Vault cluster. Both nodes are part of the Raft cluster, one is the leader, and they’re ready to serve secrets.

Making Vault Accessible

I set up two ways to access Vault:

1. NodePort Service (For Direct IP Access)

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
apiVersion: v1
kind: Service
metadata:
  name: vault-node
  namespace: vault
spec:
  type: NodePort
  publishNotReadyAddresses: true
  ports:
    - name: http
      port: 8200
      targetPort: 8200
      nodePort: 32000
    - name: https-internal
      port: 8201
      targetPort: 8201
  selector:
    app.kubernetes.io/name: vault
    app.kubernetes.io/instance: vault
    component: server

Now I can access Vault at any node’s IP on port 32000. In my case: http://64.227.181.21:32000

2. Ingress (For Fancy DNS Names)

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: vault-ingress
  namespace: vault
spec:
  ingressClassName: nginx
  rules:
  - host: vault2.skill-mine.com
    http:
      paths:
      - backend:
          service:
            name: vault-internal
            port:
              number: 8200
        pathType: ImplementationSpecific

Much better: https://vault2.skill-mine.com

Quick test to make sure it’s working:

1
2
3
4
5
6
7
# Get a node IP
kubectl get no -o wide

# Test with the root token
curl -H "X-Vault-Token: $VAULT_ROOT_KEY" \
  -X GET \
  http://64.227.181.21:32000/v1/sys/seal-status | jq

If you see "sealed": false, you’re in business.

Act II: Setting Up the Secret Architecture

Enabling Authentication

Jump into vault-0 and let’s configure things:

1
2
3
4
5
kubectl exec -it vault-0 -- sh

# Inside the Vault container
vault auth list
vault auth enable kubernetes

The Secret Hierarchy

Instead of just dumping secrets anywhere, I designed a hierarchical structure:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
# Enable the KV-v2 secrets engine at a custom path
vault secrets enable -path=smtc kv-v2

# Store a secret
vault kv put smtc/project1/subproject1/env01 \
  username="username1" \
  password="pass1"

# Verify it worked
vault kv get smtc/project1/subproject1/env01

# List secrets in the path
vault kv list smtc/project1/

The path structure smtc/project1/subproject1/env01 means:

  • smtc - The root path for all project secrets
  • project1 - Specific project
  • subproject1 - Subproject or microservice
  • env01 - Environment (dev, staging, prod, etc.)

This scales beautifully as your organization grows.

Configuring Kubernetes Authentication

This is where we teach Vault to trust our Kubernetes cluster:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
# Check current config (should be empty)
vault read auth/kubernetes/config

# Configure it
vault write auth/kubernetes/config \
  token_reviewer_jwt="$(cat /var/run/secrets/kubernetes.io/serviceaccount/token)" \
  kubernetes_host="https://$KUBERNETES_PORT_443_TCP_ADDR:443" \
  kubernetes_ca_cert=@/var/run/secrets/kubernetes.io/serviceaccount/ca.crt

# Verify it's configured
vault read auth/kubernetes/config

Creating Policies (The Gatekeeper)

Policies are how Vault controls access. Think of them as fine-grained permissions:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
# List existing policies
vault policy list

# Create a policy for our project
vault policy write smtc-policy - <<EOF
path "smtc/data/project1/subproject1/env01" {
   capabilities = ["read"]
}
EOF

# Verify it was created
vault policy list

Notice the /data/ in the path? That’s a KV-v2 quirk. The actual API path has /data/ inserted between the mount path and your secret path.

Creating Vault Roles

Roles bind Kubernetes service accounts to Vault policies:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
# List existing roles (probably empty)
vault list auth/kubernetes/role

# Create a role for the local cluster's vault namespace
vault write auth/kubernetes/role/smtc-role \
  bound_service_account_names=vault \
  bound_service_account_namespaces=vault \
  policies=smtc-policy \
  ttl=24h

# Verify
vault read auth/kubernetes/role/smtc-role

# List all roles
vault list auth/kubernetes/role

What this says: “If a pod uses the service account named vault in the namespace vault, allow it to authenticate and apply the smtc-policy.”

We’ll also create a role for another namespace that we’ll use later:

1
2
3
4
5
vault write auth/kubernetes/role/local01-smtc-role \
  bound_service_account_names=vault \
  bound_service_account_namespaces=local01 \
  policies=smtc-policy \
  ttl=24h

Act III: The Magic of Sidecar Injection

Deploying a Test Application

Let’s deploy a simple nginx application:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginxtest1
  labels:
    app: nginxtest1
spec:
  selector:
    matchLabels:
      app: nginxtest1
  replicas: 1
  template:
    metadata:
      labels:
        app: nginxtest1
    spec:
      serviceAccountName: vault
      containers:
        - name: nginxtest1
          image: nginx

Nothing special - just a plain nginx pod using the vault service account.

1
kubectl apply -f 1-deployment-nginx1.yaml

The First Patch: Basic Injection

Now let’s inject secrets with a simple annotation patch:

1
2
3
4
5
6
7
spec:
  template:
    metadata:
      annotations:
        vault.hashicorp.com/agent-inject: "true"
        vault.hashicorp.com/role: "smtc-role"
        vault.hashicorp.com/agent-inject-secret-database-config.txt: "smtc/data/project1/subproject1/env01"
1
2
kubectl patch deployment nginxtest1 \
  --patch "$(cat 3a-smtc-patch-inject-secrets.yaml)"

What happens?

  1. The Vault Agent Injector (a mutating webhook) intercepts the pod creation
  2. It injects an init container that authenticates to Vault and fetches secrets
  3. It injects a sidecar container that keeps the secrets up to date
  4. It mounts the secrets at /vault/secrets/database-config.txt

The secrets are written in raw format. Let’s check:

1
2
kubectl exec $(kubectl get pod -l app=nginxtest1 -o jsonpath="{.items[0].metadata.name}") \
  --container nginxtest1 -- cat /vault/secrets/database-config.txt

You’ll see something like:

1
data: map[password:pass1 username:username1]

That’s… not ideal for most applications.

The Second Patch: Template Magic

Let’s make it actually useful with templates:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
spec:
  template:
    metadata:
      annotations:
        vault.hashicorp.com/agent-inject: "true"
        vault.hashicorp.com/agent-inject-status: "update"
        vault.hashicorp.com/role: "smtc-role"
        vault.hashicorp.com/agent-inject-secret-database-config.txt: "smtc/data/project1/subproject1/env01"
        vault.hashicorp.com/agent-inject-template-database-config.txt: |
          {{- with secret "smtc/data/project1/subproject1/env01" -}}
          postgresql://{{ .Data.data.username }}:{{ .Data.data.password }}@postgres:5432/wizard
          {{- end -}}
1
2
kubectl patch deployment nginxtest1 \
  --patch "$(cat 3b-smtc-patch-inject-secrets-as-template01pg.yaml)"

Now check the secret:

1
2
kubectl exec $(kubectl get pod -l app=nginxtest1 -o jsonpath="{.items[0].metadata.name}") \
  --container nginxtest1 -- cat /vault/secrets/database-config.txt

Output:

1
postgresql://username1:pass1@postgres:5432/wizard

Beautiful! A ready-to-use PostgreSQL connection string. Your application just reads a file and connects - it never needs to know about Vault.

Act IV: The Multi-Cluster Plot Twist

Here’s where things get spicy. I have two DigitalOcean Kubernetes clusters:

  1. Cluster 1 (do-blr1-testk8s-1): The Vault server cluster
  2. Cluster 2 (do-blr1-testk8s-2): The client cluster that needs secrets

The challenge: How do pods in Cluster 2 authenticate to Vault in Cluster 1 and fetch secrets?

On the Vault Server (Cluster 1)

First, we need to create a separate authentication backend for the remote cluster:

1
2
3
4
kubectl exec -it vault-0 -- sh

# Create a dedicated auth mount for the remote cluster
vault auth enable --path=remote01-cluster kubernetes

Why a separate path? Because each Kubernetes cluster has its own CA certificate and API server. We need to configure them separately.

On the Client Cluster (Cluster 2)

Switch your kubectl context to the second cluster and install the Vault Agent Injector:

1
2
3
4
5
6
7
8
9
# Option 1: Using direct IP
helm upgrade --install remote01 hashicorp/vault \
  --set "global.externalVaultAddr=http://64.227.181.21:32000" \
  -n vault --create-namespace

# Option 2: Using DNS (cleaner)
helm upgrade --install vault hashicorp/vault \
  --set "global.externalVaultAddr=https://vault2.skill-mine.com" \
  -n vault --create-namespace

This installs only the Vault Agent Injector webhook - not the Vault server itself. The injector will configure pods to connect to the external Vault.

Creating the Service Endpoint

We need to tell Kubernetes about the external Vault server. There are two approaches:

Approach 1: Service with Endpoints (For IP Addresses)

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
apiVersion: v1
kind: Service
metadata:
  name: vault-server
  namespace: vault
spec:
  ports:
  - protocol: TCP
    port: 32000
    targetPort: 32000
---
apiVersion: v1
kind: Endpoints
metadata:
  name: vault-server
  namespace: vault
subsets:
  - addresses:
      - ip: 64.227.181.21
    ports:
      - port: 32000

Approach 2: ExternalName Service (For DNS Names)

1
2
3
4
5
6
7
8
apiVersion: v1
kind: Service
metadata:
  name: external-vault
  namespace: vault
spec:
  type: ExternalName
  externalName: vault2.skill-mine.com

I used both in my testing. ExternalName is cleaner if you have DNS set up.

1
kubectl apply -f 4-external-vault-svc-endpoint.yaml

The Service Account Token Dance

In Kubernetes 1.24+, service account tokens are no longer automatically created as secrets. We need to explicitly request them:

1
2
3
4
5
6
7
apiVersion: v1
kind: Secret
metadata:
  name: vault-token-seeecretttt
  annotations:
    kubernetes.io/service-account.name: vault
type: kubernetes.io/service-account-token
1
kubectl apply -f 5-vault-secret.yaml

Extracting Authentication Credentials

Now we extract the credentials that Vault needs to validate tokens from this cluster:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
# Get the secret name
VAULT_HELM_SECRET_NAME=$(kubectl get secrets --output=json | \
  jq -r '.items[].metadata | select(.name|startswith("vault-token-")).name')

echo $VAULT_HELM_SECRET_NAME

# Extract the JWT token
TOKEN_REVIEW_JWT=$(kubectl get secret $VAULT_HELM_SECRET_NAME \
  --output='go-template={{ .data.token }}' | base64 --decode)

echo $TOKEN_REVIEW_JWT

# Extract the CA certificate
KUBE_CA_CERT=$(kubectl config view --raw --minify --flatten \
  --output='jsonpath={.clusters[].cluster.certificate-authority-data}' | base64 --decode)

echo $KUBE_CA_CERT

# Get the Kubernetes API endpoint
KUBE_HOST=$(kubectl config view --raw --minify --flatten \
  --output='jsonpath={.clusters[].cluster.server}')

echo $KUBE_HOST

Generating the Vault Configuration Command

Create a script with the configuration command:

1
2
3
4
5
6
7
cat > 7a-vault-auth << EOF
vault write auth/remote01-cluster/config \
     token_reviewer_jwt="$TOKEN_REVIEW_JWT" \
     kubernetes_host="$KUBE_HOST" \
     kubernetes_ca_cert="$KUBE_CA_CERT" \
     issuer="https://kubernetes.default.svc.cluster.local"
EOF

IMPORTANT: Copy the contents of 7a-vault-auth and run it inside the Vault server in Cluster 1.

Back on the Vault Server (Cluster 1)

Switch back to your Cluster 1 context and configure the remote cluster authentication:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
kubectl exec -it vault-0 -- sh

# Paste and run the vault write command from 7a-vault-auth

# Create a role for the remote cluster
vault write auth/remote01-cluster/role/vault-smtc-role \
  bound_service_account_names=vault \
  bound_service_account_namespaces=vault \
  policies=smtc-policy \
  ttl=24h

# Verify it was created
vault list auth/remote01-cluster/role

Notice we’re using the same smtc-policy - no need to duplicate policies.

Back on the Client Cluster (Cluster 2): Deploy and Test

Now for the moment of truth:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginxtest1
  namespace: vault
  labels:
    app: nginxtest1
spec:
  selector:
    matchLabels:
      app: nginxtest1
  replicas: 1
  template:
    metadata:
      annotations:
        vault.hashicorp.com/agent-inject: 'true'
        vault.hashicorp.com/role: 'vault-smtc-role'
        vault.hashicorp.com/auth-path: 'auth/remote01-cluster'
        vault.hashicorp.com/agent-inject-secret-credentials.txt: 'smtc/data/project1/subproject1/env01'
      labels:
        app: nginxtest1
    spec:
      serviceAccountName: vault
      containers:
        - name: nginxtest1
          image: nginx

Key points:

  • vault.hashicorp.com/auth-path: 'auth/remote01-cluster' - This tells the Vault Agent to authenticate using our custom auth path
  • vault.hashicorp.com/role: 'vault-smtc-role' - The role we created earlier
1
kubectl apply -f 6-deployment-nginx-remote01.yaml

Checking if It Works

1
2
3
4
5
6
7
8
9
# Check the logs (look for successful authentication)
kubectl logs $(kubectl get pod -l app=nginxtest1 -o jsonpath="{.items[0].metadata.name}")

# Check pod details
kubectl describe po $(kubectl get pod -l app=nginxtest1 -o jsonpath="{.items[0].metadata.name}")

# Check vault-agent-init container logs specifically
kubectl logs $(kubectl get pod -l app=nginxtest1 -o jsonpath="{.items[0].metadata.name}") \
  -c vault-agent-init

If you see “authentication successful” in the logs, congratulations! You’ve just set up cross-cluster Vault authentication.

Scaling to Multiple Namespaces

Want to use this in multiple namespaces? Just create more roles:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
# Create a namespace
kubectl create ns remote01
kubectl -n remote01 create sa vault

# On the Vault server
kubectl exec -it vault-0 -- sh

vault write auth/remote01-cluster/role/remote01-smtc-role \
  bound_service_account_names=vault \
  bound_service_account_namespaces=remote01 \
  policies=smtc-policy \
  ttl=24h

Then deploy your application to that namespace, and it just works.

Act V: Manual Testing and Debugging

Testing Authentication Manually

Sometimes you need to understand exactly what’s happening. Let’s manually authenticate from inside a pod:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
# Get inside a pod
kubectl exec -it $(kubectl get pod -l app=nginxtest1 -o jsonpath="{.items[0].metadata.name}") \
  -c nginxtest1 -- bash

# Inside the pod
jwt_token=$(cat /var/run/secrets/kubernetes.io/serviceaccount/token)

# Install curl and jq
apt update && apt install -y curl jq

# Test authentication with the local auth path
curl --request POST \
  --data '{"jwt": "'$jwt_token'", "role": "smtc-role"}' \
  http://64.227.181.21:32000/v1/auth/kubernetes/login | jq .

# Test authentication with the remote cluster auth path
curl --request POST \
  --data '{"jwt": "'$jwt_token'", "role": "vault-smtc-role"}' \
  http://64.227.181.21:32000/v1/auth/remote01-cluster/login | jq .

This returns a Vault token in the response. You can use that token to manually fetch secrets:

1
2
3
4
5
6
7
# Extract the token from the login response
VAULT_TOKEN="<token from previous command>"

# Fetch the secret
curl -H "X-Vault-Token: $VAULT_TOKEN" \
  -X GET \
  http://64.227.181.21:32000/v1/smtc/data/project1/subproject1/env01 | jq .

Creating Manual Tokens (For Troubleshooting)

Sometimes you need a token for debugging:

1
2
3
4
5
6
7
kubectl exec -it vault-0 -- sh

# Create a token with specific policy
vault token create -policy=smtc-policy

# List token accessors (useful for revoking)
vault list auth/token/accessors

The Architecture: Putting It All Together

Let me paint the complete picture:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
┌─────────────────────────────────────────────────┐
│         Cluster 1 (Vault Server)                │
│                                                 │
│  ┌─────────────────────────────────┐           │
│  │  Vault HA Cluster (Raft)        │           │
│  │  ┌─────────┐     ┌─────────┐   │           │
│  │  │vault-0  │────▶│vault-1  │   │           │
│  │  │(leader) │     │(follower)│  │           │
│  │  └─────────┘     └─────────┘   │           │
│  │                                 │           │
│  │  Auth Backends:                 │           │
│  │  • auth/kubernetes              │           │
│  │  • auth/remote01-cluster        │           │
│  │                                 │           │
│  │  Secrets:                       │           │
│  │  • smtc/project1/subproject1/   │           │
│  └─────────────────────────────────┘           │
│                                                 │
│  Exposed via:                                   │
│  • NodePort (32000)                             │
│  • Ingress (vault2.skill-mine.com)             │
└─────────────────────────────────────────────────┘
                      │ JWT Token
                      │ Validation
┌─────────────────────────────────────────────────┐
│         Cluster 2 (Client Cluster)              │
│                                                 │
│  ┌─────────────────────────────────┐           │
│  │  Vault Agent Injector           │           │
│  │  (Mutating Webhook)              │           │
│  └─────────────────────────────────┘           │
│                                                 │
│  ┌─────────────────────────────────┐           │
│  │  Application Pods                │           │
│  │                                 │           │
│  │  ┌──────────────┐               │           │
│  │  │ Init:        │               │           │
│  │  │ vault-agent  │───┐           │           │
│  │  └──────────────┘   │           │           │
│  │                     │ Fetch     │           │
│  │  ┌──────────────┐   │ Secrets   │           │
│  │  │ Sidecar:     │◀──┘           │           │
│  │  │ vault-agent  │               │           │
│  │  └──────────────┘               │           │
│  │                                 │           │
│  │  ┌──────────────┐               │           │
│  │  │ App:         │               │           │
│  │  │ nginx        │               │           │
│  │  │              │               │           │
│  │  │ Reads:       │               │           │
│  │  │ /vault/      │               │           │
│  │  │  secrets/    │               │           │
│  │  └──────────────┘               │           │
│  └─────────────────────────────────┘           │
└─────────────────────────────────────────────────┘

Key Learnings and “Aha!” Moments

1. Dev Mode is a Trap (A Comfortable One)

Dev mode is so easy that you’ll be tempted to use it everywhere. Don’t. I learned this the hard way when my dev Vault restarted and all my test secrets vanished into the ether. Dev mode uses in-memory storage - no persistence.

2. Unsealing is Serious Business

In production, you’ll have 5 unseal keys split among trusted people. If Vault restarts, you need 3 of those 5 keys to unseal it. This isn’t paranoia - it’s security. But for convenience in non-critical environments, use cloud auto-unseal with AWS KMS, Azure Key Vault, or GCP KMS.

3. The /data/ Path Gotcha

With KV-v2, the path in your policy must include /data/:

  • Secret path: smtc/project1/subproject1/env01
  • Policy path: smtc/data/project1/subproject1/env01
  • API path: v1/smtc/data/project1/subproject1/env01

I spent an embarrassing amount of time troubleshooting “permission denied” errors before I figured this out.

4. Templates Are Your Best Friend

Don’t just inject raw JSON. Use templates to format secrets exactly as your application expects:

1
2
3
4
5
6
7
8
9
# Bad: Raw JSON dump
vault.hashicorp.com/agent-inject-secret-config: "smtc/data/project1/subproject1/env01"

# Good: Formatted for your app
vault.hashicorp.com/agent-inject-template-config: |
  {{- with secret "smtc/data/project1/subproject1/env01" -}}
  export DB_USER="{{ .Data.data.username }}"
  export DB_PASS="{{ .Data.data.password }}"
  {{- end -}}

5. Multi-Cluster Auth Paths

Each Kubernetes cluster needs its own auth path because:

  • Different CA certificates
  • Different API servers
  • Different service account token issuers

Don’t try to share auth paths between clusters. It won’t work and you’ll waste hours debugging.

6. Service Account Tokens Changed in K8s 1.24+

If you’re wondering why service accounts don’t automatically create secrets anymore, it’s because of the BoundServiceAccountTokenVolume feature. Now you need to explicitly create token secrets:

1
2
3
4
5
6
7
apiVersion: v1
kind: Secret
metadata:
  name: vault-token-seeecretttt
  annotations:
    kubernetes.io/service-account.name: vault
type: kubernetes.io/service-account-token

7. NodePort + Ingress = Best of Both Worlds

I used both:

  • NodePort for direct access during troubleshooting
  • Ingress for clean DNS names in production

Having both options saved me during debugging sessions.

8. Logging is Your Friend

When things go wrong (and they will), check these logs in order:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
# 1. The vault-agent-init container (authentication)
kubectl logs <pod> -c vault-agent-init

# 2. The vault-agent sidecar (secret fetching/renewal)
kubectl logs <pod> -c vault-agent

# 3. The pod description (injection worked?)
kubectl describe pod <pod>

# 4. The Vault server logs
kubectl logs -n vault vault-0

9. Test Authentication Manually

Don’t trust the sidecar to work magically. Test the authentication flow manually first:

1
2
3
4
5
6
7
# Extract your service account token
jwt_token=$(cat /var/run/secrets/kubernetes.io/serviceaccount/token)

# Try to authenticate
curl --request POST \
  --data '{"jwt": "'$jwt_token'", "role": "your-role"}' \
  http://vault-server:8200/v1/auth/kubernetes/login

If this fails, the sidecar will fail too.

10. Policies Are Finicky

A policy that doesn’t work:

1
2
3
path "smtc/project1/subproject1/env01" {
   capabilities = ["read"]
}

A policy that works:

1
2
3
path "smtc/data/project1/subproject1/env01" {
   capabilities = ["read"]
}

Notice the data in the path? That’s KV-v2’s way of saying “actual secret data” vs “metadata.”

Production Considerations (Or: Don’t Get Paged at 3 AM)

High Availability

My setup with 2 Raft nodes is the minimum for HA. For production:

  • Use 3 or 5 nodes (odd numbers for Raft quorum)
  • Spread nodes across availability zones
  • Monitor Raft cluster health
  • Have runbooks for node failures

Auto-Unseal

Manual unsealing doesn’t scale. Use cloud KMS:

1
2
3
4
5
6
# Example for AWS KMS
vault:
  seal:
    awskms:
      region: "us-west-2"
      kms_key_id: "alias/vault-unseal"

Backup Strategy

Vault data is precious. For Raft storage:

1
2
3
4
5
# Take a snapshot
kubectl exec vault-0 -- vault operator raft snapshot save /tmp/snapshot.snap

# Restore from a snapshot
kubectl exec vault-0 -- vault operator raft snapshot restore /tmp/snapshot.snap

Automate this. Schedule it. Test restores regularly.

Monitoring and Alerting

Key metrics to watch:

  • Seal status (is Vault unsealed?)
  • Raft cluster health (are all nodes active?)
  • Authentication failures (someone trying something fishy?)
  • Token expiration rates (are applications renewing properly?)
  • Secret access patterns (unusual access patterns?)

Network Policies

Lock down network access:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: vault-access
  namespace: vault
spec:
  podSelector:
    matchLabels:
      app.kubernetes.io/name: vault
  ingress:
  - from:
    - namespaceSelector:
        matchLabels:
          vault-access: "true"

Only pods in namespaces with the vault-access label can reach Vault.

Audit Logging

Enable audit logging to track everything:

1
2
3
kubectl exec -it vault-0 -- sh

vault audit enable file file_path=/vault/audit/audit.log

Ship these logs to your centralized logging system (ELK, Splunk, etc.).

Secret Rotation

Secrets should rotate. Period. Implement a rotation strategy:

  • Database credentials: Vault can generate dynamic credentials
  • API keys: Rotate quarterly (or more frequently)
  • Certificates: Use Vault’s PKI engine with automatic rotation

Resource Limits

The Vault Agent sidecar uses resources. Plan accordingly:

1
2
3
4
5
6
7
resources:
  requests:
    memory: "64Mi"
    cpu: "50m"
  limits:
    memory: "128Mi"
    cpu: "100m"

For 100 pods, that’s 6.4 GB RAM just for Vault sidecars.

Troubleshooting Guide (Because You’ll Need It)

Problem: “Permission Denied” When Accessing Secrets

Symptoms: Vault Agent logs show permission denied

Solution:

  1. Check your policy includes /data/ in the path:

    1
    2
    3
    
    path "smtc/data/project1/subproject1/env01" {
       capabilities = ["read"]
    }
    
  2. Verify the role has the correct policy:

    1
    
    vault read auth/kubernetes/role/your-role
    
  3. Check if the service account matches:

    1
    
    kubectl get pod <pod> -o yaml | grep serviceAccountName
    

Problem: “Authentication Failed”

Symptoms: Vault Agent init container fails with authentication error

Solution:

  1. Verify Kubernetes auth is configured:

    1
    
    vault read auth/kubernetes/config
    
  2. Check the role exists:

    1
    
    vault list auth/kubernetes/role
    
  3. Test authentication manually (see earlier section)

Problem: Pods Stuck in Init

Symptoms: Pods stuck with vault-agent-init container running

Solution:

  1. Check logs:

    1
    
    kubectl logs <pod> -c vault-agent-init
    
  2. Verify Vault is accessible from the pod:

    1
    
    kubectl exec <pod> -c vault-agent-init -- curl http://vault:8200/v1/sys/health
    
  3. Check the external Vault address:

    1
    
    helm get values <vault-release> -n vault
    

Problem: Secrets Not Updating

Symptoms: Secrets in /vault/secrets/ are stale

Solution:

  1. Check vault-agent sidecar logs:

    1
    
    kubectl logs <pod> -c vault-agent
    
  2. Verify the template includes the update annotation:

    1
    
    vault.hashicorp.com/agent-inject-status: "update"
    
  3. Check token TTL - might need renewal:

    1
    
    vault read auth/kubernetes/role/your-role
    

Problem: “Vault is Sealed”

Symptoms: All Vault operations fail with “vault is sealed”

Solution:

  1. Check seal status:

    1
    
    kubectl exec vault-0 -- vault status
    
  2. Unseal Vault:

    1
    
    kubectl exec vault-0 -- vault operator unseal $VAULT_UNSEAL_KEY
    
  3. If you have multiple nodes, unseal each:

    1
    
    kubectl exec vault-1 -- vault operator unseal $VAULT_UNSEAL_KEY
    

Problem: Cross-Cluster Authentication Fails

Symptoms: Remote cluster pods can’t authenticate to Vault

Solution:

  1. Verify the auth path is correct:

    1
    
    vault.hashicorp.com/auth-path: 'auth/remote01-cluster'
    
  2. Check the remote auth backend exists:

    1
    
    vault auth list
    
  3. Verify the remote cluster config:

    1
    
    vault read auth/remote01-cluster/config
    
  4. Test with manual authentication (see earlier section)

What We’d Do Differently Next Time

1. Start with Production Setup

We spent too much time in dev mode. Starting over, we’d go straight to the HA production setup from day one. The unseal dance isn’t that complicated, and it teaches you the real workflow.

2. Use Auto-Unseal Immediately

Cloud KMS auto-unseal should be the default, not an afterthought. It’s easier, more secure, and saves you from the “where did we put the unseal keys?” panic—a lesson learned the hard way during a pod restart drill.

3. Document as You Go

We had to reverse-engineer our own setup multiple times. Write runbooks as you go. Your future self (and your team) will thank you.

4. Set Up Monitoring First

Don’t deploy to production without monitoring. Set up alerts for:

  • Vault seal status
  • Authentication failures
  • Raft cluster health
  • High token expiration rates

5. Test Disaster Recovery Early

Take snapshots, test restores, document the process. Don’t wait until you actually need to restore from backup.

6. Namespace Organization

I used vault namespace for everything. In hindsight, I should have:

  • vault-system - For Vault infrastructure
  • vault-injector - For the webhook
  • Application namespaces - For actual workloads

7. Use Terraform for Vault Configuration

I did everything manually (vault write, vault policy write, etc.). For production, use Terraform:

1
2
3
4
5
6
7
8
9
resource "vault_auth_backend" "kubernetes" {
  type = "kubernetes"
  path = "kubernetes"
}

resource "vault_policy" "smtc" {
  name = "smtc-policy"
  policy = file("policies/smtc.hcl")
}

This makes configuration reproducible and version-controlled.

The Unexpected Benefits

1. Zero Code Changes

The most beautiful part? Applications don’t need any Vault-specific code. They just read files from /vault/secrets/. Want to migrate from ConfigMaps to Vault? Just change the annotations. The app code stays the same.

2. Audit Trail for Free

Every secret access is logged in Vault’s audit log. Who accessed what secret, when, from which pod. Security team loves this.

3. Secret Versioning

KV-v2 stores versions of secrets. Accidentally rotated a password and broke everything? Roll back:

1
vault kv get -version=1 smtc/project1/subproject1/env01

4. Dynamic Secrets

While I used static secrets in this journey, Vault can generate dynamic database credentials, AWS IAM credentials, and more. These expire automatically - no manual rotation needed.

5. Multi-Cloud Secrets

One Vault instance can serve multiple Kubernetes clusters across different cloud providers. Unified secret management across your entire infrastructure.

The Journey Ends (But the Road Continues)

What started as “I need to manage secrets in Kubernetes” turned into a deep dive spanning:

  • Production HA Vault with Raft
  • Multiple authentication backends
  • Cross-cluster secret injection
  • Template-based secret formatting
  • Manual unsealing procedures
  • Multi-cluster architecture

I’ve gone from hardcoding secrets (we’ve all been there) to a production-grade secret management system that:

  • Scales across multiple clusters
  • Provides audit logs
  • Supports secret versioning
  • Requires no application code changes
  • Follows security best practices

Was it complex? Yes. Was it worth it? Absolutely.

The next time someone asks me “how do you manage secrets in Kubernetes?”, I can confidently answer: “Let me tell you about HashiCorp Vault…”

What’s Next?

This setup is just the foundation. Here’s what you can explore next:

Dynamic Database Credentials

Instead of static passwords, have Vault generate temporary credentials:

1
2
3
4
5
vault write database/roles/my-role \
    db_name=postgres \
    creation_statements="CREATE ROLE \"{{name}}\" WITH LOGIN PASSWORD '{{password}}' VALID UNTIL '{{expiration}}';" \
    default_ttl="1h" \
    max_ttl="24h"

Every pod gets its own database credentials that expire automatically.

PKI and Certificate Management

Use Vault as an internal Certificate Authority:

1
2
3
4
vault secrets enable pki
vault write pki/root/generate/internal \
    common_name="example.com" \
    ttl=87600h

Applications can request certificates on-demand, and Vault handles rotation.

Encryption as a Service

Use Vault’s Transit engine for application-level encryption:

1
2
vault secrets enable transit
vault write -f transit/keys/my-key

Your app sends plaintext to Vault, gets ciphertext back. No encryption keys to manage.

Multi-Region Vault

For global deployments, set up Vault replication:

  • Performance replication (read replicas in multiple regions)
  • Disaster recovery replication (failover clusters)

GitOps Integration

Combine Vault with ArgoCD or Flux for true GitOps:

  • Store secret paths in Git (not the secrets themselves)
  • Let the Vault Agent fetch actual values
  • Change secrets without Git commits

Resources That Saved Us

These resources were invaluable during implementation:

Final Thoughts

Secret management isn’t glamorous. It doesn’t make for impressive demos. No one will ooh and ahh over your Vault setup at a conference.

But it’s critical. Every production outage caused by leaked credentials, every security breach from hardcoded passwords, every compliance audit failure - they all point back to poor secret management.

The sidecar injection pattern is elegant. Your application code stays clean. The Vault Agent handles all the complexity. Secrets are fetched securely, rotated automatically, and audited completely.

Is it more complex than hardcoding credentials or using ConfigMaps? Yes.

Is it worth it? Ask yourself this: What’s the cost of a security breach?

Your 3 AM self will thank you when:

  • Credentials leak and you can rotate them in seconds
  • Audit asks “who accessed production database passwords” and you have logs
  • A pod is compromised but can only access its specific secrets
  • Secrets rotate automatically and applications keep working

That’s when you’ll know the journey was worth it.

Kudos to the Mavericks at the DevOps Den. Proud of you all.


Built with curiosity, debugged with persistence, secured with Vault—fueled by late-night coffee in the DevOps Den.

Repository

All the configuration files, deployment manifests, patches, and scripts from this adventure are available in this repository:

  • 1-deployment-*.yaml - Application deployments
  • 2-services-nodeport.yaml - NodePort service for Vault
  • 2-vault2-ingress.yaml - Ingress configuration
  • 3a-smtc-patch-inject-secrets.yaml - Basic injection patch
  • 3b-smtc-patch-inject-secrets-as-template01pg.yaml - Template-based injection
  • 4-external-vault-svc-endpoint.yaml - External Vault service endpoint
  • 4a-external-vault2-svc-externalname.yaml - ExternalName service
  • 5-vault-secret.yaml - Service account token secret
  • 6-deployment-nginx-remote01*.yaml - Remote cluster deployments
  • README20240520.md - My working notes (with all the wrong turns included)

Feel free to use these as templates for your own Vault journey. Just remember: change the passwords. Seriously. Please.