Picture this: It’s 10 PM in the DevOps Den. Laptop screens glow under table lamps, and half the team is connected via Teams from home. We’re staring at Kubernetes clusters and YAML files scattered across multiple screens. We’ve been using ConfigMaps and Kubernetes Secrets for months now, but the workflow is painful—manually collecting credentials from developers, base64-encoding them, hardcoding values into YAML files, applying them to clusters. Rinse, repeat. Every new application, every credential rotation, every new environment means more manual work.
The truth? ConfigMaps and Secrets work, but they don’t scale. Secrets scattered across clusters, no centralized management, no rotation strategy, no audit trail, and definitely no peace of mind when security audit emails land in our inbox. There had to be a better way—something simpler, more scalable, more secure.
This is the story of how our team built a production-grade secret management system using HashiCorp Vault’s sidecar injection pattern. What started as a learning exercise with dev mode evolved into a full-blown production HA Vault setup with multi-cluster authentication, cross-cluster secret injection, and enough YAML files to make even the most seasoned DevOps engineer’s eye twitch. Through late nights in the DevOps Den, debugging sessions, and those breakthrough moments when everything clicked—we built something that actually works.
Act I: The Evolution from Dev to Production
The Humble Beginning: Dev Mode
Every great journey starts with a simple step. Before tackling production, we started with dev mode to understand the basics.
| |
Dev mode is beautiful in its simplicity:
- Pre-unsealed (no unseal key juggling)
- In-memory storage (ephemeral - restart and poof, it’s gone)
- Single node (no HA complexity)
- Root token available immediately
It’s perfect for learning, terrible for production, and absolutely not something you want to run when real secrets are at stake.
We spent a day testing secret injection, authentication, and policies. Impressive, but clearly just a warm-up.
The “Oh Wait, This Needs to Be Real” Moment
Two days later, the urgency hit. “We need this in production. Like, yesterday.”
Time to do it right. Production means HA, persistence, proper unsealing. No shortcuts.
Option 1: Production Single Node
| |
This gives you a proper Vault installation with persistent storage, but it’s still a single point of failure. Better, but not quite there.
Option 2: Production HA with Raft (The One I Chose… Eventually)
| |
Now we’re talking. Two Vault pods running in HA mode using the Raft consensus protocol. If one goes down, the other keeps serving secrets. This is what production looks like.
But wait. Let me tell you about the Raft rabbit hole we fell into.
The Great Raft Documentation Hunt (Or: Where Is Everything?)
Here’s the thing about Raft in Vault: it exists, it’s well-documented somewhere, but finding that documentation when you need it is like finding a specific grain of sand on a beach.
We spent hours Googling, reading docs, trying commands. The information exists but is scattered across multiple sources.
The official HashiCorp docs have pages on:
- Integrated Storage (Raft) configuration
- Raft internals
- Kubernetes deployment guide
- HA with Raft examples
But here’s what they don’t prominently tell you (or at least, not in a way that jumps out when you’re frantically Googling at midnight):
The vault-1 won’t automatically join vault-0. You need to manually run
vault operator raft joinon each follower node. The Helm chart creates the StatefulSet, but the clustering? That’s on you.The internal service name matters. It’s
vault-0.vault-internal:8200, notvault-0:8200, notvault:8200, notvault-0.vault.svc.cluster.local:8200. Get it wrong, and you’ll see cryptic “failed to join raft cluster” errors.Each node needs to be unsealed individually. Seal one? Sealed. Seal two? Also sealed. Pod restarts? Everything’s sealed. Node failure? You’re unsealing again. Hope you saved those keys!
The order matters. Initialize vault-0, unseal it, then join vault-1, then unseal vault-1. Do it out of order and you’ll get errors that make you question your life choices.
GitHub is full of issues about this exact pain point:
We spent hours piecing together information from:
- Three different HashiCorp tutorials (each covering one piece of the puzzle)
- GitHub issues (where the real truth lives)
- Random blog posts from people who’ve been there
- The Helm chart’s
values.yamlcomments (surprisingly informative, actually)
The moment it clicked: Late evening in the DevOps Den, someone looked up from their laptop. “Wait… Raft is a consensus algorithm, not magic clustering?”
Exactly. The Helm chart creates the infrastructure, but you create the cluster by explicitly telling each node to join. The mental model finally made sense.
The Initialization Ritual
When you install Vault in production mode, it starts sealed. Think of it as a safe that needs to be unlocked before you can use it. This is a one-time operation—miss this, and you’re locked out forever.
| |
CRITICAL WARNING: In a real production environment, you’d use 5 key shares with a threshold of 3, and distribute them to different trusted individuals. You’d also enable auto-unseal using a cloud KMS. But for this learning journey, we’re keeping it simple with one key.
That keys.json file? Treat it like the nuclear launch codes. Seriously.
Unsealing the Vault (Literally)
| |
And just like that, you have a two-node HA Vault cluster. Both nodes are part of the Raft cluster, one is the leader, and they’re ready to serve secrets.
Making Vault Accessible
I set up two ways to access Vault:
1. NodePort Service (For Direct IP Access)
| |
Now I can access Vault at any node’s IP on port 32000.
2. Ingress (For Fancy DNS Names)
| |
Much better: https://vault2.skill-mine.com
Act II: Setting Up the Secret Architecture
Enabling Authentication
Jump into vault-0 and let’s configure things:
| |
The Secret Hierarchy
Instead of just dumping secrets anywhere, I designed a hierarchical structure:
| |
The path structure smtc/project1/subproject1/env01 means:
smtc- The root path for all project secretsproject1- Specific projectsubproject1- Subproject or microserviceenv01- Environment (dev, staging, prod, etc.)
This scales beautifully as your organization grows.
Configuring Kubernetes Authentication
This is where we teach Vault to trust our Kubernetes cluster:
| |
Creating Policies (The Gatekeeper)
Policies are how Vault controls access. Think of them as fine-grained permissions:
| |
Notice the /data/ in the path? That’s a KV-v2 quirk. The actual API path has /data/ inserted between the mount path and your secret path.
Creating Vault Roles
Roles bind Kubernetes service accounts to Vault policies:
| |
What this says: “If a pod uses the service account named vault in the namespace vault, allow it to authenticate and apply the smtc-policy.”
Act III: The Magic of Sidecar Injection
Deploying a Test Application
Let’s deploy a simple nginx application:
| |
Nothing special - just a plain nginx pod using the vault service account.
| |
The First Patch: Basic Injection
Now let’s inject secrets with a simple annotation patch:
| |
| |
What happens?
- The Vault Agent Injector (a mutating webhook) intercepts the pod creation
- It injects an init container that authenticates to Vault and fetches secrets
- It injects a sidecar container that keeps the secrets up to date
- It mounts the secrets at
/vault/secrets/database-config.txt
The Second Patch: Template Magic
Let’s make it actually useful with templates:
| |
| |
Now check the secret:
| |
Output:
| |
Beautiful! A ready-to-use PostgreSQL connection string. Your application just reads a file and connects - it never needs to know about Vault.
Act IV: The Multi-Cluster Plot Twist
Here’s where things get spicy. I have two DigitalOcean Kubernetes clusters:
- Cluster 1 (do-blr1-testk8s-1): The Vault server cluster
- Cluster 2 (do-blr1-testk8s-2): The client cluster that needs secrets
The challenge: How do pods in Cluster 2 authenticate to Vault in Cluster 1 and fetch secrets?
On the Vault Server (Cluster 1)
First, we need to create a separate authentication backend for the remote cluster:
| |
Why a separate path? Because each Kubernetes cluster has its own CA certificate and API server. We need to configure them separately.
On the Client Cluster (Cluster 2)
Switch your kubectl context to the second cluster and install the Vault Agent Injector:
| |
This installs only the Vault Agent Injector webhook - not the Vault server itself. The injector will configure pods to connect to the external Vault.
The Service Account Token Dance
In Kubernetes 1.24+, service account tokens are no longer automatically created as secrets. We need to explicitly request them:
| |
| |
Extracting Authentication Credentials
Now we extract the credentials that Vault needs to validate tokens from this cluster:
| |
Generating the Vault Configuration Command
Create a script with the configuration command:
| |
IMPORTANT: Copy the contents of 7a-vault-auth and run it inside the Vault server in Cluster 1.
Back on the Vault Server (Cluster 1)
Switch back to your Cluster 1 context and configure the remote cluster authentication:
| |
Notice we’re using the same smtc-policy - no need to duplicate policies.
Back on the Client Cluster (Cluster 2): Deploy and Test
Now for the moment of truth:
| |
Key points:
vault.hashicorp.com/auth-path: 'auth/remote01-cluster'- This tells the Vault Agent to authenticate using our custom auth pathvault.hashicorp.com/role: 'vault-smtc-role'- The role we created earlier
| |
Key Learnings and “Aha!” Moments
1. Dev Mode is a Trap (A Comfortable One)
Dev mode is so easy that you’ll be tempted to use it everywhere. Don’t. I learned this the hard way when my dev Vault restarted and all my test secrets vanished into the ether.
2. Unsealing is Serious Business
In production, you’ll have 5 unseal keys split among trusted people. If Vault restarts, you need 3 of those 5 keys to unseal it. This isn’t paranoia - it’s security.
3. The /data/ Path Gotcha
With KV-v2, the path in your policy must include /data/:
- Secret path:
smtc/project1/subproject1/env01 - Policy path:
smtc/data/project1/subproject1/env01 - API path:
v1/smtc/data/project1/subproject1/env01
I spent an embarrassing amount of time troubleshooting “permission denied” errors before I figured this out.
4. Templates Are Your Best Friend
Don’t just inject raw JSON. Use templates to format secrets exactly as your application expects:
| |
5. Multi-Cluster Auth Paths
Each Kubernetes cluster needs its own auth path because:
- Different CA certificates
- Different API servers
- Different service account token issuers
Don’t try to share auth paths between clusters. It won’t work and you’ll waste hours debugging.
Production Considerations
High Availability
My setup with 2 Raft nodes is the minimum for HA. For production:
- Use 3 or 5 nodes (odd numbers for Raft quorum)
- Spread nodes across availability zones
- Monitor Raft cluster health
- Have runbooks for node failures
Auto-Unseal
Manual unsealing doesn’t scale. Use cloud KMS:
| |
Backup Strategy
Vault data is precious. For Raft storage:
| |
Automate this. Schedule it. Test restores regularly.
Monitoring and Alerting
Key metrics to watch:
- Seal status (is Vault unsealed?)
- Raft cluster health (are all nodes active?)
- Authentication failures (someone trying something fishy?)
- Token expiration rates (are applications renewing properly?)
- Secret access patterns (unusual access patterns?)
Troubleshooting Guide
Problem: “Permission Denied” When Accessing Secrets
Symptoms: Vault Agent logs show permission denied
Solution:
- Check your policy includes
/data/in the path - Verify the role has the correct policy
- Check if the service account matches
Problem: “Authentication Failed”
Symptoms: Vault Agent init container fails with authentication error
Solution:
- Verify Kubernetes auth is configured:
vault read auth/kubernetes/config - Check the role exists:
vault list auth/kubernetes/role - Test authentication manually
Problem: Pods Stuck in Init
Symptoms: Pods stuck with vault-agent-init container running
Solution:
- Check logs:
kubectl logs <pod> -c vault-agent-init - Verify Vault is accessible from the pod
- Check the external Vault address
Problem: Cross-Cluster Authentication Fails
Symptoms: Remote cluster pods can’t authenticate to Vault
Solution:
- Verify the auth path is correct in annotations
- Check the remote auth backend exists:
vault auth list - Verify the remote cluster config:
vault read auth/remote01-cluster/config
The Unexpected Benefits
1. Zero Code Changes
The most beautiful part? Applications don’t need any Vault-specific code. They just read files from /vault/secrets/. Want to migrate from ConfigMaps to Vault? Just change the annotations. The app code stays the same.
2. Audit Trail for Free
Every secret access is logged in Vault’s audit log. Who accessed what secret, when, from which pod. Security team loves this.
3. Secret Versioning
KV-v2 stores versions of secrets. Accidentally rotated a password and broke everything? Roll back:
| |
4. Multi-Cloud Secrets
One Vault instance can serve multiple Kubernetes clusters across different cloud providers. Unified secret management across your entire infrastructure.
What’s Next?
This setup is just the foundation. Here’s what you can explore next:
Dynamic Database Credentials
Instead of static passwords, have Vault generate temporary credentials that expire automatically.
PKI and Certificate Management
Use Vault as an internal Certificate Authority for service-to-service mTLS.
Encryption as a Service
Use Vault’s Transit engine for application-level encryption without managing keys.
GitOps Integration
Combine Vault with ArgoCD or Flux:
- Store secret paths in Git (not the secrets themselves)
- Let the Vault Agent fetch actual values
- Change secrets without Git commits
Resources That Saved Us
These resources were invaluable during implementation:
- DevOps Cube - Vault in Kubernetes
- DevOps Cube - Vault Agent Injector Tutorial
- HashiCorp - Kubernetes Raft Deployment Guide
- HashiCorp - External Vault Tutorial
Final Thoughts
Secret management isn’t glamorous. It doesn’t make for impressive demos. No one will ooh and ahh over your Vault setup at a conference.
But it’s critical. Every production outage caused by leaked credentials, every security breach from hardcoded passwords, every compliance audit failure - they all point back to poor secret management.
The sidecar injection pattern is elegant. Your application code stays clean. The Vault Agent handles all the complexity. Secrets are fetched securely, rotated automatically, and audited completely.
Is it more complex than hardcoding credentials or using ConfigMaps? Yes.
Is it worth it? Ask yourself this: What’s the cost of a security breach?
Your 3 AM self will thank you when:
- Credentials leak and you can rotate them in seconds
- Audit asks “who accessed production database passwords” and you have logs
- A pod is compromised but can only access its specific secrets
- Secrets rotate automatically and applications keep working
That’s when you’ll know the journey was worth it.
Kudos to the Mavericks at the DevOps Den. Proud of you all.
Built with curiosity, debugged with persistence, secured with Vault—fueled by late-night coffee in the DevOps Den.
