Amazon DocumentDB has long been a powerful, scalable, and fully managed document database service for teams building applications compatible with MongoDB workloads. But as more organizations adopt Kubernetes as their core platform, the need to deploy, manage, and scale DocumentDB-like environments directly inside Kubernetes has grown.
The new open-source DocumentDB Operator now makes this possible. It introduces a Kubernetes-native way to run DocumentDB-compatible workloads by leveraging Custom Resource Definitions (CRDs), declarative configuration, and the reconciliation patterns common in modern Operators.
This article walks you step-by-step through deploying, managing, and scaling DocumentDB on Kubernetes using the new operator. It includes architectural insights, real-world best practices, and complete code examples. If you’re looking for a Kubernetes-native workflow for DocumentDB—without relying solely on AWS-managed services—this is your deep guide.
What the New DocumentDB Operator Provides
The DocumentDB Operator introduces a cloud-native method for running DocumentDB clusters on any Kubernetes environment. Operator capabilities include:
-
Declarative cluster creation using YAML manifests
-
Automated provisioning of StatefulSets, secrets, services, and PVCs
-
Built-in failover and self-healing
-
Automated rolling upgrades
-
Horizontal and vertical autoscaling
-
Backup orchestration (if configured)
-
Consistent configuration enforcement
This aligns DocumentDB cluster management with the GitOps model, improving repeatability and operational consistency.
Why Run DocumentDB on Kubernetes?
Running DocumentDB-like workloads on Kubernetes provides:
-
Infrastructure independence (on-prem, cloud, hybrid)
-
GitOps-friendly workflows
-
Native autoscaling mechanisms
-
Unified monitoring and logging pipelines
-
Development parity with production
-
Lower cost in self-managed environments
For teams already deeply invested in Kubernetes, the operator removes operational overhead and integrates DocumentDB into the platform’s automation ecosystem.
Architecture Overview of the DocumentDB Operator
The operator architecture follows the standard Kubernetes Operator pattern:
Custom Resource Definitions (CRDs)
The operator introduces CRDs such as:
-
DocumentDBCluster– describes the cluster instances, storage, and configuration -
DocumentDBUser– manages users and credentials -
DocumentDBBackup– defines backup schedules and retention policies
Kubernetes treats these as first-class API objects.
Reconciliation Loop
The operator continuously ensures that the desired state described in YAML manifests matches the actual state of the running cluster. If something drifts (e.g., a pod dies), the operator repairs it automatically.
StatefulSets and Persistent Volume Claims
DocumentDB storage is managed using Kubernetes PVCs. StatefulSets ensure stable network identities for each member.
Services for Internal Connectivity
The operator provisions services such as:
-
A headless service for pod DNS resolution
-
A cluster endpoint service for client connections
Scaling Controllers
The operator supports:
-
Vertical scaling (changing CPU/memory)
-
Horizontal scaling (adding/removing cluster members)
Prerequisites
Before working through the examples, ensure you have:
-
A Kubernetes cluster (min. v1.22)
-
kubectlinstalled -
StorageClass configured in the cluster
-
(Optional) Cert-manager for TLS
-
(Optional) Kustomize for GitOps workflows
Installing the DocumentDB Operator
The operator is installed through a standard manifest apply. For example:
Or using Helm:
After installation, verify the CRDs:
You should see entries like:
Deploying a DocumentDB Cluster
Now let’s create a fully functional cluster.
Define the DocumentDBCluster Manifest
Create a file named documentdb-cluster.yaml:
Apply it:
Check cluster health:
Check the pods:
When all pods show Running, the cluster is ready.
Creating Database Users
The operator manages users declaratively.
Create a documentdb-user.yaml file:
Create the password secret:
Apply the user manifest:
To verify user creation:
Connecting to the Cluster
First get the service name:
Forward a local port for testing:
Then connect with a MongoDB-compatible client:
Scaling DocumentDB with the Operator
The operator supports both horizontal scaling (changing replica count) and vertical scaling (changing CPU/memory).
Horizontal Scaling (Add/Remove Instances)
Update the cluster manifest:
Apply the update:
Kubernetes will automatically add two new replicas.
To scale down:
Vertical Scaling (Adjust CPU/Memory)
Modify:
Apply:
The operator performs a safe rolling restart.
Enabling Auto-Scaling
Horizontal Pod Autoscaler Example
Apply:
Managing Backups
If backups are enabled in the operator, you can manage them declaratively:
Create a Backup
Rolling Updates and Version Upgrades
To upgrade DocumentDB, change:
Apply:
The operator performs a zero-downtime rolling update:
-
Drains a member
-
Upgrades it
-
Rejoins cluster
-
Moves to next node
Monitoring and Logging
DocumentDB pods expose metrics through Prometheus endpoints.
Prometheus Integration Example
Add these annotations to the pod template.
ELK, Loki, or OpenSearch can collect logs via standard stdout/stderr.
Securing DocumentDB
Security best practices include:
-
Using TLS certificates (cert-manager integration supported)
-
Using Kubernetes Secrets for credentials
-
Enforcing network policies
-
Denying pod exec for production workloads
-
Using encryption-at-rest via StorageClass
Example NetworkPolicy
GitOps Deployment With Kustomize (Optional)
Example structure:
kustomization.yaml:
Apply:
Conclusion
The new open-source DocumentDB Operator brings the power of DocumentDB-like databases directly into Kubernetes clusters with a fully declarative, self-managing, and automation-friendly workflow. By embracing Kubernetes-native patterns such as CRDs and reconciliation loops, the operator simplifies every aspect of managing document database workloads—from initial deployment and scaling to user management, backups, and rolling upgrades.
Organizations that are deeply aligned with Kubernetes can now enjoy the same operational efficiency for DocumentDB as they do for other containerized applications. This includes the convenience of GitOps deployments, native autoscaling, unified observability, and easier disaster recovery workflows. Instead of relying on external managed DocumentDB services, teams can run clusters on their own infrastructure, gain more control, and customize every layer of deployment.
With robust features like horizontal and vertical scaling, TLS integration, backup orchestration, and the ability to declaratively manage everything through Kubernetes manifests, the DocumentDB Operator empowers engineering teams to treat DocumentDB as a first-class citizen in their platform ecosystem.
As Kubernetes continues to evolve as the universal substrate for application delivery, the DocumentDB Operator represents a meaningful step forward—ensuring that running a scalable, resilient, and secure document database is just as straightforward as managing any other Kubernetes resource. By adopting the operator, you unlock a consistent, automated, and production-grade approach to managing document database workloads at any scale.
If your organization is embracing cloud-native principles, the DocumentDB Operator is not just a useful tool — it is an essential component in building a unified, Kubernetes-native data platform.