Amazon DocumentDB has long been a powerful, scalable, and fully managed document database service for teams building applications compatible with MongoDB workloads. But as more organizations adopt Kubernetes as their core platform, the need to deploy, manage, and scale DocumentDB-like environments directly inside Kubernetes has grown.

The new open-source DocumentDB Operator now makes this possible. It introduces a Kubernetes-native way to run DocumentDB-compatible workloads by leveraging Custom Resource Definitions (CRDs), declarative configuration, and the reconciliation patterns common in modern Operators.

This article walks you step-by-step through deploying, managing, and scaling DocumentDB on Kubernetes using the new operator. It includes architectural insights, real-world best practices, and complete code examples. If you’re looking for a Kubernetes-native workflow for DocumentDB—without relying solely on AWS-managed services—this is your deep guide.

What the New DocumentDB Operator Provides

The DocumentDB Operator introduces a cloud-native method for running DocumentDB clusters on any Kubernetes environment. Operator capabilities include:

  • Declarative cluster creation using YAML manifests

  • Automated provisioning of StatefulSets, secrets, services, and PVCs

  • Built-in failover and self-healing

  • Automated rolling upgrades

  • Horizontal and vertical autoscaling

  • Backup orchestration (if configured)

  • Consistent configuration enforcement

This aligns DocumentDB cluster management with the GitOps model, improving repeatability and operational consistency.

Why Run DocumentDB on Kubernetes?

Running DocumentDB-like workloads on Kubernetes provides:

  • Infrastructure independence (on-prem, cloud, hybrid)

  • GitOps-friendly workflows

  • Native autoscaling mechanisms

  • Unified monitoring and logging pipelines

  • Development parity with production

  • Lower cost in self-managed environments

For teams already deeply invested in Kubernetes, the operator removes operational overhead and integrates DocumentDB into the platform’s automation ecosystem.

Architecture Overview of the DocumentDB Operator

The operator architecture follows the standard Kubernetes Operator pattern:

Custom Resource Definitions (CRDs)

The operator introduces CRDs such as:

  • DocumentDBCluster – describes the cluster instances, storage, and configuration

  • DocumentDBUser – manages users and credentials

  • DocumentDBBackup – defines backup schedules and retention policies

Kubernetes treats these as first-class API objects.

Reconciliation Loop

The operator continuously ensures that the desired state described in YAML manifests matches the actual state of the running cluster. If something drifts (e.g., a pod dies), the operator repairs it automatically.

StatefulSets and Persistent Volume Claims

DocumentDB storage is managed using Kubernetes PVCs. StatefulSets ensure stable network identities for each member.

Services for Internal Connectivity

The operator provisions services such as:

  • A headless service for pod DNS resolution

  • A cluster endpoint service for client connections

Scaling Controllers

The operator supports:

  • Vertical scaling (changing CPU/memory)

  • Horizontal scaling (adding/removing cluster members)

Prerequisites

Before working through the examples, ensure you have:

  • A Kubernetes cluster (min. v1.22)

  • kubectl installed

  • StorageClass configured in the cluster

  • (Optional) Cert-manager for TLS

  • (Optional) Kustomize for GitOps workflows

Installing the DocumentDB Operator

The operator is installed through a standard manifest apply. For example:

kubectl apply -f https://example.com/documentdb-operator.yaml

Or using Helm:

helm repo add documentdb https://example.com/helm
helm install documentdb-operator documentdb/operator

After installation, verify the CRDs:

kubectl get crds | grep documentdb

You should see entries like:

documentdbclusters.documentdb.myorg.io
documentdbusers.documentdb.myorg.io
documentdbbackups.documentdb.myorg.io

Deploying a DocumentDB Cluster

Now let’s create a fully functional cluster.

Define the DocumentDBCluster Manifest

Create a file named documentdb-cluster.yaml:

apiVersion: documentdb.myorg.io/v1alpha1
kind: DocumentDBCluster
metadata:
name: docdb-prod
spec:
replicas: 3
version: "5.0"
storage:
size: 20Gi
className: standard
resources:
requests:
cpu: "500m"
memory: "1Gi"
limits:
cpu: "1"
memory: "2Gi"
networking:
serviceType: ClusterIP
auth:
createAdminUser: true

Apply it:

kubectl apply -f documentdb-cluster.yaml

Check cluster health:

kubectl get documentdbclusters

Check the pods:

kubectl get pods -l app=docdb-prod

When all pods show Running, the cluster is ready.

Creating Database Users

The operator manages users declaratively.

Create a documentdb-user.yaml file:

apiVersion: documentdb.myorg.io/v1alpha1
kind: DocumentDBUser
metadata:
name: app-user
spec:
clusterRef: docdb-prod
username: appuser
passwordSecretRef:
name: appuser-password

Create the password secret:

kubectl create secret generic appuser-password \
--from-literal=password='MySecurePass123!'

Apply the user manifest:

kubectl apply -f documentdb-user.yaml

To verify user creation:

kubectl describe documentdbuser app-user

Connecting to the Cluster

First get the service name:

kubectl get svc | grep docdb-prod

Forward a local port for testing:

kubectl port-forward svc/docdb-prod 27017:27017

Then connect with a MongoDB-compatible client:

mongo "mongodb://appuser:MySecurePass123!@localhost:27017/admin"

Scaling DocumentDB with the Operator

The operator supports both horizontal scaling (changing replica count) and vertical scaling (changing CPU/memory).

Horizontal Scaling (Add/Remove Instances)

Update the cluster manifest:

spec:
replicas: 5

Apply the update:

kubectl apply -f documentdb-cluster.yaml

Kubernetes will automatically add two new replicas.

To scale down:

spec:
replicas: 2

Vertical Scaling (Adjust CPU/Memory)

Modify:

spec:
resources:
requests:
cpu: "1"
memory: "2Gi"
limits:
cpu: "2"
memory: "4Gi"

Apply:

kubectl apply -f documentdb-cluster.yaml

The operator performs a safe rolling restart.

Enabling Auto-Scaling

Horizontal Pod Autoscaler Example

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: docdb-prod-hpa
spec:
scaleTargetRef:
apiVersion: documentdb.myorg.io/v1alpha1
kind: DocumentDBCluster
name: docdb-prod
minReplicas: 3
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 65

Apply:

kubectl apply -f docdb-hpa.yaml

Managing Backups

If backups are enabled in the operator, you can manage them declaratively:

Create a Backup

apiVersion: documentdb.myorg.io/v1alpha1
kind: DocumentDBBackup
metadata:
name: docdb-backup-daily
spec:
clusterRef: docdb-prod
schedule: "0 2 * * *"
retentionDays: 7

Rolling Updates and Version Upgrades

To upgrade DocumentDB, change:

spec:
version: "5.1"

Apply:

kubectl apply -f documentdb-cluster.yaml

The operator performs a zero-downtime rolling update:

  1. Drains a member

  2. Upgrades it

  3. Rejoins cluster

  4. Moves to next node

Monitoring and Logging

DocumentDB pods expose metrics through Prometheus endpoints.

Prometheus Integration Example

annotations:
prometheus.io/scrape: "true"
prometheus.io/path: "/metrics"
prometheus.io/port: "9216"

Add these annotations to the pod template.

ELK, Loki, or OpenSearch can collect logs via standard stdout/stderr.

Securing DocumentDB

Security best practices include:

  • Using TLS certificates (cert-manager integration supported)

  • Using Kubernetes Secrets for credentials

  • Enforcing network policies

  • Denying pod exec for production workloads

  • Using encryption-at-rest via StorageClass

Example NetworkPolicy

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-app-only
spec:
podSelector:
matchLabels:
app: docdb-prod
ingress:
- from:
- podSelector:
matchLabels:
role: application

GitOps Deployment With Kustomize (Optional)

Example structure:

/
├── base/
│ ├── documentdb-cluster.yaml
│ └── kustomization.yaml
└── overlays/
├── dev/
└── prod/

kustomization.yaml:

resources:
- documentdb-cluster.yaml

Apply:

kubectl apply -k overlays/prod/

Conclusion

The new open-source DocumentDB Operator brings the power of DocumentDB-like databases directly into Kubernetes clusters with a fully declarative, self-managing, and automation-friendly workflow. By embracing Kubernetes-native patterns such as CRDs and reconciliation loops, the operator simplifies every aspect of managing document database workloads—from initial deployment and scaling to user management, backups, and rolling upgrades.

Organizations that are deeply aligned with Kubernetes can now enjoy the same operational efficiency for DocumentDB as they do for other containerized applications. This includes the convenience of GitOps deployments, native autoscaling, unified observability, and easier disaster recovery workflows. Instead of relying on external managed DocumentDB services, teams can run clusters on their own infrastructure, gain more control, and customize every layer of deployment.

With robust features like horizontal and vertical scaling, TLS integration, backup orchestration, and the ability to declaratively manage everything through Kubernetes manifests, the DocumentDB Operator empowers engineering teams to treat DocumentDB as a first-class citizen in their platform ecosystem.

As Kubernetes continues to evolve as the universal substrate for application delivery, the DocumentDB Operator represents a meaningful step forward—ensuring that running a scalable, resilient, and secure document database is just as straightforward as managing any other Kubernetes resource. By adopting the operator, you unlock a consistent, automated, and production-grade approach to managing document database workloads at any scale.

If your organization is embracing cloud-native principles, the DocumentDB Operator is not just a useful tool — it is an essential component in building a unified, Kubernetes-native data platform.