How to Automate Backup MongoDB Using Kubernetes

4 min readMay 15, 2021

In this blog post, we will guide you step by step on how to use Kubernetes to backup and restore MongoDB databases operating in Kubernetes environment.

MongoDB is an open-source and NoSQL database. MongoDB uses a JSON-like document and optional schema. MongoDB database has the ability to easily store a large amount of data and scales very easily.

Understanding the Basics

Before continue further with this article some basic understanding on the matter is needed. If you have experience with popular relational database systems such as MySQL, you may find some similarities when working with MongoDB.

The first thing you should know is that MongoDB uses json and bson (binary json) formats for storing its information. Json is the human-readable format which is perfect for exporting and, eventually, importing your data. You can further manage your exported data with any tool which supports json, including a simple text editor.

An example json document looks like this:

{"address":[
    {"building":"1007", "street":"Park Ave"},
    {"building":"1008", "street":"New Ave"},
]}

Json is very convenient to work with, but it does not support all the data types available in bson. This means that there will be the so called ‘loss of fidelity’ of the information if you use json. For backing up and restoring, it’s better to use the binary bson.

Second, you don’t have to worry about explicitly creating a MongoDB database. If the database you specify for import doesn’t already exist, it is automatically created. Even better is the case with the collections’ (database tables) structure. In contrast to other database engines, in MongoDB the structure is again automatically created upon the first document (database row) insert.

Third, in MongoDB reading or inserting large amounts of data, such as for the tasks of this article, can be resource intensive and consume much of the CPU, memory, and disk space. This is something critical considering that MongoDB is frequently used for large databases and Big Data. The simplest solution to this problem is to run the exports and backups during the night or during non-peak hours.

Fourth, information consistency could be problematic if you have a busy MongoDB server where the information changes during the database export or backup process. There is no simple solution to this problem, but at the end of this article, you will see recommendations to further read about replication.

While you can use the import and export functions to backup and restore your data, there are better ways to ensure the full integrity of your MongoDB databases. To backup your data you should use the command mongodump. For restoring, use mongorestore. Let’s see how they work.

source : https://www.digitalocean.com/community/tutorials/how-to-back-up-restore-and-migrate-a-mongodb-database-on-ubuntu-14-04

Step 1: Create a Base Container

Backing Up a MongoDB Database : Creating a Dump file

dump.sh

#!/bin/bash

echo ******************************************************
echo Starting-BACKUP
echo ******************************************************
NOW="$(date +"%F")-$(date +"%T")"

FILE="$DB_NAME-$NOW"

mongodump --uri=$MONGODB_URI  --out=/mongodump/db/$FILE

sleep 30 | echo End-BACKUP

Restoring a MongoDB Database : Creating a restore file

restore.sh

#!/bin/bash

echo ******************************************************
echo Starting-BACKUP
echo ******************************************************
NOW="$(date +"%F")-$(date +"%T")"

FILE="$DB_NAME-$NOW"

mongodump --uri=$MONGODB_URI  --out=/mongodump/db/$FILE

sleep 30 | echo End-BACKUP

writing Dockerfiles

Dockerfile

FROM mongo
WORKDIR /opt/backup/
# Create app directory
WORKDIR /usr/src/configs

# Install app dependencies
COPY dump.sh .
RUN chmod +x dump.sh
COPY restore.sh .
RUN chmod +x restore.sh

Pushing a Docker container image to Docker Hub

you can use my docker image:

https://hub.docker.com/r/microfunctions/microfunctions-mongodump
https://github.com/microfunctionsio/microfunctions-mongodump

Step 2: add PersistentVolumeClaim

Understanding the Basics : https://kubernetes.io/docs/concepts/storage/persistent-volumes/

You must have an existing volume in use in your cluster, which you can create by creating a PersistentVolumeClaim (PVC). For the purposes of this tutorial, presume we have already created a PVC by calling kubectl create -f your_pvc_file.yaml with a YAML file that looks like this:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
 name: mongodb-backup
spec:
 accessModes:
 — ReadWriteOnce
 resources:
 requests:
 storage: 2Gi
 storageClassName: hostpath

Step 3: create CronJobs

You can use CronJobs for cluster tasks that need to be executed on a predefined schedule. As the documentation explains, they are useful for periodic and recurring tasks, like running backups, sending emails, or scheduling individual tasks for a specific time, such as when your cluster is likely to be idle.

As with Jobs, you can create CronJobs via a definition file. Following is a snippet of the CronJob file cron-mongodump-backup.yaml. Use this file to create an example CronJob:

---
apiVersion: batch/v1beta1
kind: CronJob
metadata:
  name: mongodump-backup
spec:
  schedule: "0 */6 * * *" #Cron job every 6 hours
  startingDeadlineSeconds: 60
  concurrencyPolicy: Forbid
  successfulJobsHistoryLimit: 3
  failedJobsHistoryLimit: 2
  jobTemplate:
    spec:
      template:
        spec:
          containers:
            - name: mongodump-backup
              image: microfunctions/microfunctions-mongodump
              imagePullPolicy: "IfNotPresent"
              env:
                - name: DB_NAME
                  value: "microfunctions"
                - name:  MONGODB_URI
                  value: mongodb://microfunctions:UxXKmC9EAn@host-mongodb:27017/microfunctions
              volumeMounts:
                - mountPath: "/mongodump"
                  name: mongodump-volume
              command: ['sh', '-c',"./dump.sh"]
          restartPolicy: OnFailure
          volumes:
            - name: mongodump-volume
              persistentVolumeClaim:
                claimName: mongodb-backup

Apply the CronJob to your cluster:

kubectl apply -f cron-mongodump-backup.yaml
cronjob.batch/mongodump-backup created

Verify that the CronJob was created with the schedule in the definition file:

kubectl get cronjob mongodump-backup NAME SCHEDULE SUSPEND ACTIVE LAST SCHEDULE AGEmongodump-backup */5 * * * * False 0 3m36s 8m47s

Show jobs :

kubectl get job -n microfunctionsNAME COMPLETIONS DURATION AGEmongodump-backup-1621085400 1/1 63s 10mmongodump-backup-1621085700 1/1 34s 5m19s

Show jobs logs :

kubectl logs mongodump-backup-1621085700-k9hz5Starting-BACKUP2021–05–15T13:35:07.234+0000 writing microfunctions.statushists to /mongodump/db/microfunctions-2021–05–15–13:35:07/microfunctions/statushists.bson2021–05–15T13:35:07.245+0000 done dumping microfunctions.statushists (5 documents)2021–05–15T13:35:07.246+0000 writing microfunctions.clusters to /mongodump/db/microfunctions-2021–05–15–13:35:07/microfunctions/clusters.bson2021–05–15T13:35:07.247+0000 done dumping microfunctions.clusters (1 document)2021–05–15T13:35:07.248+0000 writing microfunctions.users to /mongodump/db/microfunctions-2021–05–15–13:35:07/microfunctions/users.bson2021–05–15T13:35:07.248+0000 writing microfunctions.functions to /mongodump/db/microfunctions-2021–05–15–13:35:07/microfunctions/functions.bson2021–05–15T13:35:07.249+0000 writing microfunctions.sourcecodes to /mongodump/db/microfunctions-2021–05–15–13:35:07/microfunctions/sourcecodes.bson2021–05–15T13:35:07.249+0000 writing microfunctions.namespaces to /mongodump/db/microfunctions-2021–05–15–13:35:07/microfunctions/namespaces.bson2021–05–15T13:35:07.250+0000 done dumping microfunctions.functions (1 document)2021–05–15T13:35:07.250+0000 done dumping microfunctions.users (1 document)2021–05–15T13:35:07.251+0000 done dumping microfunctions.sourcecodes (1 document)2021–05–15T13:35:07.252+0000 done dumping microfunctions.namespaces (1 document)End-BACKUP