Since some months I am dealing with MongoDBs and do a lot of operations with them. In my mind it is a great database technology because it starts with a small footprint (few hundred megabytes of binaries) and can end-up in a large horizontal scaled database (using sharding). High availability was considered since the creation of this engine which makes a replication very easy.
However independently of the db engine there is this big duty of every DBA to protect the data in case of a disaster, human error or a security vulnerability attack. For this kind of scenarios, you need always a backup of your data. At a MongoDB you are free of choice in terms of backup tools, but it has less options in terms of backup strategy. Lets go in detail.
Physical backup
Simple movement of your data files to another location
Tool | Output Format | Pro | Cons |
File system snapshots | 1:1 copy plus journal | – No external tool needed, just build in FS-capabilities – Immediate and fast procedure | – Can lead to data corruption because of dirty buffers which hadn’t be moved fro memory to the journal (disk) – Will be inside the server and won’t be accessible during a disaster recovery |
WireTiger snapshots | 1:1 copy plus journal | – Incremental backup strategy – Fast and efficient from storage perspective | – Requires MongoDB Ops/Cloud Manager which requires enterprise license |
Logical Backup
Is creating an export of your data into specific files
Tool | Output format | Pro | Cons |
mongoexport | JSON / CSV | – Can be used for hot-backups and keep your mongod-process online – Output is either in JSON or CSV which makes it usable for other systems too | – Restore takes time hence it must be converted back to BSON – Some value attributes could be lost because of decimals restrictions for example |
mongodump | BSON (Extended JSON) | – Can be used for hot-backups and keep your mongod-process online – Save storage cause BSON is more efficient to store than JSON – Runs very fast ans has multi-threaded architecture – A lot of additional options like adding oplog-changes which has happened during the backup | – Must be separately installed with the MongoDB Database Tools |
Ultimately “mongodump” has a good balance between a free tool and performance. It has a lot of features and is fast in terms of backup and restore. The tool is part of the so called “mongodb-database-tools”-suite and must be downloaded separately from the official mongodb website. When you use please take care to store the dumps outside of your server, could be a S3 bucket or NAS-share. It shouldn’t be stored on the same server.
I wrote a script which can be used for a backup strategy. This script is available at my git hub repo here.