PWO:Working Item Page: Database Backup
Jump to navigation
Jump to search
Working Item
From Core site & URLs:
- Database Backup
- Construct a reliable database backup/snapshot system so docker containers and/or instance may be restarted without losing data, as well as changing the
db_data
volume name fromubuntu_db_data
topaperworld-wiki_db_data
.
Backup Structure
- Filename:
paperworld-wiki-$(date +%F_%H%M)
- Backup Rotation:
- Manual On-demand Backups ``stored within the host at
/var/backups/paperworld-wiki/manual/
``- While an update is in development for the service, backups will be stored here and restored as needed.
- Completely separate from scheduled backup chain
- 3 daily backups stored within the host at
/var/backups/paperworld-wiki/
- 7 daily backups stored within AWS S3 Standard Buckets; 7-day expiration
- Total daily backup range of 10 days
- 4 weekly backups stored within AWS S3 Standard Buckets; 4-week expiration
- 12 monthly backups stored within AWS S3 Glacier Deep Archive Vault; 12-week expiration
- Manual On-demand Backups ``stored within the host at
The Plan
- Identify What Needs Backing Up
- Databases (MariaDB/MySQL for MediaWiki content, users, configs)
- Uploads/static files (MediaWiki images/ dir, favicon, robots.txt, etc.)
- Configs/code (often in Git already, but if configs are generated on host, back them too)
- For you, primary target = DB dumps, plus optionally static/ and images/ in the same process.
- Choose Backup Types
- Logical (mysqldump): portable, human-readable SQL, slower for restore on huge DBs but fine for wikis.
- Physical (mariabackup): byte-for-byte, faster restores but bigger files.
- ~ You’ll almost always do logical backups for a wiki unless you’re hitting 50GB+.
- ~ For your case →
mysqldump --single-transaction
is the standard.
- Decide Locations & Permissions
- Manual local dir: /var/backups/paperworld-wiki/manual/ (on-demand only)
- Scheduled local dir: /var/backups/paperworld-wiki/
- Remote: AWS S3 Standard (daily+weekly) and Glacier Deep Archive (monthly)
- ~ Permissions: only backup user can write; backups are chmod 600; S3 bucket policies block deletion except by backup process.
- Automate the Process
- Usually broken into three jobs:
- Dump job
docker exec mediawiki-db mysqldump ... > /var/backups/paperworld-wiki/...
- Compress with zstd or gzip (zstd is faster).
- Rotation job (local)
- Keep only last 3 local daily dumps.
- Sync job (offsite)
- Use rclone or AWS CLI to upload new dumps.
- Apply lifecycle policies in S3 to expire old objects automatically.
- Rotation & Retention
- Implemented via:
- Local: find + -mtime or systemd-tmpfiles to purge >N days.
- S3: Lifecycle rules for 7-day expiry (Standard), 4-week expiry (Standard), and 12-week expiry (Glacier).
- Don’t manually delete offsite unless needed; let the storage class do it.
- Implemented via:
- Test Restores
- On a schedule (monthly at least), take a random backup and restore to a staging DB.
- Check MediaWiki loads as expected.
- This step is often skipped in prod, and it’s where a lot of teams get burned.
- Secure the Pipeline
- Backup DB user = read-only (no DDL/DML rights).
- Encrypt backups in-flight (TLS to S3) and optionally at rest (S3 SSE or GPG locally).
- Make offsite destinations write-only from prod host’s perspective.
- Document & Monitor
- Keep a README in /var/backups/paperworld-wiki/ describing restore commands.
- Add logging & monitoring (e.g., send an alert if backup fails or size changes unexpectedly).