PWO:Working Item Page: Database Backup

From The Paper World
Jump to navigation Jump to search

Working Item

From Core site & URLs:

Database Backup
Construct a reliable database backup/snapshot system so docker containers and/or instance may be restarted without losing data, as well as changing the db_data volume name from ubuntu_db_data to paperworld-wiki_db_data.

Backup Structure

  • Filename: paperworld-wiki-$(date +%F_%H%M)
  • Backup Rotation:
    • Manual On-demand Backups ``stored within the host at /var/backups/paperworld-wiki/manual/``
      • While an update is in development for the service, backups will be stored here and restored as needed.
      • Completely separate from scheduled backup chain
    • 3 daily backups stored within the host at /var/backups/paperworld-wiki/
    • 7 daily backups stored within AWS S3 Standard Buckets; 7-day expiration
      • Total daily backup range of 10 days
    • 4 weekly backups stored within AWS S3 Standard Buckets; 4-week expiration
    • 12 monthly backups stored within AWS S3 Glacier Deep Archive Vault; 12-week expiration

The Plan

  1. Identify What Needs Backing Up
    • Databases (MariaDB/MySQL for MediaWiki content, users, configs)
    • Uploads/static files (MediaWiki images/ dir, favicon, robots.txt, etc.)
    • Configs/code (often in Git already, but if configs are generated on host, back them too)
    • For you, primary target = DB dumps, plus optionally static/ and images/ in the same process.
  2. Choose Backup Types
    • Logical (mysqldump): portable, human-readable SQL, slower for restore on huge DBs but fine for wikis.
    • Physical (mariabackup): byte-for-byte, faster restores but bigger files.
    ~ You’ll almost always do logical backups for a wiki unless you’re hitting 50GB+.
    ~ For your case → mysqldump --single-transaction is the standard.
  3. Decide Locations & Permissions
    • Manual local dir: /var/backups/paperworld-wiki/manual/ (on-demand only)
    • Scheduled local dir: /var/backups/paperworld-wiki/
    • Remote: AWS S3 Standard (daily+weekly) and Glacier Deep Archive (monthly)
    ~ Permissions: only backup user can write; backups are chmod 600; S3 bucket policies block deletion except by backup process.
  4. Automate the Process
    Usually broken into three jobs:
    1. Dump job
      docker exec mediawiki-db mysqldump ... > /var/backups/paperworld-wiki/...
      • Compress with zstd or gzip (zstd is faster).
    2. Rotation job (local)
      • Keep only last 3 local daily dumps.
    3. Sync job (offsite)
      • Use rclone or AWS CLI to upload new dumps.
      • Apply lifecycle policies in S3 to expire old objects automatically.
  5. Rotation & Retention
    • Implemented via:
      • Local: find + -mtime or systemd-tmpfiles to purge >N days.
      • S3: Lifecycle rules for 7-day expiry (Standard), 4-week expiry (Standard), and 12-week expiry (Glacier).
    • Don’t manually delete offsite unless needed; let the storage class do it.
  6. Test Restores
    • On a schedule (monthly at least), take a random backup and restore to a staging DB.
    • Check MediaWiki loads as expected.
    This step is often skipped in prod, and it’s where a lot of teams get burned.
  7. Secure the Pipeline
    • Backup DB user = read-only (no DDL/DML rights).
    • Encrypt backups in-flight (TLS to S3) and optionally at rest (S3 SSE or GPG locally).
    • Make offsite destinations write-only from prod host’s perspective.
  8. Document & Monitor
    • Keep a README in /var/backups/paperworld-wiki/ describing restore commands.
    • Add logging & monitoring (e.g., send an alert if backup fails or size changes unexpectedly).