Exporting 10,000+ MySQL databases

I’m a big fan of single-tenant databases, one for each customer. With MySQL and InnoDB, a single server can scale to tens of thousands of individual databases.

As long as you’re deliberate with your schema design up front - here, I’m a big fan of explicit key/value tables that let you assign arbitrary metadata to objects that are more normalised - you buy yourself incredibly simple isolation by simply selecting the appropriate database once you’ve identified your tenant (using PHP’s select_db()). This means you no longer have to rely on scoping your queries if using something like Eloquent or another ORM, or much worse, making sure to scope it yourself when writing manual queries.

One negative is that you have to iterate these databases to get some things done. This can be slow, as you’ve got to iterate through thousands of databases. Generally, these things aren’t time sensitive, so this doesn’t cause a problem. Remember - everything is a trade-off and a decision - there is no right or wrong answer and it entirely depends on your requirements, environment, application, and everything else.

Let’s make a bash script to iterate through every database on the server, export it, compress it, and give us a reasonable progress report and estimated time remaining.

bash export.sh


#!/bin/bash

# Variables
MYSQL_USER="your_user"
MYSQL_PASS="your_password"
MYSQL_HOST="your_host"
BACKUP_DIR="/path/to/your/backup/directory"
LOG_FILE="/path/to/your/backup/directory/backup_log.txt"

# Create backup directory if it doesn't exist already
mkdir -p $BACKUP_DIR

# Clear the log file before we start
> $LOG_FILE

# Pull the list of databases on the server
databases=$(mysql -u $MYSQL_USER -p$MYSQL_PASS -h $MYSQL_HOST -e 'SHOW DATABASES;' | tail -n +2)

# Count the number of databases
total_databases=$(echo "$databases" | wc -l)
current_database=0

# Start time
start_time=$(date +%s)

# Read and iterate the databases list
for database in $databases
do
  current_database=$((current_database + 1))
  echo "[$(date +'%Y-%m-%d %H:%M:%S')] Starting backup for database: $database ($current_database of $total_databases)"

  # Dump the database + GZIP it
  start_db_time=$(date +%s)
  mysqldump -u $MYSQL_USER -p$MYSQL_PASS -h $MYSQL_HOST --databases $database | gzip > $BACKUP_DIR/$database.sql.gz
  if [ $? -eq 0 ]; then
    echo "[$(date +'%Y-%m-%d %H:%M:%S')] Backup completed for database: $database" >> $LOG_FILE
  else
    echo "[$(date +'%Y-%m-%d %H:%M:%S')] Backup failed for database: $database" >> $LOG_FILE
  fi
  end_db_time=$(date +%s)
  db_duration=$((end_db_time - start_db_time))
  echo "[$(date +'%Y-%m-%d %H:%M:%S')] Backup for database $database completed in $db_duration seconds"

  # Calculate and display progress
  elapsed_time=$((end_db_time - start_time))
  avg_time_per_db=$((elapsed_time / current_database))
  remaining_dbs=$((total_databases - current_database))
  estimated_remaining_time=$((avg_time_per_db * remaining_dbs))

  echo "[$(date +'%Y-%m-%d %H:%M:%S')] Progress: $current_database/$total_databases databases completed"
  echo "[$(date +'%Y-%m-%d %H:%M:%S')] Estimated remaining time: $(date -u -d @$estimated_remaining_time +'%H:%M:%S')"
done

# End time
end_time=$(date +%s)
total_duration=$((end_time - start_time))
echo "[$(date +'%Y-%m-%d %H:%M:%S')] All backups completed in $(date -u -d @$total_duration +'%H:%M:%S')"

Edit the values at the top to point to your MySQL host, the directory to store the exported files in, and the log file.

Then make the script executable (chmod +x export.sh), and open a terminal screen (screen -S export). Once you’re ready to go, run the script we just created (./export.sh). Once it’s started and starts outputting successful progress, you can detach the screen (cmd+a d) and it will carry on running in the background.

As it processes through each database, the estimated time remaining will adjust itself - it’s calculating the average time per database and then applying that to the number of remaining databases.

My databases aren’t particular large, and so this process took only a few hours to complete. If they are pretty large, you could look at parallising this across multiple threads, but that’s an exercise for you!

Whenever you want to check in, you can tail the log file (tail -f /path/to/your/backup/directory/backup_log.txt) or you can rejoin the screen with screen -x export to see the estimated remaining time. Don’t forget that if you rejoin the screen, you need to detach it with cmd+a d or it’ll close and you’ll have to restart!

Exporting 10,000+ MySQL databases

Hit inbox zero?

Let's fix that!