Recently, I had a client that had a rack-mounted server that went into a hard lock on a Saturday morning. Monday morning, when it could be rebooted, MySQL received the dreaded “Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: ‘Could not find first log file name in binary log index file’.”
Of course, with the server locking up on Saturday morning and the expire_logs_days* variable set for two days, the binary logs were gone. Once I validated the logs were indeed gone, the next step was to recreate the replica. This would entail storing the database from the source, gathering the master status information and restoring it on the replica.
At this point, my best option was to use mysqldump on the source. The normal procedure would be to execute a FLUSH TABLE WITH READ LOCK and SHOW MASTER STATUS on the source which would block write statements to ensure data integrity. But in this instance, the source was being highly utilized and I couldn’t perform a read lock.
In this situation, I also couldn’t do the usual mysqldump and then compress the data since there was limited space on the source. I started the mysqldump and sent it directly to a compressed file using gzip.
Since I couldn’t issue the FLUSH TABLE WITH READ LOCK and the SHOW MASTER STATUS command to get the binary log coordinates, I had to use the option master-data=2.
mysqldump --skip-lock-tables --single-transaction --flush-logs --master-data=2 -A -u root -p --all-databases --triggers --routines --events | gzip > dump.sql.gz
The variable master-data=2** will add a line in the mysqldump as a comment. This will provide the CHANGE MASTER information along with the binary log coordinates such as the file name and position. The mysqldump file will have a line similar to the below example for option two.
Once I had the dump completed, I used scp to copy the dump file to the replica. I then executed the change master to master_host command on the replica.
change master to master_host='10.0.0.10',master_user='replica', master_password='password', master_log_file='mysql-bin.000001', master_log_pos=120;
The replica started right up and was replicating correctly.
1. Always prepare for replication recovery by setting your binary log file removal to a sufficient time frame (long/holiday weekend plus one day). That way, in the event of an issue on Friday evening, your logs will be protected until you are able to resolve the situation.
2. Also, important, and I cannot stress this enough, is to have space available to dump your database.
*The variable expire_log_days has been deprecated as of MySQL 8.0. The replacement for this variable in MySQL 8.0 is binlog_expire_logs_seconds.
**The option master-data has been deprecated as of MySQL 5.7.5.
For more information on the change master command visit: https://dev.mysql.com/doc/refman/5.6/en/change-master-to.html
If you have any questions, please feel free to contact us!