MySQL Asynchronous Replication Error code: 1032

SUMMARY:

Database administrators can resolve the common MySQL asynchronous replication error “Error_code: 1032 Could not execute Update_rows event on table; Can’t find record” by meticulously analyzing the source’s binary logs to identify the missing data and inserting the correct record into the replica, bypassing the substantial time required for a full rebuild.

Error_code: 1032 typically indicates a data mismatch between the primary (source) and secondary (replica) databases, preventing the replica from executing an update operation on a non-existent record.
The essential first step is using the mysqlbinlog command and the error log’s end position to uncover the specific UPDATE SQL statement that the source attempted to run.
To fix the mismatch, database administrators must use the mysqldump command to securely grab the missing record’s INSERT statement from the source and execute it directly on the replica database.
This “two-minute quick swat” method allows replication to commence successfully via the START REPLICA command, effectively saving 8 to 12 hours that a full replica database rebuild would require.

This simple yet effective method enables the immediate resumption of MySQL replication, maintaining a healthy and synchronized database environment while avoiding lengthy rebuild processes.

SUMMARY:

Video version also available here.

Introduction: Facing a MySQL Asynchronous Replication Nightmare

MySQL asynchronous replication is crucial for high availability and data redundancy. However, encountering a MySQL replication error can quickly turn into a significant headache for database administrators. One particular and common error is “Error_code: 1032 Could not execute Update_rows event on table; Can’t find record”:

2023-07-27T16:07:00.219561Z 5584 [ERROR] Slave SQL for channel '': Worker 2 failed executing transaction '85efe374-fb22-11ec-a3fa-0050569581bc:6082304586' at master log mysql-bin.002825, end_log_pos 98361775; Could not execute Update_rows event on table ismsgng.msgdta; Can't find record in 'msgdta', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log mysql-bin.002825, end_log_pos 98361775, Error_code: 1032

This error typically indicates a data mismatchbetween your primary (source) database and your secondary (replica) database.

While a full rebuild of the replica database might be a guaranteed fix, it’s a time-consuming process that could take anywhere from 8 to 12 hours for large databases. This post will walk you through a “two-minute quick swat” that often resolves this specific replication error, saving you a substantial amount of time and effort, as it did for a client with a 400-500 gigabyte database.

Understanding the Specific “Can’t Find Record” Error

When you encounter this specific replication error, you’ll see a message indicating “Could not execute Update_rows event on this particular table”. The key phrase to pay attention to is “Can’t find record in this table”. The error message also provides critical information such as the source log bin file name and the end log position where the error occurred.

This error essentially means that the replica attempted to perform an update operation based on an event from the source, but the record it was intended to update didn’t exist on the replica.

The Two-Minute Quick Swat: A Step-by-Step Fix

Here’s how to perform the quick fix:

Step 1: Unearthing the Source’s Binary Log The error log provides the specific binary log file and position where the replication got stuck. To understand what the source was trying to do, you’ll need to mine the Source’s binary logs.

Use the mysqlbinlog command to convert the binary log file (e.g., binlog.002825) into a human-readable format. You can pipe the output to a file, for instance, mysqlbinlog --base64-output=DECODE-ROWS --verbose master-bin.002825 > a.a.
Next, use a text editor (like vi) to open the created file (a.a) and search for the “end position” (98361775) that was provided in your error log.
At this position, you will find the exact SQL statement that the source executed, which the replica failed to replicate. For this specific error, it will be an UPDATE statement.

UPDATE ismsgng.msgdta
WHERE
@1=368593979
SET
@7=1690474020

Insight: In binary log files, column names are often represented by their ordinal positions; for example, “column one” typically refers to the primary key.

Step 2: Pinpointing the Data Mismatch Once you have the UPDATE statement and the ID of the record from the binary log, you need to verify the data on both the source and replica:

Check the Source: Execute a SELECT * query on the source database for the specific table and record ID identified in the binary log (e.g., SELECT * FROM ismsgng.msgdta WHERE id = 368593979). You should find that the record exists in the source.
Check the Replica: Perform the exact same SELECT * query on the replica database. In this scenario, you will find that the record does not exist in the replica. This absence on the replica is precisely why the replication failed when it tried to perform an update on a non-existent record.

Step 3: Injecting the Missing Piece into the replica The solution is to insert the missing record from the source into the replica.

Use the mysqldump command to grab the INSERT command for that specific record from the source. You can specify the database name, table name, and use the --where clause option to target just that one record (e.g., mysqldump ismsgng msgdta --where='id=368593979' --skip-add-drop-table --no-create-info > a.sql).
This will create a small SQL file (a.sql) containing the INSERT statement for the missing record.
Finally, execute this INSERT command directly on the replica database to add the missing record.

Step 4: Resuming Replication and Celebrating Success With the missing record now present on the replica, the update operation that previously failed can now complete successfully.

Simply execute the command START REPLICA (or START SLAVE in older MySQL versions).
Replication should commence successfully, saving you the arduous task of a full rebuild.

Why This Quick Fix Matters

This “two-minute effort” offers a massive return on investment. The alternative, a full database rebuild, involves backing up the source, restoring it to the replica, and reconfiguring everything, which can easily take 8 to 12 hours. This quick swat is definitely worth trying, especially for large databases, as it has proven successful for clients even with hundreds of gigabytes of data.

Conclusion: Saving Hours (and Headaches)

The “Error_code: 1032 Could not execute Update_rows event on table; Can’t find record” is a common headache in MySQL asynchronous replication, typically caused by a data mismatch. By meticulously analyzing the binary logs, identifying the missing record, inserting it into the replica, and then restarting replication, you can often resolve the issue in minutes. This simple yet effective method enables you to resume replication quickly, avoiding lengthy rebuild processes and maintaining a healthy and synchronized MySQL environment.