innovation to day: Fixing Replication with Replication

A couple of days ago I ran into a Tungsten Replicator case where several MySQL tables became corrupted on slaves and needed to be restored from the master. We identified the tables that had problems fairly quickly using Tungsten Replicator's consistency checks. However, that led to another problem: how to restore the slave tables efficiently from the master. The MySQL server in question processes around 10M tranactions per day--there is virtually no downtime. Though the tables were not large, we could not be sure whether they were in use.

Fortunately, you can use a simple MySQL trick to get all the rows of a table to replicate through to slaves. The idea is to dump the table, delete the rows, then reload it again. The delete and subsequent reload replicate out to slaves, after which everything is consistent again. Let's say we have a table called tpcb.history that needs to be fixed. Login with mysql and run the following commands:

BEGIN;
SELECT * FROM tpcb.history 
  INTO OUTFILE '/tmp/tpcb.history.dmp' FOR UPDATE;
DELETE FROM tpcb.history;
LOAD DATA INFILE '/tmp/tpcb.history.dmp' REPLACE
  INTO TABLE tpcb.history FIELDS TERMINATED BY '\t'
  LINES TERMINATED BY '\n';
COMMIT;

You can do the reload several ways in MySQL, but this particular code has some advantages over other approaches, such as using LOCK TABLES. First, it uses a transaction, so if something goes wrong the changes roll back and you do not lose your data. Second, the SELECT ... FOR UPDATE locks your data and ensures serialization. You can run this while applications are running without problems.

This seems useful enough that I put together a simple script called reload-table.sh with a README and checked them into the Tungsten Replicator codeline on SourgeForge.net. You can refresh the same table shown above using the following command:

[sudo] ./reload-table-b.sh -u tungsten -p secret -t tpcb.history

I tested the reload using Tungsten 1.3.1 on MySQL 5.1 with statement replication. However, it would work equally well with row replication. Moreover, you can do the same trick in MySQL replication, as this involves base replication capabilities that are directly equivalent. There are a few caveats: you need to use InnoDB (or another transactional engine), large tables may be a problem, and you would need to tread carefully in cases where tables contain referential constraints. Finally, it would be wise to save the master table somewhere else before running the script.

innovation to day

Blog Archive

Fixing Replication with Replication