Comparing data in 2 tables on different servers with CHECKSUM

Question

So I've had a problem where I need to compare data in 2 different tables on two different servers. Now, I know MySQL supports CHECKSUM TABLES, but from my testing and understanding, it's not reliable across server instances and versions.

So I created this query:

$part = '@CRC := MD5(CONCAT_WS(\'#\', COALESCE(`'.
    implode('`, "#NULL#"), COALESCE(`', $this->_columns).
    '`, "#NULL#")))';
$sql1 = "SELECT COUNT(*) AS cnt, 
        SUM(CONV(SUBSTRING({$part}, 1, 4), 16, 10)) as a1, 
        SUM(CONV(SUBSTRING(@CRC, 5, 4), 16, 10)) as a2, 
        SUM(CONV(SUBSTRING(@CRC, 9, 4), 16, 10)) as a3, 
        SUM(CONV(SUBSTRING(@CRC, 13, 4), 16, 10)) as a4, 
        SUM(CONV(SUBSTRING(@CRC, 17, 4), 16, 10)) as a5, 
        SUM(CONV(SUBSTRING(@CRC, 21, 4), 16, 10)) as a6, 
        SUM(CONV(SUBSTRING(@CRC, 25, 4), 16, 10)) as a7, 
        SUM(CONV(SUBSTRING(@CRC, 29, 4), 16, 10)) as a8 
    FROM `dbname`.`tablename` 
    WHERE `id` >= $min AND `id` <= $max ";

So basically, it's concatenating each row together (all the columns of each row more specifically) and then MD5ing them. Then it walks 4 hexbits at a time through that MD5 and sums them across all rows (4 hexbits to allow me to do huge tables without needing to worry about overflowing). Then, I just compare the result of this query on both tables to see if everything is the same.

By using this binary search, I am able to rather quickly narrow down where the changes are so that I can port them.

It's actually reasonably efficient, so I'm not too concerned about that. What I am concerned about is if this is even necessary. It's screaming to me "You're doing it wrong", but I can't figure out any cleaner method around it...

What are your thoughts?

It may be overkill for your problem but have you looked at MYSQL clusters to handle mirroring and distributing of data. It would offer other features that may be useful. — user60, Commented Jan 19, 2011 at 21:37
They need to be syncronized lazily. Basically pushing QC into Production. And the data volume prohibits (or at least makes it expensive) to dump and restore the entire dataset each cycle... — ircmaxell, Commented Jan 19, 2011 at 21:38

edorian · Accepted Answer · 2011-01-20 02:17:07Z

5

We use mk-table-checksum.

It works really great in Master-Slave context where it also allows to sync differences in both directions depending on your choice.

Saidly from what i've seen most people it for replication and i can't provide any copy/pasteable output but if you don't know it it's definitly worth looking into. If you know it i'd like to hear why it doesn't work for you.

To get an overview over many tables you can use something like mk-table-checksum host1 host2 | mk-checksum-filter

answered Jan 20, 2011 at 2:17

edorian

1,5361 gold badge12 silver badges16 bronze badges

\$\begingroup\$ The only issue that I have with that, is that it'll be hard to write a binary search using it. It does have chunk size, but it doesn't seem that I can specify the range of PK as well. So all it would tell me is if they differed. But looking there, the tool mk-table-sync would do it. But I need some custom work wrapped around it, so calling an external program is a last resort at best. Thanks though! \$\endgroup\$
– ircmaxell
Commented Jan 20, 2011 at 13:16

Add a comment |

almaz · Accepted Answer · 2013-02-24 19:51:36Z

1

Simply run in MySQL CHECKSUM TABLE 'yourtable'

Or for PHP solution read How to check MySQL Replication databases are in Sync

edited Feb 24, 2013 at 19:51

almaz

7,19217 silver badges29 bronze badges

answered Feb 24, 2013 at 18:36

dvgmex

111 bronze badge

Add a comment |

Stack Exchange Network

Comparing data in 2 tables on different servers with CHECKSUM

2 Answers 2

Not the answer you're looking for? Browse other questions tagged
php
mysql
sql
or ask your own question.

Hot Network Questions

Comparing data in 2 tables on different servers with CHECKSUM

2 Answers 2

Not the answer you're looking for? Browse other questions tagged phpmysqlsql or ask your own question.

Related

Hot Network Questions

Not the answer you're looking for? Browse other questions tagged
php
mysql
sql
or ask your own question.