Row-Level Replication in MySQL 5.0 (by the application level)

Hi all,

eveyone who uses replication might at some point noticed the “replication lag” that may occur if a table on the slave should be updated at a time, where the table is locked by another ‘slow’ (but sometimes unavoidable) query that is still at work.

Sometimes even the replication lag occurs because of slow update/delete queries, as each master-query is executed in sequential order. So, of course we always try to avoid complex update/delete queries in general.

Since MySQL 5.1 there is this new “row level” replication. Did anyone of you used that? Do you know if there is more than one “replication thread” that executes the master queries? I think they could do that because the ‘order’ of the queries is not important anymore, or am I wrong at this point?

I’m asking, because actually we’re re-developing our existing application, and we’re thinking about coding in a way that helps MySQL (5.0) to keep the replication lag as small a possible.

Example: Instead of sending a short (but heavy) UPDATE/DELETE query to the master, e.g.

UPDATE tableX
SET A = B
WHERE insertDate <= ‘…’
AND status = ‘OK’

Assume that UPDATE requires a slow table scan, which lasts about 30 seconds. If the slave has to execute that query, the whole replication will hang for at least 30 seconds, because the slave also has to do the table scan.

We thought about splitting ALL updates/deletes on a per-application basis in that way:

SELECT primaryKey FROM tableX
WHERE insertDate <= ‘…’
AND status = ‘OK’

And than sending single UPDATES to the master

foreach($result){
UPDATE tabelX
SET A = B
WHERE primaryKey = $resultsPrimaryKey
}

This should help (and actually does in our production enviroment on specific, heavy update queries) avoiding the replication lag.

What do you think about that? Is that a good idea to generally split every update on the application level? Or should we hope that MySQL 5.1 is getting stable soon? And does row-level-replication mean that each slave has multplie replication threads so that even a slow UPDATE query will not cause the replication to hang ?

Hoping for your ideas and comments. )

Greetings
Christian

Hi,

As far as I know mysql row level replication is not any help to this problem. It still uses single thread for execution.

So yes if you need to reduce the lag you need to split your update queries to be short which is rather inconvenient.

Dear Peter,

thanks for your fast response. What do you think? Do you know any larger company that uses some kind of that approach? It would not be too inconvenient, because once the application layer supports it, it’s no additional development work.

Do you know why the guys at MySQL don’t try to come up with that problem (replication lag)?

Does anybody know if the other existing RDBMs use a different kind of replication (MSSQL, DB2, Oracle)?

It seems to me that replication is not the ‘holy cow’ (I thought that when I started using replication two years ago). )

Guess, we better partion our data vertically (as suggested in your performance presentation [URL=“Database Performance Blog – Percona”] http://www.mysqlperformanceblog.com/files/presentations/OSDB[/URL] CON2006-MySQL-Performance-Optimization.pdf). Replication would then be used for backup purposes only.

)
Greetings
Christian

It is typical approach if replication lag is critical.

Regarding other databases - replication is a piece which implemented very differently in different products.

Regarding partition - I spoke mostly about horizontal partitioning. Vertical is when you split so some columns are in one table and other are in another.