What Happens on the Read/Write Node When a Query Locks Tables on the Read-Only Node?

geek_prophet · December 4, 2021, 7:59pm

Greetings,

We are considering switching to XtraDB Cluster as our new MySQL HA solution. We would deploy clusters consisting of 1 read-write node (select/insert/update/delete), one read node (select only), and 1 abbreviated node strictly for quorum. To avoid potential deadlocks, all write operations would be directed to the read-write node, whereas read (SELECT) operations could be directed to the either node. That said, sometimes even relatively simple SELECT operations can result in tables being locked. If that happens on the read-only node, what happens on the read-write node? Would users on that node potentially freeze because apps are unable to complete queries, due to a table lock on the other node?

matthewb · December 5, 2021, 7:30pm

InnoDB uses row-level locking, not table locking. Only two situations can lock rows when reading: SELECT .. FOR UPDATE and SELECT on a table with FKs (which is not recommended in PXC anyways).

Any SELECT .. FOR UPDATE should be executed on your read-write node. All other SELECT can be safely executed on read-onlys.

geek_prophet · December 5, 2021, 9:14pm

Understood, but we’ve seen cases where queries queued up waiting for a lock to be released that was caused by an earlier query, even when all the tables involved were InnoDB. This seems to happen relatively often with IBM’s Cognos product. So the question remains. If a lock condition occurs on the read-only node, does that condition propagate to the read-write node?

matthewb · December 6, 2021, 5:23am

Can you share specifics around this? This should not happen in InnoDB. Readers never block writers except the situations I described previously.

No. Nothing propagates between nodes regarding read operations. Any/all locking always and only happens locally.

geek_prophet · December 6, 2021, 2:59pm

The next time it happens, I can paste the query that caused it.

My wording may be imprecise. What I mean is, if a lock causes queries to queue up on node B, can queries issued to node A stall because node A is waiting for a condition to be resolved on B?

matthewb · December 6, 2021, 5:46pm

No, because nothing propagates between nodes regarding read operations. If you start a read query on B, there is no mechanism that says “hey ‘A’, I started a locking read on ‘B’”. Nothing is transmitted to other nodes during reads.

geek_prophet · December 6, 2021, 8:08pm

I’m talking about write queries (update, insert, delete) on node A. If the Toys and Elves tables on B are locked by some weird interaction of SELECT statements (we’ve seen this occur with IBM Cognos), and someone tries to update those tables on A, what happens when A tries to replicate the changes to B?

matthewb · December 6, 2021, 8:28pm

I know what you’re talking about, and I’m going to repeat myself for a 3rd time by saying that there is no mechanism that relays information regarding SELECT nor any table-locking statements.

To elaborate using your example:

If the Toys and Elves tables on B are locked by some weird interaction of SELECT statements

This should never happen in InnoDB unless you have no/bad indexing, or are using FKs. This is not IBM Cognos, so what you see there is irrelevant to this topic as that is not how InnoDB behaves.

and someone tries to update those tables on A, what happens when A tries to replicate the changes to B?

If someone tries updating a row on A which is read-locked (again, InnoDB does not lock tables, only rows) on B (A has absolutely no idea anything is locked on B btw), the replication will succeed, A will commit, and B will wait for the lock to be released (if necessary, depending on the rows being updated) during the apply phase on B. If A updates row where X = 4 and B has row write-locks on X = 6 and X = 8, the update will replicate, certify, and commit without waiting for 6 & 8 to unlock.

geek_prophet · December 6, 2021, 8:40pm

Thanks for your patience, I do understand that there is no mechanism that relays information regarding SELECT statements from one node to the other. I get that.

I’m not talking about what I see in IBM Cognos, but rather what I see when I issue a show full processlist in MySQL. First I get a phone call from users saying that their reports are timing out. I go into MySQL and I see a bunch of queries stacked up (selects, inserts, updates) waiting for one SELECT query to complete. All the tables are InnoDB. I often have to kill the SELECT query to break the logjam. I only mentioned IBM Cognos to illustrate that the queries are coming from a reputable product. However, make no mistake, the system freezes because of what is happening in MySQL at that moment, not because if Cognos.

There it is. Thanks,

matthewb · December 6, 2021, 8:42pm

Can you share this SELECT and the associated table schema in a new post?

geek_prophet · December 6, 2021, 8:46pm

I sure will, as soon as I can find an example. It’s been a few weeks since the last time it happened because of the holidays and because we’ve been training users to run those reports during off hours.

bob · December 8, 2021, 1:10pm

Readers never block writers except the situations I described previously [ SELECT .. FOR UPDATE and SELECT on a table with FKs].

This is generally not correct. A shared (read) lock on a row (or index) will block acquisition of an exclusive (write) lock.

However, specifically, if such shared lock is on the read-only replica slave (S), as mentioned it won’t block the writable instance master (M) from acquiring an exclusive lock locally on M anymore, but it is fair to ask if it will block S to apply the replicated statement/row which is a write operation locally on S.

I don’t have the answer but I wanted to correct the misleading answer about shared lock not blocking exclusive locks.

matthewb · December 8, 2021, 4:16pm

Hi @bob,

What you said, and what I said are the same thing and what I said is in fact correct, there was nothing misleading.

Let me explain further. If you run a plain SELECT on an InnoDB table, no locks are created, thus this SELECT will not block any other SELECT nor will it block any writer. This is how InnoDB has operated since inception. InnoDB uses “views” internally based on transaction-id (MVCC) that allows SELECTs to not block other readers/writers.

If, as I said above, and as you quoted me, you run a SELECT … FOR UPDATE or your table has FKs, in those two situations, a SELECT will create (S)hared locks which indeed can block writers from acquiring (X)clusive locks.

So, yes, it is generally correct that a simple SELECT will not block other readers/writers due to the fundamental way in which InnoDB operates. When a SELECT creates shared locks, yes, those can block.

bob · December 8, 2021, 7:41pm

I’m not saying the same thing as you are.
I’m saying that even a simple select with a shared lock can block a statement trying to get an exclusive lock.
It’s not just the “select for update” that could block those statements.

I will add, to be clear, that the whole discussion is involving transactions. Without Txn, this is not even close to be interesting…

As I said. that is false. Even an insert can block another statement.

Shared locks are taken (and they can be taken on unpredictable indices btw, as range locks, as gap locks, etc… ).
A mere select can block another “SELECT …for update” and any other statement trying to acquire an exclusive lock while the shared lock is taken. It 's more complex than anyone can put in a sentence so I suggest you simply rollback…

The complicated truth about locks is at (assuming 5.7 for now)
https://dev.mysql.com/doc/refman/5.7/en/innodb-locking-transaction-model.html

Meanwhile, there is the isolation level on top of it all, to mess with what anyone think they know about locks.

Commit.

matthewb · December 8, 2021, 8:47pm

That’s exactly what I said above. Could you please re-read what I said?

(S)hared locks are only created in 2 situations: 1) SELECT FOR UPDATE, and 2) FKs.

What I think may be the issue here is that your understanding is that a simple SELECT will alway create shared locks. This is absolutely not true. A plain SELECT will NEVER create S locks due to MVCC.

You can verify this is correct. Run the following pseudocode:

START TRANSACTION;
SELECT * FROM table WHERE id = 4;
SHOW ENGINE INNODB STATUS;

Notice that this transaction has 0 locks. This is because simple SELECTs do not create any locks due to MVCC. Let me clear, there are no shared locks; no locks of any kind at this point. Further evidence, if you open another connection and update this row, it will succeed because readers do not block writers. Continue:

COMMIT; START TRANSACTION;
SELECT * FROM table WHERE id = 4 FOR UPDATE;
SHOW ENGINE INNODB STATUS;

You can now see this txn hold (S) locks. In another terminal, attempting to update this row will block because this (S) lock will block writers. This is the only case where a SELECT blocks a writer, when (S) locks are involved, and as I’ve now said 3 or 4 times, (S) locks are only created in 2 situations, noted in my previous comments.

Everything in InnoDB is a transaction. Even if you just do a simple SELECT, InnoDB still wrappers that inside an internal read-only transaction. You can’t not have txns in InnoDB. (Sorry for double negative)

Ref: https://dev.mysql.com/doc/refman/5.7/en/innodb-locking-reads.html

bob · December 8, 2021, 10:52pm

I ran your samples, and you are right, this one doesn’t block.

Here is why I am convinced that a select (some sort of select anyway) does take a shared lock:

Every month or so I’m spending time solving deadlocks in poor code in mysql5.7 because a client stack (jpa) is always reading an entity (row) before the code can hope to update it (managed entity). Thread 1 takes a S lock, followed by an attempt to escalate it to X lock upon update. The thing is, another thread 2 was attempting an update before thread 1, and thread 2 is waiting for the X lock but can’t get it. Now Thread 1 wants the lock too but it is after thread 2, yet holding the S lock. Deadlock. Yes, SHOW ENGINE INNODB STATUS shows the last detected deadlock well with the situation (pasted below) among to update on a “thing” row.

(FYI, the default isolation is repeatable-read, in all this.)

So, I have a hard time believing that millions of developers and users of hibernate didn’t figure out how to avoid locking for nothing. Something is missing. All I know is that the code asked to “find” the “thing” (by PK mind you) and then updates it. (My usual fix is to “select …for update” since it’s obviously the intent).

For such deadlock to occur, the S lock wasn’t taken “just before” the X lock. A window of time had to expose the txn to collisions. So that’s a hint that the jpa “find” issued a statement (we know it’s a select) on the “thing” table.

At this point, since there is NOTHING in the select SQL to explain this shared lock, I’m hoping for a magical mysql “mode” that nobody (in mysql nor jpa community) talks much about. (FYI, it’s on AWS RDS, but I get those on a local mysql 5.7 too).

*** (1) TRANSACTION:
TRANSACTION 175274419, ACTIVE 0 sec starting index read
mysql tables in use 1, locked 1
LOCK WAIT 8 lock struct(s), heap size 1136, 3 row lock(s), undo log entries 2
MySQL thread id 627088, OS thread handle 22952098592512, query id 365172765 192.168.1.100 mydatabase updating
update thing set needs_checkup=0 where id=1129
*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 361 page no 25 n bits 144 index PRIMARY of table `mydatabase`.`thing` trx id 175274419 lock_mode X locks rec but not gap waiting

*** (2) TRANSACTION:
TRANSACTION 175274420, ACTIVE 0 sec starting index read
mysql tables in use 1, locked 1
8 lock struct(s), heap size 1136, 3 row lock(s), undo log entries 2
MySQL thread id 627077, OS thread handle 22952240928512, query id 365172766 192.168.1.100 mydatabase updating
update thing set needs_checkup=0 where id=1129
*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 361 page no 25 n bits 144 index PRIMARY of table `mydatabase`.`thing` trx id 175274420 lock mode S locks rec but not gap

*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 361 page no 25 n bits 144 index PRIMARY of table `mydatabase`.`thing` trx id 175274420 lock_mode X locks rec but not gap waiting

*** WE ROLL BACK TRANSACTION (2)

I think this hijacks the thread as much as I could… LOL
Wasn’t my intention. But it lends itself to a deeper analysis of the lock, and hopefully a bit about replication in all that.

matthewb · December 9, 2021, 3:50pm

Hey @bob,
Feel free to open another topic on your deadlock issue and we can discuss.

geek_prophet · December 9, 2021, 4:19pm

Most of that was above my head, but thanks for your input. With all that said, where does it leave the original question? Ultimately, what I want to know is, can there be conditions on the slave (i.e., conditions created by any combination of SELECT statements, however hairy) that could prevent production operations from proceeding without interruption on the master? I just don’t want users of the master to freeze because some query on the slave went haywire. And by “master” and “slave,” of course I mean the administratively assigned read/write and read-only members of an XDC.

bob · December 13, 2021, 3:25pm

For the curious, I’ve created

matthewb · December 13, 2021, 6:42pm

@geek_prophet I answered your original question above, and you even commented on it giving me a ‘Thanks’.

can there be conditions on the slave … without interruption on the master

Now we have a new problem. You are using the terms “master” and “slave”, neither which exist within the scope of PXC. Be clear when speaking about PXC if you mean “writer” and “reader” node because in PXC, any node can be both reader and writer at any given time.

As stated above, no, there is no action which can occur on a reader node that will affect your ability to write on a write node. (For the Xth time now…) No locking information, of any kind, is ever relayed back to the writer node. All locks are 100% independent of each node.

Topic		Replies	Views
Simple select causing flow control pause of whole cluster Percona Server for MySQL 5.7 mysql , percona	13	151	December 11, 2024
Cluster hang with wsrep: initiating replication for write set Percona XtraDB Cluster 5.x	8	6929	November 14, 2017
3 node outage scenario plus long query block question Percona XtraDB Cluster 5.x	4	578	September 10, 2018
Replication Problem where Cluster locks up completely Percona XtraDB Cluster 5.x mysql , percona	12	1849	February 7, 2022
Row-level locking and XtraDB Cluster Percona XtraDB Cluster 5.x	1	1177	September 28, 2014

What Happens on the Read/Write Node When a Query Locks Tables on the Read-Only Node?

Related topics