Exploring The Scope Of SERIALIZABLE Transaction Row-Locking In Lucee CFML 5.3.7.47
Earlier this week, I looked at using SERIALIZABLE
transactions to enforce unique naming constraints in MySQL and Lucee CFML. Transaction isolation isn't a topic that I'm super comfortable with. As such, I wanted to continue exploring the way in which transaction locking affects concurrent access to a given database table. More specifically, I wanted to better understand the scope of row-locking that is applied in a SERIALIZABLE
transaction in MySQL 5.7.32 and Lucee CFML 5.3.7.47.
To quote from the MySQL documentation on isolation levels, this is the behavior that we get when we wrap a collection of CFQuery
tags in a SERIALIZABLE
transaction:
This level is like
REPEATABLE READ
, but InnoDB implicitly converts all plainSELECT
statements toSELECT ... LOCK IN SHARE MODE
ifautocommit
is disabled.... UsingLOCK IN SHARE MODE
sets a shared lock that permits other transactions to read the examined rows but not to update or delete them.... InnoDB locks the index range scanned, using gap locks or next-key locks to block insertions by other sessions into the gaps covered by the range.
It's the last part - "block insertions by other sessions into the gaps covered by the range" - that I want to explore in this post. And to do so, we're going to bring back our widget
table:
CREATE TABLE `widget` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`userID` int(10) unsigned NOT NULL,
`name` varchar(100) NOT NULL,
`createdAt` datetime NOT NULL,
`updatedAt` datetime NOT NULL,
PRIMARY KEY (`id`),
KEY `IX_byUser` (`userID`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
The main things to note about this table are that each row is "owned" by the userID
. And, that we have an index on the table which allows us to quickly scan all rows for a given user. This is why the SERIALIZABLE
transaction in my previous post was able to enforce a unique naming constraint across all records owned by a given userID
.
But, I want to make sure that the locking there was only applied to the rows owned by the given userID
; and, that I wasn't accidentally locking all rows in the table (which would also allow for unique naming but at a much higher cost to performance).
To test this, we're going to touch multiples rows owned by two different users and then see if the execution times demonstrate a cross-contamination of locking. Given two users, we're going to perform these three operation in parallel:
- User One: Insert a new record.
- User One: Update an old record.
- User Two: Insert a new record.
If the row-locking is indeed user-specific, we should expect the first two operations to be serialized and the third operation to execute independently.
In the following ColdFusion code, each of the aforementioned operations will be executed inside an asynchronous CFThread
tag body. Each CFThread
contains a sleep()
command to help ensure that each thread has time to spawn and overlap in its execution of SQL queries. Our first SERIALIZABLE
transactions contains a sleep of 3-seconds mid-transaction in order to make sure that it is still executing when the other two threads finish their sleep of 1-second:
<cfscript>
include "./db-functions.cfm";
truncateWidgets();
// In order to examine the scope of row-locking in the SERIALIZABLE transaction,
// let's create two widgets, each one owned by a different user. This will become
// important because the SERIALIZABLE lock should be user-based due to the structure
// of the underlying TABLE INDEX (IX_byUser).
sarahID = 1;
sarahWidgetID = createWidget( sarahID, "Sarah's First Widget" );
johnID = 2;
johnWidgetID = createWidget( johnID, "John's First Widget" );
// ------------------------------------------------------------------------------- //
// ------------------------------------------------------------------------------- //
// SARAH - Insert new widget using SERIALIZABLE transaction.
thread name = "sarahInsertThread" {
transaction isolation = "serializable" {
var newWidgetName = "Sarah's Second Widget";
// Since we're using a SERIALIZABLE transaction, MySQL is going to implicitly
// convert all SELECT statements within this transaction block into LOCK IN
// SHARE MODE statements. This will use gap-locking / next-key-locking to
// lock the RANGE of rows scanned in the SELECT; which is going to be all
// rows owned by the given userID (Sarah).
if ( isWidgetWithNameExists( sarahID, newWidgetName ) ) {
throw( type = "NameAlreadyExists" );
}
sleep( 3000 );
createWidget( sarahID, newWidgetName );
}
// Take a snapshot of the elapsed time (needed for Lucee CFML incompatibility).
thread.executionTime = thread.elapsedTime;
}
// SARAH - Update existing widget - no explicit transaction.
thread name = "sarahUpdateThread" {
// Update the name of the first widget we created. While this update has no
// explicit transaction wrapped around it, since it is owned by Sarah, it will
// inherently get blocked by the SERIALIZABLE transaction that is locking all
// rows for the same userID.
sleep( 1000 );
renameWidget( sarahWidgetID, "Sarah's First Widget (renamed)" );
// Take a snapshot of the elapsed time (needed for Lucee CFML incompatibility).
thread.executionTime = thread.elapsedTime;
}
// ------------------------------------------------------------------------------- //
// ------------------------------------------------------------------------------- //
// JOHN - Insert new widget using SERIALIZABLE transaction.
thread name = "johnInsertThread" {
transaction isolation = "serializable" {
var newWidgetName = "John's Second Widget";
// Just as with the SERIALIZABLE above, this SELECT will implicitly lock all
// rows scanned. However, since we using a different userID value, this lock
// will not interact with the lock in the other transaction.
if ( isWidgetWithNameExists( johnID, newWidgetName ) ) {
throw( type = "NameAlreadyExists" );
}
sleep( 1000 );
createWidget( johnID, newWidgetName );
}
// Take a snapshot of the elapsed time (needed for Lucee CFML incompatibility).
thread.executionTime = thread.elapsedTime;
}
// ------------------------------------------------------------------------------- //
// ------------------------------------------------------------------------------- //
// Wait for all threads to re-join the parent thread and output execution times.
thread action = "join";
dump( var = cfthread.sarahInsertThread, show = "name,error,executionTime" );
dump( var = cfthread.sarahUpdateThread, show = "name,error,executionTime" );
dump( var = cfthread.johnInsertThread, show = "name,error,executionTime" );
// Output the current widgets.
dump( label = "Sarah Widgets", var = getWidgetsByUserID( sarahID ), metainfo = false );
dump( label = "John Widgets", var = getWidgetsByUserID( johnID ), metainfo = false );
</cfscript>
As you can see, the first CFThread
to insert a record for Sarah contains a SERIALIZABLE
transaction. We want to see how this transaction affects the second thread for Sarah as well as the third CFThread
for John. And, when we run this ColdFusion code, we get the following output:
As you can see from the execution times, the two queries for Sarah ran for about 3-seconds - the duration of the first SERIALIZABLE
transaction. The third query for John, however, executed in just about 1-second. This demonstrates that the third query for the different userID
value was not affected by the gap locks or next-key locks being applied by the first query.
To be clear, this is happening because of the underlying IX_byUser
index on userID
. If we were to go into the database schema and remove the IX_byUser
index, re-running the above ColdFusion code would result in a transaction deadlock because the two SERIALIZABLE
transactions - the one for Sarah and the one for John - would both be locking the same rows (essentially forcing a full-table scan). However, since the IX_byUser
index is used in the isWidgetWithNameExists()
user-defined function (UDF), it allows MySQL to limit the scope of the row-locking to a given userID
, therefor allowing Sarah's records and John's record to be updated independently.
Historically, I've used database transactions to enforce atomicity - that is, an all-or-nothing update. However, I'm starting to get more comfortable with the use of transactions to enforce cross-record constraints, such as unique naming constraints. And, it's good to see exactly how the underlying index structures affect the scope of row-locking.
Want to use code from this post? Check out the license.
Reader Comments
@All,
As I've continued to explore some of the
SERIALIZABLE
transaction scoping in MySQL 5.7.32 (the version I run my R&D against), I've run into some cases that I am having a hard time explaining. I think it relates to some other people's experiences:It seems like the index-based locking has trouble when there are no rows in that index yet at all. It looks like the scope of the lock may extend beyond the suggested index in that case, which could then cause unexpected deadlocks.
In this particular post, we don't run into this issue at all since both Sarah and John had an existing record before we then went to test the scope of the locking. As such, each segment of the index,
IX_byUser
, already had data in it, which I think is why there was no deadlock in my demo.I'll try to follow-up with something that shows the empty-vs-populated index outcome.
@All,
As per my comment above, on the change in row-locking based on the existing / non-existence of rows, here's my follow-up post:
www.bennadel.com/blog/4133-the-scope-of-serializable-transaction-row-locking-is-larger-when-rows-dont-yet-exist-in-mysql-5-7-32.htm
This appears to be related to the way in which MySQL has to lock something called
supermum
. I don't really understand what this means; but, I demonstrate the change in the above link.