You Can Use ORDER BY And LIMIT Within UPDATE And DELETE Statements In MySQL 5.6.37
Continuing with the database-theme of the week, I wanted to share one piece of wisdom that I learned from our former director of Data, Brad Brewer. He taught me that you can use the ORDER BY
and LIMIT
clauses inside both UPDATE
and DELETE
SQL statements within MySQL 5.6. While I wouldn't use this technique in my day-to-day CRUD (Create, Read, Update, Delete) code, I do use these options religiously when I am writing data clean-up and migration scripts (ie, those SQL scripts that you run outside of the main application).
The premise of the behavior is simple: you can use the ORDER BY
and LIMIT
clauses as a means to lock-down the scope of the resultant data mutation. When you are hand-crafting SQL statements that are about to alter production data, this added safety-guard can work wonders for your mental health and sense of safety.
To see this in action, let's look at an UPDATE
statement that will update the two most recently-added rows in a MySQL database able:
-- First, let's reset our test table for the demo.
TRUNCATE TABLE
limit_test
;
INSERT INTO
limit_test
VALUES
( 1, 'One' ),
( 2, 'Two' ),
( 3, 'Three' ),
( 4, 'Four' ),
( 5, 'Five' )
;
-- Now that we have our demo-data in place, let's UPDATE the LAST TWO records in the
-- table based on an ID-based sort.
UPDATE
limit_test
SET
name = CONCAT( name, ' ( Updated )' )
ORDER BY
id DESC
LIMIT
2
;
As you can see, we are updating the name
column of the test table; but, we are using the ORDER BY
and LIMIT
clauses to ensure that we only touch the last two rows. And, when we look at the table afterwards, we get the following output:
mysql> SELECT * FROM limit_test;
+----+------------------+
| id | name |
+----+------------------+
| 1 | One |
| 2 | Two |
| 3 | Three |
| 4 | Four ( Updated ) |
| 5 | Five ( Updated ) |
+----+------------------+
5 rows in set (0.00 sec)
As you can see, only the last two records were mutated.
The same things works with DELETE
statements. In the following query, we're going to delete the last two rows:
-- First, let's reset our test table for the demo.
TRUNCATE TABLE
limit_test
;
INSERT INTO
limit_test
VALUES
( 1, 'One' ),
( 2, 'Two' ),
( 3, 'Three' ),
( 4, 'Four' ),
( 5, 'Five' )
;
-- Now that we have our demo-data in place, let's DELETE the LAST TWO records in the
-- table based on an ID-based sort.
DELETE FROM
limit_test
ORDER BY
id DESC
LIMIT
2
;
As you can see, it's the exact same concept - we're using the ORDER BY
and LIMIT
clauses to limit the scope of the delete. And, when we look at the table afterwards, we get the following output:
mysql> SELECT * FROM limit_test;
+----+-------+
| id | name |
+----+-------+
| 1 | One |
| 2 | Two |
| 3 | Three |
+----+-------+
3 rows in set (0.00 sec)
As you can see, the last two rows in the table have been deleted.
Sometimes, if I am feeling extra nervous about deleting production data by hand, I'll even take this a step further and include several redundant columns in the WHERE
clause in addition to the LIMIT
clause:
DELETE FROM
critical_data
WHERE
id = 4 -- Primary key (this should be sufficient on its own).
AND
name = 'Mission statement' -- Redundant.
AND
startedAt = '2020-01-18' -- Redundant.
AND
endedAt IS NULL -- Redundant.
LIMIT
1
;
Here, I'm deleting a single row based on its primary key (4
). In reality, this should be a sufficient condition on its own. However, when it comes to production data and hand-crafted SQL statements, it just feels too easy to make a mistake. As such, matching several other known column values in the same row offers the peace-of-mind that prevent me from vomiting on my keyboard the moment after I hit the "Run Query" button.
LIMIT
+ UPDATE
Not Just For Peace-of-Mind
Including a LIMIT
clause within a SQL UPDATE
statement isn't just for the peace-of-mind; there are some types of workflows that can be powered by LIMIT
within an application. Imagine, for example, that you are iterating over a database table using multiple threads. To do this, you can use an UPDATE
/ LIMIT
combination to "claim" a chunk of rows to be processed.
In the follow SQL statement, a given worker thread (with a unique ID) is going to claim the next 100 unclaimed rows:
-- Claim the next 100 rows for this worker.
UPDATE
table_to_be_processed
SET
claimedBy = @workerID
WHERE
claimedBy IS NULL
ORDER BY
id ASC
LIMIT
100
;
-- Return the CLAIMED rows to be processed by this worker thread within the application.
SELECT
*
FROM
table_to_be_processed
WHERE
claimedBy = @workerID
ORDER BY
id ASC
;
This multi-statement SQL query first claims at-most 100 unclaimed rows from the head of the table; and then, returns whichever rows are currently claimed by the worker thread. Such a technique allows you to divide and conquer the processing of a table while also ensuring that two different worker threads aren't accidentally processing the same records.
In traditional CRUD-based application code, I wouldn't normally use ORDER BY
and LIMIT
clauses in my SQL queries. However, when hand-crafting SQL, I do find that such additional query constraints bring a much needed sense of safety and well-being.
Want to use code from this post? Check out the license.
Reader Comments
@All,
In this post, I mentioned that you could use the
ORDER BY
+LIMIT
+UPDATE
combination to drive a multi-thread migration. Since that's a fun little topic on its own, I wanted to follow-up with a more robust look at that concept:www.bennadel.com/blog/3847-using-update-order-by-limit-to-drive-multi-worker-migrations-in-lucee-cfml-5-3-6-61.htm
In this follow-up post, I use concurrent worker threads to safely traverse a single database table, claiming rows to be processed by a given worker.