You Can Use ORDER BY And LIMIT Within UPDATE And DELETE Statements In MySQL 5.6.37

By Ben Nadel

Published 2020-06-14 in SQL — Comments (1)

Continuing with the database-theme of the week, I wanted to share one piece of wisdom that I learned from our former director of Data, Brad Brewer. He taught me that you can use the ORDER BY and LIMIT clauses inside both UPDATE and DELETE SQL statements within MySQL 5.6. While I wouldn't use this technique in my day-to-day CRUD (Create, Read, Update, Delete) code, I do use these options religiously when I am writing data clean-up and migration scripts (ie, those SQL scripts that you run outside of the main application).

The premise of the behavior is simple: you can use the ORDER BY and LIMIT clauses as a means to lock-down the scope of the resultant data mutation. When you are hand-crafting SQL statements that are about to alter production data, this added safety-guard can work wonders for your mental health and sense of safety.

To see this in action, let's look at an UPDATE statement that will update the two most recently-added rows in a MySQL database able:

  
          -- First, let's reset our test table for the demo.
        
          TRUNCATE TABLE
        
          	limit_test
        
          ;
        
          INSERT INTO
        
          	limit_test
        
          VALUES
        
          	( 1, 'One' ),
        
          	( 2, 'Two' ),
        
          	( 3, 'Three' ),
        
          	( 4, 'Four' ),
        
          	( 5, 'Five' )
        
          ;
        
          -- Now that we have our demo-data in place, let's UPDATE the LAST TWO records in the
        
          -- table based on an ID-based sort.
        
          UPDATE
        
          	limit_test
        
          SET
        
          	name = CONCAT( name, ' ( Updated )' )
        
          ORDER BY
        
          	id DESC
        
          LIMIT
        
          	2
        
          ;

view raw update.sql hosted with ❤ by GitHub

As you can see, we are updating the name column of the test table; but, we are using the ORDER BY and LIMIT clauses to ensure that we only touch the last two rows. And, when we look at the table afterwards, we get the following output:

  
          mysql> SELECT * FROM limit_test;
        
          +----+------------------+
        
          | id | name             |
        
          +----+------------------+
        
          |  1 | One              |
        
          |  2 | Two              |
        
          |  3 | Three            |
        
          |  4 | Four ( Updated ) |
        
          |  5 | Five ( Updated ) |
        
          +----+------------------+
        
          5 rows in set (0.00 sec)

view raw update-result.txt hosted with ❤ by GitHub

As you can see, only the last two records were mutated.

The same things works with DELETE statements. In the following query, we're going to delete the last two rows:

  
          -- First, let's reset our test table for the demo.
        
          TRUNCATE TABLE
        
          	limit_test
        
          ;
        
          INSERT INTO
        
          	limit_test
        
          VALUES
        
          	( 1, 'One' ),
        
          	( 2, 'Two' ),
        
          	( 3, 'Three' ),
        
          	( 4, 'Four' ),
        
          	( 5, 'Five' )
        
          ;
        
          -- Now that we have our demo-data in place, let's DELETE the LAST TWO records in the
        
          -- table based on an ID-based sort.
        
          DELETE FROM
        
          	limit_test
        
          ORDER BY
        
          	id DESC
        
          LIMIT
        
          	2
        
          ;

view raw delete.sql hosted with ❤ by GitHub

As you can see, it's the exact same concept - we're using the ORDER BY and LIMIT clauses to limit the scope of the delete. And, when we look at the table afterwards, we get the following output:

  
          mysql> SELECT * FROM limit_test;
        
          +----+-------+
        
          | id | name  |
        
          +----+-------+
        
          |  1 | One   |
        
          |  2 | Two   |
        
          |  3 | Three |
        
          +----+-------+
        
          3 rows in set (0.00 sec)

view raw delete-result.txt hosted with ❤ by GitHub

As you can see, the last two rows in the table have been deleted.

Sometimes, if I am feeling extra nervous about deleting production data by hand, I'll even take this a step further and include several redundant columns in the WHERE clause in addition to the LIMIT clause:

  
          DELETE FROM
        
          	critical_data
        
          WHERE
        
          	id = 4 -- Primary key (this should be sufficient on its own).
        
          AND
        
          	name = 'Mission statement' -- Redundant.
        
          AND
        
          	startedAt = '2020-01-18' -- Redundant.
        
          AND
        
          	endedAt IS NULL -- Redundant.
        
          LIMIT
        
          	1
        
          ;

view raw redundant-delete.sql hosted with ❤ by GitHub

Here, I'm deleting a single row based on its primary key (4). In reality, this should be a sufficient condition on its own. However, when it comes to production data and hand-crafted SQL statements, it just feels too easy to make a mistake. As such, matching several other known column values in the same row offers the peace-of-mind that prevent me from vomiting on my keyboard the moment after I hit the "Run Query" button.

From Zoolander: I just threw up in my mouth a little bit.

`LIMIT` + `UPDATE` Not Just For Peace-of-Mind

Including a LIMIT clause within a SQL UPDATE statement isn't just for the peace-of-mind; there are some types of workflows that can be powered by LIMIT within an application. Imagine, for example, that you are iterating over a database table using multiple threads. To do this, you can use an UPDATE / LIMIT combination to "claim" a chunk of rows to be processed.

In the follow SQL statement, a given worker thread (with a unique ID) is going to claim the next 100 unclaimed rows:

  
          -- Claim the next 100 rows for this worker.
        
          UPDATE
        
          	table_to_be_processed
        
          SET
        
          	claimedBy = @workerID
        
          WHERE
        
          	claimedBy IS NULL
        
          ORDER BY
        
          	id ASC
        
          LIMIT
        
          	100
        
          ;
        
          -- Return the CLAIMED rows to be processed by this worker thread within the application.
        
          SELECT
        
          	*
        
          FROM
        
          	table_to_be_processed
        
          WHERE
        
          	claimedBy = @workerID
        
          ORDER BY
        
          	id ASC
        
          ;

view raw claim.sql hosted with ❤ by GitHub

This multi-statement SQL query first claims at-most 100 unclaimed rows from the head of the table; and then, returns whichever rows are currently claimed by the worker thread. Such a technique allows you to divide and conquer the processing of a table while also ensuring that two different worker threads aren't accidentally processing the same records.

In traditional CRUD-based application code, I wouldn't normally use ORDER BY and LIMIT clauses in my SQL queries. However, when hand-crafting SQL, I do find that such additional query constraints bring a much needed sense of safety and well-being.

Want to use code from this post? Check out the license.

Short link: https://bennadel.com/3846

Reader Comments

Ben Nadel Jun 16, 2020 at 7:39 AM

15,996 Comments

@All,

In this post, I mentioned that you could use the ORDER BY + LIMIT + UPDATE combination to drive a multi-thread migration. Since that's a fun little topic on its own, I wanted to follow-up with a more robust look at that concept:

www.bennadel.com/blog/3847-using-update-order-by-limit-to-drive-multi-worker-migrations-in-lucee-cfml-5-3-6-61.htm

In this follow-up post, I use concurrent worker threads to safely traverse a single database table, claiming rows to be processed by a given worker.

Oh my chickens, this post is old!

Hit me up on Twitter if you want to discuss it further.

	-- First, let's reset our test table for the demo.
	TRUNCATE TABLE
	limit_test
	;
	INSERT INTO
	limit_test
	VALUES
	( 1, 'One' ),
	( 2, 'Two' ),
	( 3, 'Three' ),
	( 4, 'Four' ),
	( 5, 'Five' )
	;

	-- Now that we have our demo-data in place, let's UPDATE the LAST TWO records in the
	-- table based on an ID-based sort.
	UPDATE
	limit_test
	SET
	name = CONCAT( name, ' ( Updated )' )
	ORDER BY
	id DESC
	LIMIT
	2
	;

	mysql> SELECT * FROM limit_test;
	+----+------------------+
	\| id \| name \|
	+----+------------------+
	\| 1 \| One \|
	\| 2 \| Two \|
	\| 3 \| Three \|
	\| 4 \| Four ( Updated ) \|
	\| 5 \| Five ( Updated ) \|
	+----+------------------+
	5 rows in set (0.00 sec)

	-- Claim the next 100 rows for this worker.
	UPDATE
	table_to_be_processed
	SET
	claimedBy = @workerID
	WHERE
	claimedBy IS NULL
	ORDER BY
	id ASC
	LIMIT
	100
	;

	-- Return the CLAIMED rows to be processed by this worker thread within the application.
	SELECT
	*
	FROM
	table_to_be_processed
	WHERE
	claimedBy = @workerID
	ORDER BY
	id ASC
	;

LIMIT + UPDATE Not Just For Peace-of-Mind

Reader Comments

`LIMIT` + `UPDATE` Not Just For Peace-of-Mind