Skip to main content
Ben Nadel at Scotch On The Rock (SOTR) 2010 (London) with: Kev McCabe
Ben Nadel at Scotch On The Rock (SOTR) 2010 (London) with: Kev McCabe

Conditionally Updating Columns When Using ON DUPLICATE KEY UPDATE In MySQL

By
Published in

A couple of weeks ago, I talked about the lessons I learned while sending 7M emails using ColdFusion. In that post, I mentioned that I needed to store a cache of email validation codes in order to avoid sending an email to any address that was known to be invalid. But, an invalid email address isn't the only reason to skip a given send. If a user explicitly unsubscribes from a broadcast stream, I need to omit the given user in the next send (or my SMTP provider—Postmark—will mark the email as bounced).

When a user explicitly unsubscribes from the broadcast stream, I update their email validation record to store the value, suppressed. But, of course, their email address is technically valid. Which means, if we ever run their email through the validation process again (such as via NeverBounce or Email List Verify), it may come back as ok or valid.

In order to treat the suppressed assignment as a one-way door, I needed to make sure that once an email validation was marked suppressed, it will never be marked as anything else. To do this, I had to update my SQL statement to conditionally store the new value if and only if the current value was not suppressed.

To explore this logic, let's first look at the database table structure:

CREATE TABLE `email_verification` (
	`email` varchar(255) NOT NULL,
	`textCode` varchar(50) NOT NULL,
	`updatedAt` datetime NOT NULL,
	PRIMARY KEY (`email`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;

Here, we're using the email as the primary key and the textCode as the email verification value. Populating this table was going to be an iterative process; and, the verification codes were going to change over time. As such, my INSERT needed to handle updates on email duplication (ie, on primary key collision).

The SQL for such a multi-pass INSERT looks like this:

SET @email = 'ben@bennadel.com';
SET @textCode = 'ok';

INSERT INTO
	email_verification
SET
	email = @email,
	textCode = @textCode,
	updatedAt = UTC_TIMESTAMP()
ON DUPLICATE KEY UPDATE
	textCode = VALUES( textCode ),
	updatedAt = VALUES( updatedAt )
;

In this SQL statement, I'm using the ON DUPLICATE KEY UPDATE clause in order to update any existing row with a matching email address (which, remember, is being used as the primary key). In the context of the ON DUPLICATE KEY UPDATE clause, the VALUES() function returns the value that would have been used within the INSERT had there been no key-collision.

With this SQL, I can run the INSERT...ON DUPLICATE KEY UPDATE as many times as I like; and the textCode column will keep getting updated for the given email address:

Using INSERT..ON DUPLICATE KEY UPDATE to toggle a MySQL column value back and forth.

Of course, once the textCode column is designated as suppressed, I never want it to change. In order to do this, my UPDATE assignment has to look at the existing value and conditionally override it. Thankfully, the UPDATE clause can reference any existing column value within the existing row by name. Which means, I can use the IF() function to return the existing value if it is currently suppressed; or, to return the VALUES() value if it is anything else.

SET @email = 'ben@bennadel.com';
SET @textCode = 'ok';

INSERT INTO
	email_verification
SET
	email = @email,
	textCode = @textCode,
	updatedAt = UTC_TIMESTAMP()
ON DUPLICATE KEY UPDATE
	textCode = IF(
		( textCode = 'suppressed' ),
		textCode,          -- Return the existing value (suppressed)
		VALUES( textCode ) -- Return the new value.
	),
	updatedAt = VALUES( updatedAt )
;

To be clear, within the ON DUPLICATE KEY UPDATE clause, the expression textCode refers to the existing value within the existing row. But, the expression VALUES(textCode) refers to the new value that would have been inserted. And, once the textCode column is set to suppressed, it can never go back to anything else:

Using INSERT..ON DUPLICATE KEY UPDATE with an IF() function to lock-down a column value.

By using the IF() function within the ON DUPLICATE KEY UPDATE clause, I can ensure that once the textCode column is "suppressed", any subsequent execution of this query will become a no-op. That said, I'm electing to always update the updatedAt column on every execution as a means to provide internal feedback to our team.

I could have broken this whole algorithm up into two separate SQL queries: a SELECT query followed by a conditional INSERT or UPDATE query. However, this bit of workflow operates over millions of email address. And, if I can cut the number of SQL queries in half (from 2 to 1), this has a very meaningful impact on how fast the overall workflow can execute.

Want to use code from this post? Check out the license.

Reader Comments

Post A Comment — I'd Love To Hear From You!

Post a Comment

I believe in love. I believe in compassion. I believe in human rights. I believe that we can afford to give more of these gifts to the world around us because it costs us nothing to be decent and kind and understanding. And, I want you to know that when you land on this site, you are accepted for who you are, no matter how you identify, what truths you live, or whatever kind of goofy shit makes you feel alive! Rock on with your bad self!
Ben Nadel