Skip to main content
Ben Nadel at BFusion / BFLEX 2010 (Bloomington, Indiana) with: Matt Boles
Ben Nadel at BFusion / BFLEX 2010 (Bloomington, Indiana) with: Matt Boles

Using Regular Expressions (RLIKE) Pattern Matching In MySQL Queries

By
Published in

If you've known me for a while, you know that I'm absolutely in love with Regular Expressions. I use them in ColdFusion and JavaScript all the time. And, the other day, for the very first time, I used them in MySQL. As a celebration of this milestone, I wanted to put together a quick post. This is not intended to be a tutorial, but rather a beacon of hope for anyone who didn't realize that using Regular Expressions in MySQL was even a possibility.

The Regular Expression (REGEXP / RLIKE) support in MySQL is not as robust as it is in other programming languages. And, the Regular Expression engine can't leverage table indices; so, you're basically doing a full set scan for pattern matching. But, they are still very awesome and can make running data reports much easier.

To start with, I wanted to see what kind of performance hit the REGEXP / RLIKE operator had when compared to the less robust, less flexible LIKE operator. While not necessarily a scientific test, I tried to query for email addresses, in a 30-million record table, that ended with "@bennadel.com":

-- When using the LIKE operator, three attempts ran in: 34s, 22s, 22s.
SELECT
	u.id,
	u.email
FROM
	user u
WHERE
	u.email LIKE '%@bennadel.com'
;

-- When using the RLIKE operator, three attempts ran in: 144s, 141s, 140s.
SELECT
	u.id,
	u.email
FROM
	user u
WHERE
	u.email RLIKE '@bennadel\.com$'
;

While neither of these operations (suffix matching) could leverage table indices, the LIKE operator ran 5-6 times faster than the RLIKE / REGEXP operator to find the same data. I don't point this out to say that RLIKE is bad; merely, that it has some overhead and should be used when its the "right" tool for the job - not as a replacement for basic LIKE operations.

That said, here's what some of the possible pattern matching looks like:

SELECT
	-- Using basic character classes and not-operator.
	-- Result: 1
	( 'ben@bennadel.com' RLIKE '[^@]+@[^.](\.[^.]+)+' ) isEmail,

	-- Using optional character matching.
	-- Result: 1
	( 'data.xls' RLIKE '\.xlsx?' ) AS isXLS,

	-- Using digit-matching and variable-length patterns. Unfortunately, RLIKE
	-- doesn't support notations like \d and \s. Instead, you have to use special
	-- character class names.

	-- Result: 0 -- doesn't support \d
	( '212-555-3932' RLIKE '\d{3}-\d{3}-\d{4}' ) AS isTele,

	-- Result: 1
	( '212-555-3932' RLIKE '[[:digit:]]{3}-[[:digit:]]{3}-[[:digit:]]{4}' ) AS isTele2,

	-- Result: 1 -- using support for character ranges.
	( '212-555-3932' RLIKE '[0-9]{3}-[0-9]{3}-[0-9]{4}' ) AS isTele3,

	-- Testing word-boundaries. And, again, since RLIKE doesn't support notations
	-- like \b, you have to use special character class names.

	-- Result: 0 -- doesn't support \b
	( 'You have a warrant?' RLIKE '[[:<:]]war[[:>:]]' ) AS isWar,

	-- Result: 1
	( 'Yo, war is lame.' RLIKE '[[:<:]]war[[:>:]]' ) AS isWar2,

	-- RLIKE is not case sensitive, except when used with binary strings.
	-- Result: 1 (for both tests)
	( 'BEN' RLIKE 'ben' ) AS isBen,
	( 'BEN' RLIKE 'BEN' ) AS isBen2
;

As you can see, there is an unfortunate lack of support for core character class notation in the form of \w, \b, \s, etc. But, you can used named character classes like [:digit:] to make up for that. It's a bit more cumbersome, but it works just the same.

So there you are - Regular Expression power meets relation database power for much sexiness! And while this might not affect much of my day-to-day SQL authoring, I am quite sure that I'll be making use of this in my SQL reporting efforts.

Want to use code from this post? Check out the license.

Reader Comments

I believe in love. I believe in compassion. I believe in human rights. I believe that we can afford to give more of these gifts to the world around us because it costs us nothing to be decent and kind and understanding. And, I want you to know that when you land on this site, you are accepted for who you are, no matter how you identify, what truths you live, or whatever kind of goofy shit makes you feel alive! Rock on with your bad self!
Ben Nadel