Skip to main content
Ben Nadel at CFUNITED 2010 (Landsdown, VA) with: Chaz Chumley
Ben Nadel at CFUNITED 2010 (Landsdown, VA) with: Chaz Chumley

Playing With Window Functions In MySQL 8

By
Published in Comments (3)

Earlier this week, I went through the High Performance SQLite course by Aaron Francis. Among the many SQL topics that he covered, he included something that I've never tried before: Window Functions. These functions allow you to calculate data based on the current row. These functions behave somewhat like a subquery that is locked-down to the existing result-set. And, which can be further subdivided using a partition column.

To start exploring these Window Functions in MySQL, I'm going to use a common table expression (CTE) of email addresses. The derived table will be constructed using the super awesome VALUES/ROWS feature of MySQL:

WITH
	email_list ( email ) AS (

		VALUES
			ROW ( 'ben@alpha.example.com' ),
			ROW ( 'kit@alpha.example.com' ),
			ROW ( 'joanna@beta.example.com' ),
			ROW ( 'sam@charlie.example.com' ),
			ROW ( 'molly@charlie.example.com' ),
			ROW ( 'steve@charlie.example.com' ),
			ROW ( 'bob@charlie.example.com' ),
			ROW ( 'jan@beta.example.com' ),
			ROW ( 'laura@alpha.example.com' ),
			ROW ( 'hal@delta.example.com' )

	)

Now that we have our email addresses in the CTE table, email_list, let's add a generated id column. For this, we'll use the ROW_NUMBER() window function. ROW_NUMBER() returns the number of the current row (1-based) within the current partition. However, since we want this number to be unique across the entire list of emails, we'll omit a partition column and only provide an ORDER BY clause:

ROW_NUMBER() OVER ( ORDER BY email ASC )

The OVER keyword indicates that the ROW_NUMBER() is going to be calculated over a set of related rows (ie, that this is a window-based function). Then, the ORDER BY defines how each set of related rows should be sorted before the ROW_NUMBER() is applied.

For good measure, we're also going to calculate the email domain for use in a subsequent window function:

WITH
	email_list ( email ) AS (
		/* .. */
	),
	user AS (

		SELECT
			ROW_NUMBER() OVER ( ORDER BY email ASC ) AS id,
			email,
			SUBSTRING_INDEX( email, '@', -1 ) AS domain
		FROM
			email_list

	)
SELECT
	*
FROM
	user
;

When we SELECT from this intermediary table, we get the following output:

MySQL result-set showing the generated ID column and email domain column.

As you can see, the ROW_NUMBER() window function gave us an incrementing ID across the entire set of email addresses.

Now, let's use another window function to calculate the ID of each user locally within the relevant email domain. For this, we'll use the DENSE_RANK() window function:

DENSE_RANK() OVER ( PARTITION BY domain ORDER BY email ASC )

This is essentially like a ROW_NUMBER() function that resets within each partition. The PARTITION BY domain tells MySQL that within the result-set we're going to group rows by domain first. Then, within each group, we're going to first sort the rows by email ASC and then apply the DENSE_RANK() window function.

Aside: In fact, using the same PARTITION BY and ORDER BY, the ROW_NUMBER() window function would give us the same result as the DENSE_RANK() window function. But, it's more fun to try a different function.

WITH
	email_list ( email ) AS (
		/* .. */
	),
	user AS (
		/* .. */
	),
	ranked AS (

		SELECT
			u.id,
			u.email,
			u.domain,
			DENSE_RANK() OVER ( PARTITION BY domain ORDER BY email ASC ) AS domainID
		FROM
			user u

	)
SELECT
	r.domain,
	r.domainID,
	r.id,
	r.email
FROM
	ranked r
ORDER BY
	r.domain ASC,
	r.domainID
;

If we now SELECT from this ranked table, we get the following output:

MySQL result-set with generated domainID column.

As you can see, the DENSE_RANK() window function gave us a domain-specific incrementing ID for each partitioned set of users.

Window functions seem very powerful. I doubt that I'll use these function in my core application CRUD (Create, Read, Update, Delete) queries. But, I suspect that—much like the common table expression—they will be very helpful in one-off analytics and report-style queries.

Want to use code from this post? Check out the license.

Reader Comments

238 Comments

Hmmm, really glad you're delving deeper into this topic. I've never really grepped windows functions very well and my understanding was that they're most useful dealing with rolling averages and ranking search results.

15,848 Comments

@Chris,

Yeah, I read a little about the rolling average stuff in the MySQL docs. I think a lot of this stuff, too, is something that I would normally do in my application code after pulling the records back into the ColdFusion app. After all, you're still pulling back all the records - it's not like a traditional app where you're rolling the rows up into the aggregates; with the window functions you're still bring back all the rows; only, you're also asking the DB to perform some additional calculations.

I hear a lot of people say that you should ask the DB to do the stuff that it's good at - like crunching numbers. But, I don't think it's that simple. You only have 1 DB (usually); but, you might have many application instances. By moving the calculations into the application server, you are distributed the load.

As always, there's no one way to do it, just a different set of trade-offs.

238 Comments

@Ben Nadel,

Agreed, there are no perfect solutions...only trade-offs! I have had one business case for a windows function that would have been a lot more effort in the app code. I wish I could recall the circumstance, but alas...I do not :(

And if/when I ever need to do rolling averages again, I think I'll try to lean on these windows functions more...if only to understand them better.

Great article! 🙌

Post A Comment — I'd Love To Hear From You!

Post a Comment

I believe in love. I believe in compassion. I believe in human rights. I believe that we can afford to give more of these gifts to the world around us because it costs us nothing to be decent and kind and understanding. And, I want you to know that when you land on this site, you are accepted for who you are, no matter how you identify, what truths you live, or whatever kind of goofy shit makes you feel alive! Rock on with your bad self!
Ben Nadel