Using SELECT And UNION ALL To JOIN SQL Tables To CSV Data In MySQL
I just love SQL. I've been writing web-applications on top of relational databases (RDBMS) for a while; and, SQL always feels like a luxurious combination of structure and flexibility. Yesterday, when working with some data scientists, I had to merge a few columns from a database table into a CSV (Comma-Separated Values) file that the data scientists had created. The CSV file was fairly small; so, I ended up using the awesome power of multiple-cursors in SublimeText 3 in order to write a SQL query that converted the CSV data into a derived table that I then consumed in an INNER JOIN
in MySQL.
To be clear, this is not the only way to consume CSV data in a relational database. But, I was working with a database in which I had read only permissions. As such, I couldn't load the CSV file into a "real table". Given the relatively small size of the CSV payload, what follows just felt like the appropriate level of effort.
That said, let's paint a picture of what was going on. Imagine that I had a database table called friend
with the following structure:
SHOW CREATE TABLE friend;
-- Generates the following SQL statement:
CREATE TABLE `friend` (
`id` int(11) NOT NULL,
`name` varchar(50) NOT NULL,
`age` int(11) NOT NULL,
`catchPhrase` varchar(250) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
Then, the data scientists, using an external data warehouse, generated a CSV file that contained the friend.id
value as the first column and a favorite meal as the second column:
id favorite_meal
1 "Pizza"
4 "Bella's Chicken Burritos"
5 "Vegan, Multi-Grain Pancakes"
Given this CSV file, they wanted to merge it with the friend
table so that they could see the name
value alongside the existing id
and favorite_meal
values. To do this, I took their CSV data and - using SublimeText 3 - translated it into series of SELECT
and UNION ALL
statements:
( SELECT 1 AS id, "Pizza" AS favoriteMeal ) UNION ALL
( SELECT 4 AS id, "Bella's Chicken Burrito" AS favoriteMeal ) UNION ALL
( SELECT 5 AS id, "Vegan, Multi-Grain Pancakes" AS favoriteMeal )
This SQL generates a derived result-set in which each SELECT
statement represents a unique row. One of the hottest features of SQL is that I can then use a derived result-set as input to another query. As such, I was able to take this series of SELECT
statements and consume it as part of an INNER JOIN
on the friend
table:
This left me with the following SQL:
SELECT
f.id,
f.name,
tmp.favoriteMeal
FROM
-- Here, we are using the SELECT / UNION ALL technique to crate a derived table
-- in which each CSV value becomes a column within the derived table. We can then
-- JOIN to this derived table in order to augment it.
(
( SELECT 1 AS id, "Pizza" AS favoriteMeal ) UNION ALL
( SELECT 4 AS id, "Bella's Chicken Burrito" AS favoriteMeal ) UNION ALL
( SELECT 5 AS id, "Vegan, Multi-Grain Pancakes" AS favoriteMeal )
) AS tmp
INNER JOIN
friend f
ON
f.id = tmp.id -- Joining `friend` to the derived table.
ORDER BY
f.id ASC
;
As you can see, I'm using my CSV-data-cum-temporary-table as the first part of my INNER JOIN
. This allows me to map the friend.id
column to the tmp.id
column, which subsequently gives me access to the friend.name
column in the results. And, when we run the above SQL in the MySQL database, we get the following output:
mysql> SELECT
-> f.id,
-> f.name,
-> tmp.favoriteMeal
-> FROM
-> (
->
-> ( SELECT 1 AS id, "Pizza" AS favoriteMeal ) UNION ALL
-> ( SELECT 4 AS id, "Bella's Chicken Burrito" AS favoriteMeal ) UNION ALL
-> ( SELECT 5 AS id, "Vegan, Multi-Grain Pancakes" AS favoriteMeal )
->
-> ) AS tmp
-> INNER JOIN
-> friend f
-> ON
-> f.id = tmp.id
-> ORDER BY
-> f.id ASC
-> ;
+----+-------+-----------------------------+
| id | name | favoriteMeal |
+----+-------+-----------------------------+
| 1 | Tina | Pizza |
| 4 | Ralph | Bella's Chicken Burrito |
| 5 | Sandi | Vegan, Multi-Grain Pancakes |
+----+-------+-----------------------------+
3 rows in set (0.00 sec)
Woot woot! As you can see, by merging the CSV data and the friend
table in a single SQL query, I was able to pull in the friend.name
column in order to augment the existing CSV column data.
Isn't SQL just thrilling?! I am sure there are good number of ways to accomplish what I demonstrated. But, in the "fog of war" on a Zoom call with several other people watching me live-type SQL, this felt like the right approach. But dang it, if SQL isn't just the cat's pajamas!
Want to use code from this post? Check out the license.
Reader Comments
As of MySQL 8.0.19, this whole concept gets even easier. There is now a
VALUES
/ROW
construct that is purpose built for providing static data as a derived table:www.bennadel.com/blog/4626-using-values-row-to-create-derived-table-from-static-values-in-mysql-8-0-19.htm
It removes all of the unnecessary noise and cruft.