Grouping The MySQL PROCESSLIST By IP Address To View Connection Counts
I'm a huge fan of SQL. But, I've never been a database administrator. At least, nothing like Brad Brewer - our DBA and general data architect. Lately, however, I've been trying to take a more active role in the monitoring of our database infrastructure; which means, learning more about MySQL features like the PROCESSLIST and SLOW_LOG tables. For example, today, we saw an odd increase in the number of connections coming into the master database. So, in an attempt to figure out which client machine was causing the unexpected increase, I had to figure out how to group the PROCESSLIST table by host IP address.
Now, I've known about the SHOW FULL PROCESSLIST command for a while. But, I had no idea that you could actually run queries against the PROCESSLIST table in MySQL. In the past, I've certainly tried to do this with no luck. Thankfully I came across a really helpful StackOverflow post today that taught me how to execute such a query by denoting the PROCESSLIST as a child of the INFORMATION_SCHEMA.
Once I was able to actually run queries against the PROCESSLIST table, all I had to do was strip the port off the host column and do a little grouping on the naked IP address:
SELECT
tmp.ipAddress,
-- Calculate how many connections are being held by this IP address.
COUNT( * ) AS ipAddressCount,
-- For each connection, the TIME column represent how many SECONDS it has been in
-- its current state. Running some aggregates will give us a fuzzy picture of what
-- the connections from this IP address is doing.
FLOOR( AVG( tmp.time ) ) AS timeAVG,
MAX( tmp.time ) AS timeMAX
FROM
-- Let's create an intermediary table that includes an additional column representing
-- the client IP address without the port.
(
SELECT
-- We don't actually need all of these columns for the demo. But, I'm
-- including them here to demonstrate what fields COULD be used in the
-- processlist system.
pl.id,
pl.user,
pl.host,
pl.db,
pl.command,
pl.time,
pl.state,
pl.info,
-- The host column is in the format of "IP:PORT". We want to strip off
-- the port number so that we can group the results by the IP alone.
LEFT( pl.host, ( LOCATE( ':', pl.host ) - 1 ) ) AS ipAddress
FROM
INFORMATION_SCHEMA.PROCESSLIST pl
) AS tmp
GROUP BY
tmp.ipAddress
ORDER BY
ipAddressCount DESC
This returns a list of client IP addresses along with the number of connections that the given client is holding.
Honestly, I can't tell you how long I've longed to be able to query against the PROCESSLIST table. Being able to do that now is a simple yet major breakthrough for me in my journey towards SQL mastery. It makes things like grouping the PROCESSLIST by IP address finally possible.
Want to use code from this post? Check out the license.
Reader Comments