Practical P_S: Finding which accounts fail to properly close connections

I’ve previously written about several problems which can benefit from additional visibility provided by PERFORMANCE_SCHEMA in MySQL 5.6, and it’s time to add to that list.  A very common problem involves connections which are not properly closed – they simply idle until they reach wait_timeout (or interactive_timeout, depending on the client flags set), and the server terminates the connection.  Who knows what the root cause is – perhaps the client terminated without cleaning up connections, or maybe there was just no load, or maybe the network cable was unplugged.  It’s something application developers – particularly those using persistent connections managed by a pool – run into frequently.

If you are a DBA rather than a developer, though, your only real clue that something is wrong may be a perpetually increasing Aborted_clients status variable counter. The manual has a page dedicated to solving such (and related) connection problems, and it references tools such as the general query log and error log.  The Aborted_clients status variable is useful to answer the question, “how many connections have been closed without an explicit quit request from the client?”  And prior to 5.6, that’s about as much information as you could expect to get:

mysql> show global status like 'aborted_clients';
| Variable_name   | Value |
| Aborted_clients | 5     |
1 row in set (0.00 sec)

With PERFORMANCE_SCHEMA in 5.6, we can isolate the problem to specific accounts, and we can calculate the percentage of client connections which were terminated without an explicit quit command from the client.  You can do that with the following query:

    performance_schema.events_statements_summary_by_account_by_event_name ess
    performance_schema.accounts a ON (ess.USER = a.USER AND ess.HOST = a.HOST)
    ess.EVENT_NAME = 'statement/com/Quit'

The easiest way to test this is to make a handful of connections, issue SET @@session.wait_timeout = 1 – here’s the result of the above query after doing so:

| USER | HOST      | not_closed | pct_not_closed |
| root | localhost |          4 |        44.4444 |
| ODBC | localhost |          1 |       100.0000 |
2 rows in set (0.00 sec)

Knowing which accounts are failing to properly close connections can help quickly spotlight where further investigation should be focused.  And with MySQL 5.6, DBAs can get that information without resorting to the general query log or via application logs.

Tagged , ,

2 thoughts on “Practical P_S: Finding which accounts fail to properly close connections

  1. Tom Ostberg says:

    Why do we care if a connection is closed improperly? I have a site where web-scripts mainly do not correctly close connections. Is there any adverse affect on the server?

    • Todd Farmer says:

      Good question, Tom! Failure to close a connection may cause problems in extreme cases for MySQL Server, or it may just be symptomatic of problems elsewhere (e.g., troublesome network). Each connection to MySQL Server has some resources allocated to it, in the form of memory and sockets. The connection may uses a lot of server-side resources such as prepared statements, user variables, temporary tables, etc. – these are resources that can’t be released until the connection is terminated. One immediate concern of connections being left open would be reaching max_connections (5.7 defaults to 151; earlier versions to 100). Each connection open counts towards that, and the server cannot distinguish between those which are “really” in use and those which are not. When web scripts don’t close connections, the server counts that connection towards max_connections until wait_timeout (or interactive_timeout, depending on the connection method used) is reached. These default to 8 hours, so a number of scripts which leave connections open can clearly cause problems over time – the end result being that MySQL Server refuses connections.

      Beyond that, seeing connections which are not being properly closed can be of interest for diagnosing other problems. As noted above, it might signal network problems. It could be caused by misconfigured applications (e.g., a JDBC connection pool which isn’t configured to maintain/check idle connections, or isn’t checking them often enough).

      A couple of connections which were not properly closed isn’t likely to cause massive problems, but it’s a good thing to track and try to understand. Such problems can be pointers to larger problems, or can become a much larger problem if left unaddressed over time.

      I hope that helps!

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>