Tag Archives: prepared statements

Who’s leaking prepared statements?

In my last post, I described a specific problem with prepared statements into which PERFORMANCE_SCHEMA can give visibility.  That made me wonder whether PERFORMANCE_SCHEMA can also be used to identify other areas where prepared statements run into problems.  The most significant problem tends to be leakage of prepared statements.  This can inflate memory usage, both on the server and application side, and it’s not uncommon to find applications which fail to close prepared statements.

So the question is, what can PERFORMANCE_SCHEMA tell us about how connections close (or more importantly, fail to close) prepared statements?

At the most basic level, one can check the number of PREPARE statements executed compared to DEALLOCATE PREPARE, and you can do that using global status variables.  You’re shooting for general equality between (Com_prepare_sql + Com_stmt_prepare) and (Com_stmt_close + Com_dealloc_sql).  I’m using session status below to make it easy to follow, but obviously global status variables will be most interesting in any review of prepared statement leakage:

mysql> flush status;
Query OK, 0 rows affected (0.00 sec)

mysql> SHOW SESSION STATUS LIKE 'com_%prepare%';
+--------------------+-------+
| Variable_name      | Value |
+--------------------+-------+
| Com_prepare_sql    | 0     |
| Com_stmt_prepare   | 0     |
| Com_stmt_reprepare | 0     |
| Com_xa_prepare     | 0     |
+--------------------+-------+
4 rows in set (0.00 sec)

mysql> PREPARE stmt FROM 'SELECT 1';
Query OK, 0 rows affected (0.00 sec)
Statement prepared

mysql> SHOW SESSION STATUS LIKE 'com_%prepare%';
+--------------------+-------+
| Variable_name      | Value |
+--------------------+-------+
| Com_prepare_sql    | 1     |
| Com_stmt_prepare   | 1     |
| Com_stmt_reprepare | 0     |
| Com_xa_prepare     | 0     |
+--------------------+-------+
4 rows in set (0.00 sec)

mysql> SHOW SESSION STATUS LIKE 'Com_stmt_close';
+----------------+-------+
| Variable_name  | Value |
+----------------+-------+
| Com_stmt_close | 0     |
+----------------+-------+
1 row in set (0.00 sec)

mysql> SHOW SESSION STATUS LIKE 'Com_dealloc_sql';
+-----------------+-------+
| Variable_name   | Value |
+-----------------+-------+
| Com_dealloc_sql | 0     |
+-----------------+-------+
1 row in set (0.00 sec)

mysql> DEALLOCATE PREPARE stmt;
Query OK, 0 rows affected (0.00 sec)

mysql> SHOW SESSION STATUS LIKE 'Com_stmt_close';
+----------------+-------+
| Variable_name  | Value |
+----------------+-------+
| Com_stmt_close | 1     |
+----------------+-------+
1 row in set (0.00 sec)

mysql> SHOW SESSION STATUS LIKE 'Com_dealloc_sql';
+-----------------+-------+
| Variable_name   | Value |
+-----------------+-------+
| Com_dealloc_sql | 1     |
+-----------------+-------+
1 row in set (0.00 sec)

That’s useful as a start to identify whether a problem exists, but it doesn’t help isolate the source very well. PERFORMANCE_SCHEMA can take it the next step.

The PERFORMANCE_SCHEMA counters are smarter than the status variables, and we can count the number of prepared statements that have been opened (statement/sql/prepare_sql event for normal statements, statement/com/Prepare event for connections using COM_STMT_PREPARE protocol command) and the number closed (statement/sql/dealloc_sql event for normal statements, statement/com/Close stmt event for COM_STMT_CLOSE command).  Once can write a query which computes the difference between the prepares and closes:

SELECT sp.thread_id, t.processlist_user user, t.processlist_host host,
sp.count_star - sd.count_star open_com_query_ps,
cp.count_star - cc.count_star open_com_prepare_ps
FROM
( SELECT COUNT_STAR,
THREAD_ID
FROM events_statements_summary_by_thread_by_event_name
WHERE event_name = 'statement/sql/prepare_sql' ) sp
JOIN
( SELECT COUNT_STAR,
THREAD_ID
FROM events_statements_summary_by_thread_by_event_name
WHERE event_name = 'statement/com/Prepare' ) cp
ON (cp.THREAD_ID = sp.THREAD_ID)
JOIN
( SELECT COUNT_STAR,
THREAD_ID
FROM events_statements_summary_by_thread_by_event_name
WHERE event_name = 'statement/sql/dealloc_sql' ) sd
ON (sd.THREAD_ID = sp.THREAD_ID)
JOIN
( SELECT COUNT_STAR,
THREAD_ID
FROM events_statements_summary_by_thread_by_event_name
WHERE event_name = 'statement/com/Close stmt' ) cc
ON (cc.THREAD_ID = sp.THREAD_ID)
JOIN threads t ON (t.thread_id = sp.thread_id)
ORDER BY GREATEST(open_com_query_ps, open_com_prepare_ps) DESC;

That’s nice, and it gives us output like the following:

+-----------+------+-----------+-------------------+---------------------+
| thread_id | user | host      | open_com_query_ps | open_com_prepare_ps |
+-----------+------+-----------+-------------------+---------------------+
|        22 | root | localhost |                 3 |                   0 |
+-----------+------+-----------+-------------------+---------------------+
1 row in set (0.08 sec)

Of course, there are still ways in which this isn’t 100% accurate. For example, one can do stuff like this:

+-----------+------+-----------+-------------------+---------------------+
| thread_id | user | host      | open_com_query_ps | open_com_prepare_ps |
+-----------+------+-----------+-------------------+---------------------+
|        22 | root | localhost |                 3 |                   0 |
+-----------+------+-----------+-------------------+---------------------+
1 row in set (0.08 sec)

mysql> PREPARE stmt FROM 'SELECT 1';
Query OK, 0 rows affected (0.00 sec)
Statement prepared

mysql> PREPARE stmt FROM 'SELECT 2';
Query OK, 0 rows affected (0.00 sec)
Statement prepared

mysql> SELECT sp.thread_id, t.processlist_user user, t.processlist_host hos
-> sp.count_star - sd.count_star open_com_query_ps,
-> cp.count_star - cc.count_star open_com_prepare_ps
-> FROM
->  ( SELECT COUNT_STAR,
->     THREAD_ID
->    FROM events_statements_summary_by_thread_by_event_name
->    WHERE event_name = 'statement/sql/prepare_sql' ) sp
->  JOIN
->  ( SELECT COUNT_STAR,
->     THREAD_ID
->    FROM events_statements_summary_by_thread_by_event_name
->    WHERE event_name = 'statement/com/Prepare' ) cp
->  ON (cp.THREAD_ID = sp.THREAD_ID)
->  JOIN
->  ( SELECT COUNT_STAR,
->     THREAD_ID
->    FROM events_statements_summary_by_thread_by_event_name
->    WHERE event_name = 'statement/sql/dealloc_sql' ) sd
->  ON (sd.THREAD_ID = sp.THREAD_ID)
->  JOIN
->  ( SELECT COUNT_STAR,
->     THREAD_ID
->    FROM events_statements_summary_by_thread_by_event_name
->    WHERE event_name = 'statement/com/Close stmt' ) cc
->  ON (cc.THREAD_ID = sp.THREAD_ID)
->  JOIN threads t ON (t.thread_id = sp.thread_id)
->  WHERE sp.count_star - sd.count_star <> 0
->   OR cp.count_star - cc.count_star <> 0
-> ORDER BY GREATEST(open_com_query_ps, open_com_prepare_ps) DESC;
+-----------+------+-----------+-------------------+---------------------+
| thread_id | user | host      | open_com_query_ps | open_com_prepare_ps |
+-----------+------+-----------+-------------------+---------------------+
|        22 | root | localhost |                 5 |                   0 |
+-----------+------+-----------+-------------------+---------------------+
1 row in set (0.09 sec)

In the above, I prepared the same statement handler twice without closing it.

I know Marc has some ideas for further PERFORMANCE_SCHEMA monitoring of prepared statements, but what’s implemented today is a good step forward. It’s easier to tie a potential prepared statement leak to specific accounts, and to isolate the leakage to protocol-based prepared statements (drivers using COM_PREPARE) or the more generic COM_QUERY interface used by the mysql command-line interface.

Prepared statement peculiarities (P_S to the rescue)

Prepared statements have been with MySQL since version 4.1, including the protocol plumbing that helps support it.  What I didn’t realize – until a recent expedition through a general query log – is that the mysql command-line interface doesn’t implement the protocol commands that support this explicitly.  I came to this realization after observing a byproduct of this behavior.

The initial observation that triggered this exploration was noting that PREPARE and EXECUTE statements, when issued from the mysql command-line interface, result in two entries per command in the general query log:

6 Query    PREPARE stmt FROM 'SELECT RAND()'
6 Prepare    SELECT RAND()
6 Query    EXECUTE stmt
6 Execute    SELECT RAND()

Contrast this behavior with what is seen when a client sends COM_PREPARE and COM_EXECUTE, such as below with Connector/J (and useServerPrepStmts=true):

14 Prepare    SELECT * FROM t1 WHERE a = ?
14 Execute    SELECT * FROM t1 WHERE a = 2

This was of interest to me because I had a script to take the contents of the general query log, separate them into per-connection files containing the SQL commands, to assist in diagnosing application behavior when developers can’t effectively isolate or describe problematic behavior.  By stripping out certain commands (connect/disconnect) and transforming others (Init DB to the original USE [db] command), this script could potentially help us build repeatable test cases faster.

Of course, that doesn’t work if prepared statement commands are duplicated sometimes.

The general query log helpfully records the type of command executed.  From the mysql cli, you can see “Query”, which corresponds to the COM_QUERY command in the protocol.  You also see the “Prepare” and “Execute”, the latter of which has the interpreted values.  Executed from Connector/J, though, you only see the “Prepare” and “Execute” because the driver is sending COM_STMT_PREPARE and COM_STMT_EXECUTE directly.  The protocol documentation describes these commands well.  That gives me two options, if I want to transform the general query log into a somewhat accurate series of SQL statements:

  1. Throw away all of the Prepare and Execute events, hoping that everybody executing prepared statements is using the cli (or COM_QUERY), and replay only the COM_QUERY.
  2. Parse and discard any COM_QUERY commands that invoke PREPARE or EXECUTE, and just use the generated SQL found in the “Execute” entries in the general query log.

I kind of understand why the mysql cli doesn’t implement the protocol commands for prepared statements, even if it does parse and transform other commands (e.g., “USE db”).  The server-side support allows prepared statements to be used at the application level even if the driver doesn’t support it at the protocol level.  And from a debugging perspective, it’s very nice to see the generated SQL in the general query log.  Because the general query log includes every command it gets, before execution is started, I also understand why the the original query received has to show there, unaltered.  I do wish, however, that there was a flag in the general query log to indicate that the Prepare or Execute being logged was internally-generated from a COM_QUERY.

The duplication of commands in the general query log did make me wonder about performance.  It seems pretty clear that there’s some overhead added when COM_QUERY is used to send PREPARE or EXECUTE commands to the server – at the very minimum, the original statement gets logged to the general query log and some parsing done to redirect execution to code paths handling preparation or execution of prepared statements.  For the fun of it, I thought I would benchmark what the performance difference is on my (admittedly slow) laptop, using Java.  Here’s what the code looks like:

public static void testPSPerformance() throws Exception {
 Class.forName("com.mysql.jdbc.Driver");
 Properties props = new Properties();
 props.setProperty("user", "root");
 props.setProperty("useServerPrepStmts", "true");
 Connection conn =
  DriverManager.getConnection("jdbc:mysql://localhost:3306/test", props);
 System.out.println("Connected!");
 PreparedStatement ps = conn.prepareStatement("SELECT RAND()");
 long start = System.currentTimeMillis();
 for(int i = 0; i < 1000000; i++){ ps.execute(); }
 long end = System.currentTimeMillis();
 System.out.println("Using COM_PREPARE:  " + (end - start));

 Statement stmt = conn.createStatement();
 stmt.execute("PREPARE stmt FROM 'SELECT RAND()'");
 final String ex = "EXECUTE stmt";
 start = System.currentTimeMillis();
 for(int i = 0; i < 1000000; i++){ stmt.execute(ex); }
 end = System.currentTimeMillis();
 System.out.println("Using COM_QUERY:  " + (end - start));
}

The end results show about 10% performance loss by using COM_QUERY, although the actual difference may be more or less depending on your driver and deployment environment.  For example, the Connector/J Java code path for using Statement objects is different than using server-side PreparedStatements, and overhead may be added or removed there.  Here are the results from my testing:

Using COM_PREPARE:  80516
Using COM_QUERY:  90109

Using COM_PREPARE:  80547
Using COM_QUERY:  87594

Using COM_PREPARE:  81344
Using COM_QUERY:  89781

As you might expect, the overall execution time as well as the performance difference increases when the general query log is enabled:

Using COM_PREPARE:  100031
Using COM_QUERY:  126485

Again, you won’t want to draw any solid conclusions from the above about whether use of prepared statements in the same way as the mysql cli does represents performance problems for you.  But it might be worth checking.  So, how can that be done?  Unfortunately, the relevant status variables don’t distinguish between when a statement is prepared using COM_PREPARE or COM_QUERY (UPDATE: You can evaluate whether COM_QUERY or COM_PREPARE is used by subtracting Com_prepare_sql from Com_stmt_prepare. The latter is always incremented for PREPARE statements, regardless of whether they are issued as part of a COM_QUERY or COM_PREPARE command, while the former is only incremented when COM_QUERY is used):

mysql> FLUSH STATUS;
Query OK, 0 rows affected (0.00 sec)

mysql> SHOW SESSION STATUS LIKE '%prep%';
+---------------------+-------+
| Variable_name       | Value |
+---------------------+-------+
| Com_prepare_sql     | 0     |
| Com_stmt_prepare    | 0     |
| Com_stmt_reprepare  | 0     |
| Com_xa_prepare      | 0     |
| Handler_prepare     | 0     |
| Prepared_stmt_count | 1     |
+---------------------+-------+
6 rows in set (0.00 sec)

mysql> prepare stmt from 'SELECT 1';
Query OK, 0 rows affected (0.00 sec)
Statement prepared

mysql> SHOW SESSION STATUS LIKE '%prep%';
+---------------------+-------+
| Variable_name       | Value |
+---------------------+-------+
| Com_prepare_sql     | 1     |
| Com_stmt_prepare    | 1     |
| Com_stmt_reprepare  | 0     |
| Com_xa_prepare      | 0     |
| Handler_prepare     | 0     |
| Prepared_stmt_count | 1     |
+---------------------+-------+
6 rows in set (0.00 sec)

So, how can you determine whether your use of prepared statements uses COM_PREPARE or COM_QUERY?  Using PERFORMANCE_SCHEMA in 5.6, it’s easy!  Here’s the query:

SELECT
user,
host,
count_star,
IF(event_name = 'statement/sql/prepare_sql',
'COM_QUERY', 'COM_PREPARE') command
FROM events_statements_summary_by_account_by_event_name
WHERE EVENT_NAME IN
('statement/sql/prepare_sql', 'statement/com/Prepare');

Here it is in action:

mysql> SELECT
->  user,
->  host,
->  count_star,
->  IF(event_name = 'statement/sql/prepare_sql',
->   'COM_QUERY', 'COM_PREPARE') command
-> FROM events_statements_summary_by_account_by_event_name
-> WHERE EVENT_NAME IN
-> ('statement/sql/prepare_sql', 'statement/com/Prepare');
+------+-----------+------------+-------------+
| user | host      | count_star | command     |
+------+-----------+------------+-------------+
| NULL | NULL      |          0 | COM_QUERY   |
| NULL | NULL      |          0 | COM_PREPARE |
| root | localhost |          4 | COM_QUERY   |
| root | localhost |          3 | COM_PREPARE |
+------+-----------+------------+-------------+
4 rows in set (0.00 sec)

mysql> prepare stmt from 'SELECT 1';
Query OK, 0 rows affected (0.00 sec)
Statement prepared

mysql> SELECT
->  user,
->  host,
->  count_star,
->  IF(event_name = 'statement/sql/prepare_sql',
->   'COM_QUERY', 'COM_PREPARE') command
-> FROM events_statements_summary_by_account_by_event_name
-> WHERE EVENT_NAME IN
-> ('statement/sql/prepare_sql', 'statement/com/Prepare');
+------+-----------+------------+-------------+
| user | host      | count_star | command     |
+------+-----------+------------+-------------+
| NULL | NULL      |          0 | COM_QUERY   |
| NULL | NULL      |          0 | COM_PREPARE |
| root | localhost |          5 | COM_QUERY   |
| root | localhost |          3 | COM_PREPARE |
+------+-----------+------------+-------------+
4 rows in set (0.00 sec)

mysql> -- Issue prepared statement from Connector/J:
mysql> SELECT
->  user,
->  host,
->  count_star,
->  IF(event_name = 'statement/sql/prepare_sql',
->   'COM_QUERY', 'COM_PREPARE') command
-> FROM events_statements_summary_by_account_by_event_name
-> WHERE EVENT_NAME IN
-> ('statement/sql/prepare_sql', 'statement/com/Prepare');
+------+-----------+------------+-------------+
| user | host      | count_star | command     |
+------+-----------+------------+-------------+
| NULL | NULL      |          0 | COM_QUERY   |
| NULL | NULL      |          0 | COM_PREPARE |
| root | localhost |          5 | COM_QUERY   |
| root | localhost |          4 | COM_PREPARE |
+------+-----------+------------+-------------+
4 rows in set (0.00 sec)

The original problem may not be of much concern to you, but the power of PERFORMANCE_SCHEMA to dig deeper into server behavior is something that is applicable regardless of whether prepared statements are a concern for you or not.