Tag Archives: MySQL

Identifying and killing blocking transactions in InnoDB

The MySQL Server has a few options to help deal with transactions holding InnoDB locks for excessive periods of time.  The –innodb-lock-wait-timeout option is one such option, but that just affects statements waiting on locks already held by another transaction.  If you want to ensure your application isn’t holding locks for long periods of time to start with, what options do you have?

Well, the manual has a useful example that leverages InnoDB tables in INFORMATION_SCHEMA to show which transactions are locking rows needed by other transactions.  That’s cool stuff, and you can take this a step further by looking for all connections with open transactions which exceed a given threshold or are blocking other transactions:

mysql> SET @threshold = 30;
Query OK, 0 rows affected (0.00 sec)

mysql> SELECT p.user,
->      LEFT(p.HOST, LOCATE(':', p.HOST) - 1) host, p.id,
->      TIMESTAMPDIFF(SECOND, t.TRX_STARTED, NOW()) duration,
->      COUNT(DISTINCT ot.REQUESTING_TRX_ID) waiting
->    FROM INFORMATION_SCHEMA.INNODB_TRX t
->    JOIN INFORMATION_SCHEMA.PROCESSLIST p
->      ON ( p.ID = t.TRX_MYSQL_THREAD_ID )
->    LEFT JOIN INFORMATION_SCHEMA.INNODB_LOCK_WAITS ot
->      ON ( ot.BLOCKING_TRX_ID = t.TRX_id )
->    WHERE t.TRX_STARTED + INTERVAL @threshold SECOND <= NOW()
->    GROUP BY LEFT(p.HOST, LOCATE(':', p.HOST) - 1), p.id, duration
->    HAVING duration >= @threshold OR waiting > 0;
+------+-----------+----+----------+---------+
| user | host      | id | duration | waiting |
+------+-----------+----+----------+---------+
| root | localhost |  4 |     1062 |       0 |
| ODBC | localhost |  5 |      276 |       0 |
+------+-----------+----+----------+---------+
2 rows in set (0.08 sec)

The next step you can take is to create an EVENT which logs this, or even issues KILL CONNECTION (sadly, there is no KILL TRANSACTION, and KILL QUERY doesn’t rollback the transaction) commands on the blocking transaction.  You can also create control tables that allow you to fine-tune reporting or action thresholds for specific user/host combinations:

CREATE TABLE test.innodb_trx_kill_threshold (
 user VARCHAR(100) NOT NULL,
 host VARCHAR(100) NOT NULL,
 trx_kill_threshold INT UNSIGNED,
 PRIMARY KEY (user, host)
) ENGINE = InnoDB;

CREATE TABLE test.innodb_trx_kill_audit (
 id INT UNSIGNED NOT NULL PRIMARY KEY AUTO_INCREMENT,
 user VARCHAR(100) NOT NULL,
 host VARCHAR(100) NOT NULL,
 connection_id INT UNSIGNED NOT NULL,
 trx_duration INT UNSIGNED NOT NULL,
 other_trx_waiting INT UNSIGNED NOT NULL DEFAULT 0,
 kill_time TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
 UNIQUE KEY (user, host, connection_id)
) ENGINE = InnoDB;

CREATE OR REPLACE VIEW test.innodb_trx_past_threshhold AS
SELECT p.user, 
     LEFT(p.HOST, LOCATE(':', p.HOST) - 1) host, p.id, 
     TIMESTAMPDIFF(SECOND, t.TRX_STARTED, NOW()) duration,
     IFNULL(kt.trx_kill_threshold, 30) threshold, 
     (SELECT COUNT(ot.REQUESTING_TRX_ID) 
       FROM INFORMATION_SCHEMA.INNODB_LOCK_WAITS ot 
       WHERE ot.BLOCKING_TRX_ID = t.TRX_id) waiting
   FROM INFORMATION_SCHEMA.INNODB_TRX t 
   JOIN INFORMATION_SCHEMA.PROCESSLIST p 
     ON ( p.ID = t.TRX_MYSQL_THREAD_ID )
   LEFT JOIN test.innodb_trx_kill_threshold kt 
     ON ( p.USER = kt.user AND LEFT(p.HOST, LOCATE(':', p.HOST) - 1) = kt.host )
   WHERE t.TRX_STARTED + INTERVAL IFNULL(kt.trx_kill_threshold, 30) SECOND  0 ;

DELIMITER ||
CREATE PROCEDURE kill_long_innodb_trx (IN kill_connections INT)
 BEGIN
 DECLARE done INT DEFAULT FALSE;
 DECLARE l_user, l_host VARCHAR(100);
 DECLARE l_timeout, l_duration, l_conn_id, l_other_trx_waiting INT UNSIGNED;
 DECLARE cur CURSOR FOR 
   SELECT user, host, id, duration, threshold, waiting 
   FROM test.innodb_trx_past_threshhold;
 DECLARE CONTINUE HANDLER FOR NOT FOUND SET done = TRUE;

 OPEN cur;

 read_loop: LOOP
  FETCH cur INTO l_user, l_host, l_conn_id, l_duration, l_timeout, l_other_trx_waiting;
  IF done THEN
    LEAVE read_loop;
  END IF;
  INSERT INTO test.innodb_trx_kill_audit
    ( user, host, connection_id, trx_duration, other_trx_waiting)
      VALUES
    ( l_user, l_host, l_conn_id, l_duration, l_other_trx_waiting )
    ON DUPLICATE KEY UPDATE trx_duration = l_duration, 
      other_trx_waiting = GREATEST(other_trx_waiting, l_other_trx_waiting);
  IF kill_connections THEN
    SET @stmt = CONCAT('KILL CONNECTION ', l_conn_id);
    PREPARE stmt FROM @stmt;
    EXECUTE stmt;
  END IF;

 END LOOP;

 CLOSE cur;

 END;
||
DELIMITER ;

CREATE EVENT innodb_trx_killer ON SCHEDULE EVERY 5 SECOND DO CALL kill_long_innodb_trx (1);

And now there’s an EVENT running every 5 seconds which will kill connections having InnoDB transactions older than the threshold (30s, unless configured differently for the user/host combination in the control table).  It will also log information to an audit table (test.innodb_trx_kill_audit) about the connections before they are killed.  One could easily tweak the stored procedure to simply record the audit details instead of killing connections.  Here’s an example of this in action:

mysql> BEGIN;
Query OK, 0 rows affected (0.02 sec)

mysql> INSERT INTO p VALUES (2, 3);
Query OK, 1 row affected (0.00 sec)

mysql> SELECT SLEEP(35);
+-----------+
| SLEEP(35) |
+-----------+
|         0 |
+-----------+
1 row in set (35.00 sec)

mysql> COMMIT;
ERROR 2013 (HY000): Lost connection to MySQL server during query
mysql>

And the audit table:

mysql> SELECT * FROM test.innodb_trx_kill_audit\G
*************************** 1. row ***************************
               id: 1
             user: ODBC
             host: localhost
    connection_id: 157
     trx_duration: 31
other_trx_waiting: 0
        kill_time: 2012-08-28 13:08:07
1 row in set (0.00 sec)

Note:  It’s been pointed out to me that there are a few similar, earlier suggestions on how to do this, or something similar.

Understanding mysql_config_editor’s security aspects

The recent release of 5.6.6 includes a new utility, mysql_config_editor, which makes it easier to interact with MySQL clients securely.  At the same time, it’s easy to overstate the security benefits of using this new tool, and unfortunately, I think a couple of statements in the release notes and documentation go a step too far (something we’re in the process of correcting).  Ronald quoted one of them in his blog:

MySQL now provides a method for storing authentication credentials securely in an option file named .mylogin.cnf.

This enhancement really isn’t about securing passwords at a file-system level.  Don’t assume that the encryption used will prevent attackers from getting passwords stored in the .mylogin.cnf file if they have read access to it – it absolutely will not.  There are multiple ways to extract the password out of this file, some more complex than others.  Don’t make the mistake of believing that the password storage is safe just because the contents are encrypted.

So, if this enhancement isn’t about secure storing of passwords, what is it about?  It makes secure access via MySQL client applications easier to use.  For example, it helps eliminate the need to include a password as an argument to mysql or other command-line clients:

shell>  mysql -hproduction -uroot -pmypass

becomes

shell> mysql --login-path=production

(assuming we used mysql_config_editor to set up a “production” configuration)

That’s obviously more secure if somebody can view the running processes – you don’t want to expose your password like that.   Of course, you could always just include the -p argument and enter the password interactively (that doesn’t echo it to the screen or display it in any process list.  But that doesn’t help with scripts or batch jobs where interactive password entry isn’t an option.  So, great – this enhancement helps eliminate the need for including the password where it can’t be entered interactively.  Sounds great – but this mechanism has long existed in MySQL through the use of configuration files (and you can have as many different configuration files as you want dedicated to various connections).  You can easily put the password under the [client] header, and invoke clients like so:

shell> mysql --defaults-file=~/production.cnf

Again, assuming there’s ~/production.cnf file that has the connection attributes needed defined in the [client] section.

So, if it’s already possible to eliminate passwords for both interactive and batch use cases, why is this needed?  The most immediate answer is that this makes it easier to get password storage – when you do it – right.  In the above example with the production.cnf file, you have to create that file yourself and ensure that it has appropriate privileges assigned.  That’s not the case when you use mysql_config_utility.  Quoting from the manual:

The login file must be readable and writable to the current user, and inaccessible to other users. Otherwise, mysql_config_editor ignores it, and the file is not used by client programs, either.

So one of the principal benefits of mysql_config_editor is ease-of-use – you can rely on it to store your password in a file that is not readable by other users.  That’s not to say that you could not have done exactly the same thing yourself by setting appropriate file permissions on a .cnf file of your own making, but using the tool eliminates the possibility that it gets overlooked.  Note that I believe that this aspect of the feature may not be working properly for Windows at this stage, and I’m in the process of validating and reporting a bug on this.

Now you may be wondering:  If the file encryption doesn’t protect against attackers having file system access, what does it protect against?  It eliminates a password storage location that might be misused by a user in a fit of urgency.  If a user can’t get access (their own) plain-text passwords from .mylogin.cnf, it may throw just one more hurdle preventing passwords from being copied and pasted into other files (which would have the same need for file system restrictions) or command-line parameters.  In short, it reduces the possibility that users may unintentionally use their own passwords in an insecure manner.

All that said, the feature does have some nice ease-of-use aspects that aren’t explicitly security-related.  For example, you don’t even have to store passwords using the utility, and use it as a collection of all your hard-to-remember server locations, and still enter the password interactively:

shell> bin\mysql_config_editor print --all
[slave1]
host = some.really.obscure.slave.host
[slave2]
host = another.really.obscure.slave.host

shell> bin\mysql --login-path=slave1 --user=something -p
Enter password: ****

The options you specify as client application arguments will overwrite whatever is stored in .mylogin.pass:

shell> bin\mysql --login-path=slave1 --host=localhost --user=root
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 5

So, if you do choose to store passwords in your .mylogin.cnf file, and you use the same password for multiple accounts (look at me advocating security), you can have a single entry that remembers your password, and choose the host as an argument to the client.

This can be a useful enhancement for very specific security requirements and helps make it easier to be more secure in handling passwords as well as connections to multiple servers.  At the same time, it should not be confused with a mechanism that prevents attackers from acquiring passwords if they have file-system level access.

 

Bye-bye .MSI (Hello MySQL Installer)

Windows users of MySQL may find something missing when they first download MySQL 5.6.6.  The .MSI package of MySQL for Windows has been eliminated in favor of the MySQL Installer for Windows.  You can still download MySQL in .ZIP format (as well as a source code .ZIP for the GPL-licensed Community edition), but the quickest way to get running MySQL on Windows is the MySQL Installer.  You can download the version of MySQL Installer that includes MySQL 5.6.6 on dev.mysql.com.

The MySQL Installer has a number of improvements over the old .MSI installer and configuration wizard.  Running the 5.6.6 installer on my machine, it detected that ports 3306 and 3307 were already used by other MySQL installations, and suggested port 3308 for the new installation.  It also recognized conflicts with existing Windows service names, and suggests variants that don’t conflict (e.g., suggesting “MySQL56” when “MySQL” is already taken, and “MySQL56_1” when both are taken).  It smartly helps you set an appropriately-strong password for the root account, as well as the option to add new user accounts during initial configuration.  Additionally, you can install (or update) tools that Windows users will find helpful, including MySQL Workbench and the new MySQL Notifier and MySQL for Excel products, as well as various connectors and documentation – all from one interface.

The new installer is roughly the same size as MySQL Server .ZIP downloads (about 200MB).  The package includes MySQL Server, and downloads other products on-demand.  It will also help you check for and install recent updates to any of your installed products.  I ran into a few rough edges while testing (the Windows service created could not be started, and a datadir was created in two different locations), but on whole, it’s a vast improvement over the previous .MSI packages, and there are continuous improvements being made by the great engineering team here at Oracle.

Deprecated in MySQL 5.6 – YEAR(2)

Back in April, I issued a challenge to identify any compelling use case for the YEAR(2) data type.  Months later, I still don’t have such a use case, and the release of 5.6.6 brings deprecation of support for the YEAR(2) datatype.  Trying to create a table with a YEAR(2) column in 5.6.6 or greater will result in conversion to YEAR(4) and a warning:

mysql> SELECT VERSION();
+-----------+
| VERSION() |
+-----------+
| 5.6.6-m9  |
+-----------+
1 row in set (0.00 sec)

mysql> CREATE TABLE y (y YEAR(2));
Query OK, 0 rows affected, 1 warning (0.13 sec)

mysql> SHOW WARNINGS\G
*************************** 1. row ***************************
Level: Warning
Code: 1818
Message: YEAR(2) column type is deprecated. Creating YEAR(4) column instead.
1 row in set (0.00 sec)

The change logs for 5.6.6 highlight this change, and reference the excellent documentation of YEAR(2) limitations and migration strategies.  If you’re not sure whether you have YEAR(2) columns or not, the following command will identify any such columns:

SELECT table_schema, table_name, column_name, column_type
FROM information_schema.columns
WHERE data_type = 'YEAR'
AND column_type != 'year(4)';

Note that some queries (including this one) against INFORMATION_SCHEMA tables can be slow, and may not be suitable for running in against a busy production database server.  In such instances, you may want to execute this against a slave, backup or development machine loaded with the current schema.

 

How will IPv6 changes in 5.6.6 affect you?

As stated in the 5.6.6 release notes, the default value of –bind-address has changed from IPv4-specific “0.0.0.0” to “*” – the latter value which allows MySQL to bind to IPv6 interfaces by default.  There are a few implications to this change.

First, make sure you’ve hardened both your IPv4 and your IPv6 access points to your MySQL server.  If you are running IPv4 interfaces only, you can also change the –bind-address value back to the previous default value of “0.0.0.0”.  Because MySQL now listens by default on both IPv4 and IPv6 interfaces, an installation that has only hardened IPv4 interfaces may find unaddressed vulnerabilities exposed via the IPv6 interface that is now used by default.

Second, the change in defaults can have an impact on startup speed for deployments where there is no IPv6 support.  On my Windows XP laptop, startup stalls for 45 seconds while MySQL lets Windows determine that IPv6 is not supported:

120813 10:17:16 [Note] Server hostname (bind-address): ‘*’; port: 3307
120813 10:18:01 [Note] IPv6 is not available.
120813 10:18:01 [Note]   – ‘0.0.0.0’ resolves to ‘0.0.0.0’;
120813 10:18:01 [Note] Server socket created on IP: ‘0.0.0.0’.

Why Windows takes 45 seconds to figure this out is anybody’s guess, but explicitly setting –bind-address to 0.0.0.0 eliminates this behavior, and server start-up speed is restored:

120813 10:27:39 [Note] Server hostname (bind-address): ‘0.0.0.0’; port: 3307
120813 10:27:39 [Note]   – ‘0.0.0.0’ resolves to ‘0.0.0.0’;
120813 10:27:39 [Note] Server socket created on IP: ‘0.0.0.0’.

Another option is to make sure your OS has IPv6 support enabled (Microsoft documents this procedure for XP in KB article 2478747).

Whether you have IPv6 support enabled or not, make sure you consider how the change in behavior introduced in 5.6.6 will affect you.

 

 

 

Who’s leaking prepared statements?

In my last post, I described a specific problem with prepared statements into which PERFORMANCE_SCHEMA can give visibility.  That made me wonder whether PERFORMANCE_SCHEMA can also be used to identify other areas where prepared statements run into problems.  The most significant problem tends to be leakage of prepared statements.  This can inflate memory usage, both on the server and application side, and it’s not uncommon to find applications which fail to close prepared statements.

So the question is, what can PERFORMANCE_SCHEMA tell us about how connections close (or more importantly, fail to close) prepared statements?

At the most basic level, one can check the number of PREPARE statements executed compared to DEALLOCATE PREPARE, and you can do that using global status variables.  You’re shooting for general equality between (Com_prepare_sql + Com_stmt_prepare) and (Com_stmt_close + Com_dealloc_sql).  I’m using session status below to make it easy to follow, but obviously global status variables will be most interesting in any review of prepared statement leakage:

mysql> flush status;
Query OK, 0 rows affected (0.00 sec)

mysql> SHOW SESSION STATUS LIKE 'com_%prepare%';
+--------------------+-------+
| Variable_name      | Value |
+--------------------+-------+
| Com_prepare_sql    | 0     |
| Com_stmt_prepare   | 0     |
| Com_stmt_reprepare | 0     |
| Com_xa_prepare     | 0     |
+--------------------+-------+
4 rows in set (0.00 sec)

mysql> PREPARE stmt FROM 'SELECT 1';
Query OK, 0 rows affected (0.00 sec)
Statement prepared

mysql> SHOW SESSION STATUS LIKE 'com_%prepare%';
+--------------------+-------+
| Variable_name      | Value |
+--------------------+-------+
| Com_prepare_sql    | 1     |
| Com_stmt_prepare   | 1     |
| Com_stmt_reprepare | 0     |
| Com_xa_prepare     | 0     |
+--------------------+-------+
4 rows in set (0.00 sec)

mysql> SHOW SESSION STATUS LIKE 'Com_stmt_close';
+----------------+-------+
| Variable_name  | Value |
+----------------+-------+
| Com_stmt_close | 0     |
+----------------+-------+
1 row in set (0.00 sec)

mysql> SHOW SESSION STATUS LIKE 'Com_dealloc_sql';
+-----------------+-------+
| Variable_name   | Value |
+-----------------+-------+
| Com_dealloc_sql | 0     |
+-----------------+-------+
1 row in set (0.00 sec)

mysql> DEALLOCATE PREPARE stmt;
Query OK, 0 rows affected (0.00 sec)

mysql> SHOW SESSION STATUS LIKE 'Com_stmt_close';
+----------------+-------+
| Variable_name  | Value |
+----------------+-------+
| Com_stmt_close | 1     |
+----------------+-------+
1 row in set (0.00 sec)

mysql> SHOW SESSION STATUS LIKE 'Com_dealloc_sql';
+-----------------+-------+
| Variable_name   | Value |
+-----------------+-------+
| Com_dealloc_sql | 1     |
+-----------------+-------+
1 row in set (0.00 sec)

That’s useful as a start to identify whether a problem exists, but it doesn’t help isolate the source very well. PERFORMANCE_SCHEMA can take it the next step.

The PERFORMANCE_SCHEMA counters are smarter than the status variables, and we can count the number of prepared statements that have been opened (statement/sql/prepare_sql event for normal statements, statement/com/Prepare event for connections using COM_STMT_PREPARE protocol command) and the number closed (statement/sql/dealloc_sql event for normal statements, statement/com/Close stmt event for COM_STMT_CLOSE command).  Once can write a query which computes the difference between the prepares and closes:

SELECT sp.thread_id, t.processlist_user user, t.processlist_host host,
sp.count_star - sd.count_star open_com_query_ps,
cp.count_star - cc.count_star open_com_prepare_ps
FROM
( SELECT COUNT_STAR,
THREAD_ID
FROM events_statements_summary_by_thread_by_event_name
WHERE event_name = 'statement/sql/prepare_sql' ) sp
JOIN
( SELECT COUNT_STAR,
THREAD_ID
FROM events_statements_summary_by_thread_by_event_name
WHERE event_name = 'statement/com/Prepare' ) cp
ON (cp.THREAD_ID = sp.THREAD_ID)
JOIN
( SELECT COUNT_STAR,
THREAD_ID
FROM events_statements_summary_by_thread_by_event_name
WHERE event_name = 'statement/sql/dealloc_sql' ) sd
ON (sd.THREAD_ID = sp.THREAD_ID)
JOIN
( SELECT COUNT_STAR,
THREAD_ID
FROM events_statements_summary_by_thread_by_event_name
WHERE event_name = 'statement/com/Close stmt' ) cc
ON (cc.THREAD_ID = sp.THREAD_ID)
JOIN threads t ON (t.thread_id = sp.thread_id)
ORDER BY GREATEST(open_com_query_ps, open_com_prepare_ps) DESC;

That’s nice, and it gives us output like the following:

+-----------+------+-----------+-------------------+---------------------+
| thread_id | user | host      | open_com_query_ps | open_com_prepare_ps |
+-----------+------+-----------+-------------------+---------------------+
|        22 | root | localhost |                 3 |                   0 |
+-----------+------+-----------+-------------------+---------------------+
1 row in set (0.08 sec)

Of course, there are still ways in which this isn’t 100% accurate. For example, one can do stuff like this:

+-----------+------+-----------+-------------------+---------------------+
| thread_id | user | host      | open_com_query_ps | open_com_prepare_ps |
+-----------+------+-----------+-------------------+---------------------+
|        22 | root | localhost |                 3 |                   0 |
+-----------+------+-----------+-------------------+---------------------+
1 row in set (0.08 sec)

mysql> PREPARE stmt FROM 'SELECT 1';
Query OK, 0 rows affected (0.00 sec)
Statement prepared

mysql> PREPARE stmt FROM 'SELECT 2';
Query OK, 0 rows affected (0.00 sec)
Statement prepared

mysql> SELECT sp.thread_id, t.processlist_user user, t.processlist_host hos
-> sp.count_star - sd.count_star open_com_query_ps,
-> cp.count_star - cc.count_star open_com_prepare_ps
-> FROM
->  ( SELECT COUNT_STAR,
->     THREAD_ID
->    FROM events_statements_summary_by_thread_by_event_name
->    WHERE event_name = 'statement/sql/prepare_sql' ) sp
->  JOIN
->  ( SELECT COUNT_STAR,
->     THREAD_ID
->    FROM events_statements_summary_by_thread_by_event_name
->    WHERE event_name = 'statement/com/Prepare' ) cp
->  ON (cp.THREAD_ID = sp.THREAD_ID)
->  JOIN
->  ( SELECT COUNT_STAR,
->     THREAD_ID
->    FROM events_statements_summary_by_thread_by_event_name
->    WHERE event_name = 'statement/sql/dealloc_sql' ) sd
->  ON (sd.THREAD_ID = sp.THREAD_ID)
->  JOIN
->  ( SELECT COUNT_STAR,
->     THREAD_ID
->    FROM events_statements_summary_by_thread_by_event_name
->    WHERE event_name = 'statement/com/Close stmt' ) cc
->  ON (cc.THREAD_ID = sp.THREAD_ID)
->  JOIN threads t ON (t.thread_id = sp.thread_id)
->  WHERE sp.count_star - sd.count_star <> 0
->   OR cp.count_star - cc.count_star <> 0
-> ORDER BY GREATEST(open_com_query_ps, open_com_prepare_ps) DESC;
+-----------+------+-----------+-------------------+---------------------+
| thread_id | user | host      | open_com_query_ps | open_com_prepare_ps |
+-----------+------+-----------+-------------------+---------------------+
|        22 | root | localhost |                 5 |                   0 |
+-----------+------+-----------+-------------------+---------------------+
1 row in set (0.09 sec)

In the above, I prepared the same statement handler twice without closing it.

I know Marc has some ideas for further PERFORMANCE_SCHEMA monitoring of prepared statements, but what’s implemented today is a good step forward. It’s easier to tie a potential prepared statement leak to specific accounts, and to isolate the leakage to protocol-based prepared statements (drivers using COM_PREPARE) or the more generic COM_QUERY interface used by the mysql command-line interface.

Prepared statement peculiarities (P_S to the rescue)

Prepared statements have been with MySQL since version 4.1, including the protocol plumbing that helps support it.  What I didn’t realize – until a recent expedition through a general query log – is that the mysql command-line interface doesn’t implement the protocol commands that support this explicitly.  I came to this realization after observing a byproduct of this behavior.

The initial observation that triggered this exploration was noting that PREPARE and EXECUTE statements, when issued from the mysql command-line interface, result in two entries per command in the general query log:

6 Query    PREPARE stmt FROM 'SELECT RAND()'
6 Prepare    SELECT RAND()
6 Query    EXECUTE stmt
6 Execute    SELECT RAND()

Contrast this behavior with what is seen when a client sends COM_PREPARE and COM_EXECUTE, such as below with Connector/J (and useServerPrepStmts=true):

14 Prepare    SELECT * FROM t1 WHERE a = ?
14 Execute    SELECT * FROM t1 WHERE a = 2

This was of interest to me because I had a script to take the contents of the general query log, separate them into per-connection files containing the SQL commands, to assist in diagnosing application behavior when developers can’t effectively isolate or describe problematic behavior.  By stripping out certain commands (connect/disconnect) and transforming others (Init DB to the original USE [db] command), this script could potentially help us build repeatable test cases faster.

Of course, that doesn’t work if prepared statement commands are duplicated sometimes.

The general query log helpfully records the type of command executed.  From the mysql cli, you can see “Query”, which corresponds to the COM_QUERY command in the protocol.  You also see the “Prepare” and “Execute”, the latter of which has the interpreted values.  Executed from Connector/J, though, you only see the “Prepare” and “Execute” because the driver is sending COM_STMT_PREPARE and COM_STMT_EXECUTE directly.  The protocol documentation describes these commands well.  That gives me two options, if I want to transform the general query log into a somewhat accurate series of SQL statements:

  1. Throw away all of the Prepare and Execute events, hoping that everybody executing prepared statements is using the cli (or COM_QUERY), and replay only the COM_QUERY.
  2. Parse and discard any COM_QUERY commands that invoke PREPARE or EXECUTE, and just use the generated SQL found in the “Execute” entries in the general query log.

I kind of understand why the mysql cli doesn’t implement the protocol commands for prepared statements, even if it does parse and transform other commands (e.g., “USE db”).  The server-side support allows prepared statements to be used at the application level even if the driver doesn’t support it at the protocol level.  And from a debugging perspective, it’s very nice to see the generated SQL in the general query log.  Because the general query log includes every command it gets, before execution is started, I also understand why the the original query received has to show there, unaltered.  I do wish, however, that there was a flag in the general query log to indicate that the Prepare or Execute being logged was internally-generated from a COM_QUERY.

The duplication of commands in the general query log did make me wonder about performance.  It seems pretty clear that there’s some overhead added when COM_QUERY is used to send PREPARE or EXECUTE commands to the server – at the very minimum, the original statement gets logged to the general query log and some parsing done to redirect execution to code paths handling preparation or execution of prepared statements.  For the fun of it, I thought I would benchmark what the performance difference is on my (admittedly slow) laptop, using Java.  Here’s what the code looks like:

public static void testPSPerformance() throws Exception {
 Class.forName("com.mysql.jdbc.Driver");
 Properties props = new Properties();
 props.setProperty("user", "root");
 props.setProperty("useServerPrepStmts", "true");
 Connection conn =
  DriverManager.getConnection("jdbc:mysql://localhost:3306/test", props);
 System.out.println("Connected!");
 PreparedStatement ps = conn.prepareStatement("SELECT RAND()");
 long start = System.currentTimeMillis();
 for(int i = 0; i < 1000000; i++){ ps.execute(); }
 long end = System.currentTimeMillis();
 System.out.println("Using COM_PREPARE:  " + (end - start));

 Statement stmt = conn.createStatement();
 stmt.execute("PREPARE stmt FROM 'SELECT RAND()'");
 final String ex = "EXECUTE stmt";
 start = System.currentTimeMillis();
 for(int i = 0; i < 1000000; i++){ stmt.execute(ex); }
 end = System.currentTimeMillis();
 System.out.println("Using COM_QUERY:  " + (end - start));
}

The end results show about 10% performance loss by using COM_QUERY, although the actual difference may be more or less depending on your driver and deployment environment.  For example, the Connector/J Java code path for using Statement objects is different than using server-side PreparedStatements, and overhead may be added or removed there.  Here are the results from my testing:

Using COM_PREPARE:  80516
Using COM_QUERY:  90109

Using COM_PREPARE:  80547
Using COM_QUERY:  87594

Using COM_PREPARE:  81344
Using COM_QUERY:  89781

As you might expect, the overall execution time as well as the performance difference increases when the general query log is enabled:

Using COM_PREPARE:  100031
Using COM_QUERY:  126485

Again, you won’t want to draw any solid conclusions from the above about whether use of prepared statements in the same way as the mysql cli does represents performance problems for you.  But it might be worth checking.  So, how can that be done?  Unfortunately, the relevant status variables don’t distinguish between when a statement is prepared using COM_PREPARE or COM_QUERY (UPDATE: You can evaluate whether COM_QUERY or COM_PREPARE is used by subtracting Com_prepare_sql from Com_stmt_prepare. The latter is always incremented for PREPARE statements, regardless of whether they are issued as part of a COM_QUERY or COM_PREPARE command, while the former is only incremented when COM_QUERY is used):

mysql> FLUSH STATUS;
Query OK, 0 rows affected (0.00 sec)

mysql> SHOW SESSION STATUS LIKE '%prep%';
+---------------------+-------+
| Variable_name       | Value |
+---------------------+-------+
| Com_prepare_sql     | 0     |
| Com_stmt_prepare    | 0     |
| Com_stmt_reprepare  | 0     |
| Com_xa_prepare      | 0     |
| Handler_prepare     | 0     |
| Prepared_stmt_count | 1     |
+---------------------+-------+
6 rows in set (0.00 sec)

mysql> prepare stmt from 'SELECT 1';
Query OK, 0 rows affected (0.00 sec)
Statement prepared

mysql> SHOW SESSION STATUS LIKE '%prep%';
+---------------------+-------+
| Variable_name       | Value |
+---------------------+-------+
| Com_prepare_sql     | 1     |
| Com_stmt_prepare    | 1     |
| Com_stmt_reprepare  | 0     |
| Com_xa_prepare      | 0     |
| Handler_prepare     | 0     |
| Prepared_stmt_count | 1     |
+---------------------+-------+
6 rows in set (0.00 sec)

So, how can you determine whether your use of prepared statements uses COM_PREPARE or COM_QUERY?  Using PERFORMANCE_SCHEMA in 5.6, it’s easy!  Here’s the query:

SELECT
user,
host,
count_star,
IF(event_name = 'statement/sql/prepare_sql',
'COM_QUERY', 'COM_PREPARE') command
FROM events_statements_summary_by_account_by_event_name
WHERE EVENT_NAME IN
('statement/sql/prepare_sql', 'statement/com/Prepare');

Here it is in action:

mysql> SELECT
->  user,
->  host,
->  count_star,
->  IF(event_name = 'statement/sql/prepare_sql',
->   'COM_QUERY', 'COM_PREPARE') command
-> FROM events_statements_summary_by_account_by_event_name
-> WHERE EVENT_NAME IN
-> ('statement/sql/prepare_sql', 'statement/com/Prepare');
+------+-----------+------------+-------------+
| user | host      | count_star | command     |
+------+-----------+------------+-------------+
| NULL | NULL      |          0 | COM_QUERY   |
| NULL | NULL      |          0 | COM_PREPARE |
| root | localhost |          4 | COM_QUERY   |
| root | localhost |          3 | COM_PREPARE |
+------+-----------+------------+-------------+
4 rows in set (0.00 sec)

mysql> prepare stmt from 'SELECT 1';
Query OK, 0 rows affected (0.00 sec)
Statement prepared

mysql> SELECT
->  user,
->  host,
->  count_star,
->  IF(event_name = 'statement/sql/prepare_sql',
->   'COM_QUERY', 'COM_PREPARE') command
-> FROM events_statements_summary_by_account_by_event_name
-> WHERE EVENT_NAME IN
-> ('statement/sql/prepare_sql', 'statement/com/Prepare');
+------+-----------+------------+-------------+
| user | host      | count_star | command     |
+------+-----------+------------+-------------+
| NULL | NULL      |          0 | COM_QUERY   |
| NULL | NULL      |          0 | COM_PREPARE |
| root | localhost |          5 | COM_QUERY   |
| root | localhost |          3 | COM_PREPARE |
+------+-----------+------------+-------------+
4 rows in set (0.00 sec)

mysql> -- Issue prepared statement from Connector/J:
mysql> SELECT
->  user,
->  host,
->  count_star,
->  IF(event_name = 'statement/sql/prepare_sql',
->   'COM_QUERY', 'COM_PREPARE') command
-> FROM events_statements_summary_by_account_by_event_name
-> WHERE EVENT_NAME IN
-> ('statement/sql/prepare_sql', 'statement/com/Prepare');
+------+-----------+------------+-------------+
| user | host      | count_star | command     |
+------+-----------+------------+-------------+
| NULL | NULL      |          0 | COM_QUERY   |
| NULL | NULL      |          0 | COM_PREPARE |
| root | localhost |          5 | COM_QUERY   |
| root | localhost |          4 | COM_PREPARE |
+------+-----------+------------+-------------+
4 rows in set (0.00 sec)

The original problem may not be of much concern to you, but the power of PERFORMANCE_SCHEMA to dig deeper into server behavior is something that is applicable regardless of whether prepared statements are a concern for you or not.

See you at MySQL Innovation Day!

I’m excited to attend the MySQL Innovation Day at Oracle headquarters in Redwood Shores next Tuesday. There are some great sessions planned on MySQL advances, including:

  • InnoDB online operations
  • Global transaction IDs for replication and failover
  • NoSQL interfaces to InnoDB and NDB
  • PERFORMANCE_SCHEMA improvements
  • Optimizer enhancements

This is a great opportunity for MySQL users to get details straight from MySQL developers on upcoming features.  While the focus is entirely on what is being done in Engineering to drive MySQL forward, I’ll be around to chat about MySQL Support at the community reception in the afternoon.  If you’re a MySQL Support user, or just interested in learning more about the value MySQL Support can offer, stop by and say hi!

Registration for Innovation Day on-site remains open, so if you can make time this Tuesday, join us!  And if you can’t join us in person, there’s also the live webcast option!

Overlooked MySQL 5.6 New Features – TIMESTAMP and DATETIME improvements

Continuing my little tour of possibly overlooked new MySQL 5.6 features, I came across this oft-requested gem:  For eons, MySQL has supported TIMESTAMP columns that could be automatically populated at row insertion and/or row update time.  It’s a handy feature to have, but it has suffered from a pretty significant limitation – only one TIMESTAMP column per table could be automatically updated, forcing you to choose whether leverage this functionality for record creation or last updated time – or bypass the MySQL feature and write the logic to maintain a TIMESTAMP or other temporal column via your application code.  Well, all that has changed with 5.6.5, and you can now do this:

mysql> CREATE TABLE v (
->  a INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
->  b INT,
->  created TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
->  updated TIMESTAMP ON UPDATE CURRENT_TIMESTAMP
->  );
Query OK, 0 rows affected (0.22 sec)

mysql> INSERT INTO v (a, b) VALUES (NULL, 1);
Query OK, 1 row affected (0.05 sec)

mysql> SELECT * FROM v;
+---+------+---------------------+---------------------+
| a | b    | created             | updated             |
+---+------+---------------------+---------------------+
| 1 |    1 | 2012-05-29 11:25:20 | 0000-00-00 00:00:00 |
+---+------+---------------------+---------------------+
1 row in set (0.02 sec)

mysql> UPDATE v SET b = b+1;
Query OK, 1 row affected (0.05 sec)
Rows matched: 1  Changed: 1  Warnings: 0

mysql> SELECT * FROM v;
+---+------+---------------------+---------------------+
| a | b    | created             | updated             |
+---+------+---------------------+---------------------+
| 1 |    2 | 2012-05-29 11:25:20 | 2012-05-29 11:25:43 |
+---+------+---------------------+---------------------+
1 row in set (0.00 sec)

For the record, here’s the error message you get when you try to create the same table in 5.5:

mysql> CREATE TABLE v (
    ->  a INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
    ->  b INT,
    ->  created TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    ->  updated TIMESTAMP ON UPDATE CURRENT_TIMESTAMP
    ->  );
ERROR 1293 (HY000): Incorrect table definition; there can be only one TIMESTAMP
column with CURRENT_TIMESTAMP in DEFAULT or ON UPDATE clause

In addition, 5.6.5 expands the automatic population to DATETIME columns, as well

mysql> DROP TABLE v;
Query OK, 0 rows affected (0.00 sec)

mysql> CREATE TABLE v (
->  a INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
->  b INT,
->  created DATETIME DEFAULT CURRENT_TIMESTAMP,
->  updated DATETIME DEFAULT NULL ON UPDATE CURRENT_TIMESTAMP
->  );
Query OK, 0 rows affected (0.02 sec)

mysql>
mysql> INSERT INTO v (a, b) VALUES (NULL, 1);
Query OK, 1 row affected (0.00 sec)

mysql> SELECT * FROM v;
+---+------+---------------------+---------+
| a | b    | created             | updated |
+---+------+---------------------+---------+
| 1 |    1 | 2012-05-29 11:39:41 | NULL    |
+---+------+---------------------+---------+
1 row in set (0.00 sec)

 

Many thanks to the MySQL Engineering team in general, and Martin in particular, for implementing this needed feature!

You can read more about this important change in Martin’s blog, the 5.6.5 change logs, and the 5.6 documentation.

Overlooked MySQL 5.6 New Features – WL#5217

There’s a lot of great new features in MySQL 5.6 DMRs – almost too many to keep track of.  And while a lot of (justified) attention is given to the headline-grabbing features of 5.6 (memcached APIs! global transaction ids! improved PERFORMANCE_SCHEMA!), I’m often curious about the new features that don’t make as big a splash.  I thought I would look at one such new feature – WorkLog #5217.  I’m not telling you what this WorkLog is yet; test your knowledge of 5.6 features by seeing if you can figure it out from the following scenario.  Imagine the following table data:

mysql> SELECT * FROM p;
+------+------+
| a    | b    |
+------+------+
|    1 |    1 |
|   11 |    1 |
|   21 |    1 |
+------+------+
3 rows in set (0.00 sec)

OK, try to write an UPDATE statement that increments b for the row where a = 11.  Easy, right?

UPDATE p SET b = b+1 WHERE a = 11;

OK, now do it without a WHERE clause.  Maybe that made you think for a minute, but perhaps you came up with something like this:

UPDATE p SET b = b+1 ORDER BY a = 11 DESC LIMIT 1;

Nice!  Now do it without ORDER BY or LIMIT.  Can you do it?

There’s a hint in the table structure:

mysql> SHOW CREATE TABLE p\G
*************************** 1. row ***************************
Table: p
Create Table: CREATE TABLE `p` (
`a` int(11) DEFAULT NULL,
`b` int(11) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1
/*!50100 PARTITION BY RANGE (a)
(PARTITION p0 VALUES LESS THAN (10) ENGINE = InnoDB,
PARTITION p1 VALUES LESS THAN (20) ENGINE = InnoDB,
PARTITION p2 VALUES LESS THAN (30) ENGINE = InnoDB) */
1 row in set (0.00 sec)

You’ve probably figured out that 5.6 adds the ability to explicitly restrict statements to specific partitions, even if you haven’t found or reviewed the excellent documentation in the manual.  In 5.6, the following works:

mysql> UPDATE p PARTITION(p1) SET b = b+1;
Query OK, 1 row affected (0.00 sec)
Rows matched: 1  Changed: 1  Warnings: 0

mysql> SELECT * FROM p;
+------+------+
| a    | b    |
+------+------+
|    1 |    1 |
|   11 |    2 |
|   21 |    1 |
+------+------+
3 rows in set (0.00 sec)

The documentation on this is very good, and provides some useful examples of how this might prove useful, including showing how this works not only with UPDATE statements, but INSERT, SELECT, DELETE, REPLACE and LOAD DATA/XML statements.  There’s also the WorkLog and associated bugs.mysql.com feature requests.  The documentation says the following about UPDATE statements:

UPDATE statements using explicit partition selection behave in the same way; only rows in the partitions referenced by the PARTITION option are considered when determining the rows to be updated, as can be seen by executing the following statements.

It then provides an example that is similar to the solution for the problem posed above.  I still had one question, though – what would happen if I did an UPDATE on a partition key column that moves a row from one partition to another?  I get an error:

mysql> UPDATE p PARTITION(p1) SET a = 22 WHERE a = 11;
ERROR 1745 (HY000): Found a row not matching the given partition set

But if I include both the source partition and the target partition, all’s well:

mysql> UPDATE p PARTITION(p1,p2) SET a = 22 WHERE a = 11;
Query OK, 1 row affected (0.00 sec)
Rows matched: 1  Changed: 1  Warnings: 0

How do I know the data was actually moved from p1 to p2, and that mysqld isn’t just pulling my leg?  Easy:

mysql> SELECT * FROM p PARTITION(p2);
+------+------+
| a    | b    |
+------+------+
|   21 |    1 |
|   22 |    1 |
+------+------+

Awesome.  You might want to download the 5.6.5 Development Milestone Release yourself and try this and other new 5.6 features (you’ll need to click on the “Development Releases” tab to access it, something which took me longer to figure out than I care to admit).