SF MySQL Meetup Presentation: Changes in MySQL 5.7
Last Wednesday, I spoke at the San Francisco MySQL Meetup on the topic of changes coming in MySQL 5.7 (and later). We actually went through two different slide decks; the first on features being considered for deprecation in MySQL 5.7 (or later), and the second set providing a brief overview of the new features and benefits already introduced in MySQL 5.7 via the development milestone releases (DMRs) published to date. A big thanks to the entire SF Meetup group, and in particular the organizers (Erin, Mike and Darren), for having me. The event was streamed and recorded, and you can view the full presentation on YouTube. The slide deck can be found here.
The discussion around proposed deprecation was good, and this blog serves to document my own notes about what was said – giving others an opportunity to provide additional feedback. Feel free to comment either to reinforce or offer alternative perspectives on the feedback noted. There’s also some post-presentation clarification mixed in:
The proposal to change replication defaults to improve durability (slide 11) were well-received – there were no objections to the proposal to set:
- sync_binlog = 1
- master-info-repository = TABLE
- relay-log-info-repository = TABLE
SHOW ENGINE INNODB MUTEX (and P_S)
The proposal to deprecate SHOW ENGINE INNODB MUTEX (slides 12 and 13) produced some lively discussion, most of which centered generally around PERFORMANCE_SCHEMA in general, rather than the specifics of SHOW ENGINEE INNODB MUTEX (which only a couple of people indicated they had used). The concerns were primarily around the overhead of enabling PERFORMANCE_SCHEMA in order to obtain the same level of visibility into InnoDB mutex usage that exists today without it. It was suggested that existing mutex information could be displayed in PERFORMANCE_SCHEMA without otherwise enabling PERFORMANCE_SCHEMA (or having associated overhead).
This topic prompted a more general discussion on acceptable overhead for PERFORMANCE_SCHEMA, and there were a variety of opinions voiced on this. One important note is that very few attendees had yet used PERFORMANCE_SCHEMA in MySQL 5.6, so the discussion was very generic rather than dealing with specific performance problems observed by users. In general, consensus was that low single-digit performance overhead (<5%, some said 1-3%) was the acceptable threshold. There was some discussion about how different deployments will have different thresholds, and it was noted that PERFORMANCE_SCHEMA instrumentation can be selectively enabled/disabled today in order to help users eliminate collection overhead for instrumentation deemed too expensive or not useful. For my part, I think the value proposition of PERFORMANCE_SCHEMA will change once people experience the benefits of it.
InnoDB Monitor Tables
A proposal to deprecate the InnoDB monitor tables (slides 14 and 15) in favor of INFORMATION_SCHEMA tables was generally supported, although again only a handful of attendees indicated they used these. There were two use cases highlighted that may not be well met by INFORMATION_SCHEMA alternatives, both of which are associated with the fact that enabling this feature directs output to the server error log:
- Diagnostic information can be collected even without establishing a connection to the database. At least one person has found this useful in the past when the MySQL server refused new inbound connections.
- Diagnostic information can be collected and retained over a period of time in the error log. Providing a means to retain diagnostic information over time might prove useful.
Those raising the above concerns indicated that they weren’t severe enough to derail a plan to deprecate the existing InnoDB monitor table implementation, but that they had found this behavior useful previously.
ALTER IGNORE TABLE
The primary use case which conflicts with deprecation of ALTER IGNORE TABLE syntax (slides 16 and 17) is when you want to add a unique constraint to an existing column which has duplicate values, and you want to eliminate all but one duplicate, but you don’t care which one is retained. This is something which can be accomplished through a series of SQL commands, but ALTER IGNORE TABLE is syntactic sugar which makes it all easy. We should consider whether there are other ways to easily accomplish the same task (non-deterministic duplicate elimination).
Simplify SQL_MODE options
Presented on slides 18 through 20, this proposal to simplify SQL_MODE options produced some very interesting feedback. The most interesting points can best be summarized as:
- Give us behavior as close to SQL standards as possible by default
- Make applications dependent upon legacy behavior (and there was widespread recognition that there are many) explicitly enable non-compliant (legacy) behavior.
- Requiring applications to set a SQL mode for legacy, non-compliant behavior – where they don’t do that today – isn’t a major concern.
Instead of trying to imply NO_ZERO_DATE with STRICT_TRANS_TABLES, it was suggested that we turn things around and have SQL mode flags such as ALLOW_ZERO_DATE, NO_ERROR_FOR_DIVISION_BY_ZERO, ALLOW_ZERO_IN_DATE.
In particular, I’d love to get feedback from the larger community on this proposal. We’re obviously eager to avoid negatively impacting developers, yet we want to move MySQL towards a default experience that’s as much compatible with SQL standards as can be. The changes to make the default experience different is relatively simple technically, but if the cost is hypothetically breaking any of the many widely-deployed applications when installed using MySQL 5.7 (note: Morgan is explicitly testing WordPress, with some success), we need to plan changes carefully, here.
Nobody really voiced concerns over the plan to remove flags for EXPLAIN PARTITIONS and EXPLAIN EXTENDED. Since a new column for partition information is returned as of 5.7.3 (output comparable to EXPLAIN PARTITIONS), any tooling using EXPLAIN output should consider using FORMAT = JSON, which is meant to provide a format more consumable by tooling (and already has partitioning information included).
This is more clarification of my comments at the meeting; this slide (number 23) simply highlights that the query cache was disabled by default in MySQL 5.6 (there was general applause for this idea), and that while we have ideas on more useful implementations of query caches, nothing is being done in MySQL 5.7 here.
Deprecation of NULL synonym \N
No real comments here (slides 24 and 25) – as long as LOAD DATA INFILE continues to support \N, everybody was happy.
It was interesting to hear nobody advocate for FEDERATED (slide 26) at this meeting. There are some valid use cases for it, and we think we have at least some of those use cases covered in 5.7 with support for multi-source replication, but I was expecting somebody to say they have a specific use case for FEDERATED that they can’t live with out. It turned out nobody present uses FEDERATED, and were unanimous in being quite willing to see it deprecated.
If I had to pick between FEDERATED and MyISAM MERGE storage engines, I’d have speculated before the meeting that FEDERATED would have more interest than MyISAM MERGE (slide 27). It turned out to be the opposite, but just barely. The use cases for MERGE have largely evaporated with the addition of partition management in MySQL 5.6, but there was one use case where MERGE had proven useful in the past: Quickly dealing with MyISAM tables which had reached maximum rows (by renaming the table and creating a MERGE table referenced by the application). That said, this use case is very infrequent, and becoming less and even the attendee who raised it said it’s been years since he dealt with MyISAM table problems like that.
As a side note, I’ll make the same request here that I made during my presentation: If you have a use case which requires using MyISAM tables instead of InnoDB tables, please let us know what it is. We’ve seen a lot of improvements in InnoDB, both from a performance perspective as well as features implemented, and we want to eliminate gaps which force people to say, “I’d like to use InnoDB, but for …” The one highlighted use case that surfaced during the meetup was performance of bulk-load data.
Again, a hearty thanks to the SF MySQL Meetup Group for the feedback provided on proposed deprecation and behavioral changes in MySQL 5.7. We welcome further comments from the group or from the MySQL community at large – please let us know if you agree with these proposals, and especially if you disagree! Comment here, or on one of Morgan’s blog posts where he covers these proposals in more detail.