Tag Archives: MySQL Cluster

More GlassFish loadbalancing tips for Connector/J

Almost two weeks ago, I encouraged GlassFish users who need load-balanced JDBC connections to MySQL Cluster (or master-master replicated MySQL Server) to set the loadBalanceValidateConnectionOnSwapServer property to true in order to help ensure the connection chosen at re-balance is still usable.  That advice triggered finding a bug (14563127) which will cause the following Exception message:

No operations allowed after connection closed. Connection closed after inability to pick valid new connection during fail-over.

If you implemented the loadBalanceValidateConnectionOnSwapServer property and are seeing the above error message, updating your driver to the newly-released 5.1.22 build will likely solve this problem.  Here’s a quick look at internals of Connector/J re-balance operations, some additional configuration suggestions, and details on the bug fixed in 5.1.22: Continue reading More GlassFish loadbalancing tips for Connector/J

Learn From MySQL Support Staff at MySQL Connect!

Members of the MySQL Support Team wear a number of different hats here at Oracle.  Obviously, our top priority is to provide amazing technical support that makes customers rave.  We also have a team dedicated to processing bug reports from the MySQL Community.  Some of us are active bloggers or assist on mailing lists or forums, while others find other ways to contribute to the MySQL Community.  We help out with QA and product planning, write books, and the consultative support aspects of MySQL subscriptions allow us the opportunity to help train and advise customers on best practices and successful MySQL-backed application architecture.  In short, we’re deeply invested in making sure that MySQL deployments are successful.

This year, we’re privileged to send some of these great MySQL experts to the inaugural MySQL Connect Conference this September 29-30, and it will be a great opportunity for you to experience first-hand the value that MySQL Support can provide:

Be sure to stop by the MySQL Support booth in the demo grounds at MySQL Connect and meet these MySQL experts – they’ll be happy to help you solve any of your tricky MySQL problems.

If you’re an Oracle DBA just getting started on MySQL, it’s worth the $100 fee to add MySQL Connect to your OOW registration, but Ben Krug will also be leading a session at OOW with the goal of Demystifying MySQL for Oracle DBAs and Developers.

And if California is just too far to travel, perhaps you can catch Uma Bhat‘s presentation at PyCon India, Introduction to MySQL Connector/Python.

We hope to see you at one of these upcoming conferences!

Load-balanced JDBC Tip for GlassFish Deployments

Having supported a number of successful load-balanced JDBC applications using MySQL Cluster and MySQL Connector/J over the years, I’ve found a few problems that are unique to specific Java app servers.  A recent customer inquiry reminded me of a GlassFish-specific issue, and the Connector/J connection property we introduced to help solve it.  I thought it might be useful to document this here for any GlassFish users looking to deploy a load-balanced JDBC application with MySQL (Cluster or multi-master replication).

If you’re entirely new to the load-balancing functionality in MySQL Connector/J, you may want to review some earlier posts.  In particular, it’s important to understand how a load-balanced Connector/J (logical) Connection object maintains multiple physical connections (small “c”) – up to one for each load-balanced server.  When the driver determines it’s appropriate to re-balance the connection, a new host is chosen, and if a physical connection to that server already exists within the Connection object, it will be reused, rather than creating a new physical connection.  Because we’re maintaining multiple physical connections, it’s important that the connection validation process used by the application server trigger validation of all underlying physical connections – not just the currently-active physical connection.  In Connector/J, load-balanced Connection objects will validate all underlying physical connections when a query starting with “/* ping */” (exactly) is encountered.

And this is where GlassFish users sometimes have problems – it can be difficult (or impossible, in some versions) to define a custom validation query that starts with “/* ping */” to trigger the necessary validation of all physical connections.  Here’s the scenario that proves most problematic:

  1. The application gets a Connection object from the connection pool, and the Server A is the currently-active server to which traffic is routed by the Connection object.
  2. The application issues commit(), which triggers re-balance, and Server B is made active.  The physical connection to Server A is retained.
  3. The application returns the Connection to the connection pool.
  4. The connection pool periodically validates the Connection using “SELECT 1” – a query that is executed against the currently-active physical connection to Server B.  The physical connection to Server A remains idle, and eventually dies (unnoticed).
  5. The application is given this same Connection from the connection pool later, perhaps even validates it on check-out (again with “SELECT 1”), and uses it successfully.
  6. The application issues commit(), triggering re-balance again, and Server A is selected and made active.
  7. The application tries to use the Connection again, and it fails due to CommunicationException, as the physical connection to Server A has been left idle too long.

Explicitly in support of users of GlassFish or other JDBC connection pools where the connection validation query cannot be set to start with “/* ping */”, we added the loadBalanceValidateConnectionOnSwapServer connection property starting in MySQL Connector/J version 5.1.11.  In the example above, this affects step #6.  When the Connection re-balances and chooses a new physical connection, that physical connection is explicitly validated before returning control to the application.  Thus, when the application tries to use the Connection object again in step #7, it should find a valid Connection.

Having confidence that the Connection won’t choose an idle physical connection at re-balance sounds good, right?  So why not make this the default behavior?

Well, this adds some overhead to re-balance operations – in particular, commit().  If you have a JDBC connection pool that can maintain load-balanced connections adequately using the /*ping */ connection validation query, checking the physical connection at each commit() can be unwelcome overhead.  It also doesn’t guarantee the Connection won’t throw a CommunicationException – with applications connecting over a network, a disconnect is possible at any time.

Hopefully this tip will help users of GlassFish (and other JDBC connection pools where validation queries cannot be specified) be more successful with load-balanced MySQL deployments.

See you at MySQL Innovation Day!

I’m excited to attend the MySQL Innovation Day at Oracle headquarters in Redwood Shores next Tuesday. There are some great sessions planned on MySQL advances, including:

  • InnoDB online operations
  • Global transaction IDs for replication and failover
  • NoSQL interfaces to InnoDB and NDB
  • PERFORMANCE_SCHEMA improvements
  • Optimizer enhancements

This is a great opportunity for MySQL users to get details straight from MySQL developers on upcoming features.  While the focus is entirely on what is being done in Engineering to drive MySQL forward, I’ll be around to chat about MySQL Support at the community reception in the afternoon.  If you’re a MySQL Support user, or just interested in learning more about the value MySQL Support can offer, stop by and say hi!

Registration for Innovation Day on-site remains open, so if you can make time this Tuesday, join us!  And if you can’t join us in person, there’s also the live webcast option!

MySQL Cluster Quick Start Script

Many Windows users first looking to experiment with MySQL Cluster may find the process of getting everything up and running daunting.  Fortunately, there’s MySQL Cluster Manager (MCM) which can help jump-start that process – and that’s all many people will need.

In my work, I frequently have to install specific version of MySQL Server or Cluster to validate behavior of a handful of commands, and MCM is a bit overkill.  I have a couple of tricks that I use to quickly set up MySQL Cluster on Windows environments:

  1. I always use the .ZIP packages
  2. I have a set cluster data directory (in my case, D:\cluster-data)
  3. I have a standard Cluster configuration file (config.ini) that I use, with two data nodes, one management node, and two SQL nodes
  4. I have a script I use to do some setup and start the necessary processes, after first making sure no other Cluster is running and there are no port conflicts for MySQL SQL nodes

The Cluster configuration file is very sparse:


[ndbd default]

DataDir= D:\cluster-data
# Data Nodes
# One entry for each node
# SQL Nodes
# One entry for each node

If you use this configuration file, make sure that you also create the DataDir you specify – the Cluster management node will not start if it does not exist.

The batch file I use is similarly simple, but it saves me plenty of time when I have to quickly deploy a specific version for a quick 5-minute test:

REM stop MySQL SQL nodes running on ports 3307 or 3308, if any exist
REM (ignore errors if mysqld isn't available on the port)
bin\mysqladmin -uroot -P3307 shutdown
bin\mysqladmin -uroot -P3308 shutdown

REM stop MySQL data and management nodes, if running (connection failure errors can be ignored)
bin\ndb_mgm -e "shutdown"

REM start the MySQL Cluster management node
start bin\ndb_mgmd --configdir=%CD% -f config.ini

REM start the MySQL Cluster data nodes, passing --initial if the batch file received that argument
start bin\ndbd %1
start bin\ndbd %1

REM created a copy of the MySQL SQL node data directory, if it doesn't already exist
IF NOT EXIST %CD%\data2 XCOPY %CD%\data %CD%\data2 /i /S

REM start the MySQL SQL nodes on ports 3307 and 3308
start bin\mysqld --no-defaults --basedir=%CD% --datadir=%CD%\data --port=3307 --ndbcluster --ndb-connectstring=localhost:1186 --console
start bin\mysqld --no-defaults --basedir=%CD% --datadir=%CD%\data2 --port=3308 --ndbcluster --ndb-connectstring=localhost:1186 --console

The script expects to be run from within the directory created when unpacking the .ZIP file.  It leverages the default (blank) password for the root account in Windows .ZIP distributions to shutdown the SQL nodes, and uses port 3307 and 3308 to avoid causing problems with any existing MySQL server that may be running on port 3306 (which is my personal, persistent installation).

Hopefully this will help people looking to do a quick deployment of MySQL Cluster on Windows.

Connector/J extension points – Load Balancing Strategies

A fourth and final Connector/J extension point I covered in my JavaOne and Silicon Valley Code Camp presentations is load-balancing strategies.  This exists in order to allow you to define behavior for balancing load across multiple back-end MySQL server instances.  MySQL Connector/J’s load-balancing implementation is a simple internal connection pool.  What appears to your application as a single Connection object can actually have multiple physical connections to MySQL servers underneath (one per configured host/port pair).  At specific points, Connector/J will re-balance and choose another host to interface with.  This extension point allows you to define how Connector/J determines which host it should pick next.

Unlike the previous extension points, my demo code does not contain examples of this.  In this case, though, there are some standard implementations provided with Connector/J that we can look at, instead.  The two implementations shipped with Connector/J today implement a “best response time” strategy and a “random” strategy.  The default behavior when using load-balanced deployments is “random”, and the Connector/J configuration properties documentation describes their use cases.  Below are the actual classes which define the behavior:

  • com.mysql.jdbc.RandomBalanceStrategy
  • com.mysql.jdbc.BestResponseTimeBalanceStrategy

These classes – and any user-implemented load-balancing strategy – implements the com.mysql.jdbc.BalanceStrategy interface.  As the purpose of this extension point is to define how load is balanced across MySQL instances, there’s really only one key method you need to focus on:  pickConnection().  The purpose of this method is to return a Connection (more specifically, a com.mysql.jdbc.ConnectionImpl object).  Looking at the RandomLoadBalacneStrategy code, you will see the logic that’s involved.

The first parameter to pickConnection() is a LoadBalancingConnectionProxy object.  This is the object that does much of the load-balancing work.  It also contains a few callback methods you will want to consider:

  1. getGlobalBlacklist() – this method returns a Map<String, Long> of hosts that have been identified as unavailable.  The String key is the host/port, while the Long is the time that the blacklist entry should expire.  Inside the proxy, this global blacklist is defined as a static Map, meaning that Host X will be found in the blacklist by one Connection if another Connection object put it there after experiencing problems.  Access to the static variable is synchronized, and the Map returned from this method is a local copy.
  2. shouldExceptionTriggerFailover() – this method takes a SQLException and determines whether such an Exception should trigger a failover.  This, too, is user-configurable, although the defaults are usually sufficient for most deployments.  A previous post contains detailed information on how to customize this behavior.
  3. addToGlobalBlacklist() – this is the method you want to call if you want to add a host to the global blacklist.
  4. createConnectionForHost() – this is a utility method that handles creation of a new ConnectionImpl object based on the host/port String, so that you don’t have to wire up ConnectionImpl objects directly.  If you look at the method implementation, you will see the work that goes into setting up a properly-configured ConnectionImpl.

The remaining parameters are, in order:

  • List<String> – a list of configured hosts involved in load-balancing
  • Map<String, ConnectionImpl> – a Map of “live” connections already established, accessed through the host/port key.  Thinking of this as a connection pool, these are the cached connections which can be reused if the host/port pair is chosen, instead of doing the additional work of setting up a new physical connection.
  • long[] – an array of response times in the same order as the List<String> of configured hosts.  This is used in BestResponseTimeBalanceStrategy.
  • int – number of retries that should be attempted before giving up on finding a new connection.

So, what can you do with this?  People frequently ask for a true round-robin load-balancer.  Our experience has been that RandomBalanceStrategy is far better, but if you really need a true round-robin load-balancing algorithm, you could implement it here.


Managing load-balanced Connector/J deployments

Connector/J has long provided an effective means to distribute read/write load across multiple MySQL server instances for Cluster or master-master replication deployments, but until version 5.1.13, managing such deployments frequently required a service outage to redeploy a new configuration.  Given that ease of scaling out by adding additional MySQL Cluster (server) instances is a key element in that product offering, which is also naturally targeted at deployments with very strict availability requirements, we had to add support for online changes of this nature.  It’s also critical for online upgrades – the other option is to take a MySQL Cluster server instance down hard, which loses any in-process transactions and generates application exceptions, if any application is trying to use that particular server instance.

Mark Matthews and I first presented this at the MySQL Conference and Expo 2010, but the live demo we had planned didn’t work out (buggy video adapter, not product, really!).  This is my belated attempt to rectify that situation by walking through configuration, deployment and management of load-balancing configurations.

First, let’s talk about the configuration properties associated with this new functionality:

  • loadBalanceConnectionGroup – This is somewhat tricky to explain – imagine a scenario where you have an application which pulls JDBC connections from three different sources.  They may be dedicated resources for specific components – one for your JMS, one for your web app, etc.  This allows you to manage these JDBC sources within a single class-loader in any combination you choose.  If you have different servers for JMS than for your web app, you would define two different connection groups.  If they use the same configuration, and you want to manage them as a logical single group, give them the same name.   This is the key property for management, though – if you don’t define a name (string) for loadBalanceConnectionGroup, you cannot manage the connections.
  • loadBalanceEnableJMX – The ability to manage the connections is exposed when you define a loadBalanceConnectionGroup, but if you want to manage this externally, you will want to enable JMX by setting this property to true.  This enables a JMX implementation which exposes the management and monitoring operations of a connection group.  To expose this externally, you need to start your application with the -Dcom.sun.management.jmxremote JVM flag.  You can then perform connect and perform operations using a JMX client such as jconsole.

Now that you have the properties set, what operations can you perform?  First, there are a few monitoring details exposed:

  • Current active host count
  • Current active physical connection count
  • Current active logical connection count
  • Total logical connections created
  • Total transaction count

The following management operations are also exposed:

  • Add host
  • Remove host

The JMX interface (com.mysql.jdbc.jmx.LoadBalanceConnectionGroupManagerMBean) has the following methods:

int getActiveHostCount(String group);

int getTotalHostCount(String group);

long getTotalLogicalConnectionCount(String group);

long getActiveLogicalConnectionCount(String group);

long getActivePhysicalConnectionCount(String group);

long getTotalPhysicalConnectionCount(String group);

long getTotalTransactionCount(String group);

void removeHost(String group, String host) throws SQLException;

void stopNewConnectionsToHost(String group, String host) throws SQLException;

void addHost(String group, String host, boolean forExisting);

String getActiveHostsList(String group);

String getRegisteredConnectionGroups();

The getRegisteredConnectionGroups() method will return the names of all connection groups defined in that class-loader.

So, what does this look like when running?  Let’s take a look at the long-delayed demo!

Here’s the code for my test application:

public class ToddTest {

private static String URL = "jdbc:mysql:loadbalance://" +
"localhost:3306,localhost:3310/test?" +

public static void main(String[] args) throws Exception {
new Thread(new Repeater()).start();
new Thread(new Repeater()).start();
new Thread(new Repeater()).start();

static Connection getNewConnection() throws SQLException, ClassNotFoundException {
return DriverManager.getConnection(URL, "root", "");

static void executeSimpleTransaction(Connection c, int conn, int trans){
try {
Statement s = c.createStatement();
s.executeQuery("SELECT SLEEP(1) /* Connection: " + conn + ", transaction: " + trans + " */");
} catch (SQLException e) {

public static class Repeater implements Runnable {

public void run() {
for(int i=0; i < 100; i++){ try { Connection c = getNewConnection(); for(int j=0; j < 10; j++){ executeSimpleTransaction(c, i, j); Thread.sleep(Math.round(100 * Math.random())); } c.close(); Thread.sleep(100); } catch (Exception e) { e.printStackTrace(); } } } } }

I start this (with -Dcom.sun.management.jmxremote flag), then start jconsole:

There's the testsuite.ToddTest main class listed there I select that and navigate to the com.mysql.jdbc.jmx.LoadBalanceConnectionGroupManager bean:

Great, let's look at the methods exposed:

Let's get the number of total physical connections established:

So, 27 physical connections have been established.  Now, let's see what the currently-configured hosts are:

So the connections are being balanced against instances running on ports 3306 and 3310.  Now, I've started up an instance on port 3309, and I want to tell Connector/J to start using it.  I do this with the addHost() method:

Adding port 3309 instance

That worked, and here's how I can tell - the queries start showing up in SHOW PROCESSLIST output on port 3309:

PROCESSLIST output on port 3309

Not quite as riveting as a live demo, I'm sure, but the instructions and code necessary to reproduce this quickly on your own local machine are above.  Hopefully this will prove useful to users who want to scale out, maintain and reconfigure Connector/J-driven applications without downtime.

Connector/J load-balancing for auto-commit-enabled deployments

In a previous blog post, I wrote about Connector/J’s failover policies, I described three triggers which cause Connector/J to re-balance (potentially selecting a new physical connection to another host):

  1. At transaction boundaries (transactions are explicitly committed or rolled back)
  2. A communication exception (SQL State starting with “08”) is encountered
  3. When a SQLException matches conditions defined by user, using the extension points defined by the loadBalanceSQLStateFailover, loadBalanceSQLExceptionSubclassFailover or loadBalanceExceptionChecker properties.

Those conditions fit most needs very well, but there are situations where people are running with auto-commit enabled (no explicit transaction commit or rollback), and the end result is that Connector/J never re-balances, and keeps chugging away using the same physical connection.  That’s a problem – particularly when load-balancing is being used under-the-hood to distribute read-only load across multiple slaves.  And as of today, there is new functionality which allows Connector/J to be configured to re-balance after a certain number of statements when auto-commit is enabled.  This functionality is dependent upon the following new properties:

  • loadBalanceAutoCommitStatementThreshold – defines the number of matching statements which will trigger the driver to (potentially) swap physical server connections.  The default value (0) retains the previously-established behavior that connections with auto-commit enabled are never balanced.
  • loadBalanceAutoCommitStatementRegex – the regular expression against which statements must match.  The default value (blank) matches all statements.

So, for example, using the following properties will cause Connector/J to re-balance after every third statement which contains the string “test”:



Why does loadBalanceAutoCommitStatementRegex exist?  Maybe your application uses temporary tables.  Maybe it uses server-side variables.  Maybe it uses some other connection state that wouldn’t be good to lose by letting the driver arbitrarily swap physical connections before you are done with it (this post explores some of these scenarios).  This allows you to identify a trigger statement which is only executed at times where it is known to be safe to swap physical connections.  Generally, you likely won’t need to use it, but it may come in handy if you rely on server session-side state.

Load-balancing for MySQL Cluster

Shortly after I wrote my last post regarding some advanced Connector/J load-balancing properties, Anirudh published a post describing configuration of RHEL LVS for load-balancing and failover of MySQL Cluster SQL nodes.  It’s an interesting post, and I admit I know very little about RHEL LVS, but it reminded me of problems I experienced when trying to set up load-balanced ColdFusion(!) servers at my last job, years back.  We ended up with a nice hardware load-balancer sitting in front of multiple ColdFusion web servers.  The problems we found were that our application depended upon session state, which was stored (of course) on a single web server.  The load-balancer allowed us to define sticky sessions, which is what we did, but it cost us.

We couldn’t really balance load – we could balance session counts, sort of.  Every time a new session started, the balancer would pick which server would handle that session – for the full duration of the session.  Some sessions might be short and little load, while others may be very long and represent a huge amount of load.

We also had a limited HA solution.  We implemented a heartbeat function so that when a web server went offline, the load-balancer would re-route affected users to an available server.  But because the session data was stored on the original server, the user had to log in again and recreate session data.  If the user was in the middle of a complex transaction, too bad.

The above problem also made maintenance a pain.  We could reconfigure the load-balancer on the fly to stop using a specific server for new sessions, but we couldn’t take that web server offline until all of the user sessions on that machine terminated.  That might take 5 minutes, or it might take 5 hours.

As I said, I’m no LVS expert, but I would expect similar problems when using it as a load-balancer for MySQL Cluster.  I suspect that only new connection requests are balanced, making persistent connections (like common Java connection pools) “sticky” to whatever machine the connection was originally assigned.  You probably cannot balance load at anything less than “connection” level, while Connector/J will rebalance after transactions or communications errors.  And anytime you lack the ability to redistribute load except at new connections, taking servers offline for maintenance will be problematic (Connector/J 5.1.13 provides a new mechanism to facilitate interruption-free maintenance, which I intend to blog about later).

This means that it probably works best when using other connectors which don’t support load-balancing, or with applications that don’t use persistent connections, but I wouldn’t use it instead of Connector/J’s load-balancing, and I definitely would not use it with Connector/J’s load-balancing – Connector/J won’t understand that multiple MySQL server instances live behind a single address, and won’t be able to coordinate load-balancing with LVS.

Connector/J’s load-balancing failover policies

Connector/J provides a useful load-balancing implementation for Cluster or multi-master deployments.  As of Connector/J 5.1.12, this same implementation is used under the hood for balancing load between read-only slaves with ReplicationDriver.  When trying to balance workload between multiple servers, though, the driver has to decide when it’s safe to swap servers – doing so in the middle of a transaction would not make applications very happy.  Many of the same principles which apply to autoReconnect also apply here – you don’t want to lose important state information.

As a result, Connector/J will only try to pick a new server when one of the following happen:

  1. At transaction boundaries (transactions are explicitly committed or rolled back)
  2. A communication exception (SQL State starting with “08”) is encountered
  3. When a SQLException matches conditions defined by user, using the extension points defined by the loadBalanceSQLStateFailover, loadBalanceSQLExceptionSubclassFailover or loadBalanceExceptionChecker properties.

The third condition is new, and revolves around three new properties introduced with Connector/J 5.1.13.  It allows you to control which SQLExceptions trigger failover. Let’s examine each of the new properties in detail.


The loadBalanceExceptionChecker property is really the key.  This takes a fully-qualified class name which implements the new com.mysql.jdbc.LoadBalanceExceptionChecker interface.  This interface is very simple, and you only need to implement the following method:

public boolean shouldExceptionTriggerFailover(SQLException ex)

In goes a SQLException, out comes a boolean.  True triggers a failover, false does not.  Easy!

You can use this to implement your own custom logic. An example where this might be useful is when dealing with transient errors with MySQL Cluster, where certain buffers may be overloaded. At the 2010 MySQL Conference, Mark Matthews and I presented a simple example during our tutorial which does this:

public class NdbLoadBalanceExceptionChecker
 extends StandardLoadBalanceExceptionChecker {

 public boolean shouldExceptionTriggerFailover(SQLException ex) {
  return super.shouldExceptionTriggerFailover(ex)
    ||  checkNdbException(ex);

 private boolean checkNdbException(SQLException ex){
 // Have to parse the message since most NDB errors
 // are mapped to the same DEMC, sadly.
  return (ex.getMessage().startsWith("Lock wait timeout exceeded") ||
  (ex.getMessage().startsWith("Got temporary error")
  && ex.getMessage().endsWith("from NDB")));

The code above extends com.mysql.jdbc.StandardLoadBalanceExceptionChecker, which is the default implementation. There’s a few convenient shortcuts built into this, for those who want to have some level of control using properties, without writing Java code. This default implementation uses the two remaining properties: loadBalanceSQLStateFailover and loadBalanceSQLExceptionSubclassFailover.


The loadBalanceSQLStateFailover property allows you to define a comma-delimited list of SQLState code prefixes, against which a SQLException is compared. If the prefix matches, failover is triggered. So, for example, the following would trigger a failover if a given SQLException starts with “00”, or is “12345”:



This property can be used in conjunction with loadBalanceSQLStateFailover or on it’s own. If you want certain subclasses of SQLException to trigger failover, simply provide a comma-delimited list of fully-qualified class or interface names to check against. For example, say you want all SQLTransientConnectionExceptions to trigger failover:


That’s all there is to it!