Live reconfiguration of replication topography in Connector/Java
As noted in a previous post, MySQL Connector/Java supports multi-master replication topographies as of version 5.1.27, allowing you to scale read load to slaves while directing write traffic to multi-master (or replication ring) servers. The new release of version 5.1.28 builds upon this, allowing live management of replication host (single or multi-master) topographies. This parallels functionality that has long existed for load-balanced connections, and enables users to add or remove hosts – or now promote slaves – for Java applications without requiring application restart. This post aims to explain how to leverage this functionality (the TL;DR/fun demo is found in the examples section)
ReplicationConnection is the subclass of java.sql.Connection used by Connector/Java to handle replication deployments. It’s automatically used when a JDBC URL starting with the prefix, “jdbc:mysql:replication://…” is specified. Most people will have no interest in managing host configuration of individual Connection objects, but it’s possible at this level, if you cast the Connection object to com.mysql.jdbc.ReplicationConnection. More frequently, you’ll want to manage hosts in the context of replication connection groups.
A replication connection group is modeled by this class, and represents a logical grouping of connections which can be managed together. There may be one or more such ReplicationConnectionGroups in a given Java class loader (you may have an application with two different JDBC resources needing to be managed independently). This key class exposes host management methods for replication connections, and ReplicationConnection objects register themselves with the appropriate ReplicationConnectionGroup if a value for the new replicationConnectionGroup property is specified. The ReplicationConnectionGroup object tracks these connections until they are closed and is used to manipulate the hosts associated with these connections.
Some important methods related to host management include:
- getMasterHosts() – returns a Collection of Strings representing the hosts configured as masters
- getSlaveHosts() – returns a Collection of Strings representing the hosts configured as slaves
- addSlaveHost(String host) – adds new host to pool of possible slave hosts for selection at start of new read-only workload
- promoteSlaveToMaster(String host) – removes the host from the pool of potential slaves for future read-only work (existing read-only work allowed to continue to completion) and adds the host to pool of potential master hosts
- removeSlaveHost(String host, boolean closeGently) – removes the host (String must match exactly!) from the list of configured slaves; if closeGently is false, existing connections which have this host as currently active will be closed hard (application should expect Exceptions)
- removeMasterHost(String host, boolean closeGently) – sames as removeSlaveHost(), but removes the host from list of configured masters
There are also some useful management metrics exposed:
- getConnectionCountWithHostAsSlave(String host) – returns the number of ReplicationConnection objects that have the given host configured as a possible slave
- getConnectionCountWithHostAsMaster(String host) – returns the number of ReplicationConnection objects that have the given host configured as a possible master
- getNumberOfSlavesAdded() – returns the number of times a slave host has been dynamically added to the group pool
- getNumberOfSlavesRemoved() – returns the number of times a slave host has been dynamically removed from the group pool
- getNumberOfSlavePromotions() – returns the number of times a slave host has been promoted to a master
- getTotalConnectionCount() – returns the number of ReplicationConnection objects which have registered with this group
- getActiveConnectionCount() – returns the number of ReplicationConnection objects currently being managed by this group
If you want to manage replication host topography programmatically, you’ll probably want to use ReplicationConnectionGroup. So, how do you get one? ReplicationConnectionGroupManager provides this access, along with some utility methods to make your life easier.
The com.mysql.jdbc.ReplicationConnectionGroupManager class provides a number of static methods which can make your programmatic management of host topographies easy. There are two utility methods:
- getConnectionGroup(String groupName) – returns the ReplicationConnectionGroup object matching the name provided
- getGroupsMatching(String group) – returns all ReplicationConnectionGroups if no name is provided, otherwise returns just the matching group
The other methods in ReplicationConnectionGroupManager mirror those for ReplicationConnectionGroup, except that the first argument is a String group name. These methods will operate on all matching ReplicationConnectionGroups – helpful if you are removing a server from service, and want it decommissioned across all possible ReplicationConnectionGroups.
These methods might be useful for in-JVM management of replication hosts – that is, if your application triggers topography changes. You could, for example, write application code which periodically reads in a configuration file and applies any changes to host topography without application restart. More likely, though, you’ll want to be able to manage host configuration from outside the JVM – fortunately, you can do that using JMX.
When Connector/Java is started with replicationEnableJMX=true, A JMX MBean will be registered allowing manipulation from a JMX client. The MBean interface is defined in com.mysql.jdbc.jmx.ReplicationGroupManagerMBean, and leverages the ReplicationConnectionGroupManager static methods:
public abstract void addSlaveHost(String groupFilter, String host) throws SQLException; public abstract void removeSlaveHost(String groupFilter, String host) throws SQLException; public abstract void promoteSlaveToMaster(String groupFilter, String host) throws SQLException; public abstract void removeMasterHost(String groupFilter, String host) throws SQLException; public abstract String getMasterHostsList(String group); public abstract String getSlaveHostsList(String group); public abstract String getRegisteredConnectionGroups(); public abstract int getActiveMasterHostCount(String group); public abstract int getActiveSlaveHostCount(String group); public abstract int getSlavePromotionCount(String group); public abstract long getTotalLogicalConnectionCount(String group); public abstract long getActiveLogicalConnectionCount(String group);
A JMX client such as JConsole (part of the standard Java distribution) enables easy host management on running applications:
JMX can also be managed remotely (from a different machine, not just JVM) if you start the application with the -Dcom.sun.management.jmxremote flag.
This is all very dry without examples, though, so let’s walk through how this can be done. I’ve written a simple application driver which starts a configurable number of threads, each of which creates a series of connections, then executes a series of transactions against each connection. Transactions alternate between read-only (work distributed to slaves) and read-write (sent to masters), with random delays introduced between each transaction. The goal here is to randomize load and simulate a more realistic production deployment environment. Any SQLException at all will trigger the stack trace to be dumped, so we can easily see if host management operations trigger problems. The code is found Replication Host Management example code can be found here.
Before I start, I’ve got three MySQL Server instances running:
The example application connects using the instance on port 3306 as the master, and the instance running on port 3307 as the slave:
private static String REP_URL = “jdbc:mysql:replication://localhost:3306,localhost:3307/test”;
This can be observed once the example application is started, by looking at PROCESSLIST output:
Notice that there are no connections established to the instance running on port 3308. Now we’ll start JConsole. From the “New Connection” dialog, I’ll select test.ReplicationHostManagement 5:
Clicking the “MBeans” tab at the top and expanding the com.mysql.jdbc.jmx tree presents the attributes and operations exposed:
Here, we can add “localhost:3308″ to the existing slaves:
We should now see traffic directed towards that instance, with no problems noted by the application – and that’s indeed what we see:
We can also promote the existing slave instance on port 3307 to a master:
We can confirm this change by looking at the traffic being routed to port 3307 in PROCESSLIST (the comments regarding “read-only” will indicate some read-write traffic is now being directed towards the 3307 instance):
This can also be confirmed using JConsole to get the list of current master hosts:
We can also take a host out of the pool of available hosts. In this case, I’ll remove the instance running on port 3306 (the original master) using removeMasterHost():
In this case, I didn’t specify which connection group to affect. Chances are good that when you want to take a server offline, you don’t really care which connection group it’s a part of. Confirmation that the server has been removed from the pool can be seen in the PROCESSLIST output, as connections slowly die off as transactions end and load is re-routed to other hosts:
Finally, the example application eventually stops running, with no reported problems:
Success! In this example, we’ve gone from a replication topography with the master on port 3306 and the slave on 3307 to a master on 3307 and a slave on 3308 – and we did it all with zero interruption to the running application.