MySQL High Availability with Oracle Clusterware

MySQL has an extensive range of high-availability solutions to suit many different use cases and deployment needs.  This list spans from the time-tested – yet continuously-improved – MySQL replication to the just-released MySQL Fabric, giving users many certified solutions for highly available MySQL deployments.  The list is growing yet again, with Oracle Clusterware adding support for MySQL.

Oracle’s Clusterware product is the foundation for the Oracle RAC, and has been battle-tested for high availability support for Oracle database, as well as other Oracle applications.  This technology is now available as part of the MySQL Enterprise subscription, and – like all Oracle commercial products – is freely available for evaluation purposes.  This post will explain Oracle Clusterware architecture and the benefits to MySQL users, and will be followed by a later post focusing on how to deploy Clusterware agents with MySQL.

A very flexible architecture gives Oracle Clusterware the ability to support various consistency mechanisms.  The initial release of the Clusterware agent for MySQL uses a shared resource approach, where essential resources – such as the data directory – are deployed on a shared disk.  A similar strategy is employed in other high-availability solutions (OVM High Availability Template for MySQL, Oracle Solaris Clustering, MySQL with Windows Cluster Failover).  The flexibility of Clusterware doesn’t dictate a specific shared resource implementation – anything from a simple NFS mount to a high-performance SAN may be used.  The recommended and tested solution leverages the Oracle ACFS filesystem.  As with other shared-disk high availability solutions for MySQL, an Oracle Clusterware-based solution requires only one MySQL instance be using a shared MySQL data directory at any one time.

While no high availability solution for MySQL is truly transparent, the Clusterware system provides useful infrastructure to minimize downtime.   The agent performs periodic health checks of the running MySQL Server using mysqladmin, and applications connect through a managed virtual IP address.  The use of a managed virtual IP address directs application traffic to a failover host without requiring configuration changes at the application layer. Failover time is bounded by the interval of agent health checks (every second by default) plus the time required to start the MySQL Server on the failover host (including any necessary crash recovery processing).

A big thanks goes out to the Oracle Clusterware team who did the heavy lifting in adding MySQL support!

 

 

4 thoughts on “MySQL High Availability with Oracle Clusterware

    1. Hi James,

      This is an alternative HA solution to traditional MySQL replication. Because it uses shared storage, it cannot guard against component failures in the storage subsystem, while replication uses non-shared storage. Replication failover can be nearly instantaneous, in that failover can happen as quickly as the applications can be re-routed to use a slave. Typical replication can have lossy failover, though – there’s no guarantee the slave is up-to-date with data on the master (and if the master is unavailable, getting any residual pending data could be problematic). This can be mitigated with semi-synchronous replication, to some extent – but that can also introduce latency in normal operations. Failover using this Clusterware solution requires InnoDB crash recovery, which can take time on heavily-loaded machines, but will avoid data loss.

      As with all HA solutions, it’s a matter of picking and choosing which attributes are most important for your deployment needs. Finding a balance between performance of normal operations, speed of recovery in a failover situation, acceptable data loss, and resilience to component failure can be tough.

  1. Any figure about speed/performance please? As far as I can image, if there are a lot of MySQL instances, writing shall be quite slow.

    Thanks in advance

    1. Hi James,

      Unfortunately, I don’t have performance figures available. Remember that Clusterware has only one MySQL instance running at a given time, so there’s not really overhead there. The Clusterware solution (currently) relies on shared storage, and I would expect that to be the most notable performance change from a standard deployment. Obviously, using NFS for shared storage is likely to be significantly slower than using ACFS – you’ll probably want to benchmark performance for your deployment and use case (read-heavy workloads may be minimally-impacted, while write-heavy workloads see more measurable impacts).

      I hope that helps! If you need more information, I’m sure Oracle staff (support or sales) can dig deeper into this (I’ve recently left Oracle).

Leave a Reply

Your email address will not be published. Required fields are marked *

Time limit is exhausted. Please reload CAPTCHA.