Load-balancing for MySQL Cluster
Shortly after I wrote my last post regarding some advanced Connector/J load-balancing properties, Anirudh published a post describing configuration of RHEL LVS for load-balancing and failover of MySQL Cluster SQL nodes. It’s an interesting post, and I admit I know very little about RHEL LVS, but it reminded me of problems I experienced when trying to set up load-balanced ColdFusion(!) servers at my last job, years back. We ended up with a nice hardware load-balancer sitting in front of multiple ColdFusion web servers. The problems we found were that our application depended upon session state, which was stored (of course) on a single web server. The load-balancer allowed us to define sticky sessions, which is what we did, but it cost us.
We couldn’t really balance load – we could balance session counts, sort of. Every time a new session started, the balancer would pick which server would handle that session – for the full duration of the session. Some sessions might be short and little load, while others may be very long and represent a huge amount of load.
We also had a limited HA solution. We implemented a heartbeat function so that when a web server went offline, the load-balancer would re-route affected users to an available server. But because the session data was stored on the original server, the user had to log in again and recreate session data. If the user was in the middle of a complex transaction, too bad.
The above problem also made maintenance a pain. We could reconfigure the load-balancer on the fly to stop using a specific server for new sessions, but we couldn’t take that web server offline until all of the user sessions on that machine terminated. That might take 5 minutes, or it might take 5 hours.
As I said, I’m no LVS expert, but I would expect similar problems when using it as a load-balancer for MySQL Cluster. I suspect that only new connection requests are balanced, making persistent connections (like common Java connection pools) “sticky” to whatever machine the connection was originally assigned. You probably cannot balance load at anything less than “connection” level, while Connector/J will rebalance after transactions or communications errors. And anytime you lack the ability to redistribute load except at new connections, taking servers offline for maintenance will be problematic (Connector/J 5.1.13 provides a new mechanism to facilitate interruption-free maintenance, which I intend to blog about later).
This means that it probably works best when using other connectors which don’t support load-balancing, or with applications that don’t use persistent connections, but I wouldn’t use it instead of Connector/J’s load-balancing, and I definitely would not use it with Connector/J’s load-balancing – Connector/J won’t understand that multiple MySQL server instances live behind a single address, and won’t be able to coordinate load-balancing with LVS.