Debugging Communication Link Failure exceptions in Connector/J
Have you seen error messages similar to the following:
Communications link failure – Last packet sent to the server was X ms ago.
Generally speaking, this error suggests that the network connection has been closed. There can be several root causes:
- Firewalls or routers may clamp down on idle connections (the MySQL client/server protocol doesn’t ping).
- The MySQL Server may be closing idle connections which exceed the wait_timeout or interactive_timeout threshold
There’s a couple of useful diagnostic details which can be useful. For starters, when a recent (5.1.13) version of Connector/J is used, you should see additional details around both the last packet sent and received. Older versions may simply indicate the last time a packet was sent to the server, which is frequently zero ms ago. That’s not terribly useful, and it may be that you just sent a packet, but haven’t received a packet from the server for 12 hours. Knowing how long it’s been since Connector/J last received a packet from the server is useful information, so if you are not seeing this in your exception message, update your driver.
The second useful diagnostic detail shows up when Connector/J notices that the time a packet was last sent/received exceeds the wait_timeout or interactive_timeout threshold. It will attempt to notify you of this in the exception message.
The following can be helpful in avoiding such problems, but ultimately network connections can be volatile:
- Ensure connections are valid when checked out of connection pool (use query which starts with “/* ping */” *exactly* to execute lightweight ping instead of full query)
- Minimize duration a Connection object is left idle while other application logic is executed
- Explicitly validate Connection before using after being left idle for extended period of time
- Ensure wait_timeout and interactive_timeout are set sufficiently high
- Ensure tcpKeepalive is enabled
- Ensure that any configurable firewall or router timeout setting accounts for maximum expected idle connection time.
- Make sure that you are not setting socketTimeout, or that it is set to a sufficiently high value to avoid socket timeouts.
I’ve seen exception messages which indicate Connections being used after sitting idle for hours – sometimes days. If you do this, make sure that you are explicitly testing the connection before using it after lengthy idle periods. Network connections fail, and applications need to be prepared to handle that. But expecting connections to survive extended periods where left idle and work magically when used again hours later is just asking for trouble.