Description of problem: We have a clustered enterprise MySQL service that runs on fibre channel SAN. We recently had a SAN outage where the MySQL filesystem went read-only for a time. After the SAN outage, we could not restart mysql because it was hitting a "MYSQL_timeout" from the /usr/share/cluster/mysql.sh file. This timeout seems to be set to 30 seconds, and mysql was not able to start up to in that time which was causing the service to constantly restart. I manually changed MYSQL_timeout to be 900 seconds (which tracks the default from MySQL) and then I was able to start the service up fine. Version-Release number of selected component (if applicable): ~> rpm -qf /usr/share/cluster/mysql.sh rgmanager-2.0.31-1.el5 ~> uname -a Linux rtmdb1 2.6.18-53.1.4.el5 #1 SMP Wed Nov 14 10:37:27 EST 2007 x86_64 x86_64 x86_64 GNU/Linux ~> cat /etc/redhat-release Red Hat Enterprise Linux Server release 5.1 (Tikanga) ~> rpm -qa | grep -i mysql MySQL-shared-compat-enterprise-gpl-5.0.56sp1-0.rhel5 MySQL-server-enterprise-gpl-5.0.56sp1-0.rhel5 MySQL-devel-enterprise-gpl-5.0.56sp1-0.rhel5 MySQL-client-enterprise-gpl-5.0.56sp1-0.rhel5 How reproducible: Sometimes Steps to Reproduce: 1. Crash a MySQL database and the MySQL service 2. Try to re-enable the service 3. Actual results: Service could not be enabled with the 30 second timeout, had to increase that timeout value in /usr/share/cluster/mysql.sh. There appeared to be no way to configure the startup timeout that I could see in the cluster.conf Expected results: Service would gracefully re-enable itself. Additional info:
BTW, here's the 'rm' section of my cluster.conf for this cluster: <rm log_facility="local4" log_level="10"> <failoverdomains> <failoverdomain name="rtmdb_failover" ordered="0" restricted="0"> <failoverdomainnode name="rtmdb1" priority="1"/> <failoverdomainnode name="rtmdb2" priority="1"/> </failoverdomain> </failoverdomains> <resources/> <service autostart="1" domain="rtmdb_failover" exclusive="0" name="rtmdb" recovery="restart"> <ip address="10.43.36.139" monitor_link="1"> <fs device="/dev/mapper/sanvg-vol1" force_fsck="0" force_unmount="1" fsid="6845" fstype="ext3" mountpoint="/dbr" name="/dbr" options="defaults" self_fence="1"> <mysql config_file="/etc/my.cnf" listen_address="10.43.36.139" name="rtm-mysql" shutdown_wait="180"/> </fs> </ip> </service> </rm>
Created attachment 329977 [details] Adds startup_wait option This patch adds a support for startup_wait for mysql resource agent. After applying this patch it will be possible to configure timeout directly from cluster.conf
Rock On Marek! The test patch works for me in our test lab. Is this likely to be made generally available ? If so, when ?
I believe it will be in next update, so 5.4
~~ Attention Partners RHEL 5.4 Partner Alpha Released! ~~ RHEL 5.4 Partner Alpha has been released on partners.redhat.com. There should be a fix present that addresses this particular request. Please test and report back your results here, at your earliest convenience. Our Public Beta release is just around the corner! If you encounter any issues, please set the bug back to the ASSIGNED state and describe the issues you encountered. If you have verified the request functions as expected, please set your Partner ID in the Partner field above to indicate successful test results. Do not flip the bug status to VERIFIED. Further questions can be directed to your Red Hat Partner Manager. Thanks!
~~ Attention - RHEL 5.4 Beta Released! ~~ RHEL 5.4 Beta has been released! There should be a fix present in the Beta release that addresses this particular request. Please test and report back results here, at your earliest convenience. RHEL 5.4 General Availability release is just around the corner! If you encounter any issues while testing Beta, please describe the issues you have encountered and set the bug into NEED_INFO. If you encounter new issues, please clone this bug to open a new issue and request it be reviewed for inclusion in RHEL 5.4 or a later update, if it is not of urgent severity. Please do not flip the bug status to VERIFIED. Only post your verification results, and if available, update Verified field with the appropriate value. Questions can be posted to this bug or your customer or partner representative.
~~ Attention Partners - RHEL 5.4 Snapshot 1 Released! ~~ RHEL 5.4 Snapshot 1 has been released on partners.redhat.com. If you have already reported your test results, you can safely ignore this request. Otherwise, please notice that there should be a fix available now that addresses this particular request. Please test and report back your results here, at your earliest convenience. The RHEL 5.4 exception freeze is quickly approaching. If you encounter any issues while testing Beta, please describe the issues you have encountered and set the bug into NEED_INFO. If you encounter new issues, please clone this bug to open a new issue and request it be reviewed for inclusion in RHEL 5.4 or a later update, if it is not of urgent severity. Do not flip the bug status to VERIFIED. Instead, please set your Partner ID in the Verified field above if you have successfully verified the resolution of this issue. Further questions can be directed to your Red Hat Partner Manager or other appropriate customer representative.
~~ Attention Partners - RHEL 5.4 Snapshot 5 Released! ~~ RHEL 5.4 Snapshot 5 is the FINAL snapshot to be release before RC. It has been released on partners.redhat.com. If you have already reported your test results, you can safely ignore this request. Otherwise, please notice that there should be a fix available now that addresses this particular issue. Please test and report back your results here, at your earliest convenience. If you encounter any issues while testing Beta, please describe the issues you have encountered and set the bug into NEED_INFO. If you encounter new issues, please clone this bug to open a new issue and request it be reviewed for inclusion in RHEL 5.4 or a later update, if it is not of urgent severity. If it is urgent, escalate the issue to your partner manager as soon as possible. There is /very/ little time left to get additional code into 5.4 before GA. Partners, after you have verified, do not flip the bug status to VERIFIED. Instead, please set your Partner ID in the Verified field above if you have successfully verified the resolution of this issue. Further questions can be directed to your Red Hat Partner Manager or other appropriate customer representative.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2009-1339.html