Bug 1037650

Summary: MySQL server does not restart after unclean reboot
Product: Red Hat Enterprise Linux 6 Reporter: Lon Hohberger <lhh>
Component: mysqlAssignee: Honza Horak <hhorak>
Status: CLOSED CURRENTRELEASE QA Contact: qe-baseos-daemons
Severity: urgent Docs Contact:
Priority: urgent    
Version: 6.5CC: byte, chris+bugzilla.redhat.com, databases-maint, dpolyakov, jherrman, mark, matthias, milan.kerslager, mkolaja, oe, ondrejj, ovasik, pclegg, sclewis, scott, webdawg, xdmoon
Target Milestone: rcKeywords: Patch, Regression, ZStream
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Prior to this update, the MySQL server attempted to verify if the MySQL socket existed, but did not take into consideration whether any processes were using it. As a consequence, if the MySQL server terminated unexpectedly, the MySQL socket was not reset after the reboot, and thus prevented the MySQL server from restarting correctly. With this update, the MySQL server no longer attempts to verify the existence of a MySQL socket when the socket is not being used by any processes. As a result, the MySQL server can now be successfully restarted after a crash occurs.
Story Points: ---
Clone Of:
: 1037748 1037749 1064862 (view as bug list) Environment:
Last Closed: 2014-10-22 07:14:50 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1037748, 1037749, 1058719, 1064862, 1207113    
Attachments:
Description Flags
proposed patch that uses fuser call lhh: review+

Description Lon Hohberger 2013-12-03 14:42:06 UTC
Description of problem:

MySQL server does not restart after reboot.


Version-Release number of selected component (if applicable): mysql-5.1.71-1.el6

How reproducible: 100%

Steps to Reproduce:
1. service mysqld start
2. reboot -fn

Actual results: 

Another MySQL daemon already running with the same unix socket.
Starting mysqld:  [FAILED]

Expected results:

Starting mysqld:  [  OK  ]

Additional info:

* MySQL starts correctly with mysql-5.1.69-1.el6_4

* Suspect fix for bug 884651 - the block checking for socket existence should only matter if there's a PID attached to the other end of the socket.  On unclean reboot, this socket will persist and thus cause the failure.

Comment 1 Lon Hohberger 2013-12-03 14:59:41 UTC
Perhaps the simplest thing to do is remove the socket check.  I'm not sure what it gains in this particular case.  If there's no mysqld pids, then the existence of the socket may not be a useful check.

Comment 2 Lon Hohberger 2013-12-03 15:08:24 UTC
Deleting this block:

 	# and some users might prefer to configure logging to syslog.)
 	# Note: set --basedir to prevent probes that might trigger SELinux
 	# alarms, per bug #547485
>	if [ -S "$socketfile" ] ; then
>	    echo "Another MySQL daemon already running with the same unix socket."
>	    action $"Starting $prog: " /bin/false
>	    return 1
>	fi
 	$exec   --datadir="$datadir" --socket="$socketfile" \
 		--pid-file="$mypidfile" \
 		--basedir=/usr --user=mysql >/dev/null 2>&1 &

Allows the daemon to restart on hard reboot or after 'killall -9 mysqld'.  If checking the socket is important, you can see if it is running using something like:

  netstat -lx | grep "$socketfile"

Comment 3 Lon Hohberger 2013-12-03 15:25:25 UTC
The netstat program is provided by net-tools, a dependency of initscripts, so using this command would work fine.

[root@localhost ~]# rpm -qf `which netstat`
net-tools-1.60-110.el6_2.x86_64
[root@localhost ~]# rpm -q --whatrequires net-tools
initscripts-9.03.40-2.el6.x86_64
facter-1.6.6-1.el6_4.x86_64
[root@localhost ~]# rpm -q --whatrequires initscripts | grep mysql-server
mysql-server-5.1.71-1.el6.x86_64

-       if [ -S "$socketfile" ] ; then
+       if netstat -lx | grep -q "$socketfile"; then

[root@localhost ~]# service mysqld start
Starting mysqld:  [  OK  ]
[root@localhost ~]# export socketfile="/var/lib/mysql.sock"
[root@localhost ~]# if [ -S "$socketfile" ]; then echo Yup; else echo Nope; fi
Yup
[root@localhost ~]# service mysqld stop
Stopping mysqld:  [  OK  ]
[root@localhost ~]# if [ -S "$socketfile" ]; then echo Yup; else echo Nope; fi
Nope
[root@localhost ~]# if netstat -lx | grep -q "$socketfile"; then echo Yup; else echo Nope; fi
Yup
[root@localhost ~]# killall -9 mysqld mysqld_safe
[root@localhost ~]# if [ -S "$socketfile" ]; then echo Yup; else echo Nope; fi
Yup <-- BAD
[root@localhost ~]# if netstat -lx | grep -q "$socketfile"; then echo Yup; else echo Nope; fi
Nope

One could also use 'fuser' instead:

[root@localhost ~]# if [ -S "$socketfile" ]; then echo Yup; else echo Nope; fi
Yup
[root@localhost ~]# if fuser "$socketfile" &> /dev/null; then echo Yup; else echo Nope; fi
Nope

(fuser is required by initscripts as well)

Comment 4 Lon Hohberger 2013-12-03 15:30:35 UTC
Whoops, forgot to add positive unit test result for using fuser (above is negative test; e.g. after killing mysqld_safe & mysqld):

[root@localhost ~]# service mysqld start
Starting mysqld:  [  OK  ]
[root@localhost ~]# if [ -S "$socketfile" ]; then echo Yup; else echo Nope; fi
Yup
[root@localhost ~]# if fuser "$socketfile" &> /dev/null; then echo Yup; else echo Nope; fi
Yup

Comment 5 Honza Horak 2013-12-03 17:43:32 UTC
Lon, thank you for the elaboration. I'd like to use the fuser approach. I've tested it as well and it seems to work fine. Pure removing the check is not an option, since it is not only a fix for bug 884651, but it also prevents mysqld_safe to remove the socket file of different process:

/usr/bin/mysqld_safe:598
  rm -f $safe_mysql_unix_port "$pid_file"       # Some extra safety

Simple steps to reproduce this issue are:
1) root> service mysqld start
2) root> killall -9 mysqld_safe mysqld
3) root> service mysqld start

Actual results:
Another MySQL daemon already running with the same unix socket.
Starting mysqld:                                           [FAILED]

Expected results:
Starting mysqld:                                           [  OK  ]

Comment 6 Honza Horak 2013-12-03 17:48:18 UTC
Created attachment 832221 [details]
proposed patch that uses fuser call

Comment 7 Lon Hohberger 2013-12-03 19:38:52 UTC
Tested and approved :]

Comment 8 Lon Hohberger 2013-12-03 19:43:13 UTC
Unit test with patch applied:

[root@localhost init.d]# service mysqld start
Starting mysqld:  [  OK  ]
[root@localhost init.d]# echo $?  
0
[root@localhost init.d]# netstat -lx | grep mysql
unix  2      [ ACC ]     STREAM     LISTENING     882338  /var/lib/mysql/mysql.sock
[root@localhost init.d]# killall -9 mysqld mysqld_safe
[root@localhost init.d]# ls -l /var/lib/mysql/mysql.sock 
srwxrwxrwx. 1 mysql mysql 0 Dec  3 14:38 /var/lib/mysql/mysql.sock
[root@localhost init.d]# service mysqld start
Starting mysqld:  [  OK  ]
[root@localhost init.d]# echo $?
0

All good.

Comment 9 Honza Horak 2013-12-13 08:36:37 UTC
*** Bug 1042644 has been marked as a duplicate of this bug. ***

Comment 10 Jan ONDREJ 2013-12-20 05:50:03 UTC
Same problem still here. Please, make an update. Thank you.

Comment 11 Dima Polyakov 2013-12-20 22:33:21 UTC
Same problem happened here (after unexpected VM host reboot). guys, any ETA on that one?
Work around:
# service mysqld stop
# mv /var/lib/mysql/mysql.sock /var/lib/mysql/mysql.sock.bak
# service mysqld start

Comment 12 Milan Kerslager 2014-01-09 19:08:46 UTC
The fix works here.

Comment 13 Chris Wilson 2014-01-10 11:42:07 UTC
Works here too, and needed. Please RH, can we have this fixed soon?

Comment 14 Ondrej Vasik 2014-01-10 13:42:16 UTC
We appreciate the feedback and look to use reports such as this to guide our efforts at improving our products. That being said, this bug tracking system is not a mechanism for requesting support, and we are not able to guarantee the timeliness.

If this issue is critical or in any way time sensitive, please raise a ticket through your regular Red Hat support channels to make certain  it receives the proper attention and prioritization to assure a timely resolution. 
 
For information on how to contact the Red Hat production support team, please visit:
https://www.redhat.com/support/process/production/#howto

Comment 15 Chris Wilson 2014-01-10 13:50:53 UTC
@Ondrej Vasik, I am not requesting support, I am hoping to "guide your efforts at improving your products." :)

Comment 16 Ondrej Vasik 2014-01-10 13:56:54 UTC
I know, but you are requesting "fixed soon" ... and we can't guarantee that without support escalation.

Comment 21 Oden Eriksson 2014-02-12 16:07:07 UTC
I fixed this 3 years, 11 months ago in Mandriva, please look here:

http://svn.mandriva.com/cgi-bin/viewvc.cgi/packages/cooker/mysql/current/SOURCES/mysql-initscript.diff?view=log

Comment 23 Oden Eriksson 2014-02-12 16:12:56 UTC
Oh, I just realized you have to add the logic for the socket, well :)

Comment 24 Honza Horak 2014-02-13 09:50:12 UTC
I'm a bit confused now. It seems the only issue we have right now is handling the socket that nobody uses any more. The solutions we're going to use is to check if the socket file is being used by some process using "fuser" utility. Or do you have some other particular considerations? Because it seems to me you solved different issues with your patches.

Comment 25 Oden Eriksson 2014-02-13 09:53:30 UTC
(In reply to Honza Horak from comment #24)
> I'm a bit confused now. It seems the only issue we have right now is
> handling the socket that nobody uses any more. The solutions we're going to
> use is to check if the socket file is being used by some process using
> "fuser" utility. Or do you have some other particular considerations?
> Because it seems to me you solved different issues with your patches.

Yes, disregard that. The fixes are for stale pid files and other stuff.

Comment 26 webdawg 2014-02-20 09:05:36 UTC
So like.  I get that redhat wants escalation through support and stuff but what the hell.  Install a new version of 6.5 and freaking mysql does not work correctly?  Pretty bad that they took the time to reply and say they will not do anything instead of using that time to just fix it.

I just figured big company, mysql is used in enterprises, big issue.  Nope.

Sorry for the rant.  I see that it is included in the next release...

Can anyone tell me the proper way to apply this patch?

Comment 27 Honza Horak 2014-02-20 11:32:36 UTC
(In reply to webdawg from comment #26)
> Can anyone tell me the proper way to apply this patch?

No need to apply the patch manually, since the fixed package is already delivered. See more at https://bugzilla.redhat.com/show_bug.cgi?id=1058719#c9