Bug 448534

Summary: upgrade to RHEL5.2 - breaks mysql replication between MasterDB and Slave
Product: Red Hat Enterprise Linux 5 Reporter: Chris Stankaitis <cstankaitis>
Component: mysqlAssignee: Tom Lane <tgl>
Status: CLOSED ERRATA QA Contact:
Severity: high Docs Contact:
Priority: low    
Version: 5.2CC: byte, charles, hhorak, kvolny, patrickm
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-09-02 09:46:19 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Chris Stankaitis 2008-05-27 14:36:03 UTC
Description of problem:

MasterDB is still running mysql-server-5.0.22-2.1.0.1.x86_64 (RHEL5.1) when I
patch a Slave Server to RHEL5.2 and reboot replication breaks which has forced
me to have to redo replication for multiple servers.

I have both done a "stop slave" and shut down mysql service on the slave prior
to patching to ensure it was in a safe state to upgrade.
 
How reproducible:

Always

Steps to Reproduce:
1. stop slave on slave server
2. stop mysqld service on slave server
3. yum update to RHEL5.2
4. reboot to apply new 5.2 kernel etc..
5. do a show slave status \G 
6. Notice the - Seconds_Behind_Master: NULL


Expected results:

upgrade should not break replication

Comment 1 Chris Stankaitis 2008-05-27 14:50:56 UTC
MySQL Log when I try and "start slave"

080526 15:50:43  mysqld started
080526 15:50:44 [Warning] Can't open and lock time zone table: Table
'mysql.time_zone_leap_second' doesn't exist trying to live without them
080526 15:50:44 [Warning] Neither --relay-log nor --relay-log-index were used;
so replication may break when this MySQL server acts as a slave and has his
hostname changed!! Please use '--rela
y-log=/var/run/mysqld/mysqld-relay-bin' to avoid this problem.
080526 15:50:44 [ERROR] Failed to open the relay log
'./dynamic-relay-bin.000406' (relay_log_pos 235)
080526 15:50:44 [ERROR] Could not find target log during relay log initialization
080526 15:50:44 [ERROR] Failed to initialize the master info structure
080526 15:50:44 [Note] /usr/libexec/mysqld: ready for connections.
Version: '5.0.45'  socket: '/var/lib/mysql/mysql.sock'  port: 0  Source distribution
080527 10:45:58 [ERROR] Failed to open the relay log
'./dynamic-relay-bin.000406' (relay_log_pos 235)
080527 10:45:58 [ERROR] Could not find target log during relay log initialization


[root.fmpub.net mysql]# ll
total 60
-rw-rw---- 1 mysql mysql  254 May 26 15:50 dynamic-relay-bin.000406
-rw-rw---- 1 mysql mysql   27 May 26 15:27 dynamic-relay-bin.index
-rw-rw---- 1 mysql mysql   80 May 26 15:27 master.info
drwx------ 2 mysql mysql 4096 Apr 30 20:15 mysql
srwxrwxrwx 1 mysql mysql    0 May 26 15:50 mysql.sock
drwx------ 2 mysql mysql 4096 May  9 22:27 phpads
-rw-rw---- 1 mysql mysql   58 May 26 15:27 relay-log.info


[root.fmpub.net mysql]# mysqlbinlog dynamic-relay-bin.000406 
/*!40019 SET @@session.max_insert_delayed_threads=0*/;
/*!50003 SET @OLD_COMPLETION_TYPE=@@COMPLETION_TYPE,COMPLETION_TYPE=0*/;
DELIMITER /*!*/;
# at 4
#080526 15:27:10 server id 101  end_log_pos 98  Start: binlog v 4, server v
5.0.22 created 080526 15:27:10
# at 98
#691231 19:00:00 server id 1  end_log_pos 0     Rotate to mysql-bin.000006  pos: 4
# at 141
#080526 15:26:59 server id 1  end_log_pos 98    Start: binlog v 4, server v
5.0.45-log created 080526 15:26:59 at startup
ROLLBACK/*!*/;
# at 235
#080526 15:50:41 server id 101  end_log_pos 254         Stop
DELIMITER ;
# End of log file
ROLLBACK /* added by mysqlbinlog */;
/*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/;


Comment 2 Tom Lane 2008-05-27 15:03:33 UTC
Hm, did you change the hostname as part of the upgrade?  This looks pretty close to the scenario 
described at

http://dev.mysql.com/doc/refman/5.0/en/replication-howto-additionalslaves.html


Comment 3 Chris Stankaitis 2008-05-27 15:42:33 UTC
There was no hostname change at all...  These are not new slaves.. they have
been replicating for the better part of 8 months now.  There have been no
changes to the box except for the yum update.



Comment 4 Chris Stankaitis 2008-06-22 08:09:46 UTC
This is a well known bug with mysql and has been fixed in 5.0.54 please see the
following references

http://bugs.centos.org/view.php?id=2327
http://arjen-lentz.livejournal.com/115899.html
http://bugs.mysql.com/bug.php?id=28597

"The problem originated upstream at MySQL where the default path for the PID
file was changed from the datadir to /var/run/mysqld. That in itself was a
correct move (again, for LSB compliance), but elsewhere in the code the base
path for the relay log got derived from the path of the PID file, and that's
where the trouble originates.
The change of the PID default path was not really noticed anywhere, as most
distros already put it in /var/run/mysqld through their default my.cnf file. But
since none of them explicitly specifies a relay-log path, it gets put where the
compiled-in defaults tell it to go. Kaboom.
"

Please respond and advise.

Comment 5 Chris Stankaitis 2008-06-22 08:12:36 UTC
So because of this, every time the box gets rebooted /var/run is rm'ed and all
the relay logs go poof...



Comment 6 Tom Lane 2008-06-22 18:41:42 UTC
Of course, 5.0.54 isn't actually *out* yet --- MySQL's support for the community release seems to have 
gotten mighty weak.

My recommendation is to use the workaround suggested by Arjen.  The log fragment you quote seems to 
be recommending setting these things explicitly anyway for other reasons.  Unfortunately, /etc/my.cnf
is a customizable config file, so I can't overwrite it during a package upgrade.

Comment 7 Chris Stankaitis 2008-06-23 14:06:55 UTC
we have already put the explicit paths for the relay-log and related files to
work around this issue for our networks.  We hope RedHat will get a code fix in
for this ASAP in a future Errata for MySQL-Server.

Comment 11 Tom Lane 2009-08-19 22:23:21 UTC
*** Bug 518326 has been marked as a duplicate of this bug. ***

Comment 12 errata-xmlrpc 2009-09-02 09:46:19 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2009-1289.html