Bug 190623

Summary: MySQL slave doesn't read data from master on kernel 2107
Product: [Fedora] Fedora Reporter: William Shubert <wms>
Component: kernelAssignee: Dave Jones <davej>
Status: CLOSED ERRATA QA Contact: Brian Brock <bbrock>
Severity: medium Docs Contact:
Priority: medium    
Version: 5CC: pfrields, wtogami
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: 2111 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2006-05-07 00:10:59 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description William Shubert 2006-05-04 01:21:02 UTC
Description of problem:
I run a slave MySQL-server-4.1.18, from mysql.com, with statically linked glibc.
Under kernel-smp 2096, it works fine. Under kernel-smp 2107, the slave does not
work; a "netstat" shows that a socket to the master mysql server was created,
but data is waiting for MySQL to read it, and never being read.

Version-Release number of selected component (if applicable):
kernel-smp-2.6.16-1.2107_FC5

How reproducible:
Every time (I made many attempts to get MySQL working).

Steps to Reproduce:
1. Install kernel 2107.
2. Start mysql as a slave database.
  
Actual results:
Data waits to be read by the slave ("netstat --tcp" shows inbound data not being
read). Slave database is not kept up to date. Connecting to database and
examining database status indicates that the slave is waiting for data from the
master.

Expected results:
Slave should read data on inbound network socket.

Additional info:
Swiched back to kernel 2096 and everything went back to normal. I see bug 190543
which looks similar, except the applications having trouble are nfs and ypbind.

As an added note, the socket to the master isn't directly out the network - it
actually goes through an SSL tunnel first, so the connection is:
  Master database --> (over internet) --> xinetd-spawned SSL tunner --> slave
database

This may or may not be significant, I'm not sure.

Comment 1 Dave Jones 2006-05-04 03:37:59 UTC
does booting with pci=nomsi make this go away?

Comment 2 Dave Jones 2006-05-04 04:13:07 UTC
Actually, that option is also broken in this kernel (sigh).

2108, available from http://people.redhat.com/davej/kernels/Fedora/FC5/
has this disabled by default. Give that a try ?


Comment 3 William Shubert 2006-05-04 05:06:40 UTC
I did try that option. I rebooted three times; twice with pci=nomsi, once
without. Surprisingly, one time with pci=nomsi, mysql *did* start pulling in
data, and in fact worked fine! The other two times it was stuck and didn't pull
in any data, as before. Maybe it was a coincidence, it is also possible that I
accidentally ran the wrong kernel the time it worked, but I was pretty careful.

I'll try 2108 another day, it is getting late and there's work I have to do.

Another point: Connecting to mysql via named pipe works fine every time.
Connecting via java, which always uses network sockets, fails when 2107 is
running; the connection is made, but the server never answers messages sent.

Wouldn't pci=nomsi disable an option on the PCI bus? It would surprise me if
that helps, since the data is sitting in the network buffer I don't see how PCI
bus properties can stop it from getting to the application, but if you think it
is worthwhile I'll try 2108 all the same once I have time to do so.