Bug 200341 - possible bad interaction between pvmove and mysqld
Summary: possible bad interaction between pvmove and mysqld
Keywords:
Status: CLOSED DUPLICATE of bug 179201
Alias: None
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: lvm2
Version: 4.0
Hardware: x86_64
OS: Linux
medium
high
Target Milestone: ---
: ---
Assignee: Milan Broz
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2006-07-27 00:22 UTC by Mark Tinberg
Modified: 2013-03-01 04:04 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2008-01-14 14:19:40 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
sysreport from affected system (228.25 KB, application/octet-stream)
2006-07-27 00:24 UTC, Mark Tinberg
no flags Details
vgdisplay verbose output (10.94 KB, text/plain)
2006-07-27 00:24 UTC, Mark Tinberg
no flags Details
mysql configuration for affected database (1.11 KB, text/plain)
2006-07-27 00:26 UTC, Mark Tinberg
no flags Details
mysql error log from affected database (14.40 KB, application/octet-stream)
2006-07-27 00:27 UTC, Mark Tinberg
no flags Details

Description Mark Tinberg 2006-07-27 00:22:29 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (Macintosh; U; PPC Mac OS X; en) AppleWebKit/418.8 (KHTML, like Gecko) Safari/419.3

Description of problem:
I apologize in advance for not having sufficient factual data with which to make a proper bug report.

On 7 Jun 06 a db server that I manage experienced massive mysql innodb tablespace corruption at 
approximately the same time that a pvmove of its most recently added disk space completed.  The 
corruption seemed to only affect the most recently added data, from the last few minutes or hours, data 
from the previous day going back in time did not seem to be corrupted.  I'm not sure whether this was 
some sort of mysql bug or a problem in pvmove or some kind of cache or buffer inconsistancy when they 
are used together or with innodb tablespaces that are accessed with the O_DIRECT option.  There were no 
kernel errors for this time frame and the system has not been rebooted since the incident and continues 
to run a new database (with the old data backfilled in) on a different disk without incident.

Version-Release number of selected component (if applicable):
kernel-smp-2.6.9-34.EL lvm2-2.02.01-1.3.RHEL4 mysql-server-4.1.12-3.RHEL4.1

How reproducible:
Couldn't Reproduce


Steps to Reproduce:
I tried to reproduce on a test piece of hardware by running pvmove in a continuous loop between two disk 
devices while running mysql benchmarks for several weeks but was unable to reproduce the problem.  The 
hardware I tested on was a Dell PE850 with two SATA disks, unfortunately I don't have an FC test setup I 
could use that would more closely replicate the production config.

Actual Results:


Expected Results:


Additional info:
System: Dell PE1850 x86_64 w/ 2x 2.8GHz Xeon  w/HT 4GB RAM
HBA: 2x QLogic QLA2312 w/ Optical SFP
SAN: Infortrend A16F-R1211+JBOD & Infortrend A16F-R2221

Comment 1 Mark Tinberg 2006-07-27 00:24:07 UTC
Created attachment 133120 [details]
sysreport from affected system

Comment 2 Mark Tinberg 2006-07-27 00:24:52 UTC
Created attachment 133121 [details]
vgdisplay verbose output

Comment 3 Mark Tinberg 2006-07-27 00:26:01 UTC
Created attachment 133122 [details]
mysql configuration for affected database

this config includes the force_recovery option added during db recovery

Comment 4 Mark Tinberg 2006-07-27 00:27:24 UTC
Created attachment 133123 [details]
mysql error log from affected database

this has sensitive and redundant data removed

Comment 5 Milan Broz 2008-01-14 14:19:40 UTC
Well, there is not too much information about physical volumes used during
pvmove  operation but I guess that this could be another demonstration of bug
179201 (not with kernel oops - but only with temporary structure corruption
in-memory).

From log I see there are many PV used in usagelog VG with mysqlold and myql LV,
so there are multisegmented LVs - and exactly this triggers kernel bug during
pvmove operation.

Closing as duplicate of recently fixed bug.


*** This bug has been marked as a duplicate of 179201 ***


Note You need to log in before you can comment on or make changes to this bug.