Bug 481178 - LVM Incorrect metadata area header checksum after update from UP kernel to SMP
LVM Incorrect metadata area header checksum after update from UP kernel to SMP
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: anaconda (Show other bugs)
4.8
i386 Linux
high Severity medium
: beta
: ---
Assigned To: Anaconda Maintenance Team
Alexander Todorov
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2009-01-22 11:23 EST by Jan Tluka
Modified: 2009-05-18 16:16 EDT (History)
12 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2009-05-18 16:16:03 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Jan Tluka 2009-01-22 11:23:29 EST
Description of problem:
When installing nightly build of RHEL4 - RHEL4-U8-re20090115.nightly on machine ibm-mongoose.rhts.bos.redhat.com I got following failure resulting in system panic:

--snip--
Scanning logical volumes

  Reading all physical volumes.  This may take a while...

  Incorrect metadata area header checksum

  Volume group "VolGroup00" inconsistent

  Incorrect metadata area header checksum

  WARNING: Inconsistent metadata found for VG VolGroup00 - updating to use version 3

  Incorrect metadata area header checksum

  Automatic metadata correction failed

ERROR: /bin/lvm exited abnormally! (pid 697)

Activating logical volumes

  Couldn't find device with uuid '7NR7g9-40oo-gIUr-XMRV-nLnr-379e-c2K3Zx'.

  Couldn't find device with uuid 'XrrTeW-UNMQ-BjRw-l4x6-OxXe-rYu0-8IPGH0'.

  Couldn't find device with uuid '7NR7g9-40oo-gIUr-XMRV-nLnr-379e-c2K3Zx'.

  Couldn't find device with uuid 'XrrTeW-UNMQ-BjRw-l4x6-OxXe-rYu0-8IPGH0'.

  LV LogVol00: segment 1 has inconsistent PV area 0

  Couldn't read all logical volumes for volume group VolGroup00.

  Couldn't find device with uuid '7NR7g9-40oo-gIUr-XMRV-nLnr-379e-c2K3Zx'.

  Couldn't find device with Kernel panic - not syncing: Attempted to kill init!

uuid 'XrrTeW-UNM Q-BjRw-l4x6-OxXe
--snip--

(see full log in RHTS) RHTS job http://rhts.redhat.com/cgi-bin/rhts/jobs.cgi?id=42848

Recipe http://rhts.redhat.com/cgi-bin/rhts/recipes.cgi?id=147680
This recipe installs uniprocessor kernel and updates it with smp kernel on multiprocessor system then reboots the system.

Version-Release number of selected component (if applicable):
2.6.9-78.28.ELsmp

How reproducible:
Not sure if this is 100% reproducible. I will check tomorrow.

Steps to Reproduce:
1. Install RHEL4-U8-re20090115.nightly with UP kernel on ibm-mongoose.rhts.bos.redhat.com. 
2. Update to SMP kernel.
3. Reboot.
  
Actual results:
System hangs.

Expected results:
System boots OK.

Additional info:
Comment 1 Milan Broz 2009-01-22 12:05:27 EST
I expect that there was just old metadata of the same VG name which appeared during boot (or anaconda didn't wiped all metadata properly).

See the anaconda log, there are also some errors:
* WARNING: Installing on a USB device.  This may or may not produce a working system.
...
* parted exception: Error: File system has an invalid signature for a FAT file systems.
* parted exception: Error: File system has an invalid signature for a FAT file systems.
* parted exception: Error: Can't have the end before the start!
...

Isn't possible that anaconda just didn't initialized some device during install and this device re-appered during the system boot with wrong lvm metadata?

Reassigning to anaoconda, if you still see it is bug in lvm2, please provide full lvm2 debug log (run commands with -vvvv) and if possible, "lvmdump -m" diagnostic data from the machine.)
Comment 3 Joel Andres Granados 2009-01-23 05:22:20 EST
It is completely possible that anaconda did not initialize the disk.  but as I see on the parted messages, this might be that the partition table was corrupted and anaconda just ignored the disk.

did something change in the kernel partitioning code?
Comment 4 Joel Andres Granados 2009-01-27 03:35:25 EST
anaconda and parted where built on Jan 15.  This means that the test was done with the rhel4.7 anaconda version.  Please test with current nightly and confirm that the behavior persists.
Comment 5 Joel Andres Granados 2009-01-27 10:14:32 EST
The new anaconda was modified in such a way that it uses vgreduce before vgremove.  Please test with current anaconda version anaconda-10.1.1.94-1

I'm also thinking that this bug is a dup of 481698.  But pls test and post your findings.
Comment 6 Jan Tluka 2009-01-27 12:02:47 EST
Following job was queued in RHTS:
http://rhts.redhat.com/cgi-bin/rhts/jobs.cgi?id=43499

This is about to install RHEL4-U8-re20090126.2 tree. I'm aware that it will fail because of udev bug but for our purpose should be sufficient. Anyway does this tree include anaconda version you mentioned in comment 5?
Comment 7 Joel Andres Granados 2009-01-27 12:11:28 EST
Jan:

RHEL4-U8-re20090126.2 does not have the latest anaconda.  Please wait for today's compose.  Jan 27.  It has a new lvm fix and could make the difference.  For this reason, I will ignore comment #6 and wait for a test that includes anaconda-10.1.1.94-1
Comment 8 Jan Tluka 2009-01-29 06:25:19 EST
RHTS job running RHEL4-U8-re20090128.1 installation:
http://rhts.redhat.com/cgi-bin/rhts/jobs.cgi?id=43883
Comment 9 Joel Andres Granados 2009-01-30 12:30:31 EST
(In reply to comment #8)
> RHTS job running RHEL4-U8-re20090128.1 installation:
> http://rhts.redhat.com/cgi-bin/rhts/jobs.cgi?id=43883

This link does not show me any relative info.  Do you have the link to the test logs?
Comment 10 Jan Tluka 2009-02-02 05:00:42 EST
Test logs available here:
http://rhts.redhat.com/cgi-bin/rhts/test_log.cgi?id=6301495
Comment 11 Joel Andres Granados 2009-02-03 06:23:31 EST
We have built a new anaconda,  this new version erases stale lvm metadata before doing anything.  Can you please retest with new nightly.  FYI anaconda version you need is anaconda-10.1.1.95-1
Comment 12 Jan Tluka 2009-02-03 13:11:29 EST
Using nightly from 3rd Feb I got these test logs:
http://rhts.redhat.com/cgi-bin/rhts/test_log.cgi?id=6331359
Comment 13 Joel Andres Granados 2009-02-03 13:32:35 EST
Unless I am missreading the logs.... this seems ok now.  I see that the job passed.  If you see anymore missbehavior pls reopen this bug.
Comment 23 errata-xmlrpc 2009-05-18 16:16:03 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2009-0978.html

Note You need to log in before you can comment on or make changes to this bug.