This reproducer causes corruption of PVs. The last lvcreate command loops forever, probably searching somewhere for unterminated duplicate PV string. #!/bin/bash # uses /dev/sd[bcd] dd if=/dev/zero of=/dev/sdb bs=1M count=1 dd if=/dev/zero of=/dev/sdc bs=1M count=1 dd if=/dev/zero of=/dev/sdd bs=1M count=1 pvcreate -ff /dev/sd[bcd] dd if=/dev/zero of=/dev/sdb bs=512 count=1 dd if=/dev/sdb of=/dev/sdc bs=512 count=131072 sync # Basically it simulates corruption caused bu this: #dmsetup create log --table "0 2 linear /dev/sdb 0" #dmsetup create mirror --table "0 131072 mirror core 1 1024 2 /dev/sdc 0 /dev/sdd 0" #sleep 5 #dmsetup remove mirror #dmsetup remove log pvcreate -ff /dev/sd[bcd] ------------ [root@saloonio ~]# /reproduce_stupid_hash + dd if=/dev/zero of=/dev/sdb bs=1M count=1 1+0 records in 1+0 records out 1048576 bytes (1.0 MB) copied, 0.0117125 s, 89.5 MB/s + dd if=/dev/zero of=/dev/sdc bs=1M count=1 1+0 records in 1+0 records out 1048576 bytes (1.0 MB) copied, 0.00682748 s, 154 MB/s + dd if=/dev/zero of=/dev/sdd bs=1M count=1 1+0 records in 1+0 records out 1048576 bytes (1.0 MB) copied, 0.00661354 s, 159 MB/s + pvcreate -ff /dev/sdb /dev/sdc /dev/sdd Physical volume "/dev/sdb" successfully created Physical volume "/dev/sdc" successfully created Physical volume "/dev/sdd" successfully created + dd if=/dev/zero of=/dev/sdb bs=512 count=1 1+0 records in 1+0 records out 512 bytes (512 B) copied, 0.0185469 s, 27.6 kB/s + dd if=/dev/sdb of=/dev/sdc bs=512 count=131072 131072+0 records in 131072+0 records out 67108864 bytes (67 MB) copied, 2.88822 s, 23.2 MB/s + sync + pvcreate -ff /dev/sdb /dev/sdc /dev/sdd Found duplicate PV 4Y3U4U6minJFhJkmNgkFpYQgfHYfJCdv: using /dev/sdc not /dev/sdb Found duplicate PV h8TIG8TynLe6nfcaqaqrRF8fwldn0LrB: using /dev/sdb not /dev/sdc Physical volume "/dev/sdb" successfully created ^C^C/reproduce_stupid_hash: line 23: 18537 Killed pvcreate -ff /dev/sd[bcd] lvm version lvm2-2.02.39-6.fc10.x86_64 (also upstream cvs snapshot)
The infinite loop is in metadata/metada.c, function _vg_read_orphans while iterating through vginfo->infos. The iteration never ends, possibly caused by corrupted vginfo->infos list.
This is a simplified version of the reproducer: #!/bin/bash dd if=/dev/zero of=/dev/sdb dd if=/dev/zero of=/dev/sdc pvcreate -ff /dev/sdb /dev/sdc dd if=/dev/sdb of=/dev/sdc pvcreate -ff /dev/sdb /dev/sdc
This bug appears to have been reported against 'rawhide' during the Fedora 10 development cycle. Changing version to '10'. More information and reason for this action is here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping
OK, since nobody did a full analysis of this yet, I decided to start to look at it myself today. The root cause of this problem is a bug in the internal lvmcache code. It maintains an index of devices by pvid. Given a pvid, a hash lookup returns the corresponding struct lvmcache_info which in turn gives the struct device (and pvid which should match the one looked up). static int _lvmcache_update_pvid(struct lvmcache_info *info, const char *pvid) { if (!strcmp(info->dev->pvid, pvid)) return 1; ... When there are two devices with the same pvid, there'll be two info->dev->pvid the same, but the pvid hash only holds one of them. After pvcreate changes the first pvid, you're left without a hash entry for the original pvid but crucially the second info->dev->pvid still holds it, so when the code should be adding it, the pvids in the strcmp match and the function just returns. The fix is to check that the info is the one already stored in the hash: if (((dm_hash_lookup(_pvid_hash, pvid)) == info) && !strcmp(info->dev->pvid, pvid)) return 1; There's also a misleading 'duplicate PV' error message when the cache is updated which should be suppressed - by testing for matching pvids?
The fix has been uploaded to upstream (version 2.02.44).
lvm2-2.02.48-1.fc11 has been submitted as an update for Fedora 11. http://admin.fedoraproject.org/updates/lvm2-2.02.48-1.fc11
This message is a reminder that Fedora 10 is nearing its end of life. Approximately 30 (thirty) days from now Fedora will stop maintaining and issuing updates for Fedora 10. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as WONTFIX if it remains open with a Fedora 'version' of '10'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version prior to Fedora 10's end of life. Bug Reporter: Thank you for reporting this issue and we are sorry that we may not be able to fix it before Fedora 10 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora please change the 'version' of this bug to the applicable version. If you are unable to change the version, please add a comment here and someone will do it for you. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. The process we are following is described here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping
Fedora 10 changed to end-of-life (EOL) status on 2009-12-17. Fedora 10 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. Thank you for reporting this bug and we are sorry it could not be fixed.