Bug 481698
Summary: | vgremove failed on s390* | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 4 | Reporter: | Alexander Todorov <atodorov> | ||||||
Component: | anaconda | Assignee: | Anaconda Maintenance Team <anaconda-maint-list> | ||||||
Status: | CLOSED ERRATA | QA Contact: | Alexander Todorov <atodorov> | ||||||
Severity: | high | Docs Contact: | |||||||
Priority: | high | ||||||||
Version: | 4.8 | CC: | borgan, jgranado, mbroz | ||||||
Target Milestone: | beta | Keywords: | Regression, TestBlocker | ||||||
Target Release: | --- | ||||||||
Hardware: | s390 | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2009-05-18 20:16:07 UTC | Type: | --- | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Attachments: |
|
Description
Alexander Todorov
2009-01-27 07:33:19 UTC
Created attachment 330073 [details]
anacdump.txt
Notice the 'unknown device' value
Local variables in innermost frame:
pvs: ['/dev/dasdb1', '/dev/dasdc1', '/dev/dasdd1', '/dev/dasde1', '/dev/dasdf1', '/dev/dasdg1', '/dev/dasdh1', 'unknown device']
args: ['lvm', 'vgremove', 'VolGroup00']
pv: ('unknown device', 'VolGroup00', '2348810240')
vgname: VolGroup00
rc: 1280
and the leaking file descriptors:
/tmp/lvmout:
File descriptor 3 (/tmp/anaconda.log) leaked on lvm invocation. Parent PID 346: /usr/bin/python
File descriptor 4 (/tmp/product/.buildstamp) leaked on lvm invocation. Parent PID 346: /usr/bin/python
File descriptor 5 (socket:[959]) leaked on lvm invocation. Parent PID 346: /usr/bin/python
File descriptor 6 (/proc/cmdline) leaked on lvm invocation. Parent PID 346: /usr/bin/python
File descriptor 7 (socket:[454]) leaked on lvm invocation. Parent PID 346: /usr/bin/python
File descriptor 8 (socket:[459]) leaked on lvm invocation. Parent PID 346: /usr/bin/python
File descriptor 9 (socket:[460]) leaked on lvm invocation. Parent PID 346: /usr/bin/python
File descriptor 10 (socket:[464]) leaked on lvm invocation. Parent PID 346: /usr/bin/python
File descriptor 11 (/.buildstamp) leaked on lvm invocation. Parent PID 346: /usr/bin/python
File descriptor 12 (socket:[473]) leaked on lvm invocation. Parent PID 346: /usr/bin/python
File descriptor 13 (socket:[477]) leaked on lvm invocation. Parent PID 346: /usr/bin/python
File descriptor 14 (socket:[478]) leaked on lvm invocation. Parent PID 346: /usr/bin/python
File descriptor 15 (socket:[960]) leaked on lvm invocation. Parent PID 346: /usr/bin/python
File descriptor 16 (socket:[487]) leaked on lvm invocation. Parent PID 346: /usr/bin/python
File descriptor 17 (socket:[961]) leaked on lvm invocation. Parent PID 346: /usr/bin/python
File descriptor 18 (pipe:[964]) leaked on lvm invocation. Parent PID 346: /usr/bin/python
File descriptor 19 (pipe:[964]) leaked on lvm invocation. Parent PID 346: /usr/bin/python
File descriptor 20 (socket:[1240]) leaked on lvm invocation. Parent PID 346: /usr/bin/python
File descriptor 21 (socket:[1229]) leaked on lvm invocation. Parent PID 346: /usr/bin/python
File descriptor 22 (socket:[1230]) leaked on lvm invocation. Parent PID 346: /usr/bin/python
File descriptor 24 (socket:[1231]) leaked on lvm invocation. Parent PID 346: /usr/bin/python
File descriptor 25 (socket:[1241]) leaked on lvm invocation. Parent PID 346: /usr/bin/python
File descriptor 26 (socket:[1242]) leaked on lvm invocation. Parent PID 346: /usr/bin/python
File descriptor 27 (socket:[2037]) leaked on lvm invocation. Parent PID 346: /usr/bin/python
File descriptor 28 (socket:[2038]) leaked on lvm invocation. Parent PID 346: /usr/bin/python
File descriptor 29 (socket:[2039]) leaked on lvm invocation. Parent PID 346: /usr/bin/python
File descriptor 30 (socket:[2044]) leaked on lvm invocation. Parent PID 346: /usr/bin/python
File descriptor 31 (socket:[2045]) leaked on lvm invocation. Parent PID 346: /usr/bin/python
File descriptor 32 (socket:[2046]) leaked on lvm invocation. Parent PID 346: /usr/bin/python
File descriptor 33 (socket:[2050]) leaked on lvm invocation. Parent PID 346: /usr/bin/python
File descriptor 34 (socket:[2051]) leaked on lvm invocation. Parent PID 346: /usr/bin/python
File descriptor 35 (socket:[2052]) leaked on lvm invocation. Parent PID 346: /usr/bin/python
Couldn't find device with uuid 'ShV8tU-PTSa-kOLw-hQmw-k3Fi-FKQ0-A2xPoN'.
Couldn't find device with uuid 'ShV8tU-PTSa-kOLw-hQmw-k3Fi-FKQ0-A2xPoN'.
Couldn't find device with uuid 'ShV8tU-PTSa-kOLw-hQmw-k3Fi-FKQ0-A2xPoN'.
Couldn't find device with uuid 'ShV8tU-PTSa-kOLw-hQmw-k3Fi-FKQ0-A2xPoN'.
Couldn't find device with uuid 'ShV8tU-PTSa-kOLw-hQmw-k3Fi-FKQ0-A2xPoN'.
Couldn't find device with uuid 'ShV8tU-PTSa-kOLw-hQmw-k3Fi-FKQ0-A2xPoN'.
Couldn't find device with uuid 'ShV8tU-PTSa-kOLw-hQmw-k3Fi-FKQ0-A2xPoN'.
Volume group "VolGroup00" not found, is inconsistent or has PVs missing.
Consider vgreduce --removemissing if metadata is inconsistent.
This bug started happening around 0120. 0120.nightly has anaconda-10.1.1.92 and test case passes on s390 but fails on s390x (bug #480793). 0121.nightly has anaconda-10.1.1.92 and fails for s390 and s390x 0123.nightly has anaconda-10.1.1.91 and fails for s390 only the 0126.2 tree has anaconda-10.1.1.93 and fails for s390 (s390x untested yet). Something other than anaconda is failing. Did lvm change? anaconda-10.1.1.91 is the version used in rhel4.7 anaconda-10.1.1.92 anaconda version built around Jan 16. anaconda-10.1.1.93 Final anaconda version with some fixes to partitioning, built on Jan 26. I think that, if the bug is present on the three version of anaconda, it must be in some other place. The solution can be in anaconda though. I will compare the rhel5 and rhel4 code bases and see what I can back port from rhel5. Found what could be a solution in the rhel5 tree. This was previously handled by doing a lvm vgreduce before the vgerase. This will be present in the next version of anaconda anaconda-10.1.1.94 This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release. Created attachment 330336 [details]
anacdump.txt with anaconda-10.1.1.94
The traceback now is:
Traceback (most recent call last):
File "/var/tmp/anaconda-10.1.1.94//usr/lib/anaconda/gui.py", line 1074, in handleRenderCallback
self.currentWindow.renderCallback()
File "/var/tmp/anaconda-10.1.1.94//usr/lib/anaconda/iw/progress_gui.py", line 242, in renderCallback
self.intf.icw.nextClicked()
File "/var/tmp/anaconda-10.1.1.94//usr/lib/anaconda/gui.py", line 789, in nextClicked
self.dispatch.gotoNext()
File "/var/tmp/anaconda-10.1.1.94//usr/lib/anaconda/dispatch.py", line 171, in gotoNext
self.moveStep()
File "/var/tmp/anaconda-10.1.1.94//usr/lib/anaconda/dispatch.py", line 239, in moveStep
rc = apply(func, self.bindArgs(args))
File "/var/tmp/anaconda-10.1.1.94//usr/lib/anaconda/packages.py", line 564, in turnOnFilesystems
partitions.doMetaDeletes(diskset)
File "/var/tmp/anaconda-10.1.1.94//usr/lib/anaconda/partitions.py", line 1236, in doMetaDeletes
lvm.vgremove(delete.name)
File "/var/tmp/anaconda-10.1.1.94//usr/lib/anaconda/lvm.py", line 200, in vgremove
raise SystemError, "pvremove failed"
SystemError: pvremove failed
Local variables in innermost frame:
vgname: VolGroup00
pv: ('unknown device', 'VolGroup00', '2348810240')
args: ['lvm', 'pvremove', '-ff', '-y', '-v', 'unknown device']
pvs: ['/dev/dasdb1', '/dev/dasdc1', '/dev/dasdd1', '/dev/dasde1', '/dev/dasdf1', '/dev/dasdg1', '/dev/dasdh1', 'unknown device']
pvname: unknown device
rc: 1280
I believe this is another manifestation of the same issue. Please advise if you need another bug report.
Alex: Can this be found in other archs? or is it only s390 specific? Joel, the new traceback is seen on i386 and s390 with the latest compose. The "unknown_device" comes from calling `lvm vgdisplay -C --noheadings --units b --separator : --nosuffix --options vg_name,vg_size,vg_extent_size'. apparently, vg_name is left wih "unknown_device" and it happens with dasda (it the only one missing from the device list). How can this happen with lvm and how can it be avoided through the command line? (btw why parsing vgdisplay output when vgs is designed for scripting much more better?) Anyway, if you have VG consisting from several PVs, and you lost some PVs, metadata are still read from the remaining PVs (every PV contains full set of metadata by default). So LVM know that there should be some PV with known UUID, but cannot find it. In vgs output this device will be marked like "unknown device". If you just need to remove all devices (to wipe metadata), you can probably ignore such devices. Or call vgreduce --removemissing --force first. (In reply to comment #12) > The "unknown_device" comes from calling `lvm vgdisplay -C --noheadings --units This should be pvdisplay instead of vgdisplay. > b --separator : --nosuffix --options vg_name,vg_size,vg_extent_size' this should be pv_name,vg_name,pv_size instead of vg_name,vg_size,vg_extent_size. sorry for the confusion. Does comment 13 still apply considering my mistake? Also notice. I'm pretty sure, by looking at the log files that there is a dasda device in the system. for some reason the pvdisplay is not seeing it properly. I guess this is due to the fact that dasda does not have the correct metadata and so it gets "unknown device". In any case I don't think that avoiding "unknown device" when it happens is going to cause major error. Yes, comment #13 still apply, you can easily simulate this situation (it is not arch dependent, just s390 uses often lot of small devices combined into one VG). Create VG over several devices: # pvcreate /dev/sd[bcd] Physical volume "/dev/sdb" successfully created Physical volume "/dev/sdc" successfully created Physical volume "/dev/sdd" successfully created # vgcreate vg_test /dev/sd[bcd] Volume group "vg_test" successfully created # pvdisplay -C --noheadings --units b --separator : --nosuffix --options pv_name,vg_name,pv_size /dev/sdb:vg_test:209715200 /dev/sdc:vg_test:209715200 /dev/sdd:vg_test:209715200 Now we destroy metadata on sdb, but sda + sdc still contains metadata: # dd if=/dev/zero of=/dev/sdb bs=1M count=1 1+0 records in 1+0 records out 1048576 bytes (1.0 MB) copied, 0.0290812 s, 36.1 MB/s # pvdisplay -C --noheadings --units b --separator : --nosuffix --options pv_name,vg_name,pv_size Couldn't find device with uuid 'LzIgRJ-r8Pc-1AL9-qW6k-6aOo-VlxP-7EB2hR'. ... Couldn't find device with uuid 'LzIgRJ-r8Pc-1AL9-qW6k-6aOo-VlxP-7EB2hR'. /dev/sdc:vg_test:209715200 /dev/sdd:vg_test:209715200 unknown device:vg_test:209715200 (this is upstream lvm2 code, but 4.8 should be exactly the same here) Probably skip "unknown device" for pvremove -ff is ok for now. (just note: when lvm cannot find some device referenced from metadata, it tries to scan more and more aggresively to find it, last try is scan all block devices in system) changing status from ON_QA to FAILS_QA (the issue is still open, see comment #11) ... A change when it on Jan29 for this. Is it still failing with the latest tree? All nightlies since Feb 1st pass and there's no traceback. Moving to VERIFIED. An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2009-0978.html |