| Summary: | pvs command hangs (or takes tens of minutes) when a system has a lot of SAN luns - LVM claims paths before multipath has a chance to claim it | ||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Barry Donahue <bdonahue> | ||||||||||||||||||||||
| Component: | lvm2 | Assignee: | LVM and device-mapper development team <lvm-team> | ||||||||||||||||||||||
| lvm2 sub component: | Default / Unclassified | QA Contact: | cluster-qe <cluster-qe> | ||||||||||||||||||||||
| Status: | CLOSED DUPLICATE | Docs Contact: | |||||||||||||||||||||||
| Severity: | high | ||||||||||||||||||||||||
| Priority: | urgent | CC: | agk, bdonahue, bmarzins, dlehman, harald, heinzm, jbrassow, msnitzer, prajnoha, prockai, thornber, xiaoli, zkabelac | ||||||||||||||||||||||
| Version: | 7.0 | ||||||||||||||||||||||||
| Target Milestone: | rc | ||||||||||||||||||||||||
| Target Release: | 7.0 | ||||||||||||||||||||||||
| Hardware: | x86_64 | ||||||||||||||||||||||||
| OS: | Linux | ||||||||||||||||||||||||
| Whiteboard: | |||||||||||||||||||||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||||||||||||||||||||
| Doc Text: | Story Points: | --- | |||||||||||||||||||||||
| Clone Of: | Environment: | ||||||||||||||||||||||||
| Last Closed: | 2014-02-20 07:52:10 UTC | Type: | Bug | ||||||||||||||||||||||
| Regression: | --- | Mount Type: | --- | ||||||||||||||||||||||
| Documentation: | --- | CRM: | |||||||||||||||||||||||
| Verified Versions: | Category: | --- | |||||||||||||||||||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||||||||||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||||||||||||||||||
| Attachments: |
|
||||||||||||||||||||||||
|
Description
Barry Donahue
2013-12-13 14:27:54 UTC
I tried this again on snap shot 2 and I still can't install unless I disconnect the SAN storage. The job you referenced has been deleted and therefore is not accessible. Please reproduce again and leave the job so we can see the logs. Thanks. I reranthis on the 20140110.0 build. It's hung in a loop now. https://beaker.engineering.redhat.com/jobs/575642 I'm also attaching the logs. Created attachment 850604 [details]
anaconda log
Created attachment 850606 [details]
console log
Created attachment 850608 [details]
storage log
Created attachment 850609 [details]
sys log
Created attachment 850610 [details]
dmesg
Created attachment 850611 [details]
messages
The logs you attached seem to capture a successful installation. What was the problem? Right now on the console, All I see is ......... This will continue until Beaker times out the install and kills it. 11:47:23 Running pre-installation scripts 11:47:24 Not asking for VNC because of an automated install 11:47:24 Not asking for VNC because text mode was explicitly asked for in kickstart Starting automated install.............................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................forever One of your kickstart %post scripts must be hung. The last thing in anaconda.log is this: 16:39:39,928 INFO anaconda: Running kickstart %%post script(s) Once that finishes, you would see something like this: 16:39:39,928 INFO anaconda: All kickstart %%post script(s) have been run But the second line above is absent. I see that 'set -x' is in your %post, and that the logging should go to /dev/console. Maybe something there will provide some insight as to what's hanging. It appears to be in the pre-install phase. Starting installer, one moment... anaconda 19.31.42-1 for Red Hat Enterprise Linux 7.0 (pre-release) started. 11:47:23 Running pre-installation scripts 11:47:24 Not asking for VNC because of an automated install 11:47:24 Not asking for VNC because text mode was explicitly asked for in kickstart Starting automated install...................................................................................forever Complete screwup on my part. Wrong install and wrong logs. sigh.... Off to get the correct info. The correct job (still running). https://beaker.engineering.redhat.com/jobs/576459 Will add logs. Created attachment 850668 [details]
anaconda log
Created attachment 850669 [details]
console log
Created attachment 850670 [details]
storage log
Created attachment 850671 [details]
sys log
The last thing in program.log is this:
14:39:22,193 INFO program: Running... lvm pvs --unit=k --nosuffix --nameprefixes --rows --unquoted --noheadings -opv_uuid,pe_start,vg_name,vg_uuid,vg_size,vg_free,vg_extent_size,vg_extent_count,vg_free_count,pv_count --config global {locking_type=4} /dev/sda2
Had this command completed it would be followed by a line like this:
14:39:22,350 DEBUG program: Return code: 0
So it looks like this lvm command is hanging.
OK, I've tried the installation on the machine myself (thanks Barry for the access to the machine!) and I've found this:
The problem is that the devices are multipath members and we fire PV scans on those, even though these should be ignored!
We normally ignore devices (for pvscan) which have
'DM_MULTIPATH_DEVICE_PATH': '1'
...set in udev (this is set by mpath udev rules to indicate that this is a multipath device path).
However, we still fire a scan because I can see:
'SYSTEMD_WANTS': 'lvm2-pvscan@8:16.service'
...that instantiates a service to run a scan on the device. This is because the LVM label was detected to be lost from previous state, indicated by this property recorded in udev:
'LVM_PV_GONE': '1'
This only happens if and only if previously there was a PV label detected and recorded in udev db ('ID_FS_TYPE': 'LVM2_member') and now, there's something else ('ID_FS_TYPE': 'mpath_member' exactly here).
I still have to investigate how this happens, looks like a problem with device claiming - the device should be claimed by mpath as the very first thing on boot if this is an mpath device to prevent this problem... Looks like an ordering problem when scanning devices/running subsystem services (mpath should be first so any mpath member is properly marked before it's considered by lvm2).
Also, another part of this problem is numerous event processing (for which we already have a patch - bz #1056976 and original and related bz #1049387 - the mpath part built just yesterday and I'll do an lvm2 update as well) - this should also help here...
The extra probolem in this report is the LVM_PV_GONE=1 flag set in udev db which fires the useless extra pvscans...
This is where it goes wrong - it's during initramfs stage (I've chosen one path as an example - /dev/sddq here). In dracut's "pre-trigger", this is marked as ID_FS_TYPE=LVM2_member (incorrect) while at "switch-root" it's ID_FS_TYPE=mpath_member (correct). This transition causes the LV_PV_GONE=1 to be set which then fires pvscan later on. Though it's odd that the ID_FS_TYPE=LVM2_member is not corrected to ID_FS_TYPE=mpath_member as the "multipath -c /dev/sddq" already shows that it's the multipath device path even in the pre-trigger stage (where the ID_FS_TYPE has incorrect value in udev database). /lib/udev/rules.d/62-multipath.rules should have rewritten this to correct value. pre-trigger:/# multipath -c /dev/sddq /dev/sddq is a valid multipath device path pre-trigger:/# udevadm info --name=/dev/sddq P: /devices/pci0000:00/0000:00:09.0/0000:05:00.1/host2/rport-2:0-3/target2:0:1/2:0:1:39/block/sddq N: sddq S: disk/by-id/scsi-360a9800044306a31715d436c58423270 S: disk/by-id/wwn-0x60a9800044306a31715d436c58423270 S: disk/by-path/pci-0000:05:00.1-fc-0x500a0983881bc1db-lun-39 E: DEVLINKS=/dev/disk/by-id/scsi-360a9800044306a31715d436c58423270 /dev/disk/by-id/wwn-0x60a9800044306a31715d436c58423270 /dev/disk/by-path/pci-0000:05:00.1-fc-0x500a0983881bc1db-lun-39 E: DEVNAME=/dev/sddq E: DEVPATH=/devices/pci0000:00/0000:00:09.0/0000:05:00.1/host2/rport-2:0-3/target2:0:1/2:0:1:39/block/sddq E: DEVTYPE=disk E: DM_MULTIPATH_TIMESTAMP=1391420704 E: ID_BUS=scsi E: ID_FS_TYPE=LVM2_member E: ID_FS_USAGE=raid E: ID_FS_UUID=3seHM5-sQzk-N1AH-wewl-3zeP-hBfL-mEpzFw E: ID_FS_UUID_ENC=3seHM5-sQzk-N1AH-wewl-3zeP-hBfL-mEpzFw E: ID_FS_VERSION=LVM2 001 E: ID_MODEL=LUN E: ID_MODEL_ENC=LUN\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20 E: ID_PATH=pci-0000:05:00.1-fc-0x500a0983881bc1db-lun-39 E: ID_PATH_TAG=pci-0000_05_00_1-fc-0x500a0983881bc1db-lun-39 E: ID_REVISION=820a E: ID_SCSI=1 E: ID_SCSI_SERIAL=D0j1q]ClXB2p E: ID_SERIAL=360a9800044306a31715d436c58423270 E: ID_SERIAL_SHORT=60a9800044306a31715d436c58423270 E: ID_TARGET_PORT=1 E: ID_TYPE=disk E: ID_VENDOR=NETAPP E: ID_VENDOR_ENC=NETAPP\x20\x20 E: ID_WWN=0x60a9800044306a31 E: ID_WWN_VENDOR_EXTENSION=0x715d436c58423270 E: ID_WWN_WITH_EXTENSION=0x60a9800044306a31715d436c58423270 E: MAJOR=71 E: MINOR=128 E: MPATH_SBIN_PATH=/sbin E: SUBSYSTEM=block E: SYSTEMD_READY=1 E: TAGS=:systemd: E: USEC_INITIALIZED=449294 switch_root:/# udevadm info --name=/dev/sddq P: /devices/pci0000:00/0000:00:09.0/0000:05:00.1/host2/rport-2:0-3/target2:0:1/2:0:1:39/block/sddq N: sddq S: disk/by-id/scsi-360a9800044306a31715d436c58423270 S: disk/by-id/wwn-0x60a9800044306a31715d436c58423270 S: disk/by-path/pci-0000:05:00.1-fc-0x500a0983881bc1db-lun-39 E: DEVLINKS=/dev/disk/by-id/scsi-360a9800044306a31715d436c58423270 /dev/disk/by-id/wwn-0x60a9800044306a31715d436c58423270 /dev/disk/by-path/pci-0000:05:00.1-fc-0x500a0983881bc1db-lun-39 E: DEVNAME=/dev/sddq E: DEVPATH=/devices/pci0000:00/0000:00:09.0/0000:05:00.1/host2/rport-2:0-3/target2:0:1/2:0:1:39/block/sddq E: DEVTYPE=disk E: DM_MULTIPATH_DEVICE_PATH=1 E: DM_MULTIPATH_TIMESTAMP=1391420760 E: DM_MULTIPATH_WIPE_PARTS=1 E: ID_BUS=scsi E: ID_FS_TYPE=mpath_member E: ID_FS_USAGE=raid E: ID_FS_UUID=3seHM5-sQzk-N1AH-wewl-3zeP-hBfL-mEpzFw E: ID_FS_UUID_ENC=3seHM5-sQzk-N1AH-wewl-3zeP-hBfL-mEpzFw E: ID_FS_VERSION=LVM2 001 E: ID_MODEL=LUN E: ID_MODEL_ENC=LUN\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20 E: ID_PATH=pci-0000:05:00.1-fc-0x500a0983881bc1db-lun-39 E: ID_PATH_TAG=pci-0000_05_00_1-fc-0x500a0983881bc1db-lun-39 E: ID_REVISION=820a E: ID_SCSI=1 E: ID_SCSI_SERIAL=D0j1q]ClXB2p E: ID_SERIAL=360a9800044306a31715d436c58423270 E: ID_SERIAL_SHORT=60a9800044306a31715d436c58423270 E: ID_TARGET_PORT=1 E: ID_TYPE=disk E: ID_VENDOR=NETAPP E: ID_VENDOR_ENC=NETAPP\x20\x20 E: ID_WWN=0x60a9800044306a31 E: ID_WWN_VENDOR_EXTENSION=0x715d436c58423270 E: ID_WWN_WITH_EXTENSION=0x60a9800044306a31715d436c58423270 E: LVM_PV_GONE=1 E: MAJOR=71 E: MINOR=128 E: MPATH_SBIN_PATH=/sbin E: SUBSYSTEM=block E: SYSTEMD_READY=1 E: TAGS=:systemd: E: USEC_INITIALIZED=449294 Ben, can you have a look, please - seems the 62-multipath.rules did not detect valid multipath device path when it should in dracut? (...and did not set the DM_MULTIPATH_DEVICE_PATH=1 + it did not rewrite the ID_FS_LABEL to correct value during this early stage?) (This one "E: LVM_PV_GONE=1" is set whenever we change from ID_FS_TYPE="LVM2_member" to anything else - we fire pvscan to inform lvmetad that the device is not a PV anymore...) ..tried rd.udev.debug and I see tons of these: ... Feb 03 10:47:27 localhost systemd-udevd[653]: '/sbin/multipath -c /dev/sdag'(err) 'libudev: udev_device_read_db: no db file to read /run/udev/data/b66:0: No such file or directory' Feb 03 10:47:27 localhost systemd-udevd[591]: '/sbin/multipath -c /dev/sdac'(err) 'libudev: udev_device_new_from_syspath: device 0x1982250 has devpath '/devices/pci0000:00/0000:00:09.0/0000:05:00.1/host2/rport-2:0-3/fc_remote_ports/rport-2:0-3'' Feb 03 10:47:27 localhost systemd-udevd[613]: '/sbin/multipath -c /dev/sdl' [881] exit with return code 1 ... So that's probably the reason why the dev is not marked as multipath device path in udev db! Ben, just to make sure - "multipath -c" uses libudev, right? If yes, then I think we can hand this over to udev as it seems it can't handle so many devices at a time and, maybe some races inside udev when numerous devices are processed in a short time. Yes, multipath -c does use udev. I doesn't call udev_device_new_from_syspath() directly doing in this code path, but it does call udev_device_new_from_subsystem_sysname(), which may build the syspath form the subsystem name and device name inside libudev and call udev_device_new_from_syspath() internally, but I haven't looked.
There's another thing you might want to check though. In order to keep from firing doing so much work on change events, they only check if the device is a multipath device if the timestamp has been updated since the last uevent. Otherwise, they just copy the value from the udev database. It's possible that something is going wrong here. You could try removing all this from 62-multipath.rules
*******
ACTION!="change", GOTO="update_timestamp"
IMPORT{db}="DM_MULTIPATH_TIMESTAMP"
IMPORT{db}="DM_MULTIPATH_DEVICE_PATH"
IMPORT{db}="DM_MULTIPATH_WIPE_PARTS"
# Check if the device is part of a multipath device. the -T option just keeps
# the old result if the timestamp hasn't changed.
PROGRAM=="$env{MPATH_SBIN_PATH}/multipath -T $env{DM_MULTIPATH_TIMESTAMP}:$env{DM_MULTIPATH_DEVICE_PATH} -c $env{DEVNAME}", \
ENV{DM_MULTIPATH_DEVICE_PATH}="1", ENV{ID_FS_TYPE}="mpath_member", \
GOTO="update_timestamp"
# If the device isn't part of a multipath device, clear this
ENV{DM_MULTIPATH_DEVICE_PATH}=""
ENV{DM_MULTIPATH_WIPE_PARTS}=""
LABEL="update_timestamp"
*******
and change
ACTION=="add", ENV{DM_MULTIPATH_DEVICE_PATH}!="1", \
PROGRAM=="$env{MPATH_SBIN_PATH}/multipath -c $tempnode", \
ENV{DM_MULTIPATH_DEVICE_PATH}="1", ENV{ID_FS_TYPE}="mpath_member"
to
ACTION!="remove", ENV{DM_MULTIPATH_DEVICE_PATH}!="1", \
PROGRAM=="$env{MPATH_SBIN_PATH}/multipath -c $tempnode", \
ENV{DM_MULTIPATH_DEVICE_PATH}="1", ENV{ID_FS_TYPE}="mpath_member"
and then remake the intramfs. This will make multipath always actually do the check. However given the errors, it seems more likely that this is working fine, and the problem is that udev can't give me a udev device to use.
It seems this an instance of bug #1054806 where the problem is in a way devices are claimed. Since multipath doesn't know whether the device is a multipath path or not unless we have another device with the same WWID or directly defined in the multipath configuration (which is not the case during installation of course). This ends up with LVM taking over the device (which later appears to be a multipath path). This problem is just being resolved at the moment (bug #1054806). In addition to that, I'm also a bit worried about these messages: Feb 03 10:47:27 localhost systemd-udevd[653]: '/sbin/multipath -c /dev/sdag'(err) 'libudev: udev_device_read_db: no db file to read /run/udev/data/b66:0: No such file or directory' Since multipath uses libudev to enumerate block devices in the system, I assume this can end up with multipath not seeing a block device and hence ending up with a conclusion that there's no other device in the system with the same WWID. Harald - can you please confirm? Is it possible that the error above causes the block device to be "invisible" to the libudev user? PROGRAM="" (not ==, btw) runs at rule processing time. So, if a device appears and has no records yet, of course there is no database file for it _yet_.
PROGRAM
Execute a program to determine whether there is a match; the key is true
if the program returns successfully. The device properties are made
available to the executed program in the environment. The program's stdout
is available in the RESULT key.
This can only be used for very short-running foreground tasks.
For details see RUN.
Also, please check, if the custom initrd udev rules, dracut adds are correct and do not interfere with yours. namely: 11-dm.rules 59-persistent-storage-dm.rules http://git.kernel.org/cgit/boot/dracut/dracut.git/tree/modules.d/90dm/11-dm.rules http://git.kernel.org/cgit/boot/dracut/dracut.git/tree/modules.d/90dm/59-persistent-storage-dm.rules Peter, I'm pretty sure that these
Feb 03 10:47:27 localhost systemd-udevd[653]: '/sbin/multipath -c /dev/sdag'(err) 'libudev: udev_device_read_db: no db file to read /run/udev/data/b66:0: No such file or directory'
messages aren't a problem. These are most likely coming from this chunk of code in mutipath's get_uid() function.
value = udev_device_get_property_value(pp->udev, pp->uid_attribute);
if ((!value || strlen(value) == 0) && conf->dry_run == 2)
value = getenv(pp->uid_attribute);
pp->uid_attribute is "ID_SERIAL" most devices. and "conf->dry_run == 2" is checking if we are running "multipath -c". If we are, we just read the value from the udev environment variable if we can't grab it from the udev database. So you will see the error message, but multipath will deal with it.
OK, thanks for confirmation. I think we can mark this one as dup of bug #1054806 then - if the other bug is fixes, the devs should be claimed properly and we should not hit the bug reported here anymore I believe. *** This bug has been marked as a duplicate of bug 1054806 *** |