Description of problem: TUI install rhevh 7.0 successful on multipath device(can identify the device by multipath -ll), after installation, boot rhevh, but physical volumes is created into single path /dev/sd* after reboot. That will cause rhevh itself is not in multipath device, and can not failover that lost the meaning of multipath. Version-Release number of selected component (if applicable): rhevh-7.0-20150114.0.el7ev.iso ovirt-node-3.2.1-4.el7.noarch How reproducible: 100% Steps to Reproduce: 1. TUI install rhevh on multipath device lun 360a9800050334c33424b32542d43497a Note: multipath -ll can list this lun # multipath -ll Jan 15 12:29:22 | multipath.conf +7, invalid keyword: getuid_callout 360a9800050334c33424b32542d43497a dm-3 NETAPP ,LUN size=20G features='4 queue_if_no_path pg_init_retries 50 retain_attached_hw_handle' hwhandler='0' wp=rw `-+- policy='service-time 0' prio=2 status=active |- 7:0:0:0 sdd 8:48 active ready running `- 7:0:1:0 sdb 8:16 active ready running 360a9800050334c33424b32542d45446e dm-4 NETAPP ,LUN size=30G features='4 queue_if_no_path pg_init_retries 50 retain_attached_hw_handle' hwhandler='0' wp=rw `-+- policy='service-time 0' prio=2 status=active |- 7:0:0:1 sde 8:64 active ready running `- 7:0:1:1 sdc 8:32 active ready running 2. After installation, login rhevh 3. F2 to shell 4. # pvs Found duplicate PV 9Ge9Zy7t2d2Uk1v5Tb5w6IF9nqtUwUwT: using /dev/sdd4 not /dev/sdb4 PV VG Fmt Attr PSize PFree /dev/sdd4 HostVG lvm2 a-- 11.71g 6.80g note:Here PV is created in /sd* 5. # multipath -ll Jan 15 11:18:58 | multipath.conf +7, invalid keyword: getuid_callout 360a9800050334c33424b32542d45446e dm-2 NETAPP ,LUN size=30G features='4 queue_if_no_path pg_init_retries 50 retain_attached_hw_handle' hwhandler='0' wp=rw `-+- policy='service-time 0' prio=2 status=active |- 7:0:0:1 sde 8:64 active ready running `- 7:0:1:1 sdc 8:32 active ready running 6. # lsblk --nodeps -o name,serial NAME SERIAL sda S1W3J9BS604547 sdb 60a9800050334c33424b32542d43497a sdc 60a9800050334c33424b32542d45446e sdd 60a9800050334c33424b32542d43497a sde 60a9800050334c33424b32542d45446e sr0 M0094645757 sr1 110052081500 loop0 loop1 loop2 Actual results: physical volumes is created into single path /dev/sd* after rhevh installation restart. can not failover. Expected results: After TUI installation, pv should be still on multipath device can failover. Additional info:
Created attachment 980485 [details] varlog
Created attachment 980487 [details] sosreport
The commandline from the logs: BOOT_IMAGE=/vmlinuz0 root=live:LABEL=Root ro rootfstype=auto rootflags=ro ksdevice=bootif rd.dm=0 rd.md=0 crashkernel=256M lang= max_loop=256 rd.live.check quiet elevator=deadline rhgb rd.luks=0 rd.live.image mpath.wwid=360a9800050334c33424b32542d43497a It seems that the serial give in mpath.wwid does not appear in the lsblk output adfter the installation.
Ben, looking at the logs from comment 2, it looks like mpath does not assemble the device in initramfs, but tries to do it in userspace (which fails, because it already booted of one of the paths). Some excerpt from the initramfs part: Jan 15 11:03:13 localhost kernel: qla4xxx 0000:04:01.3: Do not have CHAP table cache Jan 15 11:03:13 localhost kernel: scsi 7:0:1:0: Direct-Access NETAPP LUN 7320 PQ: 0 ANSI: 4 Jan 15 11:03:13 localhost kernel: sd 7:0:1:0: [sdb] 41852928 512-byte logical blocks: (21.4 GB/19.9 GiB) Jan 15 11:03:13 localhost kernel: scsi 7:0:1:1: Direct-Access NETAPP LUN 7320 PQ: 0 ANSI: 4 Jan 15 11:03:13 localhost kernel: sd 7:0:1:0: [sdb] Write Protect is off Jan 15 11:03:13 localhost kernel: sd 7:0:1:0: [sdb] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA Jan 15 11:03:13 localhost kernel: sd 7:0:1:1: [sdc] 62781440 512-byte logical blocks: (32.1 GB/29.9 GiB) Jan 15 11:03:13 localhost kernel: sd 7:0:1:1: [sdc] Write Protect is off Jan 15 11:03:13 localhost kernel: sd 7:0:1:1: [sdc] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA Jan 15 11:03:13 localhost kernel: sdc: unknown partition table Jan 15 11:03:13 localhost kernel: sdb: sdb1 sdb2 sdb3 sdb4 Jan 15 11:03:13 localhost kernel: sd 7:0:1:1: [sdc] Attached SCSI disk Jan 15 11:03:13 localhost kernel: sd 7:0:1:0: [sdb] Attached SCSI disk Jan 15 11:03:13 localhost kernel: qla4xxx 0000:04:01.3: qla4xxx_get_fwddb_entry: DDB[0] MB0 4000 Tot 2 Next 1 State 0007 ConnErr 00000000 10.66.90.115 :3260 "iqn.1992-08.com.netapp:sn.135053389"
More informations from the host: [root@hp-z800-02 admin]# multipath -ll Jan 15 13:34:57 | multipath.conf +5, invalid keyword: getuid_callout Jan 15 13:34:57 | multipath.conf +18, invalid keyword: getuid_callout Jan 15 13:34:57 | multipath.conf +37, invalid keyword: getuid_callout 360a9800050334c33424b32542d45446e dm-2 NETAPP ,LUN size=30G features='4 queue_if_no_path pg_init_retries 50 retain_attached_hw_handle' hwhandler='0' wp=rw `-+- policy='service-time 0' prio=2 status=active |- 7:0:0:1 sde 8:64 active ready running `- 7:0:1:1 sdc 8:32 active ready running [root@hp-z800-02 admin]# cat /proc/cmdline BOOT_IMAGE=/vmlinuz0 root=live:LABEL=Root ro rootfstype=auto rootflags=ro ksdevice=bootif rd.dm=0 rd.md=0 crashkernel=256M lang= max_loop=256 rd.live.check quiet elevator=deadline rhgb rd.luks=0 rd.live.image mpath.wwid=360a9800050334c33424b32542d43497a [root@hp-z800-02 admin]# lsblk -o name,serial NAME SERIAL … sdb 60a9800050334c33424b32542d43497a |-sdb1 |-sdb2 |-sdb3 `-sdb4 … sdd 60a9800050334c33424b32542d43497a |-sdd1 |-sdd2 |-sdd3 `-sdd4 |-HostVG-Swap |-HostVG-Config |-HostVG-Logging `-HostVG-Data sde 60a9800050334c33424b32542d45446e `-360a9800050334c33424b32542d45446e … [root@hp-z800-02 admin]# blkid -L Root /dev/sdb3 Considering the informations from above it looks like the mpath device which is used as the boot device 60a…97a can not be assembled.
Tested this bug on rhev-hypervisor6-6.6-20150114.0, this bug did not exist on rhevh el6.6 build. So this bug is rhevh 7.0 only issue.
Test version: rhev-hypervisor7-7.0-20140115.dontuse.iso ovirt-node-3.2.1-4.el7.noarch device-mapper-multipath-0.4.9-66.el7.x86_64 Test steps and results: 1. TUI install rhevh on multipath device lun 360a9800050334c33424b32542d43497a 2. After installation, login rhevh 3. F2 to shell 4. # pvs PV VG Fmt Attr PSize PFree /dev/mapper/360a9800050334c33424b32542d43497a4 HostVG lvm2 a-- 11.71g 400.00m 5. # multipath -ll Jan 16 12:01:26 | multipath.conf +5, invalid keyword: getuid_callout Jan 16 12:01:26 | multipath.conf +18, invalid keyword: getuid_callout Jan 16 12:01:26 | multipath.conf +37, invalid keyword: getuid_callout 360a9800050334c33424b32542d43497a dm-1 NETAPP ,LUN size=20G features='3 pg_init_retries 50 retain_attached_hw_handler' hwhandler='0' wp=rw `-+- policy='service-time 0' prio=2 status=active |- 7:0:0:0 sdd 8:48 active ready running `- 7:0:1:0 sdb 8:16 active ready running 360a9800050334c33424b32542d45446e dm-0 NETAPP ,LUN size=30G features='3 pg_init_retries 50 retain_attached_hw_handler' hwhandler='0' wp=rw `-+- policy='service-time 0' prio=2 status=active |- 7:0:0:1 sde 8:64 active ready running `- 7:0:1:1 sdc 8:32 active ready running 6. # lsblk -o name,serial NAME SERIAL sda S1W3J9BS604547 ├─sda1 ├─sda2 ├─sda3 └─sda4 sdb 60a9800050334c33424b32542d43497a └─360a9800050334c33424b32542d43497a ├─360a9800050334c33424b32542d43497a1 ├─360a9800050334c33424b32542d43497a2 ├─360a9800050334c33424b32542d43497a3 └─360a9800050334c33424b32542d43497a4 ├─HostVG-Swap ├─HostVG-Config ├─HostVG-Logging └─HostVG-Data sdc 60a9800050334c33424b32542d45446e └─360a9800050334c33424b32542d45446e sdd 60a9800050334c33424b32542d43497a └─360a9800050334c33424b32542d43497a ├─360a9800050334c33424b32542d43497a1 ├─360a9800050334c33424b32542d43497a2 ├─360a9800050334c33424b32542d43497a3 └─360a9800050334c33424b32542d43497a4 ├─HostVG-Swap ├─HostVG-Config ├─HostVG-Logging └─HostVG-Data sde 60a9800050334c33424b32542d45446e └─360a9800050334c33424b32542d45446e sr0 M0094645757 loop0 loop1 ├─live-rw └─live-base loop2 └─live-rw Now the behavior in multipath installation is right now. So consider the issue is fixed in rhev-hypervisor7-7.0-20140115.dontuse.iso.
The issue here was that the initrd wasn't updated.
*** Bug 1182048 has been marked as a duplicate of this bug. ***
*** Bug 1182516 has been marked as a duplicate of this bug. ***
According to comment 10, this issue did not on rhevh 6.6 for rhev 3.5 build. So only verified this bug on el7 build [root@hp-z600-03 admin]# cat /etc/system-release Red Hat Enterprise Virtualization Hypervisor release 7.0 (20150119.0.1.el7ev) [root@hp-z600-03 admin]# rpm -q ovirt-node ovirt-node-3.2.1-5.el7.noarch [root@hp-z600-03 admin]# multipath -ll Jan 20 07:35:39 | multipath.conf +5, invalid keyword: getuid_callout Jan 20 07:35:39 | multipath.conf +18, invalid keyword: getuid_callout Jan 20 07:35:39 | multipath.conf +37, invalid keyword: getuid_callout 35000c5001d5b2973 dm-15 SEAGATE ,ST3146356SS size=137G features='0' hwhandler='0' wp=rw `-+- policy='service-time 0' prio=1 status=active `- 6:0:0:0 sda 8:0 active ready running 360a9800050334c33424b334166784f55 dm-0 NETAPP ,LUN size=19G features='3 pg_init_retries 50 retain_attached_hw_handler' hwhandler='0' wp=rw `-+- policy='service-time 0' prio=2 status=active |- 0:0:0:1 sdh 8:112 active ready running `- 0:0:1:1 sdc 8:32 active ready running 360a9800050334c33424b334163434546 dm-3 NETAPP ,LUN size=25G features='3 pg_init_retries 50 retain_attached_hw_handler' hwhandler='0' wp=rw `-+- policy='service-time 0' prio=2 status=active |- 0:0:0:0 sdg 8:96 active ready running `- 0:0:1:0 sdb 8:16 active ready running 360a9800050334c33424b334167714852 dm-1 NETAPP ,LUN size=1021M features='3 pg_init_retries 50 retain_attached_hw_handler' hwhandler='0' wp=rw `-+- policy='service-time 0' prio=2 status=active |- 0:0:0:2 sdi 8:128 active ready running `- 0:0:1:2 sdd 8:48 active ready running 360a9800050334c33424b334167742f70 dm-2 NETAPP ,LUN size=2.0G features='3 pg_init_retries 50 retain_attached_hw_handler' hwhandler='0' wp=rw `-+- policy='service-time 0' prio=2 status=active |- 0:0:0:3 sdj 8:144 active ready running `- 0:0:1:3 sde 8:64 active ready running 360a9800050334c33424b334167756648 dm-4 NETAPP ,LUN size=3.0G features='3 pg_init_retries 50 retain_attached_hw_handler' hwhandler='0' wp=rw `-+- policy='service-time 0' prio=2 status=active |- 0:0:0:4 sdk 8:160 active ready running `- 0:0:1:4 sdf 8:80 active ready running [root@hp-z600-03 admin]# pvs PV VG Fmt Attr PSize PFree /dev/mapper/360a9800050334c33424b334163434546p4 HostVG lvm2 a-- 16.60g 11.95g [root@hp-z600-03 admin]# lsblk -o name,serial NAME SERIAL sda 5000c5001d5b2973 ├─sda1 ├─sda2 ├─sda3 └─sda4 sdb 60a9800050334c33424b334163434546 └─360a9800050334c33424b334163434546 ├─360a9800050334c33424b334163434546p1 ├─360a9800050334c33424b334163434546p2 ├─360a9800050334c33424b334163434546p3 └─360a9800050334c33424b334163434546p4 ├─HostVG-Swap ├─HostVG-Config ├─HostVG-Logging └─HostVG-Data sdc 60a9800050334c33424b334166784f55 └─360a9800050334c33424b334166784f55 sdd 60a9800050334c33424b334167714852 └─360a9800050334c33424b334167714852 sde 60a9800050334c33424b334167742f70 └─360a9800050334c33424b334167742f70 sdf 60a9800050334c33424b334167756648 └─360a9800050334c33424b334167756648 sdg 60a9800050334c33424b334163434546 └─360a9800050334c33424b334163434546 ├─360a9800050334c33424b334163434546p1 ├─360a9800050334c33424b334163434546p2 ├─360a9800050334c33424b334163434546p3 └─360a9800050334c33424b334163434546p4 ├─HostVG-Swap ├─HostVG-Config ├─HostVG-Logging └─HostVG-Data sdh 60a9800050334c33424b334166784f55 └─360a9800050334c33424b334166784f55 sdi 60a9800050334c33424b334167714852 └─360a9800050334c33424b334167714852 sdj 60a9800050334c33424b334167742f70 └─360a9800050334c33424b334167742f70 sdk 60a9800050334c33424b334167756648 └─360a9800050334c33424b334167756648 sr0 005CD005080 sr1 110052081500 loop0 loop1 ├─live-rw └─live-base loop2 └─live-rw $ md5sum rhev-hypervisor7-7.0-20150119.0.1.iso ed8647f757fc1199acfb3e9c673369b4 rhev-hypervisor7-7.0-20150119.0.1.iso Now this bug is fixed yet, after rhevh TUI installation, the pv is created on multipath device yet.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHEA-2015-0160.html