Bug 1182551 - incorrect mutipath device behavior after rhevh 3.5 TUI installation
Summary: incorrect mutipath device behavior after rhevh 3.5 TUI installation
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-node
Version: 3.5.0
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: ---
: 3.5.0
Assignee: Fabian Deutsch
QA Contact: Virtualization Bugs
URL:
Whiteboard: node
Depends On:
Blocks: rhev35rcblocker rhev35gablocker
TreeView+ depends on / blocked
 
Reported: 2015-01-15 12:36 UTC by Ying Cui
Modified: 2016-02-10 20:09 UTC (History)
12 users (show)

Fixed In Version: ovirt-node-3.2.1-5.el7
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-02-11 21:08:15 UTC
oVirt Team: Node
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
varlog (190.97 KB, application/x-gzip)
2015-01-15 12:38 UTC, Ying Cui
no flags Details
sosreport (5.93 MB, application/x-xz)
2015-01-15 12:40 UTC, Ying Cui
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2015:0160 0 normal SHIPPED_LIVE ovirt-node bug fix and enhancement update 2015-02-12 01:34:52 UTC
oVirt gerrit 36971 0 master MERGED Add product info in %post, but inside the chroot Never
oVirt gerrit 36987 0 master MERGED Don't glob the initramfs, use explicit kernel versions Never
oVirt gerrit 37099 0 ovirt-3.5 MERGED Don't glob the initramfs, use explicit kernel versions Never

Description Ying Cui 2015-01-15 12:36:14 UTC
Description of problem:
TUI install rhevh 7.0 successful on multipath device(can identify the device by multipath -ll), after installation, boot rhevh, but physical volumes is created into single path /dev/sd* after reboot.
That will cause rhevh itself is not in multipath device, and can not failover that lost the meaning of multipath.

Version-Release number of selected component (if applicable):
rhevh-7.0-20150114.0.el7ev.iso
ovirt-node-3.2.1-4.el7.noarch


How reproducible:
100%

Steps to Reproduce:
1. TUI install rhevh on multipath device lun 360a9800050334c33424b32542d43497a
Note: multipath -ll can list this lun
# multipath -ll
Jan 15 12:29:22 | multipath.conf +7, invalid keyword: getuid_callout
360a9800050334c33424b32542d43497a dm-3 NETAPP  ,LUN             
size=20G features='4 queue_if_no_path pg_init_retries 50 retain_attached_hw_handle' hwhandler='0' wp=rw
`-+- policy='service-time 0' prio=2 status=active
  |- 7:0:0:0 sdd 8:48 active ready running
  `- 7:0:1:0 sdb 8:16 active ready running
360a9800050334c33424b32542d45446e dm-4 NETAPP  ,LUN             
size=30G features='4 queue_if_no_path pg_init_retries 50 retain_attached_hw_handle' hwhandler='0' wp=rw
`-+- policy='service-time 0' prio=2 status=active
  |- 7:0:0:1 sde 8:64 active ready running
  `- 7:0:1:1 sdc 8:32 active ready running


2. After installation, login rhevh
3. F2 to shell
4. # pvs
  Found duplicate PV 9Ge9Zy7t2d2Uk1v5Tb5w6IF9nqtUwUwT: using /dev/sdd4 not /dev/sdb4
  PV         VG     Fmt  Attr PSize  PFree
  /dev/sdd4  HostVG lvm2 a--  11.71g 6.80g

note:Here PV is created in /sd*

5. # multipath -ll
Jan 15 11:18:58 | multipath.conf +7, invalid keyword: getuid_callout
360a9800050334c33424b32542d45446e dm-2 NETAPP  ,LUN             
size=30G features='4 queue_if_no_path pg_init_retries 50 retain_attached_hw_handle' hwhandler='0' wp=rw
`-+- policy='service-time 0' prio=2 status=active
  |- 7:0:0:1 sde 8:64 active ready running
  `- 7:0:1:1 sdc 8:32 active ready running

6. #  lsblk --nodeps -o name,serial
NAME  SERIAL
sda   S1W3J9BS604547
sdb   60a9800050334c33424b32542d43497a
sdc   60a9800050334c33424b32542d45446e
sdd   60a9800050334c33424b32542d43497a
sde   60a9800050334c33424b32542d45446e
sr0   M0094645757
sr1   110052081500
loop0 
loop1 
loop2 

Actual results:
physical volumes is created into single path /dev/sd* after rhevh installation restart. can not failover.


Expected results:
After TUI installation, pv should be still on multipath device can failover.

Additional info:

Comment 1 Ying Cui 2015-01-15 12:38:09 UTC
Created attachment 980485 [details]
varlog

Comment 2 Ying Cui 2015-01-15 12:40:54 UTC
Created attachment 980487 [details]
sosreport

Comment 3 Fabian Deutsch 2015-01-15 13:07:27 UTC
The commandline from the logs:

BOOT_IMAGE=/vmlinuz0 root=live:LABEL=Root ro rootfstype=auto rootflags=ro ksdevice=bootif rd.dm=0 rd.md=0 crashkernel=256M lang= max_loop=256 rd.live.check quiet elevator=deadline rhgb rd.luks=0 rd.live.image mpath.wwid=360a9800050334c33424b32542d43497a


It seems that the serial give in mpath.wwid does not appear in the lsblk output adfter the installation.

Comment 6 Fabian Deutsch 2015-01-15 13:32:57 UTC
Ben, looking at the logs from comment 2, it looks like mpath does not assemble the device in initramfs, but tries to do it in userspace (which fails, because it already booted of one of the paths).

Some excerpt from the initramfs part:
    Jan 15 11:03:13 localhost kernel: qla4xxx 0000:04:01.3: Do not have CHAP table cache
    Jan 15 11:03:13 localhost kernel: scsi 7:0:1:0: Direct-Access     NETAPP   LUN              7320 PQ: 0 ANSI: 4
    Jan 15 11:03:13 localhost kernel: sd 7:0:1:0: [sdb] 41852928 512-byte logical blocks: (21.4 GB/19.9 GiB)
    Jan 15 11:03:13 localhost kernel: scsi 7:0:1:1: Direct-Access     NETAPP   LUN              7320 PQ: 0 ANSI: 4
    Jan 15 11:03:13 localhost kernel: sd 7:0:1:0: [sdb] Write Protect is off
    Jan 15 11:03:13 localhost kernel: sd 7:0:1:0: [sdb] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA
    Jan 15 11:03:13 localhost kernel: sd 7:0:1:1: [sdc] 62781440 512-byte logical blocks: (32.1 GB/29.9 GiB)
    Jan 15 11:03:13 localhost kernel: sd 7:0:1:1: [sdc] Write Protect is off
    Jan 15 11:03:13 localhost kernel: sd 7:0:1:1: [sdc] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA
    Jan 15 11:03:13 localhost kernel: sdc: unknown partition table
    Jan 15 11:03:13 localhost kernel: sdb: sdb1 sdb2 sdb3 sdb4
    Jan 15 11:03:13 localhost kernel: sd 7:0:1:1: [sdc] Attached SCSI disk
    Jan 15 11:03:13 localhost kernel: sd 7:0:1:0: [sdb] Attached SCSI disk
    Jan 15 11:03:13 localhost kernel: qla4xxx 0000:04:01.3: qla4xxx_get_fwddb_entry: DDB[0] MB0 4000 Tot 2 Next 1 State 0007 ConnErr 00000000 10.66.90.115 :3260 "iqn.1992-08.com.netapp:sn.135053389"

Comment 8 Fabian Deutsch 2015-01-15 13:40:29 UTC
More informations from the host:

[root@hp-z800-02 admin]# multipath -ll
Jan 15 13:34:57 | multipath.conf +5, invalid keyword: getuid_callout
Jan 15 13:34:57 | multipath.conf +18, invalid keyword: getuid_callout
Jan 15 13:34:57 | multipath.conf +37, invalid keyword: getuid_callout
360a9800050334c33424b32542d45446e dm-2 NETAPP  ,LUN             
size=30G features='4 queue_if_no_path pg_init_retries 50 retain_attached_hw_handle' hwhandler='0' wp=rw
`-+- policy='service-time 0' prio=2 status=active
  |- 7:0:0:1 sde 8:64 active ready running
  `- 7:0:1:1 sdc 8:32 active ready running

[root@hp-z800-02 admin]# cat /proc/cmdline 
BOOT_IMAGE=/vmlinuz0 root=live:LABEL=Root ro rootfstype=auto rootflags=ro ksdevice=bootif rd.dm=0 rd.md=0 crashkernel=256M lang= max_loop=256 rd.live.check quiet elevator=deadline rhgb rd.luks=0 rd.live.image mpath.wwid=360a9800050334c33424b32542d43497a

[root@hp-z800-02 admin]# lsblk -o name,serial
NAME                                SERIAL
…
sdb                                 60a9800050334c33424b32542d43497a
|-sdb1                              
|-sdb2                              
|-sdb3                              
`-sdb4                              
…
sdd                                 60a9800050334c33424b32542d43497a
|-sdd1                              
|-sdd2                              
|-sdd3                              
`-sdd4                              
  |-HostVG-Swap                     
  |-HostVG-Config                   
  |-HostVG-Logging                  
  `-HostVG-Data                     
sde                                 60a9800050334c33424b32542d45446e
`-360a9800050334c33424b32542d45446e 

…

[root@hp-z800-02 admin]# blkid -L Root
/dev/sdb3

Considering the informations from above it looks like the mpath device which is used as the boot device 60a…97a can not be assembled.

Comment 10 Ying Cui 2015-01-16 10:20:32 UTC
Tested this bug on rhev-hypervisor6-6.6-20150114.0, this bug did not exist on rhevh el6.6 build. 

So this bug is rhevh 7.0 only issue.

Comment 11 wanghui 2015-01-16 12:04:49 UTC
Test version:
rhev-hypervisor7-7.0-20140115.dontuse.iso
ovirt-node-3.2.1-4.el7.noarch
device-mapper-multipath-0.4.9-66.el7.x86_64

Test steps and results:
1. TUI install rhevh on multipath device lun 360a9800050334c33424b32542d43497a
2. After installation, login rhevh
3. F2 to shell
4. # pvs
   PV                                             VG     Fmt  Attr PSize  PFree  
  /dev/mapper/360a9800050334c33424b32542d43497a4 HostVG lvm2 a--  11.71g 400.00m

5. # multipath -ll
Jan 16 12:01:26 | multipath.conf +5, invalid keyword: getuid_callout
Jan 16 12:01:26 | multipath.conf +18, invalid keyword: getuid_callout
Jan 16 12:01:26 | multipath.conf +37, invalid keyword: getuid_callout
360a9800050334c33424b32542d43497a dm-1 NETAPP  ,LUN             
size=20G features='3 pg_init_retries 50 retain_attached_hw_handler' hwhandler='0' wp=rw
`-+- policy='service-time 0' prio=2 status=active
  |- 7:0:0:0 sdd 8:48 active ready running
  `- 7:0:1:0 sdb 8:16 active ready running
360a9800050334c33424b32542d45446e dm-0 NETAPP  ,LUN             
size=30G features='3 pg_init_retries 50 retain_attached_hw_handler' hwhandler='0' wp=rw
`-+- policy='service-time 0' prio=2 status=active
  |- 7:0:0:1 sde 8:64 active ready running
  `- 7:0:1:1 sdc 8:32 active ready running

6. #  lsblk -o name,serial
NAME                                   SERIAL
sda                                    S1W3J9BS604547
├─sda1                                 
├─sda2                                 
├─sda3                                 
└─sda4                                 
sdb                                    60a9800050334c33424b32542d43497a
└─360a9800050334c33424b32542d43497a    
  ├─360a9800050334c33424b32542d43497a1 
  ├─360a9800050334c33424b32542d43497a2 
  ├─360a9800050334c33424b32542d43497a3 
  └─360a9800050334c33424b32542d43497a4 
    ├─HostVG-Swap                      
    ├─HostVG-Config                    
    ├─HostVG-Logging                   
    └─HostVG-Data                      
sdc                                    60a9800050334c33424b32542d45446e
└─360a9800050334c33424b32542d45446e    
sdd                                    60a9800050334c33424b32542d43497a
└─360a9800050334c33424b32542d43497a    
  ├─360a9800050334c33424b32542d43497a1 
  ├─360a9800050334c33424b32542d43497a2 
  ├─360a9800050334c33424b32542d43497a3 
  └─360a9800050334c33424b32542d43497a4 
    ├─HostVG-Swap                      
    ├─HostVG-Config                    
    ├─HostVG-Logging                   
    └─HostVG-Data                      
sde                                    60a9800050334c33424b32542d45446e
└─360a9800050334c33424b32542d45446e    
sr0                                    M0094645757
loop0                                  
loop1                                  
├─live-rw                              
└─live-base                            
loop2                                  
└─live-rw      

Now the behavior in multipath installation is right now. So consider the issue is fixed in rhev-hypervisor7-7.0-20140115.dontuse.iso.

Comment 12 Fabian Deutsch 2015-01-19 07:57:29 UTC
The issue here was that the initrd wasn't updated.

Comment 13 Fabian Deutsch 2015-01-19 14:29:08 UTC
*** Bug 1182048 has been marked as a duplicate of this bug. ***

Comment 14 Anatoly Litovsky 2015-01-19 15:40:17 UTC
*** Bug 1182516 has been marked as a duplicate of this bug. ***

Comment 18 Ying Cui 2015-01-20 07:43:00 UTC
According to comment 10, this issue did not on rhevh 6.6 for rhev 3.5 build.

So only verified this bug on el7 build

[root@hp-z600-03 admin]# cat /etc/system-release
Red Hat Enterprise Virtualization Hypervisor release 7.0 (20150119.0.1.el7ev)
[root@hp-z600-03 admin]# rpm -q ovirt-node
ovirt-node-3.2.1-5.el7.noarch

[root@hp-z600-03 admin]# multipath -ll
Jan 20 07:35:39 | multipath.conf +5, invalid keyword: getuid_callout
Jan 20 07:35:39 | multipath.conf +18, invalid keyword: getuid_callout
Jan 20 07:35:39 | multipath.conf +37, invalid keyword: getuid_callout
35000c5001d5b2973 dm-15 SEAGATE ,ST3146356SS     
size=137G features='0' hwhandler='0' wp=rw
`-+- policy='service-time 0' prio=1 status=active
  `- 6:0:0:0 sda 8:0   active ready running
360a9800050334c33424b334166784f55 dm-0 NETAPP  ,LUN             
size=19G features='3 pg_init_retries 50 retain_attached_hw_handler' hwhandler='0' wp=rw
`-+- policy='service-time 0' prio=2 status=active
  |- 0:0:0:1 sdh 8:112 active ready running
  `- 0:0:1:1 sdc 8:32  active ready running
360a9800050334c33424b334163434546 dm-3 NETAPP  ,LUN             
size=25G features='3 pg_init_retries 50 retain_attached_hw_handler' hwhandler='0' wp=rw
`-+- policy='service-time 0' prio=2 status=active
  |- 0:0:0:0 sdg 8:96  active ready running
  `- 0:0:1:0 sdb 8:16  active ready running
360a9800050334c33424b334167714852 dm-1 NETAPP  ,LUN             
size=1021M features='3 pg_init_retries 50 retain_attached_hw_handler' hwhandler='0' wp=rw
`-+- policy='service-time 0' prio=2 status=active
  |- 0:0:0:2 sdi 8:128 active ready running
  `- 0:0:1:2 sdd 8:48  active ready running
360a9800050334c33424b334167742f70 dm-2 NETAPP  ,LUN             
size=2.0G features='3 pg_init_retries 50 retain_attached_hw_handler' hwhandler='0' wp=rw
`-+- policy='service-time 0' prio=2 status=active
  |- 0:0:0:3 sdj 8:144 active ready running
  `- 0:0:1:3 sde 8:64  active ready running
360a9800050334c33424b334167756648 dm-4 NETAPP  ,LUN             
size=3.0G features='3 pg_init_retries 50 retain_attached_hw_handler' hwhandler='0' wp=rw
`-+- policy='service-time 0' prio=2 status=active
  |- 0:0:0:4 sdk 8:160 active ready running
  `- 0:0:1:4 sdf 8:80  active ready running
[root@hp-z600-03 admin]# pvs
  PV                                              VG     Fmt  Attr PSize  PFree 
  /dev/mapper/360a9800050334c33424b334163434546p4 HostVG lvm2 a--  16.60g 11.95g
[root@hp-z600-03 admin]# lsblk -o name,serial
NAME                                    SERIAL
sda                                     5000c5001d5b2973
├─sda1                                  
├─sda2                                  
├─sda3                                  
└─sda4                                  
sdb                                     60a9800050334c33424b334163434546
└─360a9800050334c33424b334163434546     
  ├─360a9800050334c33424b334163434546p1 
  ├─360a9800050334c33424b334163434546p2 
  ├─360a9800050334c33424b334163434546p3 
  └─360a9800050334c33424b334163434546p4 
    ├─HostVG-Swap                       
    ├─HostVG-Config                     
    ├─HostVG-Logging                    
    └─HostVG-Data                       
sdc                                     60a9800050334c33424b334166784f55
└─360a9800050334c33424b334166784f55     
sdd                                     60a9800050334c33424b334167714852
└─360a9800050334c33424b334167714852     
sde                                     60a9800050334c33424b334167742f70
└─360a9800050334c33424b334167742f70     
sdf                                     60a9800050334c33424b334167756648
└─360a9800050334c33424b334167756648     
sdg                                     60a9800050334c33424b334163434546
└─360a9800050334c33424b334163434546     
  ├─360a9800050334c33424b334163434546p1 
  ├─360a9800050334c33424b334163434546p2 
  ├─360a9800050334c33424b334163434546p3 
  └─360a9800050334c33424b334163434546p4 
    ├─HostVG-Swap                       
    ├─HostVG-Config                     
    ├─HostVG-Logging                    
    └─HostVG-Data                       
sdh                                     60a9800050334c33424b334166784f55
└─360a9800050334c33424b334166784f55     
sdi                                     60a9800050334c33424b334167714852
└─360a9800050334c33424b334167714852     
sdj                                     60a9800050334c33424b334167742f70
└─360a9800050334c33424b334167742f70     
sdk                                     60a9800050334c33424b334167756648
└─360a9800050334c33424b334167756648     
sr0                                     005CD005080
sr1                                     110052081500
loop0                                   
loop1                                   
├─live-rw                               
└─live-base                             
loop2                                   
└─live-rw        

$ md5sum rhev-hypervisor7-7.0-20150119.0.1.iso
ed8647f757fc1199acfb3e9c673369b4  rhev-hypervisor7-7.0-20150119.0.1.iso


Now this bug is fixed yet, after rhevh TUI installation, the pv is created on multipath device yet.

Comment 20 errata-xmlrpc 2015-02-11 21:08:15 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHEA-2015-0160.html


Note You need to log in before you can comment on or make changes to this bug.