Bug 1204212 - Half of the paths to HostVG lost post-upgrade of the rhev-h host
Summary: Half of the paths to HostVG lost post-upgrade of the rhev-h host
Keywords:
Status: CLOSED INSUFFICIENT_DATA
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-node
Version: 3.5.0
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ovirt-3.6.2
: 3.6.0
Assignee: Fabian Deutsch
QA Contact: cshao
URL:
Whiteboard: node
Depends On: 1051742 1235965
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-03-20 15:37 UTC by akotov
Modified: 2019-09-12 08:22 UTC (History)
13 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-12-02 06:37:42 UTC
oVirt Team: Node
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 1191401 0 high CLOSED [RHEV-H 6.6] multipath fails while creating map 2021-02-22 00:41:40 UTC
Red Hat Bugzilla 1235965 0 high CLOSED Regenerate initramfs during installation to include multipath related configuration 2021-02-22 00:41:40 UTC
oVirt gerrit 42912 0 master MERGED Add initramfs re-generation during installation Never

Internal Links: 1191401 1235965

Description akotov 2015-03-20 15:37:22 UTC
Description of problem:

After performing standard upgrade procedure for RHEV-H host, 2 paths out of 4 were missing. No reboot or scsi-rescan commands were able to reveal the paths.
 
AFTER UPGRADE PROCEDURE:

Mar 18 10:20:47 | *word = A, len = 1
Mar 18 10:20:47 | *word = 0, len = 1
360060e8007e2b9000030e2b90000xxxx dm-8 HITACHI,OPEN-V
size=500G features='1 queue_if_no_path' hwhandler='0' wp=rw
`-+- policy='round-robin 0' prio=1 status=active
  |- 0:0:0:0  sda  8:0     active ready running
  `- 1:0:0:0  sdk  8:160   active ready running
Mar 18 10:20:47 | params = 1 queue_if_no_path 0 1 1 round-robin 0 4 1 65:96 1 130:128 1 69:80 1 134:112 1 

AFTER CLEAN REINSTALL OF SAME HOST:

360060e8007e2b9000030e2b90000xxxx dm-15 HITACHI,OPEN-V
size=500G features='1 queue_if_no_path' hwhandler='0' wp=rw
`-+- policy='round-robin 0' prio=1 status=active
  |- 1:0:0:0  sda  8:0     active ready running
  |- 4:0:0:0  sdeg 128:128 active ready running
  |- 0:0:0:0  sdjw 65:416  active ready running
  `- 5:0:0:0  sdeq 129:32  active ready running



Version-Release number of selected component (if applicable):

Red Hat Enterprise Virtualization Hypervisor 6.6 (20150128.0.el6ev)

How reproducible:

Unknown, do not have the h/w for reproducer yet

Steps to Reproduce:
1. Upgrade rhev-h

Actual results:

2 paths for HostVG

Expected results:

4 paths for HostVG

Additional info:

Noticeable difference is the addition of the mpath.wwid parameter to the grub configuration after the clean-reinstall was complete: 

mpath.wwid=360060e8007e2b9000030e2b90000xxxx

Comment 4 Fabian Deutsch 2015-04-01 14:05:29 UTC
The post-upgrade sosreport from comment 1 does only shows messages from a RHEV-H 6.5 runtime.

It would be necessary to get the logs from the RHEV-H 6.6 runtime, after the upgrade.

Thanks to the findings of Ben, it is quite clear that the mpath.wwid= argument is missing from the kernel commandline.

This argument is normally getting added when a user upgrades from 6.5 to 6.6.

n case that the line is missing it is a bug, but it can be fixed manually.

Alexander, can you provide the logs after the upgrade to 6.6?

Comment 5 akotov 2015-04-06 09:35:11 UTC
Fabian, i see that first sosreport is from 6.6 runtime

[cash@dhcp-26-166 mtafscpq2800bh01-2015031810181426673917]$ cat etc/redhat-release 
Red Hat Enterprise Virtualization Hypervisor 6.6 (20150128.0.el6ev)


$ cat etc/multipath/wwids  | grep 360060e8007e2b9000030e2b900001003
/360060e8007e2b9000030e2b900001003/
[cash@dhcp-26-166 mtafscpq2800bh01-2015031810181426673917]$ pwd
/home/cash/sos/01385391/mtafscpq2800bh01-2015031810181426673917

I also want to add, that customer tried to add boot LUN  (360060e8007e2b9000030e2b900001003) back into /etc/multipath/wwids - it was missing after the upgrade, cleared the LVM cache (/etc/lvm/cache/.cache), and rebooted the hypervisor. It did not resolve the issue, only fresh reinstall and adding wwid to kernel cmdline fixed it.

Comment 6 Ben Marzinski 2015-04-08 20:21:37 UTC
(In reply to akotov from comment #5)
> I also want to add, that customer tried to add boot LUN 
> (360060e8007e2b9000030e2b900001003) back into /etc/multipath/wwids - it was
> missing after the upgrade, cleared the LVM cache (/etc/lvm/cache/.cache),
> and rebooted the hypervisor. It did not resolve the issue, only fresh
> reinstall and adding wwid to kernel cmdline fixed it.

The issues with adding it the the wwids file is that if those paths are claimed in the initramfs and the wwids file there doesn't have it, it won't help.  You would have to either remake the initramfs, or edit the kernel comdline to make the wwid appear in the initramfs.

Comment 18 Ying Cui 2015-11-24 08:27:41 UTC
Chen, could we check this bug description and try to reproduce this issue on our test env.?

Comment 19 cshao 2015-11-27 02:47:04 UTC
(In reply to Ying Cui from comment #18)
> Chen, could we check this bug description and try to reproduce this issue on
> our test env.?

Hi ycui, 

 Seem the machine that have NetApp FC Storage(2 paths) + Emulex HBA can't access now, I will send ticket to admin to ask them fix this issue asap, and then I will try to reproduce this issue on our test env.

Thanks!

Comment 20 Yaniv Kaul 2015-11-29 11:58:52 UTC
Still needinfo on QE to reproduce.

Comment 21 cshao 2015-11-30 07:06:38 UTC
Still can't reproduce this issue on VIRT-QE ENV.

Test version:
RHEV-H 6.5 20150115 + ovirt-node-3.0.1-19.el6_5.18.noarch 
RHEV-H 6.6 20150128.0.el6ev + ovirt-node-3.2.1-6.el6.noarch

Test machine:
dell-per510-01 multipath FC
NetApp FC Storage(2 paths) + Emulex HBA.


Test steps:
1. Install RHEV-H 6.5 20150115.
2. Upgrade to RHEV-H 6.6 20150128.0.el6ev
3. Check all paths

Test result:
Before upgrade
# cat /etc/redhat-release 
Red Hat Enterprise Virtualization Hypervisor release 6.5 (20150115.0.el6ev)
[root@unused admin]# multipath -ll
360050763008084e6e000000000000058 dm-1 IBM,2145
size=100G features='1 queue_if_no_path' hwhandler='0' wp=rw
|-+- policy='round-robin 0' prio=50 status=active
| `- 1:0:1:3 sdg 8:96 active ready running
`-+- policy='round-robin 0' prio=10 status=enabled
  `- 1:0:0:3 sdd 8:48 active ready running
360050763008084e6e000000000000057 dm-0 IBM,2145
size=100G features='1 queue_if_no_path' hwhandler='0' wp=rw
|-+- policy='round-robin 0' prio=50 status=active
| `- 1:0:0:2 sdc 8:32 active ready running
`-+- policy='round-robin 0' prio=10 status=enabled
  `- 1:0:1:2 sdf 8:80 active ready running
36782bcb03cdfa200174636ff055184dc dm-7 DELL,PERC 6/i
size=544G features='0' hwhandler='0' wp=rw
`-+- policy='round-robin 0' prio=1 status=active
  `- 0:2:0:0 sda 8:0  active ready running
360050763008084e6e000000000000056 dm-2 IBM,2145
size=200G features='1 queue_if_no_path' hwhandler='0' wp=rw
|-+- policy='round-robin 0' prio=50 status=active
| `- 1:0:1:1 sde 8:64 active ready running
`-+- policy='round-robin 0' prio=10 status=enabled
  `- 1:0:0:1 sdb 8:16 active ready running


After upgrade
# cat /etc/redhat-release 
Red Hat Enterprise Virtualization Hypervisor 6.6 (20150128.0.el6ev) 
[root@unused admin]# 
[root@unused admin]# multipath -ll
360050763008084e6e000000000000058 dm-1 IBM,2145
size=100G features='0' hwhandler='0' wp=rw
|-+- policy='round-robin 0' prio=50 status=active
| `- 0:0:1:3 sdf 8:80 active ready running
`-+- policy='round-robin 0' prio=10 status=enabled
  `- 0:0:0:3 sdc 8:32 active ready running
360050763008084e6e000000000000057 dm-4 IBM,2145
size=100G features='0' hwhandler='0' wp=rw
|-+- policy='round-robin 0' prio=50 status=active
| `- 0:0:0:2 sdb 8:16 active ready running
`-+- policy='round-robin 0' prio=10 status=enabled
  `- 0:0:1:2 sde 8:64 active ready running
36782bcb03cdfa200174636ff055184dc dm-12 DELL,PERC 6/i
size=544G features='0' hwhandler='0' wp=rw
`-+- policy='round-robin 0' prio=1 status=active
  `- 2:2:0:0 sdg 8:96 active ready running
360050763008084e6e000000000000056 dm-0 IBM,2145
size=200G features='0' hwhandler='0' wp=rw
|-+- policy='round-robin 0' prio=50 status=active
| `- 0:0:1:1 sdd 8:48 active ready running
`-+- policy='round-robin 0' prio=10 status=enabled
  `- 0:0:0:1 sda 8:0  active ready running


Note You need to log in before you can comment on or make changes to this bug.