Bug 1125178

Summary: multipathd reload fails when installing running vdsm in the first time on a fresh install where multipath.conf is missing
Product: Red Hat Enterprise Virtualization Manager Reporter: rhev-integ
Component: vdsmAssignee: Nir Soffer <nsoffer>
Status: CLOSED ERRATA QA Contact: Elad <ebenahar>
Severity: high Docs Contact:
Priority: high    
Version: 3.4.0CC: acanan, acathrow, alonbl, amureini, asegurap, bazulay, danken, ebenahar, ecohen, gklein, iheim, lbopf, lpeer, nlevinki, nsednev, nsoffer, oourfali, Rhev-m-bugs, scohen, tnisan, yeylon, ykaplan
Target Milestone: ---Keywords: ZStream
Target Release: 3.4.2   
Hardware: x86_64   
OS: Linux   
Whiteboard: storage
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Previously, when attempting to add a Red Hat Enterprise Linux 7 host to the Manager for the first time following clean RHEL 7 installation, Vdsm failed to start on the host, and thus the host failed to add. This happened because the multipathd service behaves differently in RHEL 7 from how it behaves in RHEL 6.5. In RHEL 6.5, the multipathd service disables all devices when starting if the multipath configuration file does not exist. Creating a new multipath.conf file and reloading the multipathd service causes multipathd to reload the new configuration, and installation can continue. In RHEL 7, under the same circumstances, the multipathd service behaved differently, and would stop silently, without any error. Reloading the multipathd service failed, because the service was not running. Now, the behavior in RHEL 7 has been modified to simulate that of multipathd on RHEL 6.5 hosts; if the multipath configuration file is missing, the install script creates a new multipath configuration file, and Vdsm successfully reloads the multipathd service during startup.
Story Points: ---
Clone Of: 1120209 Environment:
Last Closed: 2014-09-04 12:59:53 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Storage RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1120209, 1129232    
Bug Blocks: 1123858    
Attachments:
Description Flags
logs 12.8.14 none

Comment 2 Elad 2014-08-12 09:22:48 UTC
Created attachment 925990 [details]
logs 12.8.14

Failed to add rhel7 host to rhevm3.4.2

vdsm.log:

Traceback (most recent call last):
  File "/usr/share/vdsm/sampling.py", line 435, in run
    sample = self.sample()
  File "/usr/share/vdsm/sampling.py", line 425, in sample
    hs = HostSample(self._pid)
  File "/usr/share/vdsm/sampling.py", line 209, in __init__
    self.timestamp - os.stat(P_VDSM_CLIENT_LOG).st_mtime <
OSError: [Errno 2] No such file or directory: '/var/run/vdsm/client.log'

Also, it seems that engine do not know how to handle with such an error received from vdsm. Engine.log:


2014-08-12 12:05:26,862 ERROR [org.ovirt.engine.core.utils.ssh.SSHDialog] (org.ovirt.thread.pool-4-thread-49) SSH error running command root.102.11:'umask 0077; MYTMP="$(mktemp -t ovirt-XXXXXXXXXX)"; trap "c
hmod -R u+rwX \"${MYTMP}\" > /dev/null 2>&1; rm -fr \"${MYTMP}\" > /dev/null 2>&1" 0; rm -fr "${MYTMP}" && mkdir "${MYTMP}" && tar --warning=no-timestamp -C "${MYTMP}" -x &&  "${MYTMP}"/setup DIALOG/dialect=str:ma
chine DIALOG/customization=bool:True': java.io.IOException: Command returned failure code 1 during SSH session 'root.102.11'


Nir, should I open a separate but on engine about this issue?

This was tested using:
av11
rhevm-3.4.2-0.1.el6ev.noarch
vdsm-4.14.13-1.el7ev.x86_64

Host with RHEL7 installed
Red Hat Enterprise Linux Server release 7.0 (Maipo)

Attaching full logs from vdsm and engine

Comment 3 Nir Soffer 2014-08-12 09:51:22 UTC
(In reply to Elad from comment #2)
> Nir, should I open a separate but on engine about this issue?

Yes, this error is not related in any way to this bug and to the fix. The bug is *not* about any error that prevent adding a host to rhel7, but about failure of multipath reload while vdsm starts.

Returning to ON_QA - if you don't see the multipath error any more, please mark this as VERIFIED.

Comment 4 Elad 2014-08-12 10:49:25 UTC
Don't see the multipath error, but host installation fails with 'Command '/bin/systemctl' failed to execute'

I'll open a bug about the other issue

Comment 5 Elad 2014-08-12 11:15:08 UTC
This bug cannot be verified due to https://bugzilla.redhat.com/show_bug.cgi?id=1129232

Comment 6 Nir Soffer 2014-08-12 12:57:46 UTC
Based on comment 4, moving to verified.

Comment 7 Aharon Canan 2014-08-12 13:28:15 UTC
please be aware to comment #5, also, QA eng are moving bugs to verified.

Comment 8 Nir Soffer 2014-08-12 14:45:01 UTC
Aharon, Elad

How to reproduce:

1. Installing a fresh image
2. Installing vdsm without this fix
3. Running vdsm-tool configure --force
4. Starting vdsm

Vdsm start fail because multipathd reload error

How to verify:

1. Installing a fresh image
2. Installing vdsm including this fix
3. Running vdsm-tool configure --force
4. Starting vdsm

Vdsm should not fail to start because of multipathd reload error. It will fail because of bug 1127877 and bug 1129232.

But failing to add a host to engine cannot block this bug. Keeping this open does not help anyone, we have enough real open bugs to handle.

Comment 9 Elad 2014-08-12 15:13:01 UTC
(In reply to Nir Soffer from comment #8)
> Aharon, Elad
> 
> How to reproduce:
> 
> 1. Installing a fresh image
> 2. Installing vdsm without this fix
> 3. Running vdsm-tool configure --force
> 4. Starting vdsm
> 
> Vdsm start fail because multipathd reload error
> 
> How to verify:
> 
> 1. Installing a fresh image
> 2. Installing vdsm including this fix
> 3. Running vdsm-tool configure --force
> 4. Starting vdsm
> 
> Vdsm should not fail to start because of multipathd reload error. It will
> fail because of bug 1127877 and bug 1129232.
> 
> But failing to add a host to engine cannot block this bug. Keeping this open
> does not help anyone, we have enough real open bugs to handle.

That's exactly the scenario I've executed.
The reason for keeping this bug open is because we don't know if the other issues reported in bug 1127877 and in bug 1129232 that prevents vdsm on host with rhel7 installed to be deployed in rhevm comes before the multipathd reload issue or after it during the installation. We think that there is a possibility that the multipathd phase during installation is blocked by the other issues.

Comment 10 Nir Soffer 2014-08-12 16:06:50 UTC
(In reply to Elad from comment #9)
> (In reply to Nir Soffer from comment #8)
> The reason for keeping this bug open is because we don't know if the other
> issues reported in bug 1127877 and in bug 1129232 that prevents vdsm on host
> with rhel7 installed to be deployed in rhevm comes before the multipathd
> reload issue or after it during the installation.

bug 1127877 is revealed by the fix for this bug - I found it when testing
the fix for this bug.

I don't have any idea about 1129232, but from looking in the logs, you can
easily tell if the multipathd reload step was completed successfully, or
vdsm start failed before vdsm tried to reload multipath.

If you cannot tell this from the logs, please upload the test logs.

Comment 11 Elad 2014-08-13 09:20:39 UTC
Indeed, from the logs, we can tell that multipath reload went fine:

MainThread::DEBUG::2014-08-12 12:05:29,224::multipath::169::Storage.Misc.excCmd::(setupMultipath) '/usr/bin/sudo -n /usr/bin/cp /tmp/tmpMKn5f6 /etc/multipath.conf' (cwd None)
MainThread::DEBUG::2014-08-12 12:05:29,242::multipath::169::Storage.Misc.excCmd::(setupMultipath) SUCCESS: <err> = ''; <rc> = 0
MainThread::DEBUG::2014-08-12 12:05:29,244::multipath::175::Storage.Misc.excCmd::(setupMultipath) '/usr/bin/sudo -n /sbin/multipath -F' (cwd None)
MainThread::DEBUG::2014-08-12 12:05:29,263::multipath::175::Storage.Misc.excCmd::(setupMultipath) SUCCESS: <err> = ''; <rc> = 0
MainThread::DEBUG::2014-08-12 12:05:29,264::multipath::178::Storage.Misc.excCmd::(setupMultipath) '/usr/bin/sudo -n /usr/bin/vdsm-tool service-reload multipathd' (cwd None)
MainThread::DEBUG::2014-08-12 12:05:29,594::multipath::178::Storage.Misc.excCmd::(setupMultipath) SUCCESS: <err> = ''; <rc> = 0


Therefore, moving to VERIFIED

Comment 19 errata-xmlrpc 2014-09-04 12:59:53 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2014-1152.html