Bug 1308880

Summary: multipathd service sometimes fails to start
Product: Red Hat Enterprise Linux 7 Reporter: Lev Veyde <lveyde>
Component: device-mapper-multipathAssignee: Ben Marzinski <bmarzins>
Status: CLOSED CURRENTRELEASE QA Contact: Lin Li <lilin>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 7.2CC: agk, bmarzins, chakumar, coughlan, heinzm, jshivers, lilin, loberman, lveyde, mgandhi, msnitzer, nsoffer, prajnoha, sauchter
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-03-01 15:38:41 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Bug Depends On:    
Bug Blocks: 1385242    
Attachments:
Description Flags
/var/log/messages
none
systemctl status multipathd and relevant log lines from journalctl
none
ovirt-engine-setup log
none
ovirt-host-deploy log none

Description Lev Veyde 2016-02-16 10:47:27 UTC
Created attachment 1127560 [details]
/var/log/messages

Description of problem:
On some occasions multipathd service fails to start

Version-Release number of selected component (if applicable):
device-mapper-multipath-0.4.9-85.el7.x86_64

How reproducible:
100%

Steps to Reproduce:
1. Make sure the network and NetworkManager services are started
2. Start installation of oVirt-Engine Manager on oVirt-Live
3. Wait till the installation tries to start the multipathd service as a dependency for VDSM.

Actual results:
The multipathd service fails to start properly.

Expected results:
The multipathd service should start without any issue.

Additional info:
Manually starting the service at later time will normally work.

Comment 1 Lev Veyde 2016-02-16 10:49:17 UTC
Created attachment 1127561 [details]
systemctl status multipathd and relevant log lines from journalctl

Comment 2 Lev Veyde 2016-02-16 10:50:09 UTC
Created attachment 1127562 [details]
ovirt-engine-setup log

Comment 3 Lev Veyde 2016-02-16 10:50:51 UTC
Created attachment 1127563 [details]
ovirt-host-deploy log

Comment 5 Lev Veyde 2016-02-16 11:34:36 UTC
Some more info - creating an empty /etc/multipath.conf file and manually starting the multipathd service, as well as restarting it after the VDSM installed it's own configuration seems to work just fine.

Comment 6 Ben Marzinski 2016-02-16 17:20:26 UTC
Are you sure that multipathd isn't starting up? Running multipathd simply starts the daemon, and then returns. The daemon doesn't write the pid file until it is all the way up. From you logs, I can see

Feb 15 14:34:15 livecd.localdomain libvirtd[27722]: Module /usr/lib64/libvirt/connection-driver/libvirt_driver_lxc.so not accessible
Feb 15 14:34:15 livecd.localdomain kernel: device-mapper: multipath service-time: version 0.2.0 loaded
Feb 15 14:34:15 livecd.localdomain kernel: device-mapper: table: 253:6: multipath: error getting device
Feb 15 14:34:15 livecd.localdomain kernel: device-mapper: ioctl: error adding target to table
Feb 15 14:34:15 livecd.localdomain multipathd[27744]: SanDisk_Cruzer_Micro_1738921D9DC36B20-0:0: ignoring map
Feb 15 14:34:15 livecd.localdomain systemd[1]: multipathd.service never wrote its PID file. Failing.

So multipathd was clearly running here, it looks like it's doing the initial configure when it logs

Feb 15 14:34:15 livecd.localdomain multipathd[27744]: SanDisk_Cruzer_Micro_1738921D9DC36B20-0:0: ignoring map

This is before it writes the pid file.  If multipathd is actually starting up successfully, then the actual problem is that the multipathd command is returning too quickly after it forks off the daemon.  It should be waiting for the pid file to be written (or until a reasonable timeout has passed) before returning.

Comment 7 Ben Marzinski 2016-02-26 19:37:45 UTC
Could you see if these issues are fixed by using
device-mapper-multipath-0.4.9-87.el7

This package has a fix for Bug #1253913, which may also fix this issue. The multipathd command will now wait for the pidfile to be written before returning (or 3 seconds have passed), and the multipathd daemon now creates the config file much earlier in startup.

Comment 8 Lev Veyde 2016-03-14 11:16:08 UTC
(In reply to Ben Marzinski from comment #7)
> Could you see if these issues are fixed by using
> device-mapper-multipath-0.4.9-87.el7
> 
> This package has a fix for Bug #1253913, which may also fix this issue. The
> multipathd command will now wait for the pidfile to be written before
> returning (or 3 seconds have passed), and the multipathd daemon now creates
> the config file much earlier in startup.

Will try to test in the next version of the build.

Comment 9 Lev Veyde 2016-04-21 12:40:47 UTC
(In reply to Ben Marzinski from comment #7)
> Could you see if these issues are fixed by using
> device-mapper-multipath-0.4.9-87.el7
> 
> This package has a fix for Bug #1253913, which may also fix this issue. The
> multipathd command will now wait for the pidfile to be written before
> returning (or 3 seconds have passed), and the multipathd daemon now creates
> the config file much earlier in startup.

The package is still not available on latest CentOS 7, which is the basis for oVirt-Live build.

The current package version there is:
device-mapper-multipath-0.4.9-85.el7.x86_64.rpm

Comment 10 Ben Marzinski 2016-04-21 15:19:29 UTC
device-mapper-multipath-0.4.9-87.el7 isn't release yet for RHEL either. Could you manually install this package, to see if it fixes the issue?

Comment 23 Nir Soffer 2017-02-07 08:59:06 UTC
Lev, can you reproduce the issue on your setup with current multipath version?