Bug 786174

Summary: iscsid service fails to start
Product: [Fedora] Fedora Reporter: Rami Vaknin <rvaknin>
Component: iscsi-initiator-utilsAssignee: Mike Christie <mchristi>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: unspecified    
Version: rawhideCC: agrover, alitke, bubrown, dallan, danken, hdegoede, ilvovsky, joshua, knesenko, martix, mburns, mchristi, mgoldboi, ogvura, yaroslav.sapozhnik, yeylon
Target Milestone: ---Keywords: Regression, TestBlocker
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: iscsi-initiator-utils-6.2.0.872-16.fc16 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-02-08 02:28:26 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
fix return value when iscsid is not started none

Description Rami Vaknin 2012-01-31 16:13:02 UTC
Scenario and reproduction steps:
Clean installation of Fedora 16, installing vdsm which also requires iscsid, iscsid didn't start and refuses to start manually. It reproduced in second machine too.

Versions:
Fedora release 16 (Verne), kernel-3.2.2-1.fc16.x86_64
iscsi-initiator-utils-6.2.0.872-14.fc16.x86_64


[root@intg-vdsb ~]# systemctl status iscsid.service
iscsid.service - LSB: Starts and stops login iSCSI daemon.
	  Loaded: loaded (/etc/rc.d/init.d/iscsid)
	  Active: failed since Tue, 31 Jan 2012 20:19:54 +0200; 47min ago
	 Process: 1130 ExecStart=/etc/rc.d/init.d/iscsid start (code=exited, status=0/SUCCESS)
	  CGroup: name=systemd:/system/iscsid.service
[root@intg-vdsb ~]# 



Jan 31 19:51:28 intg-vdsb systemd-vdsmd[2055]: Starting iscsid...
Jan 31 19:51:28 intg-vdsb systemd[1]: PID file /var/run/iscsid.pid not readable (yet?) after start.
Jan 31 19:51:28 intg-vdsb systemd[1]: Unit iscsid.service entered failed state.
Jan 31 19:51:28 intg-vdsb systemd-vdsmd[2055]: Starting iscsid (via systemctl):  Job failed. See system logs and 'systemctl status' for details.
Jan 31 19:51:28 intg-vdsb systemd-vdsmd[2055]: [FAILED]
Jan 31 19:51:28 intg-vdsb systemd-vdsmd[2055]: vdsm: Dependent iscsid failed to start[FAILED]
Jan 31 19:51:28 intg-vdsb systemd-vdsmd[2055]: vdsm: one of the dependent services did not start, error code 1[FAILED]
Jan 31 19:51:28 intg-vdsb systemd[1]: vdsmd.service: control process exited, code=exited status=1
Jan 31 19:51:28 intg-vdsb systemd[1]: Unit vdsmd.service entered failed state.
Jan 31 19:51:43 intg-vdsb systemd[1]: Reloading.
Jan 31 19:51:43 intg-vdsb systemd-vdsmd[2172]: vdsm: libvirt already configured for vdsm [  OK  ]
Jan 31 19:51:43 intg-vdsb systemd-vdsmd[2172]: Starting iscsid...
Jan 31 19:51:43 intg-vdsb systemd[1]: PID file /var/run/iscsid.pid not readable (yet?) after start.
Jan 31 19:51:43 intg-vdsb systemd[1]: Unit iscsid.service entered failed state.
Jan 31 19:51:43 intg-vdsb systemd-vdsmd[2172]: Starting iscsid (via systemctl):  Job failed. See system logs and 'systemctl status' for details.
Jan 31 19:51:43 intg-vdsb systemd-vdsmd[2172]: [FAILED]
Jan 31 19:51:43 intg-vdsb systemd-vdsmd[2172]: vdsm: Dependent iscsid failed to start[FAILED]
Jan 31 19:51:43 intg-vdsb systemd-vdsmd[2172]: vdsm: one of the dependent services did not start, error code 1[FAILED]
Jan 31 19:51:43 intg-vdsb systemd[1]: vdsmd.service: control process exited, code=exited status=1
Jan 31 19:51:43 intg-vdsb systemd[1]: Unit vdsmd.service entered failed state.
Jan 31 19:56:04 intg-vdsb systemd-logind[1036]: New session 9 of user root.
Jan 31 19:56:04 intg-vdsb systemd-logind[1036]: Removed session 9.
Jan 31 19:56:04 intg-vdsb systemd-logind[1036]: New session 10 of user root.
Jan 31 19:56:04 intg-vdsb systemd-logind[1036]: Removed session 10.
Jan 31 19:56:05 intg-vdsb systemd-logind[1036]: New session 11 of user root.
Jan 31 19:56:05 intg-vdsb systemd-logind[1036]: Removed session 11.
Jan 31 19:56:05 intg-vdsb systemd-logind[1036]: New session 12 of user root.
Jan 31 19:56:05 intg-vdsb systemd-logind[1036]: Removed session 12.
Jan 31 19:59:13 intg-vdsb systemd[1]: Reloading.
Jan 31 19:59:13 intg-vdsb systemd-vdsmd[2460]: vdsm: libvirt already configured for vdsm [  OK  ]
Jan 31 19:59:13 intg-vdsb systemd-vdsmd[2460]: Starting iscsid...
Jan 31 19:59:13 intg-vdsb systemd[1]: PID file /var/run/iscsid.pid not readable (yet?) after start.
Jan 31 19:59:13 intg-vdsb systemd[1]: Unit iscsid.service entered failed state.
Jan 31 19:59:13 intg-vdsb systemd-vdsmd[2460]: Starting iscsid (via systemctl):  Job failed. See system logs and 'systemctl status' for details.
Jan 31 19:59:13 intg-vdsb systemd-vdsmd[2460]: [FAILED]
Jan 31 19:59:13 intg-vdsb systemd-vdsmd[2460]: vdsm: Dependent iscsid failed to start[FAILED]
Jan 31 19:59:13 intg-vdsb systemd-vdsmd[2460]: vdsm: one of the dependent services did not start, error code 1[FAILED]
Jan 31 19:59:13 intg-vdsb systemd[1]: vdsmd.service: control process exited, code=exited status=1
Jan 31 19:59:13 intg-vdsb systemd[1]: Unit vdsmd.service entered failed state.
Jan 31 19:59:41 intg-vdsb systemd[1]: Reloading.
Jan 31 19:59:42 intg-vdsb systemd[1]: PID file /var/run/iscsid.pid not readable (yet?) after start.
Jan 31 19:59:42 intg-vdsb systemd[1]: Unit iscsid.service entered failed state.
Jan 31 20:00:20 intg-vdsb systemd[1]: PID file /var/run/iscsid.pid not readable (yet?) after start.
Jan 31 20:00:20 intg-vdsb systemd[1]: Unit iscsid.service entered failed state.
Jan 31 20:00:24 intg-vdsb systemd[1]: PID file /var/run/iscsid.pid not readable (yet?) after start.
Jan 31 20:00:24 intg-vdsb systemd[1]: Unit iscsid.service entered failed state.
Jan 31 20:00:32 intg-vdsb systemd[1]: PID file /var/run/iscsid.pid not readable (yet?) after start.
Jan 31 20:00:32 intg-vdsb systemd[1]: Unit iscsid.service entered failed state.
Jan 31 20:01:01 intg-vdsb logrotate: ALERT clean old core files exited abnormally with [1]
Jan 31 20:02:13 intg-vdsb systemd-logind[1036]: New session 14 of user root.
Jan 31 20:02:13 intg-vdsb systemd-logind[1036]: Removed session 14.
Jan 31 20:02:13 intg-vdsb systemd-logind[1036]: New session 15 of user root.
Jan 31 20:02:13 intg-vdsb systemd-logind[1036]: Removed session 15.
Jan 31 20:02:13 intg-vdsb systemd-logind[1036]: New session 16 of user root.
Jan 31 20:02:13 intg-vdsb systemd-logind[1036]: Removed session 16.
Jan 31 20:02:14 intg-vdsb systemd-logind[1036]: New session 17 of user root.
Jan 31 20:02:14 intg-vdsb systemd-logind[1036]: Removed session 17.
Jan 31 20:04:23 intg-vdsb systemd[1]: PID file /var/run/iscsid.pid not readable (yet?) after start.
Jan 31 20:04:23 intg-vdsb systemd[1]: Unit iscsid.service entered failed state.
Jan 31 20:08:21 intg-vdsb systemd-logind[1036]: New session 18 of user root.
Jan 31 20:08:21 intg-vdsb systemd-logind[1036]: Removed session 18.
Jan 31 20:08:21 intg-vdsb systemd-logind[1036]: New session 19 of user root.
Jan 31 20:08:21 intg-vdsb systemd-logind[1036]: Removed session 19.
Jan 31 20:08:22 intg-vdsb systemd-logind[1036]: New session 20 of user root.
Jan 31 20:08:22 intg-vdsb systemd-logind[1036]: Removed session 20.
Jan 31 20:08:22 intg-vdsb systemd-logind[1036]: New session 21 of user root.
Jan 31 20:08:22 intg-vdsb systemd-logind[1036]: Removed session 21.
Jan 31 20:13:19 intg-vdsb systemd-logind[1036]: New session 23 of user root.

Comment 1 Mike Christie 2012-01-31 22:59:17 UTC
Oh shoot, I thought I uploaded the fix for this in

https://bugzilla.redhat.com/show_bug.cgi?id=749987

but the patch did not merged correctly. Will fix this up and repush.

Comment 2 Rami Vaknin 2012-02-01 08:59:29 UTC
Note that the service exists with code 0 although it fails to start.

Comment 3 Dave Allan 2012-02-01 14:43:26 UTC
Let me know when there's a package in updates testing and I'll try it out & give karma.

Comment 4 Mike Christie 2012-02-02 00:18:46 UTC
*** Bug 785037 has been marked as a duplicate of this bug. ***

Comment 5 Mike Christie 2012-02-02 04:03:34 UTC
Created attachment 558965 [details]
fix return value when iscsid is not started

Hey,

Does this patch to /etc/init.d/iscsid fix the problem for you?

After you apply the patch I think you have to do "systemctl --system daemon-reload"?

It fixes the return value in the failure case.

This patch will not:

1. Fix the PID error messages. For some reason even if we return failed, systemd still complains about there not being a PID file.

This might be a systemd bug.

2. Will not fix the problem of iscsi/iscsid trying to get startup and failing even though you are not using iscsi.

I am working on this, but it is going to be a bigger fix that might require some changes to other packages, so the patch in this bz is just a workaround to get us working.

Also what packages do I need to install to get the vdsm service like in this bz?

Comment 6 Dan Kenigsberg 2012-02-02 12:37:41 UTC
Rami, could you describe how to reproduce the issue, or provide a machine for Mike's disposal?

Comment 7 Dan Kenigsberg 2012-02-02 12:40:57 UTC
Please note that this bug is a proud member of http://www.ovirt.org/wiki/Releases/First_Release_Blockers

Comment 9 Dave Allan 2012-02-02 20:46:12 UTC
The patch doesn't fix it on my machine.  I get:

[root@node0 ~]# service vdsmd status
Redirecting to /bin/systemctl  status vdsmd.service
vdsmd.service - Virtual Desktop Server Manager
	  Loaded: loaded (/lib/systemd/system/vdsmd.service; enabled)
	  Active: failed since Thu, 02 Feb 2012 20:39:58 -0500; 51s ago
	 Process: 641 ExecStart=/lib/systemd/systemd-vdsmd start (code=exited, status=1/FAILURE)
	  CGroup: name=systemd:/system/vdsmd.service
[root@node0 ~]# service iscsid status
iscsid.service - LSB: Starts and stops login iSCSI daemon.
	  Loaded: loaded (/etc/rc.d/init.d/iscsid)
	  Active: failed since Thu, 02 Feb 2012 20:39:58 -0500; 58s ago
	 Process: 904 ExecStart=/etc/rc.d/init.d/iscsid start (code=exited, status=6/NOTCONFIGURED)
	  CGroup: name=systemd:/system/iscsid.service
[root@node0 ~]#

Comment 10 Mike Christie 2012-02-03 03:42:12 UTC
Hey,

The system you loaned me does not have the vdsm stuff installed.

But here is the issue.

At some point someone made iscsi/iscsid do a delayed start. It only starts when there are targets setup. So if iscsid/iscsi do not see targets setup they do not start. When someone runs iscsiadm to find targets it will start iscsid for you. Then next time you start the service it runs as expected (starts and logs into targets).

So if no targets are found the iscsi/iscsid scripts return a failure.

So my questions are:

1. I did not implement that behavior. It was done to improve boot up times and fix the issue where iscsi got installed by default but users did not need iscsi.

If we are saying that behavior is not ok now then I can easily just remove the checks. iscsid will start when the service is started and we will always return success.

2. Does vdsm need iscsi started when it starts? If "service iscsi/iscsid start" was only run when users used iscsi would that work?

See #3.

3. Is the problem that the iscsi rpm sets things so that it starts by default (it does chkconfig --add iscsi/iscsid when you install the rpm)? Should we be having the user do this if they actually use iscsi?

Comment 11 Dan Kenigsberg 2012-02-05 09:50:52 UTC
(In reply to comment #10)
> 
> 1. I did not implement that behavior. It was done to improve boot up times and
> fix the issue where iscsi got installed by default but users did not need
> iscsi.
> 
> If we are saying that behavior is not ok now then I can easily just remove the
> checks. iscsid will start when the service is started and we will always return
> success.

I find this behavior awkward and unlike other services:

# systemctl start iscsid.service
Job failed. See system logs and 'systemctl status' for details.
# systemctl status  iscsid.service
iscsid.service - LSB: Starts and stops login iSCSI daemon.
	  Loaded: loaded (/etc/rc.d/init.d/iscsid)
	  Active: failed since Sun, 05 Feb 2012 11:42:47 +0200; 12s ago
	 Process: 24960 ExecStart=/etc/rc.d/init.d/iscsid start (code=exited, status=0/SUCCESS)
	  CGroup: name=systemd:/system/iscsid.service

I'd appreciate if iscsid is fixed to start when asked to.

 
> 2. Does vdsm need iscsi started when it starts? If "service iscsi/iscsid start"
> was only run when users used iscsi would that work?

Vdsm expects iscsid running when it starts.

Comment 12 Dave Allan 2012-02-06 21:12:33 UTC
(In reply to comment #10)
> 1. I did not implement that behavior. It was done to improve boot up times and
> fix the issue where iscsi got installed by default but users did not need
> iscsi.
> 
> If we are saying that behavior is not ok now then I can easily just remove the
> checks. iscsid will start when the service is started and we will always return
> success.

That sounds good to me.  I've patched the iscsid startup script to do that and it appears to allow vdsm to start properly.

Comment 13 Mike Christie 2012-02-07 03:46:28 UTC
I have made the requested change in:
iscsi-initiator-utils-6.2.0.872-17.fc17/iscsi-initiator-utils-6.2.0.872-17.fc16.

Comment 14 Fedora Update System 2012-02-07 03:47:08 UTC
iscsi-initiator-utils-6.2.0.872-16.fc16 has been submitted as an update for Fedora 16.
https://admin.fedoraproject.org/updates/FEDORA-2012-1333/iscsi-initiator-utils-6.2.0.872-16.fc16

Comment 15 Rami Vaknin 2012-02-07 06:43:47 UTC
(In reply to comment #14)
> iscsi-initiator-utils-6.2.0.872-16.fc16 has been submitted as an update for
> Fedora 16.
> https://admin.fedoraproject.org/updates/FEDORA-2012-1333/iscsi-initiator-utils-6.2.0.872-16.fc16

iscsid and vdsmd start well with this rpm (reboot was needed in my machine - restart of iscsid.service after package update didn't do the job).

Comment 16 Fedora Update System 2012-02-08 02:28:26 UTC
iscsi-initiator-utils-6.2.0.872-16.fc16 has been pushed to the Fedora 16 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 17 Martin Holec 2012-02-10 11:31:48 UTC
Update iscsi-initiator-utils-6.2.0.872-16.fc16 cause boot delay on my F16. Boot resumes after 60s timeout.
`systemctl disable iscsi.service` fixed problem for me, but until iscsi.service is enabled by default on F16 installation this bug could hit many users after this update.

/var/spool/plymouth/boot.log:
Starting LSB: Starts and stops login and scanning of iSCSI devices. ^[[1;31mfailed^[[0m, see 'systemctl status iscsi.service' for details.

systemctl status iscsi.service:
iscsi.service - LSB: Starts and stops login and scanning of iSCSI devices.
          Loaded: loaded (/etc/rc.d/init.d/iscsi)
          Active: active (exited) since Fri, 10 Feb 2012 12:17:00 +0100; 8min ago
         Process: 1570 ExecStart=/etc/rc.d/init.d/iscsi start (code=exited, status=0/SUCCESS)
          CGroup: name=systemd:/system/iscsi.service

Comment 18 Mike Christie 2012-02-10 22:41:34 UTC
Could you attach your /var/log/messages when you hit the boot delays?

And you are not using any iscsi devices right?

Comment 19 Martin Holec 2012-02-11 10:26:25 UTC
I don't use any iscsi device.

grep iscsi /var/log/messages

Feb 11 11:21:04 Martix-NB kernel: [   17.350280] iscsi: registered transport (tcp)
Feb 11 11:21:04 Martix-NB kernel: [   17.404566] iscsi: registered transport (iser)
Feb 11 11:21:04 Martix-NB kernel: [   17.410848] iscsi: registered transport (cxgb3i)
Feb 11 11:21:04 Martix-NB kernel: [   17.414369] iscsi: registered transport (cxgb4i)
Feb 11 11:21:04 Martix-NB kernel: [   17.417639] iscsi: registered transport (bnx2i)
Feb 11 11:21:04 Martix-NB kernel: [   17.424054] iscsi: registered transport (be2iscsi)
Feb 11 11:21:04 Martix-NB iscsid: iSCSI logger with pid=1163 started!
Feb 11 11:21:04 Martix-NB iscsid[1043]: Spouštění iscsid: [  OK  ]#015[  OK  ]
Feb 11 11:21:05 Martix-NB iscsid: transport class version 2.0-870. iscsid version 2.0-872.15.f16
Feb 11 11:21:05 Martix-NB iscsid: iSCSI daemon with pid=1164 started!
Feb 11 11:22:05 Martix-NB systemd[1]: iscsi.service: control process exited, code=exited status=3
Feb 11 11:22:05 Martix-NB systemd[1]: Unit iscsi.service entered failed state.
Feb 11 11:22:14 Martix-NB iscsi[1648]: Spouštění iscsi: iscsiadm: No records found
Feb 11 11:22:14 Martix-NB iscsi[1648]: [  OK  ]
Feb 11 11:22:14 Martix-NB dbus-daemon[1024]: Starting iscsi (via systemctl):  [  OK  ]

Comment 20 Mike Christie 2012-02-14 01:11:43 UTC
Ah shoot. I know what it is. It is waiting on the NetworkManager to get the network up.

Comment 21 Mike Christie 2012-02-14 01:12:23 UTC
Let's discuss this here

https://bugzilla.redhat.com/show_bug.cgi?id=789683