Bug 677475

Summary: iscsid: semop down failed; tgtd: semop up failed
Product: Red Hat Enterprise Linux 6 Reporter: Mike Christie <mchristi>
Component: scsi-target-utilsAssignee: Andy Grover <agrover>
Status: CLOSED ERRATA QA Contact: Red Hat Kernel QE team <kernel-qe>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 6.1CC: agrover, bdonahue, roland.friedwagner, syeghiay
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Attempting to run iscsid and the tgtd on the same machine results in semaphore errors being logged by both daemons because of an identifier collision. This has been corrected, and these errors no longer appear.
Story Points: ---
Clone Of: 676804 Environment:
Last Closed: 2011-05-19 14:15:04 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 676804    
Bug Blocks:    

Description Mike Christie 2011-02-14 22:29:31 UTC
+++ This bug was initially created as a clone of Bug #676804 +++

Description of problem:

Running iscsid and tgtd on same machine 
the daemons logs semaphore errors

Version-Release number of selected component (if applicable):
scsi-target-utils-1.0.8-0.el5
iscsi-initiator-utils-6.2.0.872-6.el5

How reproducible: Everytime 

Steps to Reproduce:

1. Stop both services::

     $ service tgtd stop
     $ service iscsid stop

2. Start iscsid::

     $ service iscsid start

3. Start tgtd::

     $ service tgtd start

4. Stop iscsid::

     $ service stop iscsid

5. Stop tgtd::

     $ service tgtd stop

>> syslog error: **tgtd: semop up failed**

6. Start iscsid again::

     $ service iscsid start

7. Start tgtd again::

     $ service tgtd start

8. Stop tgtd::

     $ service tgtd stop

9. Stop iscsid::

     $ service iscsid stop

>> syslog error: **iscsid: semop down failed 22**

  
Actual results:

  Syslog Errors: tgtd: semop up failed
                 iscsid: semop down failed 22

Expected results: No Error Messages


Additional info:

Both services daemons does influence the semaphores of the other one?
Is it save to run both services on same machine at the same time?

Seen also this iscsid messages in syslog: **iscsid: semop down failed 43**
when doing iscisadm login/logout tests :-(But I can not reproduce)

Increased kernel.sem values to "1024 64000 256 1024" but
does not changed behavor.

--- Additional comment from roland.friedwagner.at on 2011-02-11 07:38:44 EST ---

Ok further testing reveals how to reproduce 
the **iscsid: semop down failed 43** error:

1. service tgtd stop
2. service iscsid restart

3. /usr/sbin/tgtd &>/dev/null
4. tgtadm --op delete --mode system
3. /usr/sbin/tgtd &>/dev/null
5. login to any iscsi target (xxx):
   iscsiadm --mode node --targetname=iqn.xxx --login
>> Syslog Error: iscsid: semop down failed 43
6. logout to any iscsi target (xxx):
   iscsiadm --mode node --targetname=iqn.xxx --logout
>> Syslog Error: iscsid: semop down failed 43

Until you did not restart iscsid you get the **iscsid: semop down failed 43**
on each login/logout of an iscsi target.

==> tgtd has brocken icsid semaphores??

Add.Info: same behavor with scsi-target-utils-0.0-6.20091205snap.el5_5.3

--- Additional comment from roland.friedwagner.at on 2011-02-11 11:25:25 EST ---

Created attachment 478273 [details]
SEMKEY clash with iscsid fix

--- Additional comment from roland.friedwagner.at on 2011-02-11 11:26:24 EST ---

Created attachment 478274 [details]
drbd control socket move2 /var/run

--- Additional comment from roland.friedwagner.at on 2011-02-11 11:27:33 EST ---

Created attachment 478275 [details]
SPEC File diff

--- Additional comment from roland.friedwagner.at on 2011-02-11 11:28:57 EST ---

The reason is a id collision with same SEMKEY in log.c used in 
scsi-target-utils-1.0.8-0.el5 and iscsi-initiator-utils-6.2.0.872-6.el5

I uploaded patch scsi-target-utils-fix-semkey-clash.patch to fix this!
It may be a very good thing that this fix makes it to RHN very soon,
because until than logging of iscid and tgtd is undefined/broken.

If Mike digs into sources, I suggest also to fix the nasty location
for the tgtd control socket being placed in /tmp but should be IMHO /var/run.
(Uploaded scsi-target-utils-namespace-socket-path.patch for this).

Kind Regards,
Roland

--- Additional comment from mchristi on 2011-02-14 17:28:25 EST ---

Hey Roland,

Thanks for both patches and problem analysis. Patches look ok to me.

Comment 3 Barry Donahue 2011-04-19 20:16:57 UTC
I have tried testing this on RHEL6.1-20110413.1 and tgtd aborts when it tries to start after iscsid. From /var/log/messages:
Apr 19 16:11:12 storageqe-01 abrt: Kerneloops: Reported 1 kernel oopses to Abrt
Apr 19 16:11:12 storageqe-01 abrtd: Directory 'kerneloops-1303243872-1925-1' creation detected
Apr 19 16:11:12 storageqe-01 abrtd: Crash is in database already (dup of /var/spool/abrt/kerneloops-1303239160-1941-1)
Apr 19 16:11:12 storageqe-01 abrtd: Deleting crash kerneloops-1303243872-1925-1 (dup of kerneloops-1303239160-1941-1), sending dbus signal
Apr 19 16:12:25 storageqe-01 ntpd[1838]: synchronized to 10.16.71.254, stratum 2

   To repro:

   service iscsid start
   service tgtd start

iscsi-initiator-utils-6.2.0.872-19.el6.x86_64
scsi-target-utils-1.0.14-2.el6.x86_64

Comment 4 Andy Grover 2011-04-20 21:20:09 UTC
Still getting setup to try and repro, but one question...

This causes a kernel oops?? How can I get the text of that oops?

Very odd -- scsi-target-utils has no kernel component so a bug should not cause kernel oops.

Comment 7 Laura Bailey 2011-05-05 05:28:47 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
Attempting to run iscsid and the tgtd on the same machine results in semaphore errors being logged by both daemons because of an identifier collision. This has been corrected, and these errors no longer appear.

Comment 8 errata-xmlrpc 2011-05-19 14:15:04 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2011-0734.html