Bug 431907 - 5.2: service iscsi stop on a busy target causes a soft CPU lockup on ppc64.
Summary: 5.2: service iscsi stop on a busy target causes a soft CPU lockup on ppc64.
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel
Version: 5.2
Hardware: ppc64
OS: Linux
medium
medium
Target Milestone: rc
: ---
Assignee: David Howells
QA Contact: Red Hat Kernel QE team
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2008-02-07 19:30 UTC by Barry Donahue
Modified: 2014-08-22 23:13 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2014-08-22 23:13:12 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
It a piece of the /var/log/messages file with several examples of the lockup. (57.39 KB, text/plain)
2008-02-07 19:30 UTC, Barry Donahue
no flags Details

Description Barry Donahue 2008-02-07 19:30:27 UTC
Description of problem: If you install the 20080206.nightly on a ppc64 box with
the latest iscsi-initiator-utils, executing service iscsi stop on a busy target
will cause a soft lockup.


Version-Release number of selected component (if applicable):
   kernel: 2.6.18-77.el5
   iscsi: iscsi-initiator-utils-6.2.0.868-0.3.el5

How reproducible:Every time


Steps to Reproduce:
1. Install 5.2
2. Install latest iscsi-initiator-utils
3. Add this to /etc/iscsi/iscsid.conf: node.conn[0].iscsi.HeaderDigest = None
4. --login to the target build files systen and mount it.
5. Do IO to iscsi LUN and the do service iscsi stop.

Actual results: You will encounter a soft lokup.
   Target system was ibm-js21-01.lab.boston.redhat.com.


Expected results: We should only get some IO errors on the iscsi LUN.


Additional info:
   Actual test script was:
#!/bin/bash
let "x=1"
while [ $x -lt 100 ] 
        do
                mount /dev/sdb1 /mnt/sdb
                status=$?
                if [ $status != 0 ]; then
                        echo "Mount FAILED"
                        exit
                fi
                dd if=/dev/zero of=/mnt/sdb/file bs=1024 count=1000000&
                echo "x=$x"
                sleep 5
                service iscsi stop
                umount /mnt/sdb
                service iscsi start
                sleep 5
                let x++
        done

Comment 1 Barry Donahue 2008-02-07 19:30:27 UTC
Created attachment 294258 [details]
It a piece of the /var/log/messages file with several examples of the lockup.

Comment 2 David Howells 2008-03-28 15:24:00 UTC
Can you run with a kernel that has LOCKDEP enabled?  That might determine what 
is causing CPUs to get stuck.

Comment 3 David Howells 2008-03-28 15:30:08 UTC
The log is a bit weird: there appear to be soft lockups occurring in 
vprintk().  This suggests that either someone's holding the spinlock and not
letting go, or that another CPU is hammering the printk's so hard that the
faulting CPU isn't getting a look in.  If this is running in a virtual
partition on a ppc64 machine, then the yield-to-hypervisor nature of spinlocks
there may be exacerbating the situation.

Comment 4 Peter Martuccelli 2008-04-07 16:28:43 UTC
This is not a blocker or RHEL 5.2.  Moving out to RHEL 5.2 for further
investigation.

Comment 6 RHEL Program Management 2008-07-25 17:05:39 UTC
This request was evaluated by Red Hat Product Management for
inclusion, but this component is not scheduled to be updated in
the current Red Hat Enterprise Linux release. If you would like
this request to be reviewed for the next minor release, ask your
support representative to set the next rhel-x.y flag to "?".

Comment 7 Ludek Smid 2008-07-25 21:54:00 UTC
Unfortunately the previous automated notification about the
non-inclusion of this request in Red Hat Enterprise Linux 5.3 used
the wrong text template. It should have read: this request has been
reviewed by Product Management and is not planned for inclusion
in the current minor release of Red Hat Enterprise Linux.

If you would like this request to be reviewed for the next minor
release, ask your support representative to set the next rhel-x.y
flag to "?" or raise an exception.

Comment 9 RHEL Program Management 2014-03-07 12:46:50 UTC
This bug/component is not included in scope for RHEL-5.11.0 which is the last RHEL5 minor release. This Bugzilla will soon be CLOSED as WONTFIX (at the end of RHEL5.11 development phase (Apr 22, 2014)). Please contact your account manager or support representative in case you need to escalate this bug.

Comment 10 Barry Donahue 2014-03-07 13:42:43 UTC
That sounds like the best plan.


Note You need to log in before you can comment on or make changes to this bug.