Bug 161591
| Summary: | scsi_cd_put: inverted refcounting | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 4 | Reporter: | nate.dailey | ||||
| Component: | kernel | Assignee: | Jeff Layton <jlayton> | ||||
| Status: | CLOSED ERRATA | QA Contact: | Brian Brock <bbrock> | ||||
| Severity: | medium | Docs Contact: | |||||
| Priority: | medium | ||||||
| Version: | 4.0 | CC: | coughlan, davej, dledford, jbaron, steved | ||||
| Target Milestone: | --- | ||||||
| Target Release: | --- | ||||||
| Hardware: | i686 | ||||||
| OS: | Linux | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | RHBA-2007-0304 | Doc Type: | Bug Fix | ||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2007-05-01 22:59:23 UTC | Type: | --- | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Bug Depends On: | |||||||
| Bug Blocks: | 176344, 234547 | ||||||
| Attachments: |
|
||||||
Ok, slab debugging was what was needed. This popped up in dmesg after attempting
to reproduce on a kernel with CONFIG_DEBUG_SLAB. I'll build a kernel with the
above patch and make certain that it fixes the problem here:
scsi0 (6:0): rejecting I/O to dead device
SCSI error: host 0 id 6 lun 0 return code = 4000000
Sense class 0, sense error 0, extended sense 0
scsi0 (6:0): rejecting I/O to dead device
sr0: CDROM (ioctl) error, command: Xpwrite, Read disk info 00 00 00 00 00 00 00
02 00
sr: old sense key No Sense
Non-extended sense class 0 code 0x0
Unable to handle kernel paging request at virtual address 6b6b6b6b
printing eip:
e0145ce1
*pde = 00000000
Oops: 0000 [#1]
SMP
Modules linked in: i915 nfsd exportfs lockd nfs_acl parport_pc lp parport
autofs4 i2c_dev i2c_core sunrpc md5 ipv6 button battery ac uhci_hcd ehci_hcd
snd_intel8x0 snd_ac97_codec snd_pcm_oss snd_mixer_oss snd_pcm snd_timer
snd_page_alloc snd_mpu401_uart snd_rawmidi snd_seq_device snd soundcore e1000
sr_mod aic7xxx scsi_mod dm_snapshot dm_zero dm_mirror ext3 jbd dm_mod
CPU: 1
EIP: 0060:[<e0145ce1>] Not tainted VLI
EFLAGS: 00010286 (2.6.9-41.EL.TEST.bz161591.1smp)
EIP is at scsi_device_put+0x3/0x40 [scsi_mod]
eax: 6b6b6b6b ebx: 6b6b6b6b ecx: 6b76f680 edx: c17f86e0
esi: c17f86e4 edi: cc425964 ebp: df65aab4 esp: cc204e48
ds: 007b es: 007b ss: 0068
Process sr_open (pid: 10536, threadinfo=cc204000 task=df12b330)
Stack: e007cde8 e00796c0 cc204000 cc425964 c0162c56 cc4259d8 00000000 d3f69784
c15699c0 def23190 ded2cf08 c015c8da d3f69784 00000000 df069db0 00000001
c015b4f9 df069db0 00000001 0000000c c0123b5b df12b870 df069db0 df12b330
Call Trace:
[<e00796c0>] sr_block_release+0x59/0x6d [sr_mod]
[<c0162c56>] blkdev_put+0x8d/0x18f
[<c015c8da>] __fput+0x55/0x100
[<c015b4f9>] filp_close+0x59/0x5f
[<c0123b5b>] put_files_struct+0x57/0xc0
[<c012476f>] do_exit+0x245/0x404
[<c0124a19>] sys_exit_group+0x0/0xd
[<c012cd46>] get_signal_to_deliver+0x31e/0x346
[<c0105bd4>] do_signal+0x55/0xd9
[<c0129f38>] del_timer+0x5d/0x65
[<c0129fe4>] del_singleshot_timer_sync+0x8/0x21
[<c02d3d48>] schedule_timeout+0x140/0x154
[<c012a6de>] process_timeout+0x0/0x5
[<c012a85a>] sys_nanosleep+0x167/0x1a1
[<c0105c80>] do_notify_resume+0x28/0x38
[<c02d55ba>] work_notifysig+0x13/0x15
Code: 8b 40 10 74 0e c1 e0 07 8d 04 02 ff 80 00 01 00 00 eb 0e 89 f0 e8 7c 9d 0d
e0 ba fa ff ff ff eb 02 31 d2 5b 89 d0 5e c3 53 89 c3 <8b> 00 8b 40 74 8b 10 85
d2 74 26 b8 00 f0 ff ff 21 e0 8b 40 10
<0>Fatal exception: panic in 5 seconds
Patch seems to fix problem so I'll propose it internally. Created attachment 132379 [details]
patch that seems to correct problem
Here's the patch sent by IBM that seems to correct the issue.
committed in stream U5 build 42.7. A test kernel with this patch is available from http://people.redhat.com/~jbaron/rhel4/ This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release. This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release. This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release. An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2007-0304.html |
From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.6) Gecko/20050317 Firefox/1.0.2 Description of problem: I noticed that in Update 1, your linux-2.6.9-scsi-inverted-refcounting.patch fixes an "inverted refcounting" problem in sd.c. The same problem also exists in sr.c. Here's a patch that fixes it. --- sr.c.orig 2005-06-23 12:38:10.000000000 -0400 +++ sr.c 2005-06-13 17:32:29.000000000 -0400 @@ -155,9 +155,11 @@ static inline struct scsi_cd *scsi_cd_ge static inline void scsi_cd_put(struct scsi_cd *cd) { + struct scsi_device *sdev = cd->device; + down(&sr_ref_sem); kref_put(&cd->kref, sr_kref_release); - scsi_device_put(cd->device); + scsi_device_put(sdev); up(&sr_ref_sem); } Version-Release number of selected component (if applicable): kernel-2.6.9-5.EL How reproducible: Didn't try Steps to Reproduce: Additional info: