Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 465862

Summary: Warning from rt_mutex code while testing infiniband
Product: Red Hat Enterprise MRG Reporter: Clark Williams <williams>
Component: realtime-kernelAssignee: Steven Rostedt <srostedt>
Status: CLOSED ERRATA QA Contact:
Severity: medium Docs Contact:
Priority: medium    
Version: 1.0CC: bhu, davids, gozen, lgoncalv, pzijlstr, srostedt, williams
Target Milestone: 1.1   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-01-22 10:44:51 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
patch to correct locking order of ib driver
none
patch to allow rwlocks to unlock out of order
none
upadet rwlock torture test to include checking of unnested locks none

Description Clark Williams 2008-10-06 19:39:16 UTC
Description of problem:

testing of the openib code generated the following traceback on -65 kernel:

WARNING: at kernel/rtmutex.c:1852 rt_read_fastunlock()
Pid: 16811, comm: ibv_rc_pingpong Not tainted 2.6.24.7-65.el5rt #1

Call Trace:
 [<ffffffff811357b2>] ? free_layer+0x37/0x3f
 [<ffffffff8105f31f>] rt_mutex_up_read+0x1a4/0x232
 [<ffffffff8105fcbc>] rt_up_read+0x9/0xb
 [<ffffffff881922b1>] :ib_uverbs:put_uobj_read+0x15/0x21
 [<ffffffff881922f7>] :ib_uverbs:put_pd_read+0xd/0xf
 [<ffffffff88194f8f>] :ib_uverbs:ib_uverbs_create_qp+0x39c/0x4cf
 [<ffffffff88191ae0>] ? :ib_uverbs:ib_uverbs_qp_event_handler+0x0/0x2d
 [<ffffffff88191843>] :ib_uverbs:ib_uverbs_write+0x96/0xb0
 [<ffffffff810b00d5>] vfs_write+0xc7/0x170
 [<ffffffff810b06e5>] sys_write+0x4a/0x76
 [<ffffffff8100c35e>] traceret+0x0/0x5

Later testing on the MRG 1.0.3 errata kernel generated this traceback (from /var/log/messages):

Oct  2 04:44:35 dhcp71-141 kernel: WARNING: at kernel/rtmutex.c:1896
rt_read_fastunlock()
Oct  2 04:44:35 dhcp71-141 kernel: Pid: 4589, comm: ibv_rc_pingpong Not tainted
2.6.24.7-81.el5rt #1
Oct  2 04:44:35 dhcp71-141 kernel: 
Oct  2 04:44:35 dhcp71-141 kernel: Call Trace:
Oct  2 04:44:35 dhcp71-141 kernel:  [<ffffffff81135ca6>] ? free_layer+0x37/0x3f
Oct  2 04:44:35 dhcp71-141 kernel:  [<ffffffff8105f521>]
rt_mutex_up_read+0x1d2/0x260
Oct  2 04:44:35 dhcp71-141 kernel:  [<ffffffff8105fe55>] rt_up_read+0x9/0xb
Oct  2 04:44:35 dhcp71-141 kernel:  [<ffffffff883732b9>]
:ib_uverbs:put_uobj_read+0x15/0x21
Oct  2 04:44:35 dhcp71-141 kernel:  [<ffffffff883732ff>]
:ib_uverbs:put_pd_read+0xd/0xf
Oct  2 04:44:35 dhcp71-141 kernel:  [<ffffffff88375fd2>]
:ib_uverbs:ib_uverbs_create_qp+0x39c/0x4cf
Oct  2 04:44:35 dhcp71-141 kernel:  [<ffffffff88372ae0>] ?
:ib_uverbs:ib_uverbs_qp_event_handler+0x0/0x2d
Oct  2 04:44:35 dhcp71-141 kernel:  [<ffffffff88372843>]
:ib_uverbs:ib_uverbs_write+0x96/0xb0
Oct  2 04:44:35 dhcp71-141 kernel:  [<ffffffff810b0479>] vfs_write+0xc7/0x170
Oct  2 04:44:35 dhcp71-141 kernel:  [<ffffffff810b0a89>] sys_write+0x4a/0x76
Oct  2 04:44:35 dhcp71-141 kernel:  [<ffffffff8100c37e>] traceret+0x0/0x5
Oct  2 04:44:35 dhcp71-141 kernel: 


Version-Release number of selected component (if applicable):

kernel-rt-2.6.24.7-81.el5rt

How reproducible:
always

Steps to Reproduce:
1. Install RHEL5.2
2. Install MRG RT kernel
3. Install openib package
4. reboot

Comment 1 Clark Williams 2008-10-06 19:47:05 UTC
Created attachment 319594 [details]
patch to correct locking order of ib driver 

Patch from srostedt to address this issue:

The ib driver releases the locks not in the reverse order that it takes them.
The RW locks in RT is very sensitive to this.

Hopefully the attached patch will fix the issue.

Comment 2 Luis Claudio R. Goncalves 2008-10-07 12:30:05 UTC
Patxh added to MRG kernel -85

Comment 3 Steven Rostedt 2008-10-09 03:31:28 UTC
Created attachment 319823 [details]
patch to allow rwlocks to unlock out of order

This patch fixes the quirk inside rwlocks that expected to unlock in the reverse order the locks were taken.

Since other places in the kernel may do this, this is the better patch than the one already attached.

Comment 4 Steven Rostedt 2008-10-09 03:32:42 UTC
Created attachment 319824 [details]
upadet rwlock torture test to include checking of unnested locks

This patch updates the rwlock torture test to include testing locks being released in an order that is not nested.

Comment 5 Luis Claudio R. Goncalves 2008-10-09 14:17:01 UTC
Patches added to -85

Comment 7 David Sommerseth 2008-11-26 15:57:22 UTC
Verified by code review.  Tried to trigger the misbehaviour on mrg-13.lab.bos.redhat.com on 2.6.24.7-81 without success.  The problem did indeed exist on dell-pe1950-02.rhts.bos.redhat.com, but that box was not available at the moment of testing.  No errors found using 2.6.24.7-93 on mrg-13.

Found these patches for this bugzilla: 

* ib: release locks in the proper order
2b39f5cb4d843c6d32e55e99eff32ff99518c9cb - bz465862--ib-fix-locking-order.patch

* rt: rwlock fix non nested unlocking
98184ed03651cbaa362a258b74238dbb26631290 - bz465862-rwlock-handle-bad-locking-practices.patch

* rwlock: update torture test for testing unnested locking
695c1c048aa18caebaab7b8645eadc69fbf2f633 - bz465862-rwlock-update-torture-test.patch

These patches was found in the mrg-rt.git tree, for the 2.6.24.7-93 branch.

Comment 9 errata-xmlrpc 2009-01-22 10:44:51 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2009-0009.html