Bug 660871 - mpctl module doesn't release fasync_struct at file close
mpctl module doesn't release fasync_struct at file close
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel (Show other bugs)
5.4
Unspecified Linux
urgent Severity urgent
: rc
: ---
Assigned To: Tomas Henzl
Storage QE
: OtherQA, ZStream
Depends On:
Blocks: 677173
  Show dependency treegraph
 
Reported: 2010-12-07 13:13 EST by David Jeffery
Modified: 2011-07-21 05:28 EDT (History)
11 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Calling the mptctl_fasync() function to enable async notification caused the fasync_struct data structure, which was allocated, to never be freed. fasync_struct remained on the event list of the mptctl module even after a file was closed and released. After the file was closed, fasync_struct had an invalid file pointer which was dereferenced when the mptctl module called the kill_fasync() function to report any events. The use of the invalid file pointer could result in a deadlock on the system because the send_sigio() function tried to acquire the rwlock in the f_owner field of the previously closed file. With this update, a release callback function has been added for the file operations in the mptctl module. fasync_struct is now properly freed when a file is closed, no longer causing a deadlock.
Story Points: ---
Clone Of:
Environment:
Last Closed: 2011-07-21 05:28:26 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
add a mptctl_release function to clean up fasync state (682 bytes, patch)
2010-12-08 16:16 EST, David Jeffery
no flags Details | Diff

  None (edit)
Description David Jeffery 2010-12-07 13:13:05 EST
Description of problem:
When the mptctl_fasync() function has been called to enable async notification, the fasync_struct which is allocated is never freed.  It remains on mptctl's event list even after the file is closed and released.

After the file is closed, the fasync_struct now has an invalid file pointer which is dereferenced when the mptctl module calls kill_fasync() to report any events.

The use of the invalid file pointer resulted in a deadlock on one system.  The function send_sigio() tries to acquire the rwlock in the file's f_owner field.  Since the memory has been freed and reused, the memory location which was the lock containes a value which causes the read_lock() attempt to spin forever.

(taken from CAS core 20101125041947)
PID: 0      TASK: ffff8100091c0100  CPU: 1   COMMAND: "swapper"
 #0 [ffff8100091e2f20] crash_nmi_callback at ffffffff8007a3bf
 #1 [ffff8100091e2f40] do_nmi at ffffffff8006585a
 #2 [ffff8100091e2f50] nmi at ffffffff80064ebf
    [exception RIP: __read_lock_failed+5]
    RIP: ffffffff8006216d  RSP: ffff8100091dfd18  RFLAGS: 00000297
    RAX: 0000000000000000  RBX: ffff8100151c2d68  RCX: 0000000000000000
    RDX: 0000000000020001  RSI: 0000000000000012  RDI: ffff81001423ebc0
    RBP: ffff81001423ebc0   R8: 0000000032140ba8   R9: 0000000000000020
    R10: 0000000000000000  R11: 0000000000000000  R12: 0000000000020001
    R13: 0000000000000012  R14: ffff81001fb80140  R15: 0000000000000002
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
--- <NMI exception stack> ---
 #3 [ffff8100091dfd18] __read_lock_failed at ffffffff8006216d
 #4 [ffff8100091dfd18] _read_lock at ffffffff80064b45
 #5 [ffff8100091dfd20] send_sigio at ffffffff800e6a96
 #6 [ffff8100091dfd50] __kill_fasync at ffffffff8004e93c
 #7 [ffff8100091dfd70] kill_fasync at ffffffff800452df
 #8 [ffff8100091dfd90] mptctl_event_process at ffffffff884718e8
 #9 [ffff8100091dfdb0] mpt_base_reply at ffffffff880c8f60
#10 [ffff8100091dfe70] mpt_interrupt at ffffffff880cf85c
#11 [ffff8100091dff20] handle_IRQ_event at ffffffff80010b81
#12 [ffff8100091dff50] __do_IRQ at ffffffff800b9808
#13 [ffff8100091dff90] do_IRQ at ffffffff8006c997
--- <IRQ stack> ---
#14 [ffff8100091d9e48] ret_from_intr at ffffffff8005d615
    [exception RIP: mwait_idle+54]
    RIP: ffffffff800571f4  RSP: ffff8100091d9ef0  RFLAGS: 00000246
    RAX: 0000000000000000  RBX: 0000000000000001  RCX: 0000000000000000
    RDX: 0000000000000000  RSI: 0000000000000001  RDI: ffffffff80301698
    RBP: ffff8100091c02f0   R8: ffff8100091d8000   R9: ffff81001eba7380
    R10: 0000000000000002  R11: 0000000000000008  R12: 00000000029f0b8d
    R13: 00000bdc90695279  R14: ffff8100091c0860  R15: ffff8100091c0100
    ORIG_RAX: ffffffffffffff56  CS: 0010  SS: 0018
#15 [ffff8100091d9ef0] cpu_idle at ffffffff8004939e

The cpu is now stuck trying to to acquire an invalid lock that will never be released.

The mptctl module has no release() function in its file_operations.  It thus never knows about the file being closed and doesn't make a call to fasync_helper() to free up the fasync_struct for the closing file.


Version-Release number of selected component (if applicable):
kernel-2.6.18-164

Steps to reproduce:
To reproduce
1: the mptctl's device file would be opened
2: fcntl() to set FASYNC on the device file handle
3: close the file
4: An mpt event to cause an async notification must occur.
  
Actual results:
kernel deadlocks from mptctl using an invalid pointer.

Expected results:
mptctl would free up the fasync_struct and not cause a deadlock.
Comment 1 David Jeffery 2010-12-08 16:16:02 EST
Created attachment 467583 [details]
add a mptctl_release function to clean up fasync state

I've attached an untested patch.  The upstream kernel's mptctl driver also looks to fail to clean up the fasync_struct at file close.
Comment 4 Tomas Henzl 2010-12-10 10:54:19 EST
Kashyap,
the patch seem to correct the issue. Is the the solution good from your point
of view?
Comment 8 Tomas Henzl 2011-01-20 09:57:36 EST
(In reply to comment #1)
> Created attachment 467583 [details]
> add a mptctl_release function to clean up fasync state
> 
> I've attached an untested patch.  The upstream kernel's mptctl driver also
> looks to fail to clean up the fasync_struct at file close.

David,
I couldn't find it posted, has it already been posted?
Comment 9 Tomas Henzl 2011-01-20 09:59:00 EST
(In reply to comment #8)
> David,
> I couldn't find it posted, has it already been posted?

Sorry, this is confusing - meant posted upstream for example in scsi-misc
Comment 10 Tomas Henzl 2011-01-20 09:59:49 EST
(In reply to comment #4)
> Kashyap,
> the patch seem to correct the issue. Is the the solution good from your point
> of view?

Ping Kashyap,
any thoughts?
Comment 13 kashyap 2011-02-02 06:54:03 EST
(In reply to comment #10)
> (In reply to comment #4)
> > Kashyap,
> > the patch seem to correct the issue. Is the the solution good from your point
> > of view?
> 
> Ping Kashyap,
> any thoughts?

Tomas, Sorry for delay...! Patch absolutely fine.! I have to post this to uptream as well..!
Comment 14 Tomas Henzl 2011-02-02 08:35:40 EST
(In reply to comment #13)
> Tomas, Sorry for delay...! Patch absolutely fine.! I have to post this to
> uptream as well..!

Kashyp,
thanks and for posting this upstream as well.
Let me know when you will have this posted.
Comment 19 Jarod Wilson 2011-02-18 17:39:41 EST
in kernel-2.6.18-244.el5
You can download this test kernel (or newer) from http://people.redhat.com/jwilson/el5

Detailed testing feedback is always welcomed.
Comment 22 Martin Prpič 2011-04-14 06:31:36 EDT
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
Calling the mptctl_fasync() function to enable async notification caused the fasync_struct data structure, which was allocated, to never be freed. fasync_struct remained on the event list of the mptctl module even after a file was closed and released. After the file was closed, fasync_struct had an invalid file pointer which was dereferenced when the mptctl module called the kill_fasync() function to report any events. The use of the invalid file pointer could result in a deadlock on the system because the send_sigio() function tried to acquire the rwlock in the f_owner field of the previously closed file. With this update, a release callback function has been added for the file operations in the mptctl module. fasync_struct is now properly freed when a file is closed, no longer causing a deadlock.
Comment 24 Gris Ge 2011-07-04 02:39:25 EDT
Code reviewed, patch found in kernel-2.6.18-272.el5
Sanity Only
Comment 25 errata-xmlrpc 2011-07-21 05:28:26 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2011-1065.html

Note You need to log in before you can comment on or make changes to this bug.