272021 – GFS2 - flocks from same process trip kernel BUG at fs/gfs2/glock.c:1118!

Bug 272021 - GFS2 - flocks from same process trip kernel BUG at fs/gfs2/glock.c:1118!

Summary: GFS2 - flocks from same process trip kernel BUG at fs/gfs2/glock.c:1118!

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 5
Classification:	Red Hat
Component:	kernel
Sub Component:
Version:	5.0
Hardware:	All
OS:	All
Priority:	medium
Severity:	medium
Target Milestone:	---
Target Release:	---
Assignee:	Don Zickus
QA Contact:	Dean Jansa
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2007-08-31 16:15 UTC by Abhijith Das
Modified:	2007-11-30 22:07 UTC (History)
CC List:	5 users (show)
Fixed In Version:	RHBA-2007-0959
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2007-11-07 20:02:26 UTC
Target Upstream Version:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
Program to create the problem. (911 bytes, text/x-csrc) 2007-08-31 16:15 UTC, Abhijith Das	no flags	Details
Initial patch (2.74 KB, patch) 2007-09-12 05:06 UTC, Abhijith Das	no flags	Details \| Diff
Second attempt (3.41 KB, patch) 2007-09-12 21:44 UTC, Abhijith Das	no flags	Details \| Diff
Show Obsolete (1) View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2007:0959	0	normal	SHIPPED_LIVE	Updated kernel packages for Red Hat Enterprise Linux 5 Update 1	2007-11-08 00:47:37 UTC

Description Abhijith Das 2007-08-31 16:15:03 UTC

Description of problem:
When you try two flocks, one after the other from the same process, with
different file descriptors on the same file, gfs2 trips the kernel BUG at
fs/gfs2/glock.c:1118!

Version-Release number of selected component (if applicable):

How reproducible:
Always

Steps to Reproduce:
1. Run the test program flucker.c like ./flucker /mnt/gfs2/foo
2. Boom
3.
  
Actual results:
Boom

Expected results:
No Boom.

Additional info:
ext3 behaves as expected:
SH followed by SH - granted
SH followed by EX - EAGAIN
EX followed by SH - EAGAIN
EX followed by EX - EAGAIN

gfs2 trips this assert in all the above cases.

Stack trace:

original: gfs2_flock+0x16a/0x1e9 [gfs2]
new: gfs2_flock+0x16a/0x1e9 [gfs2]
------------[ cut here ]------------
kernel BUG at fs/gfs2/glock.c:1118!
invalid opcode: 0000 [#1]
SMP
last sysfs file: /fs/gfs2/niobe:gfs2/lock_module/block
Modules linked in: autofs4 hidp rfcomm l2cap bluetooth lock_dlm gfs2 dlm
configfs sunrpc ipv6 video sbs backlight i2c_ec button battery asus_acpi ac lp
ata_piix libata sg floppy ide_cd parport_pc parport cdrom i2c_i810 i2c_algo_bit
i2c_i801 i2c_core pcspkr e1000 dm_snapshot dm_zero dm_mirror dm_mod qla2xxx
scsi_transport_fc sd_mod scsi_mod ext3 jbd ehci_hcd ohci_hcd uhci_hcd
CPU:    0
EIP:    0060:[<e0d1efb4>]    Not tainted VLI
EFLAGS: 00010246   (2.6.18-44.gfs2abhi.003 #1)
EIP is at gfs2_glock_nq+0xe2/0x184 [gfs2]
eax: 00000020   ebx: d2e4f854   ecx: e0d34090   edx: d12b7ed8
esi: d28bcb14   edi: d0e39980   ebp: d0e39980   esp: d12b7ed4
ds: 007b   es: 007b   ss: 0068
Process flucker (pid: 2553, ti=d12b7000 task=d08cf000 task.ti=d12b7000)
Stack: e0d34090 00000006 00000001 e0d34083 000009f9 e0d34090 00000006 00000001
       e0d34083 000009f9 d0e1c000 00000000 00000000 d28bcb14 de6786c0 00000001
       e0d2732c d28bcb14 dfdb15fc 00000006 d040a9d4 d28bcb04 de6786c0 00000000
Call Trace:
 [<e0d2732c>] gfs2_flock+0x17a/0x1e9 [gfs2]
 [<c046c4a6>] cache_alloc_refill+0x14b/0x44f
 [<c044aba1>] audit_syscall_entry+0x11c/0x14e
 [<e0d271b2>] gfs2_flock+0x0/0x1e9 [gfs2]
 [<c0482ca1>] sys_flock+0x114/0x147
 [<c0404eff>] syscall_call+0x7/0xb
 =======================
Code: df 8b 56 20 b8 b3 40 d3 e0 e8 4e 08 72 df ff 76 0c 68 83 40 d3 e0 e8 44 78
70 df ff 77 20 ff 77 14 68 90 40 d3 e0 e8 34 78 70 df <0f> 0b 5e 04 87 3e d3 e0
83 c4 28 8b 5e 0c 8d 4f 48 8b 47 48 eb
EIP: [<e0d1efb4>] gfs2_glock_nq+0xe2/0x184 [gfs2] SS:ESP 0068:d12b7ed4
 <0>Kernel panic - not syncing: Fatal exception

Comment 1 Abhijith Das 2007-08-31 16:15:03 UTC

Created attachment 183661 [details]
Program to create the problem.

Comment 2 Abhijith Das 2007-09-12 05:06:29 UTC

Created attachment 193221 [details]
Initial patch

Two scenarios when doing multiple flocks from the same process:
a) flocks through single file descriptor
One fd means same struct file* and same holder structure for all flocks.
b) flocks through multiple file descriptors.
Each fd has a different holder structure.

This patch adds a new function gfs2_flock_glock_nq that's almost like
gfs2_glock_nq. It does the list_add from add_to_queue() but does not perform
the checks that disallow the same process from queueing multiple holders onto a
glock. We need this because of scenario (b) where it's ok for multiple flocks
to come from the same process through multiple file descriptors. 

In scenario (a), when a process requests the second flock through the same file
descriptor, we dequeue the first flock, reinit the holder with the new flock
and enqueue.

In scenario (b), when a process requests the second flock through another file
descriptor, we need to find the glock (held by first flock) and queue another
holder (corresponding to the second file descriptor). This goes through
gfs2_flock_glock_nq() which doesn't trip BUG()s if it's the same process
requesting the glocks.

Existing problems that this patch doesn't fix:
1) With gfs2, ctrl-c will not break out of a process that is blocked waiting
for a flock. So, if we have a single-threaded process that does a SH flock
followed by a blocking EX flock, it'll block. Since the SH flock can't be
unlocked, we have a deadlock. If the process had two threads, one for each
flock, things go smoothly when the first thread unlocks the SH flock. I'm not
sure how this case can be handled, or whether it's ok to deadlock if the user's
rogue program attempts such a thing.

2) When one process requests promotion or demotion of an flock (i.e. through
the same file descriptor, scenario (a) from above), SH followed by EX or EX
followed by SH, we currently unlock, reinit holder and relock. There's a race
condition between the unlock and the relock where another process/node can
capture the lock. I don't know if LM_FLAG_PRIORITY would help, but ideally we
should have an atomic operation to promote or demote an flock. This bz does not
specify this issue, but I have a gfs1 bz that does.

Comment 3 Abhijith Das 2007-09-12 21:44:50 UTC

Created attachment 194051 [details]
Second attempt

This patch adds a new flag to the gfs2_holder structure GL_FLOCK. It is set on
holders of glocks representing flocks. This flag is checked in add_to_queue()
and a process is permitted to queue more than one holder onto a glock if it is
set.
I'm in the middle of testing this patch out and will update this bz with my
results.

Comment 4 Steve Whitehouse 2007-09-13 10:06:55 UTC

That patch looks much better I think.

Comment 5 Abhijith Das 2007-09-14 13:40:42 UTC

http://post-office.corp.redhat.com/archives/rhkernel-list/2007-September/msg00320.html

Posted the rhel5.1 version of this patch to rhkernel-list. Marking this bz POST.

Comment 7 Don Zickus 2007-09-18 19:23:35 UTC

in 2.6.18-48.el5
You can download this test kernel from http://people.redhat.com/dzickus/el5

Comment 10 errata-xmlrpc 2007-11-07 20:02:26 UTC

An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2007-0959.html

Note You need to log in before you can comment on or make changes to this bug.