Bug 272301
Summary: | Bad things happen when you attempt multiple flocks from a single process | ||||||
---|---|---|---|---|---|---|---|
Product: | [Retired] Red Hat Cluster Suite | Reporter: | Abhijith Das <adas> | ||||
Component: | GFS-kernel | Assignee: | Abhijith Das <adas> | ||||
Status: | CLOSED WONTFIX | QA Contact: | Cluster QE <mspqa-list> | ||||
Severity: | medium | Docs Contact: | |||||
Priority: | medium | ||||||
Version: | 4 | CC: | anandab, kanderso, rkenna, teigland | ||||
Target Milestone: | --- | ||||||
Target Release: | --- | ||||||
Hardware: | All | ||||||
OS: | All | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2008-04-14 20:28:25 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | |||||||
Bug Blocks: | 198302 | ||||||
Attachments: |
|
Description
Abhijith Das
2007-08-31 16:47:34 UTC
Created attachment 183681 [details]
Program to create the problem.
*** Bug 198302 has been marked as a duplicate of this bug. *** The real fix for this is quite invasive and might break the already fragile flock code. There is an easy workaround to return -EAGAIN/-ENOSYS when a process tries to flock the same file twice. But this workaround will mask the bug if it ever appears in the field. If we find a real-world test-case that does single-process-multiple-flocks, we can go after this one. Marking it WONTFIX. We initially came across this bug trying to work out why the nodes in our live cluster were occasionally rebooting. It turned out that one application had a race condition when handling concurrent requests which would cause it to attempt multiple locks on the same file. The result was kernel panics which were causing the reboots: Unable to handle kernel NULL pointer dereference at virtual address 0000000c printing eip: 82293ebf *pde = 00004001 Oops: 0000 [#1] SMP Modules linked in: i2c_dev i2c_core lock_dlm(U) gfs(U) lock_harness(U) ext3 jbd dm_cmirror(U) dm_mirror dlm(U) cman(U) bonding(U) md5 ipv6 aoe(U) dm_mod button battery ac uhci_hcd ehci_hcd tg3 sd_mod floppy ata_piix libata scsi_mod CPU: 0 EIP: 0060:[<82293ebf>] Not tainted VLI EFLAGS: 00010293 (2.6.9-67.0.7.ELhugemem) EIP is at add_to_queue+0x2c/0x27b [gfs] eax: 78a82030 ebx: 7767141c ecx: 77671440 edx: 7fdfa524 esi: 00000000 edi: 7fdfa4fc ebp: 7fdfa4fc esp: 70770eec ds: 007b es: 007b ss: 0068 Process dod-upgrade-acc (pid: 10054, threadinfo=70770000 task=78a82030) Stack: 8222d000 7fdfa518 7767141c 8222d000 7fdfa4fc 822941d6 00000000 70904b88 00000000 00000480 7767141c 822a95a1 7767141c 00000001 70904b88 77671400 743107ec 704d5380 7fdfa4fc 80688500 70770f90 7836b8e0 021ad19a 70770f58 Call Trace: [<822941d6>] gfs_glock_nq+0xc8/0x116 [gfs] [<822a95a1>] do_flock+0x111/0x182 [gfs] [<021ad19a>] selinux_file_lock+0x7f/0x88 [<822a9673>] gfs_flock+0x0/0x76 [gfs] [<0216e462>] sys_flock+0x96/0x120 Code: 57 56 53 89 c3 51 8b 78 08 8b 87 9c 00 00 00 89 04 24 8b 43 0c 85 c0 0f 84 29 02 00 00 8b 77 28 8d 57 28 39 d6 0f 84 f6 00 00 00 <39> 46 0c 0f 85 e6 00 00 00 f6 43 14 08 75 2d f6 46 14 08 74 27 <0>Fatal exception: panic in 5 seconds Even without a full fix, it would be good to find a way to avoid this. |