Bug 143823
Summary: | [PATCH] Stale POSIX flock | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 3 | Reporter: | Garik E <kiragon> |
Component: | kernel | Assignee: | Peter Staubach <staubach> |
Status: | CLOSED ERRATA | QA Contact: | |
Severity: | high | Docs Contact: | |
Priority: | medium | ||
Version: | 3.0 | CC: | bnocera, jbaron, kiragon, peterm, petrides, riel, tao |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | i686 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | RHSA-2005-663 | Doc Type: | Bug Fix |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2005-09-28 14:40:29 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 156320 | ||
Attachments: |
Description
Garik E
2004-12-29 04:57:44 UTC
Created attachment 109163 [details]
Patch to remove a stale POSIX flock
here is a patch I've used
Hello, Here is another issue for that bug: If a process that is blocked by the stale POSIX lock gets SIGINT (Ctrl-C from terminal) it may crash. Here is a OOPS message: CPU: 0 EIP: 0060:[<8016582f>] Tainted: PF EFLAGS: 00010282 EIP is at __fput [kernel] 0xf (2.4.21-20cpsmp/i686) eax: f72ffc00 ebx: f72ffc00 ecx: 8667fce4 edx: f72ffc00 esi: 00000000 edi: f6337c80 ebp: 00000000 esp: f1e83f10 ds: 0068 es: 0068 ss: 0068 Process awk (pid: 9572, stackpage=f1e83000) Stack: 00000296 8667fce4 836fe580 f72ffc00 f6337c80 835801bc 80165817 8017ce72 836fe580 00000000 00000000 836fe5cc f6fb2e80 00000003 f6337c80 00000000 80163a97 f6fb2e80 f6337c80 00000001 00000003 f6337c80 00000001 8012d71c Call Trace: [<80165817>] fput [kernel] 0x17 (0xf1e83f28) [<8017ce72>] locks_remove_posix [kernel] 0x132 (0xf1e83f2c) [<80163a97>] filp_close [kernel] 0x87 (0xf1e83f50) [<8012d71c>] put_files_struct [kernel] 0x6c (0xf1e83f6c) [<8012dfea>] do_exit [kernel] 0x1ba (0xf1e83f88) [<8012e35b>] do_group_exit [kernel] 0x8b (0xf1e83fa4) [<8012e3a3>] sys_exit_group [kernel] 0x13 (0xf1e83fb8) Code: 8b 7d 08 89 04 24 e8 76 76 01 00 8b 4b 60 85 c9 0f 85 cd 00 This issue is already resolved in 2.6 - 1.63 revision of fs/locks.c: http://linux.bkbits.net:8080/linux-2.6/diffs/fs/locks.c@1.63?nav=index.html|src/|src/fs|hist/fs/locks.c We're seeing this same kernel crash. We have a multithreaded application that uses file locking over NFS. Within 24 hours the RHE3-u3-4 kernels will panic with similar traces as above, always referencing "locks_remove_posix". Applying the patch in comment #1 prevents the kernel panic, although it error message added by the patch shows up frequently. Request that an official patch be rolled into U5. The patch does not solve the problem. It is a recovery operation after file counter becomes inconsistent. Is there anyway known way to reproduce this situation? A reproducible testcase sure would be a help. Unfortunately I donât have a simple testcase for this crash. You can ask Linux if he has one: he has fixed this bug in 2.6 ... well its Linus . My misspell :) *** Bug 157846 has been marked as a duplicate of this bug. *** Removing from U5 blocker list. Well, itâs a shameful decision, but frankly, I havenât been expecting else. So, does somebody at RH support have the balls to admit that this bug wonât be solved in EL3 lifetime ? A fix for this problem has just been committed to the RHEL3 U6 patch pool this evening (in kernel version 2.4.21-32.6.EL). Created attachment 115031 [details]
Testcase to reproduce the problem
This is a testcase which shows the dangling lock problem by reproducing
the fcntl/close tace.
Created attachment 115032 [details]
Patch to remove locks when the close/fcntl race is detected
These changes detect the fcntl/close race and correctly release the lock
which was just acquired.
Created attachment 115033 [details]
Patch to remove locks when the close/fcntl race is detected
These changes detect the fcntl/close race and correctly release the lock
which was just acquired.
*** Bug 157846 has been marked as a duplicate of this bug. *** An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2005-663.html |