Bug 1552621

Summary: POSIX fcntl() locks are lost across execve() if any threads are running
Product: Red Hat Enterprise Linux 7 Reporter: Daniel Berrangé <berrange>
Component: kernelAssignee: Eric Biederman <ebiederm>
kernel sub component: File Systems QA Contact: Kun Wang <kunwan>
Status: CLOSED WONTFIX Docs Contact:
Severity: unspecified    
Priority: unspecified CC: bfields, eblake, fweimer, jfehlig, jlayton, xzhou
Version: 7.5   
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1591544 1613648 (view as bug list) Environment:
Last Closed: 2021-02-15 07:37:47 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1591544    
Attachments:
Description Flags
Demo program to reproduce problem illustrated none

Description Daniel Berrangé 2018-03-07 13:00:20 UTC
Created attachment 1405329 [details]
Demo program to reproduce problem illustrated

Description of problem:
Traditional POSIX fcntl() locks are supposed to be preserved across an execve(), provided FD_CLOEXEC is not set on the file descriptor. e.g. fcntl() man page says:

  "Record locks are not inherited by a child created via fork(2),
   but are preserved across an execve(2)."

In a single threaded program this does indeed work correctly, but if any threads exist at time of execve(), the locks are being mistakenly released.

The attachment provides a demo program that can illustrate the problem.

First compile with -DJOIN_THREADS to ensure threads exit before execve(), and observe that locks are preserved across execve():

# gcc -o lockcheck -lpthread -DJOIN_THREADS lockcheck.c 

# ./lockcheck 
Lock held
Thread running
Thread has exited
Lock held
Re-exec()ing self with lock-1452.txt locked
Lock held


Now recompile without -DJOIN_THREADS, to see what happens when threads are left running at exevcve()

# gcc -o lockcheck -lpthread lockcheck.c 

# ./lockcheck 
Lock held
Thread running
Thread left running
Lock held
Re-exec()ing self with lock-1462.txt locked
Lock available

So the lock is mistakenly released when threads are running.

This appears to be long-standing buggy behaviour in Linux - RHEL5 and 6 kernels both show same behaviour.

Version-Release number of selected component (if applicable):
3.10.0-693.17.1.el7.x86_64

How reproducible:
Always.

Comment 3 Jeff Layton 2018-03-07 14:05:24 UTC
Fixable I think, but it's not trivial.

I think we can just walk the fdtable after we unshare_files() in execve code, and change all of the lock fl_owners to point to the new fdtable. That would be enough to preserve them when the other threads die. The nasty bit is unwinding that if execve subsequently fails.

It's possible that those other threads could add new locks after the unshare but before we go to change the owner back. If that happens we may need to re-merge those locks with the new ones, which is the nasty bit.

One idea would be to SIGSTOP the other threads before we do the execve. They're presumably going to be dead anyway so there's no reason to run them. But...we'd have to SIGCONT them on failure, and not do that to any that were already stopped or being ptraced, etc.

David Howells suggested allocating a one byte lock cookie that could be passed
over execve. Then we could pass ownership without having to change anything. That's an option but it adds overhead in the case where there is no locking involved.

Comment 4 Jeff Layton 2018-03-07 15:37:05 UTC
Ahh, we don't need to do it before the other threads are killed. We hold a reference to the old files_struct in that case, so we should be able to just change them in the case where we're going to return success.

Working on a draft patch now.

Comment 5 Jeff Layton 2018-03-07 18:49:22 UTC
Ok, patch sent to linux-fsdevel.org. It'll need to be backported to rhel7 if we're interested in fixing it there. Not sure if this approach is acceptable or not, but it should at least get a discussion going in that direction.

Comment 6 Jeff Layton 2018-04-10 15:00:30 UTC
The patch I had would be problematic for NFS, CIFS and other filesystems that use fl_owner_t. Eric B. had a proposed scheme to avoid the unshare_files call in most cases, but it's not trivial to implement.

Comment 7 Murphy Zhou 2018-08-07 01:47:55 UTC
Hi Jeff, any progress upstream? Can we expect this in RHEL-7.6?
Thanks!

Comment 11 RHEL Program Management 2021-02-15 07:37:47 UTC
After evaluating this issue, there are no plans to address it further or fix it in an upcoming release.  Therefore, it is being closed.  If plans change such that this issue will be fixed in an upcoming release, then the bug can be reopened.