Bug 1986138

Summary: Lockd invalid cast to nlm_lockowner
Product: Red Hat Enterprise Linux 8 Reporter: Benjamin Coddington <bcodding>
Component: kernelAssignee: Benjamin Coddington <bcodding>
kernel sub component: NFS QA Contact: Zhi Li <yieli>
Status: CLOSED ERRATA Docs Contact:
Severity: urgent    
Priority: urgent CC: ajmitchell, ampatil, brdeoliv, dwysocha, jbyrd, jiyin, kcleveng, knweiss, lmiksik, nfs-maint, nmurray, rbergant, u.sibiller, xzhou, yieli, yoyang
Version: 8.5Keywords: Regression, Triaged, ZStream
Target Milestone: beta   
Target Release: 8.5   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: kernel-4.18.0-341.el8 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1991327 2000899 2010820 (view as bug list) Environment:
Last Closed: 2021-11-09 19:25:28 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Bug Depends On:    
Bug Blocks: 1991327, 2000899, 2010820    

Description Benjamin Coddington 2021-07-26 18:10:37 UTC
A F_GETLK sent to lockd that returns a conflicting lock can cause a crash because the fl_owner is not guaranteed to be struct nlm_lockowner.  We need provisionally pending patch:

https://lore.kernel.org/linux-nfs/f94e02c019495fea4495fbef7498f342d5848dac.1627217317.git.bcodding@redhat.com/T/#u

Here's a simple demonstration for a system that has exported /exports and mounted that export on a NFSv3 mount on /mnt/localhost:

#define _GNU_SOURCE
#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <unistd.h>
#include <fcntl.h>

#define LOCALFILE    "/exports/foo"
#define NFSFILE    "/mnt/localhost/foo"

int
main(int argc, char **argv)
{
	int local_fd, nfs_fd, ret;

	struct flock lck_one = {
		.l_whence = SEEK_SET,
		.l_start  = 0,
		.l_len    = 4,
	};

	struct flock lck_two = {
		.l_whence = SEEK_SET,
		.l_start  = 0,
		.l_len    = 4,
	};

	local_fd = open(LOCALFILE, O_RDWR|O_CREAT, 0666);
	nfs_fd = open(NFSFILE, O_RDWR|O_CREAT, 0666);

	lck_one.l_type = F_WRLCK;
	ret = fcntl(local_fd, F_SETLKW, &lck_one);
	ret = fcntl(nfs_fd, F_GETLK, &lck_two);
}

Comment 29 Zhi Li 2021-09-14 01:36:27 UTC
Moving to VERIFIED according to Comment#26.

Comment 31 Dave Wysochanski 2021-09-14 14:43:21 UTC
This is a regression from 8.2.0 kernels (4.18.0-193*el8), with a simple fix, per Roberto's analysis of the commit in question:
~~~
Since 8.3 we got this commit :
...
    [fs] lockd: Show pid of lockd for remote locks
...
-       conflock->svid = lock->fl.fl_pid;
+       conflock->svid = ((struct nlm_lockowner *)lock->fl.fl_owner)->pid;  <<<---
~~~

Comment 38 errata-xmlrpc 2021-11-09 19:25:28 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: kernel security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:4356