Bug 2240280

Summary: [CephFS-NFS] CEPH_FS_NONBLOCKING_IO is stuck when compiling the Linux kernel in the NFS mount director
Product: [Red Hat Storage] Red Hat Ceph Storage Reporter: Frank Filz <ffilz>
Component: CephFSAssignee: Frank Filz <ffilz>
Status: CLOSED ERRATA QA Contact: Manisha Saini <msaini>
Severity: high Docs Contact: Rivka Pollack <rpollack>
Priority: unspecified    
Version: 7.0CC: akraj, ceph-eng-bugs, cephqe-warriors, hyelloji, tserlin, vdas
Target Milestone: ---   
Target Release: 7.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: ceph-18.2.0-47.el9cp Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-12-13 15:24:00 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 2237662    

Description Frank Filz 2023-09-22 20:29:45 UTC
Description of problem:

Originally reported in https://github.com/nfs-ganesha/nfs-ganesha/issues/988

My ganesha version is 5.5.
The ceph version is the latest, and NONBLOCKING_IO compilation is on.

When my client mounts the NFS directory, both reading and writing are normal.

But when I tried to compile the Linux kernel in the mount directory, or other software compilation, it quickly got stuck.

I analyzed the ganesha log and found that one client's read request was not completed.
Because of the read request, ceph_ll_nonblocking_readv_writev returned 0, but ceph client did not call the callback.
Then I analyzed from ceph client and found that the file size of this read is 0, although the requested offset is 0 and len is 8192.
In this case, ceph returns 0 directly and thinks that read is complete and will not call the callback function again.
see:

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results: Kernel build stalls


Expected results: Kernel build completes as expected


Additional info:

Comment 1 Frank Filz 2023-09-22 20:32:56 UTC
Upstream fix is available and merged:

https://github.com/ceph/ceph/pull/53407

Patch back ported and merged into ceph-7.0-rhel-patches

Comment 8 Frank Filz 2023-10-11 21:05:56 UTC
As a bug introduced by the async/nonblocking work, I don't think this requires doc text. Please advise on how to proceed.

Comment 9 errata-xmlrpc 2023-12-13 15:24:00 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat Ceph Storage 7.0 Bug Fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2023:7780