RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1767664 - Backport CIFS stale ESTALE handling and dentry revalidation patches
Summary: Backport CIFS stale ESTALE handling and dentry revalidation patches
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 8
Classification: Red Hat
Component: kernel
Version: ---
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: rc
: 8.0
Assignee: Dave Wysochanski
QA Contact: xiaoli feng
URL:
Whiteboard:
: 1768695 (view as bug list)
Depends On:
Blocks: 1771657
TreeView+ depends on / blocked
 
Reported: 2019-10-31 23:35 UTC by Dave Wysochanski
Modified: 2020-04-28 16:32 UTC (History)
5 users (show)

Fixed In Version: kernel-4.18.0-150.el8
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1771657 (view as bug list)
Environment:
Last Closed: 2020-04-28 16:31:50 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2020:1769 0 None None None 2020-04-28 16:32:19 UTC

Description Dave Wysochanski 2019-10-31 23:35:18 UTC
Request to backport the following to RHEL kernels.

commit 30573a82fb179420b8aac30a3a3595aa96a93156
Author: Pavel Shilovsky <piastryyy>
Date:   Mon Sep 30 10:06:18 2019 -0700

    CIFS: Gracefully handle QueryInfo errors during open
    
    Currently if the client identifies problems when processing
    metadata returned in CREATE response, the open handle is being
    leaked. This causes multiple problems like a file missing a lease
    break by that client which causes high latencies to other clients
    accessing the file. Another side-effect of this is that the file
    can't be deleted.
    
    Fix this by closing the file after the client hits an error after
    the file was opened and the open descriptor wasn't returned to
    the user space. Also convert -ESTALE to -EOPENSTALE to allow
    the VFS to revalidate a dentry and retry the open.
    
    Cc: <stable.org>
    Signed-off-by: Pavel Shilovsky <pshilov>
    Signed-off-by: Steve French <stfrench>

diff --git a/fs/cifs/file.c b/fs/cifs/file.c
index 3758237bf951..5ad15de2bb4f 100644
--- a/fs/cifs/file.c
+++ b/fs/cifs/file.c
@@ -253,6 +253,12 @@ int cifs_posix_open(char *full_path, struct inode **pinode,
                rc = cifs_get_inode_info(&inode, full_path, buf, inode->i_sb,
                                         xid, fid);
 
+       if (rc) {
+               server->ops->close(xid, tcon, fid);
+               if (rc == -ESTALE)
+                       rc = -EOPENSTALE;
+       }
+
 out:
        kfree(buf);
        return rc;

Comment 2 Dave Wysochanski 2019-11-04 19:15:26 UTC
Trying to refine reproducer to see if we can find increasing slab usage or something else.  Looks like this reproduces ESTALE fairly easily.  server is running 4.18.0-80.el8

[1]   Running                 tcpdump -i ens3 -w /tmp/tcpdump.pcap host 192.168.122.44 &  (wd: /tmp)
[2]-  Running                 nohup stap -v -e 'probe module("cifs").function("cifs_get_inode_info").return { if ($return) { printf("%s %s returning %d\n", tz_ctime(gettimeofday_s()), ppfunc(), $return) } }' &
[3]+  Running                 while true; do
    python3 -c "with open('/mnt/example/bz1767664/1') as f: print(f)"; rm -i -f /mnt/example/bz1767664/1; touch /mnt/example/bz1767664/1; touch /mnt/example/bz1767664/2;
done &

Roughly every 30 seconds:
# touch: cannot touch '/mnt/example/bz1767664/1': No such file or directory
touch: setting times of '/mnt/example/bz1767664/2': No such file or directory
Traceback (most recent call last):
  File "<string>", line 1, in <module>
FileNotFoundError: [Errno 2] No such file or directory: '/mnt/example/bz1767664/1'
touch: setting times of '/mnt/example/bz1767664/2': No such file or directory
Traceback (most recent call last):
  File "<string>", line 1, in <module>
OSError: [Errno 116] Stale file handle: '/mnt/example/bz1767664/1'


# tail nohup.out 
Mon Nov  4 13:37:08 2019 EST cifs_get_inode_info returning -2
Mon Nov  4 13:37:08 2019 EST cifs_get_inode_info returning -2
Mon Nov  4 13:37:08 2019 EST cifs_get_inode_info returning -2
Mon Nov  4 13:37:08 2019 EST cifs_get_inode_info returning -116
Mon Nov  4 13:37:08 2019 EST cifs_get_inode_info returning -116
Mon Nov  4 13:37:38 2019 EST cifs_get_inode_info returning -2
Mon Nov  4 13:37:38 2019 EST cifs_get_inode_info returning -2
Mon Nov  4 13:37:39 2019 EST cifs_get_inode_info returning -2
Mon Nov  4 13:37:39 2019 EST cifs_get_inode_info returning -116
Mon Nov  4 13:37:39 2019 EST cifs_get_inode_info returning -116

Comment 3 Dave Wysochanski 2019-11-04 19:54:14 UTC
This series had 2 other patches, so 3 patches total:
https://patchwork.kernel.org/patch/11166977/
https://patchwork.kernel.org/patch/11166979/
https://patchwork.kernel.org/patch/11166981/

30573a82fb17 CIFS: Gracefully handle QueryInfo errors during open
c82e5ac7fe35 CIFS: Force revalidate inode when dentry is stale
0b3d0ef9840f CIFS: Force reval dentry if LOOKUP_REVAL flag is set

Comment 5 Frank Sorenson 2019-11-12 17:55:12 UTC
This is also reproduced using 2 smb clients mounting the same share:

*** client 1 ***
# mount //vm3/user1 /mnt/vm3 -overs=3,sec=ntlmssp,credentials=/root/.user1_smb_creds
# cd /mnt/vm3
# echo foo > foo ; echo bar > bar

# cat foo bar
foo
bar


*** client 2 ***
# mount //vm3/user1 /mnt/vm3 -overs=3,sec=ntlmssp,credentials=/root/.user1_smb_creds
# cd /mnt/vm3

# cat foo bar
foo
bar


*** smb server ***
# smbstatus | grep -i locked -A1000
No locked files


*** client 2 ***
# while true ; do cat foo bar >/dev/null 2>/dev/null ; cat bar || break ; done
bar
bar
bar
...


*** client 1 ***
# mv -f foo bar


*** client 2 ***
...
bar
bar
bar
cat: bar: Stale file handle


*** smb server ***

# smbstatus | grep -i locked -A1000
Locked files:
Pid          Uid        DenyMode   Access      R/W        Oplock           SharePath   Name   Time
--------------------------------------------------------------------------------------------------
22254        501        DENY_NONE  0x120089    RDONLY     LEASE(RWH)       /home/user1   bar   Tue Nov 12 11:20:12 2019
22254        501        DENY_NONE  0x120089    RDONLY     LEASE(RWH)       /home/user1   bar   Tue Nov 12 11:20:12 2019
22254        501        DENY_NONE  0x120089    RDONLY     LEASE(RWH)       /home/user1   bar   Tue Nov 12 11:20:12 2019
22254        501        DENY_NONE  0x120089    RDONLY     LEASE(RWH)       /home/user1   bar   Tue Nov 12 11:20:12 2019

Comment 7 Phillip Lougher 2019-11-15 17:09:54 UTC
Patch(es) available on kernel-4.18.0-150.el8

Comment 11 Bruno Meneguele 2019-12-10 04:41:12 UTC
*** Bug 1768695 has been marked as a duplicate of this bug. ***

Comment 13 errata-xmlrpc 2020-04-28 16:31:50 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:1769


Note You need to log in before you can comment on or make changes to this bug.