RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1771687 - Process killed while opening a file can result in leaked open handle on the server
Summary: Process killed while opening a file can result in leaked open handle on the s...
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: kernel
Version: 7.7
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: rc
: ---
Assignee: cifs-maint
QA Contact: xiaoli feng
URL:
Whiteboard:
Depends On:
Blocks: 1771691
TreeView+ depends on / blocked
 
Reported: 2019-11-12 19:14 UTC by Frank Sorenson
Modified: 2024-06-13 22:18 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1771691 1927839 (view as bug list)
Environment:
Last Closed: 2019-12-12 00:29:48 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 4583551 0 None None None 2022-06-27 01:07:17 UTC

Internal Links: 1927839

Description Frank Sorenson 2019-11-12 19:14:30 UTC
Description of problem:

If a process opening a file is killed while waiting for a SMB2_CREATE response from the server, the response may not be handled by the client, leaking an open file handle on the server.


Version-Release number of selected component (if applicable):

all kernels (RHEL 7, RHEL 8, upstream)


How reproducible:

easy


Steps to Reproduce:

# mount //vm3/user1 /mnt/vm3 -overs=3,sec=ntlmssp,credentials=/root/.user1_smb_creds
# cd /mnt/vm3
# echo foo > foo

# for i in {1..100} ; do cat foo >/dev/null 2>&1 & sleep 0.0001 ; kill -9 $! ; done

(increase count if necessary--100 appears sufficient to cause multiple leaked file handles)


Actual results:

the client stops waiting for the response, and outputs the following message when the response arrives:
    CIFS VFS: Close unmatched open

the server leaks an open file handle--can be seen using samba, with the following:

# smbstatus | grep -i Locked -A1000
Locked files:
Pid          Uid        DenyMode   Access      R/W        Oplock           SharePath   Name   Time
--------------------------------------------------------------------------------------------------
25936        501        DENY_NONE  0x80        RDONLY     NONE             /home/user1   .   Tue Nov 12 12:29:24 2019
25936        501        DENY_NONE  0x80        RDONLY     NONE             /home/user1   .   Tue Nov 12 12:29:24 2019
25936        501        DENY_NONE  0x120089    RDONLY     LEASE(RWH)       /home/user1   foo   Tue Nov 12 12:29:24 2019
25936        501        DENY_NONE  0x120089    RDONLY     LEASE(RWH)       /home/user1   foo   Tue Nov 12 12:29:24 2019
25936        501        DENY_NONE  0x120089    RDONLY     LEASE(RWH)       /home/user1   foo   Tue Nov 12 12:29:24 2019
25936        501        DENY_NONE  0x120089    RDONLY     LEASE(RWH)       /home/user1   foo   Tue Nov 12 12:29:24 2019
25936        501        DENY_NONE  0x120089    RDONLY     LEASE(RWH)       /home/user1   foo   Tue Nov 12 12:29:24 2019


Expected results:

the client handles the open response, and then closes the file (can the create/open be canceled?)


Additional info:

Comment 4 Dave Wysochanski 2019-12-04 21:30:33 UTC
FWIW, the 3 upstream commits (currently in cifs-next, not in linus) in the series for this bug do not backport cleanly against kernel-3.10.0-1117.el7 which likely means we may need more commits:
9150c3adbf24d77cfba37f03639d4a908ca4ac25 CIFS: Close open handle after interrupted close
7b71843fa7028475b052107664cbe120156a2cfc CIFS: Do not miss cancelled OPEN responses
86a7964be7afaf3df6b64faaa10a7032d2444e51 CIFS: Fix NULL pointer dereference in mid callback

9150c3adbf24d77cfba37f03639d4a908ca4ac25 depends on 97ca1762246d6eeb1b48dff8a179d1a92ce24227
- seems ok to backport

7b71843fa7028475b052107664cbe120156a2cfc may depend on ?? - maybe afe6f65353b64 and 3190b59a050ec and others?

The last commit may be peripheral to this bug, commit 86a7964be7afaf3df6b64faaa10a7032d2444e51 seems to depend on a non-stable patch that is fairly invasive which adds 'cifs_credits':
commit 335b7b62ffb69d18055f2bb6f3a029263a07c735
Author: Pavel Shilovsky <pshilov>
Date:   Wed Jan 16 11:12:41 2019 -0800

    CIFS: Respect reconnect in MTU credits calculations
    
    Every time after a session reconnect we don't need to account for
    credits obtained in previous sessions. Introduce new struct cifs_credits
    which contains both credits value and reconnect instance of the
    time those credits were taken. Modify a routine that add credits
    back to handle the reconnect instance by assuming zero credits
    if the reconnect happened after the credits were obtained and
    before we decided to add them back due to some errors during sending.
    
    This patch fixes the MTU credits cases. The subsequent patch
    will handle non-MTU ones.
    
    Signed-off-by: Pavel Shilovsky <pshilov>
    Signed-off-by: Steve French <stfrench>


I'm not sure how likely this is to make RHEL7.8 given they are just going upstream and given dependencies such as above.  The patches addressing this bug do have stable cc on them though, and are not a huge # of lines.  Possibly it is not without conception they could be fixed up, assuming impact is high enough and no workaround, etc. but Ronnie can say more.

NOTE: If they go into 7.8 we need to make sure they also go into 8.2

Comment 8 Mayank Thapliyal 2021-02-08 10:54:43 UTC
Hi,

We are seeing our field reporting customers on RHEL (7.7 and others) who are looking for this fix for RHEL 7 SKUs as well.
Could RedHat team please work on this fix on supported SKUs for RHEL 7 as well?

Comment 9 Dave Wysochanski 2021-02-08 14:38:30 UTC
(In reply to Mayank Thapliyal from comment #8)
> Hi,
> 
> We are seeing our field reporting customers on RHEL (7.7 and others) who are
> looking for this fix for RHEL 7 SKUs as well.

How many customers would you say are affected?  And do you have a specific customer that we might get a case filed, so we can work on a more targetted fix?  If we have at least one customer that is willing to try test kernels it will increase chances we may be able to get something into RHEL7, though there's no guarantees.

At the time of the original assessment, it was thought this issue is very hard to hit, so not many customers would be affected.  If you're seeing something different we may need to dig in further.

> Could RedHat team please work on this fix on supported SKUs for RHEL 7 as
> well?

We had decided against this patchset over a year ago, so to set expectations, it may be unlikely to get fixed in RHEL7.  However, I'm not sure if there's some specific patch or patches possible that can still be appropriate for RHEL7 given the lifecycle.  

We're at a point where many customers are still on RHEL7, but need to start migrating to RHEL8 due to lifecycle.  This bug is just one of many example of something fixed in RHEL8, but unfortunately still exists on RHEL7.

Comment 11 Mayank Thapliyal 2021-02-11 04:12:31 UTC
We have at least one big customer (A big European car manufacturer) who is using these VMs on Azure. And then there are smaller 3PP publishers. There might be others as well who might come at a later point in time as they might not be aware of this leak.

Comment 12 Dave Wysochanski 2021-02-11 09:47:58 UTC
(In reply to Mayank Thapliyal from comment #11)
> We have at least one big customer (A big European car manufacturer) who is
> using these VMs on Azure. And then there are smaller 3PP publishers. There
> might be others as well who might come at a later point in time as they
> might not be aware of this leak.

Do you have a current customer that is hitting this problem, and would be willing to use a test kernel and verify the fix?

Comment 16 Red Hat Bugzilla 2023-09-18 00:18:21 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days


Note You need to log in before you can comment on or make changes to this bug.