Bug 1737511 - RHEL 7.6: smb2.1 client repeated cifsd kernel panic due to stale reference to old socket
Summary: RHEL 7.6: smb2.1 client repeated cifsd kernel panic due to stale reference to...
Keywords:
Status: CLOSED DUPLICATE of bug 1588123
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: kernel
Version: 7.6
Hardware: Unspecified
OS: Unspecified
medium
high
Target Milestone: rc
: ---
Assignee: cifs-maint
QA Contact: Filesystem QE
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-08-05 14:33 UTC by Frank Sorenson
Modified: 2019-11-06 16:20 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-11-05 20:00:13 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 3479301 0 None None None 2019-08-05 14:44:48 UTC

Description Frank Sorenson 2019-08-05 14:33:44 UTC
This bug was initially created as a copy of Bug #1588123

I am copying this bug because: 

The bug has occurred on the 7.6 kernel, and the BZ was for 7.5-alt


Description of problem:

(modified with information from the crashes in 7.6)

Crash in cifsd while receiving a message on a socket which had been closed, and was now in FIN_WAIT1.  A new socket had been recently created, but the crashing task still had the old pointers.


crash backtraces in multiple vmcores are in the same code:

[28046.468135] Status code returned 0xc000000d STATUS_INVALID_PARAMETER
[28046.468653] Status code returned 0xc000000d STATUS_INVALID_PARAMETER
[28046.469195] Status code returned 0xc000000d STATUS_INVALID_PARAMETER
[28046.499617] Status code returned 0xc0000022 STATUS_ACCESS_DENIED
[28046.784968] Status code returned 0xc000000d STATUS_INVALID_PARAMETER
[28046.785439] Status code returned 0xc000000d STATUS_INVALID_PARAMETER
[28046.785856] Status code returned 0xc000000d STATUS_INVALID_PARAMETER
[28046.786128] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
[28046.786304] IP: [<ffffffff9ea1eb15>] sk_wait_data+0xc5/0x120
[28046.786480] PGD 0 
[28046.786639] Oops: 0002 [#1] SMP 
[28046.786829] Modules linked in: macsec vsock_diag tcp_diag udp_diag inet_diag unix_diag af_packet_diag netlink_diag fuse cmac nfsv3 rpcsec_gss_krb5 nfs_acl arc4 nfsv4 md4 nls_utf8 cifs nfs lockd grace ccm fscache dns_resolver vmw_vsock_vmci_transport vsock ppdev vmw_balloon iosf_mbi crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd joydev pcspkr sg vmw_vmci i2c_piix4 parport_pc parport auth_rpcgss sunrpc binfmt_misc cbsensor(POE) ip_tables xfs libcrc32c sr_mod cdrom sd_mod crc_t10dif ata_generic crct10dif_generic pata_acpi crct10dif_pclmul crct10dif_common crc32c_intel vmwgfx drm_kms_helper syscopyarea sysfillrect serio_raw sysimgblt fb_sys_fops ttm ata_piix drm vmw_pvscsi libata vmxnet3 drm_panel_orientation_quirks nfit libnvdimm dm_mirror dm_region_hash dm_log
[28046.788341]  dm_mod
[28046.788665] CPU: 3 PID: 11361 Comm: cifsd Kdump: loaded Tainted: P           OE  ------------   3.10.0-957.5.1.el7.x86_64 #1
[28046.789035] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 12/12/2018
[28046.789437] task: ffff9398306730c0 ti: ffff9398297d4000 task.ti: ffff9398297d4000
[28046.789826] RIP: 0010:[<ffffffff9ea1eb15>]  [<ffffffff9ea1eb15>] sk_wait_data+0xc5/0x120
[28046.790268] RSP: 0018:ffff9398297d7b10  EFLAGS: 00010246
[28046.790678] RAX: 0000000000000000 RBX: ffff9398242ebe00 RCX: ffff9398297d7fd8
[28046.791135] RDX: ffff9398297d7fd8 RSI: 0000000000000200 RDI: ffffffff9ea1d5f5
[28046.791598] RBP: ffff9398297d7b60 R08: ffff9398297d4000 R09: 0000000000000001
[28046.792038] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[28046.792507] R13: 0000000000000000 R14: ffff9398242ebe90 R15: 0000000000000000
[28046.792982] FS:  0000000000000000(0000) GS:ffff93983d6c0000(0000) knlGS:0000000000000000
[28046.793452] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[28046.793954] CR2: 0000000000000008 CR3: 0000000ffa9ae000 CR4: 00000000007607e0
[28046.794473] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[28046.794963] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[28046.795457] PKRU: 00000000
[28046.795944] Call Trace:
[28046.796454]  [<ffffffff9e4c2d40>] ? wake_up_atomic_t+0x30/0x30
[28046.796954]  [<ffffffff9ea91bd3>] tcp_recvmsg+0x6d3/0xb30
[28046.797474]  [<ffffffff9eac00e0>] inet_recvmsg+0x80/0xb0
[28046.798005]  [<ffffffff9ea194f5>] sock_recvmsg+0xc5/0x100
[28046.798530]  [<ffffffff9e461253>] ? x2apic_send_IPI_mask+0x13/0x20
[28046.799049]  [<ffffffff9e4d6530>] ? try_to_wake_up+0x190/0x390
[28046.799583]  [<ffffffff9ea1956a>] kernel_recvmsg+0x3a/0x50
[28046.800133]  [<ffffffffc0652045>] cifs_readv_from_socket+0x205/0x310 [cifs]
[28046.800689]  [<ffffffff9e4d6530>] ? try_to_wake_up+0x190/0x390
[28046.801277]  [<ffffffffc065255c>] cifs_demultiplex_thread+0x11c/0xaa0 [cifs]
[28046.801874]  [<ffffffffc0652440>] ? cifs_handle_standard+0x1b0/0x1b0 [cifs]
[28046.802456]  [<ffffffff9e4c1c71>] kthread+0xd1/0xe0
[28046.803028]  [<ffffffff9e4c1ba0>] ? insert_kthread_work+0x40/0x40
[28046.803699]  [<ffffffff9eb74c1d>] ret_from_fork_nospec_begin+0x7/0x21
[28046.804360]  [<ffffffff9e4c1ba0>] ? insert_kthread_work+0x40/0x40
[28046.804965] Code: 44 c2 49 39 c5 74 5f 31 f6 48 89 df e8 b5 ea ff ff 4c 8b a3 98 00 00 00 b8 00 00 00 00 4d 39 e6 4c 0f 44 e0 48 8b 83 48 02 00 00 <f0> 80 60 08 fd 48 8b bb e0 00 00 00 48 8d 75 b0 e8 16 3e aa ff 
[28046.806349] RIP  [<ffffffff9ea1eb15>] sk_wait_data+0xc5/0x120
[28046.807022]  RSP <ffff9398297d7b10>
[28046.807722] CR2: 0000000000000008



Version-Release number of selected component (if applicable):
* Red Hat Enterprise Linux 7.6 (cifs client)
  * kernel 3.10.0-957.5.1.el7.x86_64


How reproducible:
Reported by one customer so far with standard RHEL 7.6 kernel; multiple customers in 7.5-alt


Steps to Reproduce:

Unknown, but may involve disconnect /reconnect or certain error paths


Actual results:

crash


Expected results:

no crash


Additional info:

The following upstream patch has been identified in bz1588123:

CIFS: Do not reconnect TCP session in add_credits()
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=ef68e831840c40c7d01b328b3c0f5d8c4796c232

Comment 3 Frank Sorenson 2019-08-05 21:13:37 UTC
vmcore 833419566


PID: 7698   TASK: ffff9d0575a9a080  CPU: 5   COMMAND: "cifsd"
    [exception RIP: sk_wait_data+0xc5]
#10 [ffff9d0568603b68] tcp_recvmsg at ffffffff94691bd3
#11 [ffff9d0568603c08] inet_recvmsg at ffffffff946c00e0
#12 [ffff9d0568603c38] sock_recvmsg at ffffffff946194f5
#13 [ffff9d0568603da0] kernel_recvmsg at ffffffff9461956a
#14 [ffff9d0568603dc0] cifs_readv_from_socket at ffffffffc0601045 [cifs]
#15 [ffff9d0568603e58] cifs_demultiplex_thread at ffffffffc060155c [cifs]
#16 [ffff9d0568603ec8] kthread at ffffffff940c1c71
#17 [ffff9d0568603f50] ret_from_fork_nospec_begin at ffffffff94774c1d

int cifs_readv_from_socket(struct TCP_Server_Info *, struct kvec *, unsigned int, unsigned int);
3 RDI: 0xffff9d057675a000
7 RSI: 0xffff9d0568603e80
4 RDX: 0x1
5 RCX: 0x4


that's this connection:

3) (struct TCP_Server_Info *)0xffff9d057675a000
    hostname: '10.1.2.3'
    task: (struct task_struct *)0xffff9d0575a9a080 - pid: 7698 - cifsd
    TCP Status: 0x3 (CifsNeedReconnect)
    last response 20 jiffies (0  0:00:00.020)
    Local users to server: 1    requests on wire: 0
    sec_mode: 0x1 (MAY_SIGN)
    CurrentMid: 0x002a1cc0
    socket: (struct socket *)0xffff9d0547dd3480, (struct sock *)0xffff9d0050bb5540
    src: (None, None), dst: ('10.2.3.4', 445)
        request_q has 2 entries: list -H 0xffff9d057675a188
            0xffff9d05707ba080    14776   grep             UN 0  0:00:00.018   
            0xffff9d05446dc100    16978   kworker/u16:1    UN 0  0:00:00.018   
        response_q has 0 entries: list -H 0xffff9d057675a170
        pending_mid_q has 0 entries: list -H 0xffff9d057675a198 -s mid_q_entry
    1 smb session(s): list -H 0xffff9d057675a010 -o cifs_ses.smb_ses_list -s cifs_ses
        1) (struct cifs_ses *)0xffff9d057be73c00
        .serverName: '10.2.3.4', .serverDomain: 'None'
        .serverOS: None
        .serverNOS: None
        .ses_count: 1, capabilities: 0x300036: CAP_MPX_MODE | CAP_UNICODE | CAP_NT_SMBS | CAP_RPC_REMOTE_APIS | 0x300000
        .SMB session status: 1 (CifsGood), .need_reconnect: True
        .auth_key.len: 16 - 1f5c7863 623f087f 1a5c7838 645c7864
        .user_name: 'cifsuser', domainName: 'LOCAL_CLUSTER'
        .tcon_ipc: (struct cifs_tcon *)0xffff9d057e253000 - tree: '\\10.2.3.4\IPC$', need_reconnect: True
            .ipc: True, .pipe: True, .retry: False, .nocase: False, .seal: False, .unix_ext: False, .local_lease: False
            .broken_posix_open: False, .broken_sparse_sup: False, .need_reopen_files: False, .use_resilient: False, .use_persistent: False
            .capabilities: 0x0 - 
            .nativeFileSystem: 'None', .tc_count: 1
            mount options: versdefault,sec=unknown?
            open files: list -H 0xffff9d057e253028 -s cifsFileInfo
            .rlist: list -H 0xffff9d057e253018 - []
        .tcon_list: list -H 0xffff9d057be73c10 -s cifs_tcon
            1) (struct cifs_tcon *)0xffff9d057e253800 - tree: '\\10.2.3.4\archive', need_reconnect: True
                .ipc: False, .pipe: False, .retry: False, .nocase: False, .seal: False, .unix_ext: False, .local_lease: False
                .broken_posix_open: False, .broken_sparse_sup: False, .need_reopen_files: False, .use_resilient: False, .use_persistent: True
                .capabilities: 0x10 - SMB2_SHARE_CAP_CONTINUOUS_AVAILABILITY
                .nativeFileSystem: 'None', .tc_count: 1
                mount options: versdefault,sec=unknown?
                open files: list -H 0xffff9d057e253828 -s cifsFileInfo
                1) (struct cifsFileInfo *)0xffff9d057de1ce00
                    partial path: 'path/filename1.csv'
                    .dentry: 0xffff9d0547cc30c0
                    .uid: <kuid_t 0xffff9d057de1ce28>, .pid: 14776 'grep'
                2) (struct cifsFileInfo *)0xffff9d0546f90a00
                    partial path: '/path/filename2.csv'
                    .dentry: 0xffff9d057c5b6e40
                    .uid: <kuid_t 0xffff9d0546f90a28>, .pid: 14776 'grep'
                3) (struct cifsFileInfo *)0xffff9d0555f9c400
                    partial path: '/path/filename3'
                    .dentry: 0xffff9d0572bd23c0
                    .uid: <kuid_t 0xffff9d0555f9c428>, .pid: 13692 'grep'
                .rlist: list -H 0xffff9d057e253818 - []

looking at the data on the stack:

int kernel_recvmsg(struct socket *, struct msghdr *, struct kvec *, size_t, size_t, int);
5 RDI: 0xffff9d0547dd3480
6 RSI: 0xffff9d0568603de8
1 RDX: 0xffff9d0576ff1810
4 RCX: 0x1

crash> socket.sk ffff9d0547dd3480
  sk = 0xffff9d0050bb5540

crash> xportshow -tv
...
        windows: rcv=29312, snd=14480  advmss=1448 rcv_ws=7 snd_ws=7
        nonagle=0 sack_ok=3 tstamp_ok=1
        rmem_alloc=0, wmem_alloc=1
        rx_queue=0, tx_queue=0
        rcvbuf=367360, sndbuf=87040
        rcv_tstamp=0.02 s, lsndtime=0.02 s ago,  RTO=201 ms

however, sk_wait_data was called with a different sock: ffff9d05747ef440


ffff9d05747ef440

crash> tcp_sock.inet_conn ffff9d05747ef440 -ox
struct tcp_sock {
  [ffff9d05747ef440] struct inet_connection_sock inet_conn;

crash> inet_connection_sock.icsk_inet ffff9d05747ef440 -ox
struct inet_connection_sock {
  [ffff9d05747ef440] struct inet_sock icsk_inet;

crash> inet_sock.sk ffff9d05747ef440 -ox 
struct inet_sock {
  [ffff9d05747ef440] struct sock sk;

crash> struct sock.sk_socket ffff9d05747ef440
  sk_socket = 0x0


the sock was saved from socket->sk in inet_recvmsg


int inet_recvmsg(struct kiocb *iocb, struct socket *sock, struct msghdr *msg,
                 size_t size, int flags)
{
        struct sock *sk = sock->sk;


the socket was obtained from TCP_Server_Info->ssocket in cifs_readv_from_socket:
int
cifs_readv_from_socket(struct TCP_Server_Info *server, struct kvec *iov_orig,
                       unsigned int nr_segs, unsigned int to_read)
{
...
        for (total_read = 0; to_read; total_read += length, to_read -= length) {
                try_to_freeze();

                if (server_unresponsive(server)) {
                        total_read = -ECONNABORTED;
                        break;
                }

                segs = kvec_array_init(iov, iov_orig, nr_segs, total_read);

                length = kernel_recvmsg(server->ssocket, &smb_msg,  <<
                                        iov, segs, to_read, 0);     <<


int kernel_recvmsg(struct socket *sock, struct msghdr *msg,
                   struct kvec *vec, size_t num, size_t size, int flags)
{
...
        msg->msg_iov = (struct iovec *)vec, msg->msg_iovlen = num;
        result = sock_recvmsg(sock, msg, size, flags);              <<<<


int sock_recvmsg(struct socket *sock, struct msghdr *msg,
                 size_t size, int flags)
{
        struct kiocb iocb;
        struct sock_iocb siocb;  <<<< sock_iocb
        int ret;

        init_sync_kiocb(&iocb, NULL);
        iocb.private = &siocb;
        ret = __sock_recvmsg(&iocb, sock, msg, size, flags);   <<<<

static inline int __sock_recvmsg(struct kiocb *iocb, struct socket *sock,
                                 struct msghdr *msg, size_t size, int flags)
{
        int err = security_socket_recvmsg(sock, msg, size, flags);

        return err ?: __sock_recvmsg_nosec(iocb, sock, msg, size, flags);  <<<<

static inline int __sock_recvmsg_nosec(struct kiocb *iocb, struct socket *sock,
                                       struct msghdr *msg, size_t size, int flags)
{
        struct sock_iocb *si = kiocb_to_siocb(iocb);

        si->sock = sock;   <<<< socket stored
        si->scm = NULL;
        si->msg = msg;
        si->size = size;
        si->flags = flags;

        return sock->ops->recvmsg(iocb, sock, msg, size, flags);

int inet_recvmsg(struct kiocb *iocb, struct socket *sock, struct msghdr *msg,
                 size_t size, int flags)
{
        struct sock *sk = sock->sk;
        int addr_len = 0;
        int err;

        sock_rps_record_flow(sk);

        err = sk->sk_prot->recvmsg(iocb, sk, msg, size, flags & MSG_DONTWAIT,
                                   flags & ~MSG_DONTWAIT, &addr_len);


int tcp_recvmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg,
                size_t len, int nonblock, int flags, int *addr_len)
{
...
        do {
...
                if (copied >= target) {
                        /* Do not sleep, just process backlog. */
                        release_sock(sk);
                        lock_sock(sk);
                } else {
                        sk_wait_data(sk, &timeo, last);
                }


int sk_wait_data(struct sock *sk, long *timeo, const struct sk_buff *skb)
{
        int rc;
        DEFINE_WAIT(wait);

        prepare_to_wait(sk_sleep(sk), &wait, TASK_INTERRUPTIBLE);
        set_bit(SOCK_ASYNC_WAITDATA, &sk->sk_socket->flags);     <<<<

sk->sk_socket is NULL


that 'struct sock' is:

<struct tcp_sock 0xffff9d05747ef440>            TCP
tcp   10.1.2.3:38430         10.2.3.4:445           FIN_WAIT1
        windows: rcv=3119360, snd=3145728  advmss=1448 rcv_ws=7 snd_ws=7
        nonagle=0 sack_ok=7 tstamp_ok=1
        rmem_alloc=0, wmem_alloc=1
        rx_queue=0, tx_queue=1
        rcvbuf=6291456, sndbuf=4194304
        rcv_tstamp=0.03 s, lsndtime=0.04 s ago,  RTO=203 ms


note that the new sock has just begun passing data:


the old tcp_sock:
<struct tcp_sock 0xffff9d05747ef440>            TCP
tcp   10.1.2.3:38430         10.2.3.4:445           FIN_WAIT1
        rcv_tstamp=0.03 s, lsndtime=0.04 s ago,  RTO=203 ms

crash> tcp_sock.segs_in,bytes_received,segs_out,rcv_tstamp 0xffff9d05747ef440
  segs_in = 0x9bec43
  bytes_received = 0x1583b248c2
  segs_out = 0x26dd4ae
  rcv_tstamp = 0xae9d2f

crash> pd (jiffies & 0xffffffff - 0xae9d2f)
$13 = 64

64 msec

the new tcp_sock:
<struct tcp_sock 0xffff9d0050bb5540>            TCP0xffff939820a40f80
tcp   10.1.2.3:49526         10.2.3.4:445           ESTABLISHED
        rcv_tstamp=0.02 s, lsndtime=0.02 s ago,  RTO=201 ms

crash> tcp_sock.segs_in,bytes_received,segs_out,rcv_tstamp 0xffff9d0050bb5540
  segs_in = 0x1
  bytes_received = 0x0
  segs_out = 0x2
  rcv_tstamp = 0xae9d39

crash> pd (jiffies & 0xffffffff - 0xae9d39)
$12 = 68

68 msec


==========================================================================================
vmcore 953557243


int cifs_readv_from_socket(struct TCP_Server_Info *, struct kvec *, unsigned int, unsigned int);

#14 cifs_readv_from_socket called from 0xffffffffc065255c <cifs_demultiplex_thread+284>
3 RDI: 0xffff9389a8e46000
7 RSI: 0xffff9398297d7e80
4 RDX: 0x1
5 RCX: 0x4

(again see cifs_info.out)
3) (struct TCP_Server_Info *)0xffff9389a8e46000
    hostname: '10.2.3.4'
    task: (struct task_struct *)0xffff9398306730c0 - pid: 11361 - cifsd
    TCP Status: 0x4 (CifsNeedNegotiate)
    last response 22 jiffies (0  0:00:00.022)
    Local users to server: 1    requests on wire: 1
    sec_mode: 0x1 (MAY_SIGN)
    CurrentMid: 0x00000001
    socket: (struct socket *)0xffff93983d149b80, (struct sock *)0xffff93987fd83e00
    *** TODO: socket state
    src: (None, None), dst: ('10.2.3.4', 445)
        request_q has 2 entries: list -H 0xffff9389a8e46188
            0xffff93952b7a5140    10203   perl             UN 0  0:00:00.000   
            0xffff93981caa5140    82652   kworker/u256:3   UN 0  0:00:00.000   
        response_q has 1 entries: list -H 0xffff9389a8e46170
            0xffff939634f8e180     9185   grep             UN 0  0:00:00.020   
        pending_mid_q has 1 entries: list -H 0xffff9389a8e46198 -s mid_q_entry
            0xffff9397b972ae00 mid: 0x00000000 MID_FREE | MID_REQUEST_SUBMITTED Create Directory
            allocated 22 jiffies ago (0  0:00:00.022)
                0xffff939634f8e180     9185   grep             UN 0  0:00:00.020   
    1 smb session(s): list -H 0xffff9389a8e46010 -o cifs_ses.smb_ses_list -s cifs_ses
        1) (struct cifs_ses *)0xffff939834efae00
        .serverName: '10.2.3.4', .serverDomain: 'None'
        .serverOS: None
        .serverNOS: None
        .ses_count: 1, capabilities: 0x300036: CAP_MPX_MODE | CAP_UNICODE | CAP_NT_SMBS | CAP_RPC_REMOTE_APIS | 0x300000
        .SMB session status: 1 (CifsGood), .need_reconnect: True
        .auth_key.len: 16 - 5c786339 260e5c78 64665c78 64645c78
        .user_name: 'cifsuser', domainName: 'LOCAL_CLUSTER'
        .tcon_ipc: (struct cifs_tcon *)0xffff9389a8e41800 - tree: '\\10.2.3.4\IPC$', need_reconnect: True
            .ipc: True, .pipe: True, .retry: False, .nocase: False, .seal: False, .unix_ext: False, .local_lease: False
            .broken_posix_open: False, .broken_sparse_sup: False, .need_reopen_files: False, .use_resilient: False, .use_persistent: False
            .capabilities: 0x0 - 
            .nativeFileSystem: 'None', .tc_count: 1
            mount options: vers3.0,sec=unknown?
            open files: list -H 0xffff9389a8e41828 -s cifsFileInfo
            .rlist: list -H 0xffff9389a8e41818 - []
        .tcon_list: list -H 0xffff939834efae10 -s cifs_tcon
            1) (struct cifs_tcon *)0xffff9389a8e40000 - tree: '\\10.2.3.4\archive', need_reconnect: True
                .ipc: False, .pipe: False, .retry: False, .nocase: False, .seal: False, .unix_ext: False, .local_lease: False
                .broken_posix_open: False, .broken_sparse_sup: False, .need_reopen_files: False, .use_resilient: False, .use_persistent: True
                .capabilities: 0x10 - SMB2_SHARE_CAP_CONTINUOUS_AVAILABILITY
                .nativeFileSystem: 'None', .tc_count: 1
                mount options: vers3.0,sec=unknown?
                open files: list -H 0xffff9389a8e40028 -s cifsFileInfo
                1) (struct cifsFileInfo *)0xffff93982937b400
                    partial path: '/path/filename1.csv'
                    .dentry: 0xffff93983d284840
                    .uid: <kuid_t 0xffff93982937b428>, .pid: 10203 'perl'
                2) (struct cifsFileInfo *)0xffff9396d2fa3200
                    partial path: '/path/filename2'
                    .dentry: 0xffff938fd25c2300
                    .uid: <kuid_t 0xffff9396d2fa3228>, .pid: 9185 'grep'
                .rlist: list -H 0xffff9389a8e40018 - []

so:
    socket: (struct socket *)0xffff93983d149b80, (struct sock *)0xffff93987fd83e00

crash> xportshow -tv
<struct tcp_sock 0xffff93987fd83e00>            TCP
tcp   10.1.2.3:55964         10.2.3.4:445           ESTABLISHED
        windows: rcv=30336, snd=14592  advmss=1448 rcv_ws=7 snd_ws=7
        nonagle=0 sack_ok=3 tstamp_ok=1
        rmem_alloc=2176, wmem_alloc=1
        rx_queue=212, tx_queue=0
        rcvbuf=367360, sndbuf=87040
        rcv_tstamp=0.02 s, lsndtime=0.02 s ago,  RTO=201 ms

and there's another sock:
<struct tcp_sock 0xffff9398242ebe00>            TCP
tcp   10.1.2.3:48146         10.2.3.4:445           FIN_WAIT1
        windows: rcv=3145728, snd=3145088  advmss=1448 rcv_ws=7 snd_ws=7
        nonagle=0 sack_ok=7 tstamp_ok=1
        rmem_alloc=0, wmem_alloc=1
        rx_queue=0, tx_queue=1
        rcvbuf=6291456, sndbuf=4194304
        rcv_tstamp=0.02 s, lsndtime=0.02 s ago,  RTO=201 ms

which is the one on which we crashed:
crash> sock.sk_socket 0xffff9398242ebe00
  sk_socket = 0x0


so we should have been waiting on this sock:
<struct tcp_sock 0xffff93987fd83e00>            TCP
tcp   10.1.2.3:55964         10.2.3.4:445           ESTABLISHED

but instead were waiting on:
<struct tcp_sock 0xffff9398242ebe00>            TCP
tcp   10.1.2.3:48146         10.2.3.4:445           FIN_WAIT1


for the valid sock:
crash> search -t 0xffff93987fd83e00 | grep PID | awk '{print $2}' | paste -s - -d ' '
9185

crash> ps -m 9185
[0 00:00:00.021] [UN]  PID: 9185   TASK: ffff939634f8e180  CPU: 8   COMMAND: "grep"

crash> bt -f 9185 | egrep -e ffff93987fd83e00 -e PID | egrep -e ffff93987fd83e00 -B1 | grep PID

(so this address is only found in pid 9185, and only in the remnant of the stack, however the process has only just run, and is in the process of reconnecting:

PID: 9185   TASK: ffff939634f8e180  CPU: 8   COMMAND: "grep"
 #0 [ffff9396cbb436b8] __schedule at ffffffff9eb67747
 #1 [ffff9396cbb43748] schedule at ffffffff9eb67c49
 #2 [ffff9396cbb43758] wait_for_response at ffffffffc066c253 [cifs]
 #3 [ffff9396cbb437b0] cifs_send_recv at ffffffffc066d3e7 [cifs]
 #4 [ffff9396cbb43800] smb2_send_recv at ffffffffc066d9b1 [cifs]
 #5 [ffff9396cbb43928] SMB2_negotiate at ffffffffc068245d [cifs]
 #6 [ffff9396cbb439b8] smb2_negotiate at ffffffffc067a75d [cifs]
 #7 [ffff9396cbb439c8] cifs_negotiate_protocol at ffffffffc0654f13 [cifs]
 #8 [ffff9396cbb439f0] smb2_reconnect at ffffffffc0681e06 [cifs]
 #9 [ffff9396cbb43a80] smb2_plain_req_init at ffffffffc068209e [cifs]
#10 [ffff9396cbb43ac0] smb2_new_read_req.constprop.11 at ffffffffc0683732 [cifs]
#11 [ffff9396cbb43af0] SMB2_read at ffffffffc0685886 [cifs]
#12 [ffff9396cbb43b80] smb2_sync_read at ffffffffc067a4fd [cifs]
#13 [ffff9396cbb43b90] cifs_readpage_worker at ffffffffc065bf76 [cifs]
#14 [ffff9396cbb43c48] cifs_readpage at ffffffffc065c665 [cifs]
#15 [ffff9396cbb43c80] generic_file_aio_read at ffffffff9e5b81f0
#16 [ffff9396cbb43d58] cifs_strict_readv at ffffffffc0662a69 [cifs]
#17 [ffff9396cbb43da0] do_sync_read at ffffffff9e640a83
#18 [ffff9396cbb43e78] vfs_read at ffffffff9e6414bf
#19 [ffff9396cbb43ea8] cb_sys_write at ffffffffc03d4c8b [cbsensor]
#20 [ffff9396cbb43f50] system_call_fastpath at ffffffff9eb74ddb


again, a very new sock:
old tcp_sock:
<struct tcp_sock 0xffff9398242ebe00>            TCP
tcp   10.1.2.3:48146         10.2.3.4:445           FIN_WAIT1

crash> tcp_sock.segs_in,bytes_received,segs_out,rcv_tstamp 0xffff9398242ebe00
  segs_in = 0xc885fa
  bytes_received = 0x17b0ff6548
  segs_out = 0x2eeff5f
  rcv_tstamp = 0x1a76240

crash> pd (jiffies & 0xffffffff - 0x1a76240)
$20 = 23

23 msec


new tcp_sock:
<struct tcp_sock 0xffff93987fd83e00>            TCP
tcp   10.1.2.3:55964         10.2.3.4:445           ESTABLISHED

crash> tcp_sock.segs_in,bytes_received,segs_out,rcv_tstamp 0xffff93987fd83e00
  segs_in = 0x3
  bytes_received = 0xd4
  segs_out = 0x4
  rcv_tstamp = 0x1a76241

crash> pd (jiffies & 0xffffffff - 0x1a76241)
$22 = 22

22 msec


Note You need to log in before you can comment on or make changes to this bug.