Description of problem: A kernel WARNING due to invalid error code returned by smb2_get_enc_key, followed quickly by a NULL pointer dereference. The kernel warning matches a warning found and resolved upstream in the following commit: commit 83728cbf366e334301091d5b808add468ab46b27 Author: Paul Aurich <paul> Date: 2021-04-13 14:25:27 -0700 cifs: Return correct error code from smb2_get_enc_key Avoid a warning if the error percolates back up: [440700.376476] CIFS VFS: \\otters.example.com crypt_message: Could not get encryption key [440700.386947] ------------[ cut here ]------------ [440700.386948] err = 1 [440700.386977] WARNING: CPU: 11 PID: 2733 at /build/linux-hwe-5.4-p6lk6L/linux-hwe-5.4-5.4.0/lib/errseq.c:74 errseq_set+0x5c/0x70 ... [440700.397304] CPU: 11 PID: 2733 Comm: tar Tainted: G OE 5.4.0-70-generic #78~18.04.1-Ubuntu ... [440700.397334] Call Trace: [440700.397346] __filemap_set_wb_err+0x1a/0x70 [440700.397419] cifs_writepages+0x9c7/0xb30 [cifs] [440700.397426] do_writepages+0x4b/0xe0 [440700.397444] __filemap_fdatawrite_range+0xcb/0x100 [440700.397455] filemap_write_and_wait+0x42/0xa0 [440700.397486] cifs_setattr+0x68b/0xf30 [cifs] [440700.397493] notify_change+0x358/0x4a0 [440700.397500] utimes_common+0xe9/0x1c0 [440700.397510] do_utimes+0xc5/0x150 [440700.397520] __x64_sys_utimensat+0x88/0xd0 Fixes: 61cfac6f267d ("CIFS: Fix possible use after free in demultiplex thread") Signed-off-by: Paul Aurich <paul> CC: stable.org Signed-off-by: Steve French <stfrench> It is unclear whether the crash is directly related to the earlier warning, however it followed very shortly after the warning, and cannot be ruled out. [735580.840999] ---[ end trace 06621dc5d043e510 ]--- [735581.250444] BUG: unable to handle kernel NULL pointer dereference at 0000000000000098 [735581.252976] PGD 0 P4D 0 [735581.255018] Oops: 0000 [#1] SMP PTI [735581.257608] CPU: 5 PID: 1567 Comm: cifsd Kdump: loaded Tainted: G W --------- - - 4.18.0-348.7.1.el8_5.x86_64 #1 [735581.270029] RIP: 0010:smb2_writev_callback+0x49/0x3a0 [cifs] [735581.328007] ? kmem_cache_free+0x385/0x3b0 [735581.330245] cifs_reconnect+0x324/0xe00 [cifs] [735581.333315] cifs_readv_from_socket+0x1ad/0x260 [cifs] [735581.336015] cifs_read_from_socket+0x4a/0x70 [cifs] [735581.339023] ? smb3_receive_transform+0x292/0x880 [cifs] [735581.341024] ? cifs_small_buf_get+0x16/0x20 [cifs] [735581.344949] ? allocate_buffers+0x66/0x120 [cifs] [735581.346285] cifs_demultiplex_thread+0xf6/0xc40 [cifs] [735581.349015] ? finish_task_switch+0xaa/0x2e0 [735581.353017] ? cifs_handle_standard+0x190/0x190 [cifs] [735581.355877] kthread+0x116/0x130 [735581.357386] ? kthread_flush_work_fn+0x10/0x10 [735581.360132] ret_from_fork+0x35/0x40 Version-Release number of selected component (if applicable): kernel 4.18.0-348.7.1.el8_5.x86_64 How reproducible: unknown, but customer reports the crash has occurred twice. Steps to Reproduce: unknown Actual results: kernel warning and crash Expected results: no kernel warning or crash Additional info:
vmcores from two kernel versions were provided by the customer kernel 4.18.0-348.7.1.el8_5.x86_64 kernel 4.18.0-348.12.2.el8_5.x86_64 the WARNINGs are the same in both vmcores CIFS: VFS: \\server.example.com crypt_message: Could not get encryption key err = 1 WARNING: CPU: 6 PID: 54199 at lib/errseq.c:74 errseq_set+0x5b/0x70 In both cases, the crash occurred very shortly after the warning (~0.5 seconds). The RIPs in the vmcores are just one instruction from each other in smb2_writev_callback: <smb2_writev_callback+0x49>: mov 0x98(%rax),%rax << 4.18.0-348.7.1.el8_5.x86_64 <smb2_writev_callback+0x50>: mov 0x38(%rax),%r14 << 4.18.0-348.12.2.el8_5.x86_64 smb2_writev_callback(struct mid_q_entry *mid) struct cifs_writedata *wdata = mid->callback_data; struct cifs_tcon *tcon = tlink_tcon(wdata->cfile->tlink); 4.18.0-348.7.1.el8_5.x86_64 kernel 0xffffffffc0a89d4b <smb2_writev_callback+0x3b>: mov 0x80(%rbx),%rax 0xffffffffc0a89d52 <smb2_writev_callback+0x42>: mov 0xa8(%rbx),%r12 0xffffffffc0a89d59 <smb2_writev_callback+0x49>: mov 0x98(%rax),%rax ((struct cifs_writedata *)mid->callback_data)->cfile was zero 4.18.0-348.12.2.el8_5.x86_64 kernel 0xffffffffc088dd4b <smb2_writev_callback+0x3b>: mov 0x80(%rbx),%rax 0xffffffffc088dd52 <smb2_writev_callback+0x42>: mov 0xa8(%rbx),%r12 0xffffffffc088dd59 <smb2_writev_callback+0x49>: mov 0x98(%rax),%rax 0xffffffffc088dd60 <smb2_writev_callback+0x50>: mov 0x38(%rax),%r14 ((struct cifs_writedata *)(mid->callback_data))->cfile->tlink was zero, which means that ->cfile was non-zero when the crash occurred. However, examination of the vmcore indicates that ->cfile is now 0 as well, so it has apparently been modified by another task
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: kernel security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:7683