Bug 1615258 - nfs: kernel BUG at include/linux/scatterlist.h:143!
Summary: nfs: kernel BUG at include/linux/scatterlist.h:143!
Keywords:
Status: NEW
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: rawhide
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-08-13 07:47 UTC by Michael Young
Modified: 2020-09-15 20:32 UTC (History)
18 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Type: Bug
Embargoed:


Attachments (Terms of Use)
iSCSI local LIO target (11.91 KB, text/plain)
2018-08-13 15:07 UTC, Martin Hoyer
no flags Details
dmesg - 4.19.0-0.rc0.git4.1.iscsifix.fc30.x86_64 (11.28 KB, text/plain)
2018-08-20 08:09 UTC, Martin Hoyer
no flags Details

Description Michael Young 2018-08-13 07:47:54 UTC
nfs mounts seem to be broken since I updated to rawhide (F29). I get the following backtrace

[   32.762320] kernel BUG at include/linux/scatterlist.h:143!
[   32.762526] invalid opcode: 0000 [#1] SMP PTI
[   32.762540] CPU: 1 PID: 1780 Comm: mount.nfs Not tainted 4.18.0-0.rc8.git2.1.fc29.x86_64 #1
[   32.762547] Hardware name: Viglen VIG830S/Q170M-C, BIOS 3019 01/07/2017
[   32.762563] RIP: 0010:sg_init_one+0x7d/0x90
[   32.762569] Code: d6 48 03 05 5d 09 ec 00 83 e6 03 a8 03 75 1d 83 e2 01 75 1a 48 09 f0 41 89 6c 24 0c 5b 49 89 04 24 5d 41 89 4c 24 08 41 5c c3 <0f> 0b 0f 0b 0f 0b 48 8b 05 56 6a 10 01 eb b6 0f 1f 40 00 81 fe 80 
[   32.762771] RSP: 0018:ffffa6790164b670 EFLAGS: 00010246
[   32.762783] RAX: 0000000000000000 RBX: ffffa6790164b6ec RCX: 0000000000000027
[   32.762791] RDX: 0000182b8164b6ec RSI: 0000000000000030 RDI: ffffa6798164b6ec
[   32.762798] RBP: 0000000000000004 R08: ffff8e4f6e3cc3c0 R09: ffffa6790164b6c8
[   32.762805] R10: ffff8e4f6e3cc3c0 R11: 0000000000000000 R12: ffffa6790164b6c8
[   32.762812] R13: ffff8e4f6c109668 R14: 0000000000000008 R15: ffff8e4f748eb900
[   32.762821] FS:  00007f5c15106880(0000) GS:ffff8e4fad000000(0000) knlGS:0000000000000000
[   32.762828] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   32.762836] CR2: 00007fd34d0391d8 CR3: 000000021c9b4001 CR4: 00000000003606e0
[   32.762845] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   32.762852] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   32.762858] Call Trace:
[   32.762875]  make_checksum+0x4e7/0x760 [rpcsec_gss_krb5]
[   32.762897]  gss_get_mic_kerberos+0x26e/0x310 [rpcsec_gss_krb5]
[   32.762921]  gss_marshal+0x126/0x1a0 [auth_rpcgss]
[   32.762942]  ? __local_bh_enable_ip+0x80/0xe0
[   32.762975]  ? call_transmit_status+0x1d0/0x1d0 [sunrpc]
[   32.763007]  call_transmit+0x137/0x230 [sunrpc]
[   32.763046]  __rpc_execute+0x9b/0x490 [sunrpc]
[   32.763082]  rpc_run_task+0x119/0x150 [sunrpc]
[   32.763118]  nfs4_run_exchange_id+0x1bd/0x250 [nfsv4]
[   32.763163]  _nfs4_proc_exchange_id+0x2d/0x490 [nfsv4]
[   32.763208]  nfs41_discover_server_trunking+0x1c/0xa0 [nfsv4]
[   32.763249]  nfs4_discover_server_trunking+0x80/0x270 [nfsv4]
[   32.763297]  nfs4_init_client+0x16e/0x240 [nfsv4]
[   32.763324]  ? nfs_get_client+0x4c9/0x5d0 [nfs]
[   32.763343]  ? _raw_spin_unlock+0x24/0x30
[   32.763366]  ? nfs_get_client+0x4c9/0x5d0 [nfs]
[   32.763412]  nfs4_set_client+0xb2/0x100 [nfsv4]
[   32.763460]  nfs4_create_server+0xff/0x290 [nfsv4]
[   32.763507]  nfs4_remote_mount+0x28/0x50 [nfsv4]
[   32.763522]  mount_fs+0x3b/0x16a
[   32.763541]  vfs_kern_mount.part.35+0x54/0x160
[   32.763582]  nfs_do_root_mount+0x7f/0xc0 [nfsv4]
[   32.763623]  nfs4_try_mount+0x43/0x70 [nfsv4]
[   32.763649]  ? get_nfs_version+0x21/0x80 [nfs]
[   32.763680]  nfs_fs_mount+0x789/0xbf0 [nfs]
[   32.763695]  ? pcpu_alloc+0x6ca/0x7e0
[   32.763722]  ? nfs_clone_super+0x70/0x70 [nfs]
[   32.763749]  ? nfs_parse_mount_options+0xb40/0xb40 [nfs]
[   32.763767]  mount_fs+0x3b/0x16a
[   32.763783]  vfs_kern_mount.part.35+0x54/0x160
[   32.763798]  do_mount+0x1fd/0xd50
[   32.763817]  ksys_mount+0xba/0xd0
[   32.763831]  __x64_sys_mount+0x21/0x30
[   32.763844]  do_syscall_64+0x60/0x1f0
[   32.763856]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
[   32.763866] RIP: 0033:0x7f5c157bc07e
[   32.763871] Code: 48 8b 0d 0d 1e 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d da 1d 0c 00 f7 d8 64 89 01 48 
[   32.764070] RSP: 002b:00007fff1246ff18 EFLAGS: 00000202 ORIG_RAX: 00000000000000a5
[   32.764081] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f5c157bc07e
[   32.764089] RDX: 00005604fa90a4a0 RSI: 00005604fa90a480 RDI: 00005604fa90a4c0
[   32.764096] RBP: 00007fff124700a0 R08: 00005604fa90e880 R09: 00007fff1246f270
[   32.764103] R10: 0000000000000002 R11: 0000000000000202 R12: 00007fff124700a0
[   32.764111] R13: 00005604fa90e590 R14: 0000000000000010 R15: 00005604f93eb279
[   32.764130] Modules linked in: arc4 rpcsec_gss_krb5 nfsv4 dns_resolver nfs lockd grace fscache xt_CHECKSUM ipt_MASQUERADE xt_conntrack tun nf_tables_set nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv6 nft_reject nft_ct nft_chain_nat_ipv6 nft_chain_nat_ipv4 devlink nf_tables ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat iptable_mangle iptable_raw iptable_security nf_conntrack libcrc32c ip_set nfnetlink ebtable_filter ebtables ip6table_filter ip6_tables vfat fat intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm mei_wdt snd_hda_codec_hdmi iTCO_wdt iTCO_vendor_support eeepc_wmi asus_wmi
[   32.764362]  sparse_keymap rfkill wmi_bmof snd_hda_codec_realtek mxm_wmi snd_hda_codec_generic ppdev irqbypass snd_hda_intel crct10dif_pclmul crc32_pclmul snd_hda_codec snd_hda_core ghash_clmulni_intel snd_hwdep intel_cstate snd_seq intel_uncore snd_seq_device snd_pcm intel_rapl_perf snd_timer snd soundcore i2c_i801 mei_me mei parport_pc wmi parport pcc_cpufreq acpi_pad auth_rpcgss sunrpc i915 i2c_algo_bit drm_kms_helper crc32c_intel e1000e drm serio_raw video
[   32.764548] ---[ end trace 912e9fde5efce1d6 ]---
[   32.764563] RIP: 0010:sg_init_one+0x7d/0x90
[   32.764569] Code: d6 48 03 05 5d 09 ec 00 83 e6 03 a8 03 75 1d 83 e2 01 75 1a 48 09 f0 41 89 6c 24 0c 5b 49 89 04 24 5d 41 89 4c 24 08 41 5c c3 <0f> 0b 0f 0b 0f 0b 48 8b 05 56 6a 10 01 eb b6 0f 1f 40 00 81 fe 80 
[   32.764791] RSP: 0018:ffffa6790164b670 EFLAGS: 00010246
[   32.764802] RAX: 0000000000000000 RBX: ffffa6790164b6ec RCX: 0000000000000027
[   32.764810] RDX: 0000182b8164b6ec RSI: 0000000000000030 RDI: ffffa6798164b6ec
[   32.764818] RBP: 0000000000000004 R08: ffff8e4f6e3cc3c0 R09: ffffa6790164b6c8
[   32.764826] R10: ffff8e4f6e3cc3c0 R11: 0000000000000000 R12: ffffa6790164b6c8
[   32.764833] R13: ffff8e4f6c109668 R14: 0000000000000008 R15: ffff8e4f748eb900
[   32.764842] FS:  00007f5c15106880(0000) GS:ffff8e4fad000000(0000) knlGS:0000000000000000
[   32.764851] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   32.764859] CR2: 00007fd34d0391d8 CR3: 000000021c9b4001 CR4: 00000000003606e0
[   32.764866] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   32.764874] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400

Comment 1 Martin Hoyer 2018-08-13 15:07:45 UTC
Created attachment 1475602 [details]
iSCSI local LIO target

Reproduced with local iSCSI LIO target.

Comment 2 Laura Abbott 2018-08-13 17:51:08 UTC
This looks like an underlying bug. make_checksum_hmac_md5 in gss_krb5_crypto.c is trying to use a stack buffer (rc4salt) for DMA which doesn't work with vmapped stacks. I'm guessing nobody has run into this with debugging enabled, hence the bug on.

Comment 3 Martin Hoyer 2018-08-17 12:16:06 UTC
I suspect enabling CRC32C checking on iSCSI is enough to trigger this.
@Laura, can I help getting more logs here? If so, let me know how.

Comment 4 Laura Abbott 2018-08-17 21:18:15 UTC
The iSCSI bug is a similar but separate issue from the NFS bug. I have a scratch build with a possible fix for the iSCSI issue https://koji.fedoraproject.org/koji/taskinfo?taskID=29150725

Comment 5 Laura Abbott 2018-08-17 21:19:20 UTC
Can you test the scratch build for the iSCSI fix?

Comment 6 Martin Hoyer 2018-08-20 08:09:26 UTC
Created attachment 1477094 [details]
dmesg - 4.19.0-0.rc0.git4.1.iscsifix.fc30.x86_64

Hi Laura,
the scratch build does not appear to fix the issue.

Comment 7 Laura Abbott 2018-08-20 20:17:02 UTC
https://koji.fedoraproject.org/koji/taskinfo?taskID=29205022 I fixed the second issue, please try this scratch build

Comment 8 Martin Hoyer 2018-08-22 11:31:38 UTC
Sorry for the late reply, I'm doing a training this week.
With the scratch build from comment#7, it is not reproducible anymore. What was the issue?

Comment 9 Laura Abbott 2018-08-22 17:17:56 UTC
Both of the issues were the same. As a side effect of switching to virtually mapped stacks several versions ago, you can no longer use stack allocated buffers with scatterlists. The fix is to switch to dynamically allocated buffers. This only gets caught with CONFIG_DEBUG_SG which is enabled in kernel-debug or rawhide snapshots. I'll see about cleaning up the patch and submitting it.


Note You need to log in before you can comment on or make changes to this bug.