Bug 1816765 - kernel crash when loading mac80211_hwsim module
Summary: kernel crash when loading mac80211_hwsim module
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 31
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Davide Caratti
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-03-24 16:53 UTC by Vladimir Benes
Modified: 2020-05-21 04:49 UTC (History)
18 users (show)

Fixed In Version: 5.5.13-200.fc31
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-05-21 04:49:12 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
backport for f31 (4.99 KB, patch)
2020-03-24 20:14 UTC, Davide Caratti
no flags Details | Diff

Description Vladimir Benes 2020-03-24 16:53:14 UTC
1. Please describe the problem:
there is a crash when loading mac80211_hwsim module:
[12750.707904] cfg80211: Loading compiled-in X.509 certificates for regulatory database 
[12750.716473] cfg80211: Loaded X.509 cert 'sforshee: 00b28ddf47aef9cea7' 
[12750.783611] mac80211_hwsim: initializing netlink 
[12750.789726] BUG: kernel NULL pointer dereference, address: 0000000000000048 
[12750.797204] #PF: supervisor read access in kernel mode 
[12750.802679] #PF: error_code(0x0000) - not-present page 
[12750.808107] PGD 0 P4D 0  
[12750.810817] Oops: 0000 [#1] SMP PTI 
[12750.814575] CPU: 0 PID: 612049 Comm: modprobe Tainted: G           OE     5.5.10-200.fc31.x86_64 #1 
[12750.824181] Hardware name: LENOVO 2756A48/MAHOBAY, BIOS 9SKT73AUS 08/23/2013 
[12750.831646] RIP: 0010:device_links_flush_sync_list+0xbd/0x100 
[12750.837772] Code: 48 89 42 08 48 89 10 48 89 9d d0 00 00 00 48 89 9d d8 00 00 00 49 39 ed 74 0c 48 8d bd 80 00 00 00 e8 e7 cc 39 00 48 8b 45 60 <48> 8b 40 48 48 85 c0 0f 85 68 ff ff ff 48 8b 45 68 48 85 c0 0f 84 
[12750.857758] RSP: 0018:ffffb725c03dbb30 EFLAGS: 00010246 
[12750.863374] RAX: 0000000000000000 RBX: ffff9810525a98d0 RCX: 0000000000000000 
[12750.870972] RDX: ffffb725c03dbb60 RSI: ffff9810525a98d0 RDI: ffff9810525a98d0 
[12750.878507] RBP: ffff9810525a9800 R08: ffffb725c03dbb60 R09: ffffb725c03dbb60 
[12750.886140] R10: 0000000000000000 R11: 0000000000000000 R12: ffffb725c03dba90 
[12750.893732] R13: ffff9810525a9800 R14: ffffb725c03dbb60 R15: 0000000000000000 
[12750.901385] FS:  00007fe821d9a740(0000) GS:ffff9810da400000(0000) knlGS:0000000000000000 
[12750.910052] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033 
[12750.916190] CR2: 0000000000000048 CR3: 000000011602c002 CR4: 00000000001606f0 
[12750.923851] Call Trace: 
[12750.926497]  device_links_driver_bound+0x17e/0x1c0 
[12750.931590]  driver_bound+0x4c/0xe0 
[12750.935314]  device_bind_driver+0x4d/0x60 
[12750.939599]  mac80211_hwsim_new_radio+0x14a/0xdb0 [mac80211_hwsim] 
[12750.946183]  ? 0xffffffffc0a74000 
[12750.949789]  init_mac80211_hwsim+0x271/0x1000 [mac80211_hwsim] 
[12750.956044]  ? 0xffffffffc0a74000 
[12750.959599]  do_one_initcall+0x46/0x200 
[12750.963715]  ? _cond_resched+0x15/0x30 
[12750.967721]  ? kmem_cache_alloc_trace+0x162/0x220 
[12750.972711]  ? do_init_module+0x23/0x230 
[12750.976924]  do_init_module+0x5c/0x230 
[12750.980918]  load_module+0x28c2/0x2b20 
[12750.984933]  ? __do_sys_init_module+0x16e/0x1a0 
[12750.989731]  __do_sys_init_module+0x16e/0x1a0 
[12750.994387]  do_syscall_64+0x5b/0x1c0 
[12750.998282]  entry_SYSCALL_64_after_hwframe+0x44/0xa9 
[12751.003753] RIP: 0033:0x7fe821eca12e 
[12751.007552] Code: 48 8b 0d 5d fd 0b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 49 89 ca b8 af 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 2a fd 0b 00 f7 d8 64 89 01 48 
[12751.027623] RSP: 002b:00007ffeb85af418 EFLAGS: 00000246 ORIG_RAX: 00000000000000af 
[12751.035700] RAX: ffffffffffffffda RBX: 000055eb95a80ff0 RCX: 00007fe821eca12e 
[12751.043302] RDX: 000055eb93dc5358 RSI: 000000000001f50e RDI: 000055eb9645d2b0 
[12751.050954] RBP: 000055eb9645d2b0 R08: 0000000000000000 R09: 0000000000000002 
[12751.058591] R10: 0000000000000001 R11: 0000000000000246 R12: 000055eb93dc5358 
[12751.066225] R13: 0000000000000000 R14: 000055eb95a81090 R15: 000055eb95a80ff0 
[12751.073868] Modules linked in: mac80211_hwsim(+) mac80211 cfg80211 rmd160 wireguard(OE) ip6_gre ip6_tunnel authenc echainiv xfrm_interface ah6 ah4 esp6 esp4 xfrm4_tunnel ipcomp ipcomp6 xfrm6_tunnel xfrm_ipcomp tunnel6 chacha20poly1305 cmac camellia_generic camellia_x86_64 cast6_generic cast5_generic cast_common ccm serpent_sse2_x86_64 serpent_generic blowfish_generic blowfish_x86_64 blowfish_common twofish_generic twofish_x86_64_3way twofish_x86_64 twofish_common xcbc sha256_ssse3 sha512_ssse3 des_generic libdes af_key ppp_mppe libarc4 ppp_deflate bsd_comp ppp_async ppp_generic slhc nsh macsec team_mode_random team_mode_activebackup team_mode_broadcast team_mode_loadbalance libcrc32c ip_set nfnetlink bluetooth ecdh_generic ecc team_mode_roundrobin veth vxlan ip6_udp_tunnel udp_tunnel tun sit macvtap tap macvlan ipip tunnel4 8021q garp mrp team bonding bridge stp llc dummy ip_gre ip_tunnel gre rfkill sunrpc intel_rapl_msr intel_rapl_common x86_pkg_temp_thermal intel_powerclamp coretemp 
[12751.073892]  kvm_intel kvm irqbypass crct10dif_pclmul mei_hdcp crc32_pclmul iTCO_wdt mei_wdt iTCO_vendor_support ghash_clmulni_intel snd_hda_codec_hdmi intel_cstate snd_hda_codec_realtek intel_uncore ppdev snd_hda_codec_generic ledtrig_audio intel_rapl_perf qcserial wmi_bmof usb_wwan snd_hda_intel snd_intel_dspcfg snd_pcsp snd_hda_codec snd_hda_core snd_hwdep snd_pcm e1000e snd_timer snd parport_pc mei_me parport soundcore mei i2c_i801 lpc_ich ie31200_edac i915 i2c_algo_bit drm_kms_helper drm crc32c_intel wmi video [last unloaded: ip_tables] 
[12751.218009] CR2: 0000000000000048 
[12751.221572] ---[ end trace 0121e7b4ec169ba6 ]--- 
[12751.226552] RIP: 0010:device_links_flush_sync_list+0xbd/0x100 
[12751.232660] Code: 48 89 42 08 48 89 10 48 89 9d d0 00 00 00 48 89 9d d8 00 00 00 49 39 ed 74 0c 48 8d bd 80 00 00 00 e8 e7 cc 39 00 48 8b 45 60 <48> 8b 40 48 48 85 c0 0f 85 68 ff ff ff 48 8b 45 68 48 85 c0 0f 84 
[12751.252714] RSP: 0018:ffffb725c03dbb30 EFLAGS: 00010246 
[12751.258314] RAX: 0000000000000000 RBX: ffff9810525a98d0 RCX: 0000000000000000 
[12751.265941] RDX: ffffb725c03dbb60 RSI: ffff9810525a98d0 RDI: ffff9810525a98d0 
[12751.273550] RBP: ffff9810525a9800 R08: ffffb725c03dbb60 R09: ffffb725c03dbb60 
[12751.281152] R10: 0000000000000000 R11: 0000000000000000 R12: ffffb725c03dba90 
[12751.288770] R13: ffff9810525a9800 R14: ffffb725c03dbb60 R15: 0000000000000000 
[12751.296418] FS:  00007fe821d9a740(0000) GS:ffff9810da400000(0000) knlGS:0000000000000000 
[12751.305016] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033 
[12751.311164] CR2: 0000000000000048 CR3: 000000011602c002 CR4: 00000000001606f0 

2. What is the Version-Release number of the kernel:
kernel-5.5.11-200.fc31

3. Did it work previously in Fedora? If so, what kernel version did the issue
   *first* appear?  Old kernels are available for download at
   https://koji.fedoraproject.org/koji/packageinfo?packageID=8 :
yes, no idea when it first occured, few weeks ago

4. Can you reproduce this issue? If so, please provide the steps to reproduce
   the issue below:
running simwifi_open test from NetworkManager test suite

https://gitlab.freedesktop.org/NetworkManager/NetworkManager-ci

nmcli/./runtest.sh simwifi_open from NM-ci dir (watch out some steps are for test purposes machine.. like changed root password, deployed test user, multiple test devices, etc..)

5. Does this problem occur with the latest Rawhide kernel? To install the
   Rawhide kernel, run ``sudo dnf install fedora-repos-rawhide`` followed by
   ``sudo dnf update --enablerepo=rawhide kernel``:
kernel 5.6.0.rc4 seems to fix the issue
scratch build from https://koji.fedoraproject.org/koji/taskinfo?taskID=42738694 seems to fix the issue too

6. Are you running any modules that not shipped with directly Fedora's kernel?:
module is from kernel-extra-internal subpackage

7. Please attach the kernel logs. You can get the complete kernel log
   for a boot with ``journalctl --no-hostname -k > dmesg.txt``. If the
   issue occurred on a previous boot, use the journalctl ``-b`` flag.

Comment 1 Davide Caratti 2020-03-24 19:09:50 UTC
(In reply to Vladimir Benes from comment #0)
> 1. Please describe the problem:
> there is a crash when loading mac80211_hwsim module:
> [12750.707904] cfg80211: Loading compiled-in X.509 certificates for
> regulatory database 
> [12750.716473] cfg80211: Loaded X.509 cert 'sforshee: 00b28ddf47aef9cea7' 
> [12750.783611] mac80211_hwsim: initializing netlink 
> [12750.789726] BUG: kernel NULL pointer dereference, address:
> 0000000000000048

[...]
 
> scratch build from
> https://koji.fedoraproject.org/koji/taskinfo?taskID=42738694 seems to fix
> the issue too

and sorry for the worng bugzilla number in the build version.
I rebuilt it as https://koji.fedoraproject.org/koji/taskinfo?taskID=42745543 .
the fix consists in including the following two commits:

commit 77036165d8bcf7c7b2a2df28a601ec2c52bb172d
Author: Saravana Kannan <saravanak>
Date:   Fri Feb 21 00:05:10 2020 -0800

    driver core: Skip unnecessary work when device doesn't have sync_state()
    
    A bunch of busy work is done for devices that don't have sync_state()
    support. Stop doing the busy work.
    
    Signed-off-by: Saravana Kannan <saravanak>
    Link: https://lore.kernel.org/r/20200221080510.197337-4-saravanak@google.com
    Signed-off-by: Greg Kroah-Hartman <gregkh>

commit ac338acf514e7b578fa9e3742ec2c292323b4c1a
Author: Saravana Kannan <saravanak>
Date:   Fri Feb 21 00:05:09 2020 -0800

    driver core: Add dev_has_sync_state()
    
    Add an API to check if a device has sync_state support in its driver or
    bus.
    
    Signed-off-by: Saravana Kannan <saravanak>
    Link: https://lore.kernel.org/r/20200221080510.197337-3-saravanak@google.com
    Signed-off-by: Greg Kroah-Hartman <gregkh>

Comment 2 Davide Caratti 2020-03-24 20:14:04 UTC
Created attachment 1673199 [details]
backport for f31

Comment 3 Davide Caratti 2020-03-24 20:17:25 UTC
requested a stable backport: https://lore.kernel.org/lkml/f22b7cd6fb6256f56e908e021f4fe389f3a6ee07.camel@redhat.com/


Note You need to log in before you can comment on or make changes to this bug.