Bug 1691034

Summary: kernel BUG: unable to handle kernel NULL pointer dereference at 0000000000000043 when start hostapd
Product: [Fedora] Fedora Reporter: Gabriel Ramirez <gabriello.ramirez>
Component: kernelAssignee: Kernel Maintainer List <kernel-maint>
Status: CLOSED INSUFFICIENT_DATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 29CC: airlied, bskeggs, hdegoede, ichavero, itamar, jarodwilson, jeremy, jglisse, john.j5live, jonathan, josef, kernel-maint, labbott, linville, mchehab, mjg59, steved
Target Milestone: ---Flags: gabriello.ramirez: needinfo-
gabriello.ramirez: needinfo-
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-09-17 20:04:49 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
kernel-5.0.3-200.fc29.x86_64 bug when hostapd starts
none
kernel bug in kernel-5.1.0-0.rc0.git9.1.fc31.x86_64
none
working hostapd with kernel 4.20.16-200.fc29.x86_64
none
Revert ff9fb72bc07705c00795ca48631f7fffe24d2c6b
none
test patch from upstream none

Description Gabriel Ramirez 2019-03-20 16:58:02 UTC
Created attachment 1546174 [details]
kernel-5.0.3-200.fc29.x86_64 bug when hostapd starts

1. Please describe the problem:


the system have a pci express wireless card
08:00.0 Network controller: Intel Corporation Wireless 8265 / 8275 (rev 50)

when I start hostapd via:
systemctl start hostapd under kernel-5.0.3-200.fc29.x86_64+debug or kernel-5.0.3-200.fc29.x86_64 or kernel-5.0.1-300.fc29.x86_64 

occurs the kernel bug

2. What is the Version-Release number of the kernel:
kernel-5.0.3-200.fc29.x86_64 also occurs in kernel-5.0.1-300.fc29.x86_64

3. Did it work previously in Fedora? work fine in kernel-4.20.16-200.fc29.x86_64


4. Can you reproduce this issue? If so, please provide the steps to reproduce
   the issue below:
systemctl start hostapd under kernel-5.0.3-200.fc29.x86_64

5. Does this problem occur with the latest Rawhide kernel? To install the
   Rawhide kernel, run ``sudo dnf install fedora-repos-rawhide`` followed by
   ``sudo dnf update --enablerepo=rawhide kernel``:

yes occurs also with kernel 5.1.0-0.rc0.git9.1.fc31.x86_64

6. Are you running any modules that not shipped with directly Fedora's kernel?:

no

7. Please attach the kernel logs. You can get the complete kernel log
   for a boot with ``journalctl --no-hostname -k > dmesg.txt``. If the
   issue occurred on a previous boot, use the journalctl ``-b`` flag.

Comment 1 Gabriel Ramirez 2019-03-20 17:06:49 UTC
Created attachment 1546176 [details]
kernel bug in kernel-5.1.0-0.rc0.git9.1.fc31.x86_64

the bug occurs also in rawhide kernel kernel-5.1.0-0.rc0.git9.1.fc31.x86_64

Comment 2 Gabriel Ramirez 2019-03-20 17:16:29 UTC
Created attachment 1546178 [details]
working hostapd with kernel 4.20.16-200.fc29.x86_64

working hostapd with kernel 4.20.16-200.fc29.x86_64

Comment 3 Laura Abbott 2019-03-22 17:27:12 UTC
I sent a message to the upstream maintainers, we'll see if they respond. In the meantime, a bisect might be helpful to narrow down a specific commit.

Comment 4 Gabriel Ramirez 2019-03-26 23:21:24 UTC
Hi,

the last kernel without error is: kernel-5.0.0-0.rc5.git0.1.fc30

the first with error is: kernel-5.0.0-0.rc6.git0.1.fc30

and the bug is present in kernel-5.0.4-200.fc29.x86_64

bisecting from kernel-5.0.0-0.rc5.git0.1.fc30

the last git bisect bad
gives me


ff9fb72bc07705c00795ca48631f7fffe24d2c6b is the first bad commit
commit ff9fb72bc07705c00795ca48631f7fffe24d2c6b
Author: Greg Kroah-Hartman <gregkh>
Date:   Wed Jan 23 11:28:14 2019 +0100

    debugfs: return error values, not NULL
    
    When an error happens, debugfs should return an error pointer value, not
    NULL.  This will prevent the totally theoretical error where a debugfs
    call fails due to lack of memory, returning NULL, and that dentry value
    is then passed to another debugfs call, which would end up succeeding,
    creating a file at the root of the debugfs tree, but would then be
    impossible to remove (because you can not remove the directory NULL).
    
    So, to make everyone happy, always return errors, this makes the users
    of debugfs much simpler (they do not have to ever check the return
    value), and everyone can rest easy.
    
    Reported-by: Gary R Hook <ghook>
    Reported-by: Heiko Carstens <heiko.carstens.com>
    Reported-by: Masami Hiramatsu <mhiramat>
    Reported-by: Michal Hocko <mhocko>
    Reported-by: Sebastian Andrzej Siewior <bigeasy>
    Reported-by: Ulf Hansson <ulf.hansson>
    Reviewed-by: Masami Hiramatsu <mhiramat>
    Reviewed-by: Sebastian Andrzej Siewior <bigeasy>
    Signed-off-by: Greg Kroah-Hartman <gregkh>

so I tried:

Reverting the patch against 5.0.4-200.fc29.x86_64

and the hostapd runs without error

Comment 5 Gabriel Ramirez 2019-03-26 23:24:30 UTC
Created attachment 1548328 [details]
Revert ff9fb72bc07705c00795ca48631f7fffe24d2c6b

with this patch against kernel-5.0.4-200.fc29.x86_64 hostapd runs without causing any error

Comment 6 Laura Abbott 2019-03-26 23:56:46 UTC
Thanks for the bisect! It looks like there was a patch to fix this that wasn't applied by the intel maintainers. I e-mailed the maintainers asking them to give a proper fix.

Comment 7 Gabriel Ramirez 2019-03-27 02:38:10 UTC
I can test the patch if you want.

thanks

Comment 8 Laura Abbott 2019-04-01 15:41:14 UTC
Created attachment 1550620 [details]
test patch from upstream

Can you test this patch? this is the proposed fix.

Comment 9 Gabriel Ramirez 2019-04-02 08:01:07 UTC
Hello,

I applied the patch to kernel-5.0.5-200.fc29.src.rpm

booted the kernel Linux version 5.0.5-200.chdebugfs1.fc29.x86_64

and in one terminal executed systemctl start hostapd and that process just hangs, don't return to the shell
and in a second terminal running "journalctl -f --no-hostname -k -l" don't show any output about the "systemctl start hostapd" 

so no info to report besides the systemctl start hostapd hangs

Comment 10 Justin M. Forbes 2019-08-20 17:41:19 UTC
*********** MASS BUG UPDATE **************

We apologize for the inconvenience.  There are a large number of bugs to go through and several of them have gone stale.  Due to this, we are doing a mass bug update across all of the Fedora 29 kernel bugs.

Fedora 29 has now been rebased to 5.2.9-100.fc29.  Please test this kernel update (or newer) and let us know if you issue has been resolved or if it is still present with the newer kernel.

If you have moved on to Fedora 30, and are still experiencing this issue, please change the version to Fedora 30.

If you experience different issues, please open a new bug report for those.

Comment 11 Justin M. Forbes 2019-09-17 20:04:49 UTC
*********** MASS BUG UPDATE **************
This bug is being closed with INSUFFICIENT_DATA as there has not been a response in 3 weeks. If you are still experiencing this issue, please reopen and attach the relevant data from the latest kernel you are running and any data that might have been requested previously.