Bug 1674268

Summary: Kernels 4.19, 4.20. 5.0 hang on boot with Realtek r8169
Product: [Fedora] Fedora Reporter: Steven Usdansky <usdanskys>
Component: kernelAssignee: Kernel Maintainer List <kernel-maint>
Status: NEW --- QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: unspecified    
Version: rawhideCC: airlied, bskeggs, hdegoede, hkallweit1, ichavero, itamar, jarodwilson, jeremy, jglisse, john.j5live, jonathan, josef, jyundt, kernel-maint, linville, mchehab, mjg59, steved, y9t7sypezp
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Steven Usdansky 2019-02-10 16:22:34 UTC
1. Please describe the problem:
Kernels 4.19, 4.20. 5.0 hang on boot with Realtek r8169


2. What is the Version-Release number of the kernel: Latest kernel with which this occurs is 5.0.0-0.rc5.git0.1.fc30.x86_64 (current Rawhide kernel)



3. Did it work previously in Fedora? If so, what kernel version did the issue
   *first* appear?  Old kernels are available for download at
   https://koji.fedoraproject.org/koji/packageinfo?packageID=8 :
First appeared with 4.19 kernels

4. Can you reproduce this issue? If so, please provide the steps to reproduce
   the issue below:
a) Install any 4.19+ kernel
b) reboot system


5. Does this problem occur with the latest Rawhide kernel? To install the
   Rawhide kernel, run ``sudo dnf install fedora-repos-rawhide`` followed by
   ``sudo dnf update --enablerepo=rawhide kernel``:
Yes

6. Are you running any modules that not shipped with directly Fedora's kernel?:
No

7. Please attach the kernel logs. You can get the complete kernel log
   for a boot with ``journalctl --no-hostname -k > dmesg.txt``. If the
   issue occurred on a previous boot, use the journalctl ``-b`` flag.

No log from hung boot.

There appears to be a fix. See the description at https://bugzilla.kernel.org/show_bug.cgi?id=202357 Reverting the change to line 4964 as shown in the link (line 4972 in the current kernel) and recompiling has allowed me to boot and file this bug. 

Previous incarnations of this bug:
https://bugzilla.redhat.com/show_bug.cgi?id=1648366
https://bugzilla.redhat.com/show_bug.cgi?id=1660649

Comment 1 Steve 2019-02-13 19:20:20 UTC
> See the description at https://bugzilla.kernel.org/show_bug.cgi?id=202357 
> Reverting the change to line 4964 as shown in the link (line 4972 in the current kernel) and recompiling 
> has allowed me to boot and file this bug.

You can link to the source code for a specific kernel version in the kernel git stable repo:

Click on a "Tag" here and then on the "tree" tab:
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/refs/tags

Thus, this is the source code for r8169.c in kernel 4.20.7:
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/drivers/net/ethernet/realtek/r8169.c?h=v4.20.7

NB: The line numbers along the left side are links.

Comment 2 Steven Usdansky 2019-02-17 16:35:27 UTC
I modified the code at https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/drivers/net/ethernet/realtek/r8169.c?h=v5.0-rc5#n4972 commenting out the line. On my system, I just modified r8169.c,  returned it to ~/rpmbuild/SOURCES/linux-5.0-rc5.tar.gz and built the binary 
rpm rpmbuild -bb --without debug --without debuginfo kernel.spec



~/Desktop$ diff -u r8169.c r8169.modified.c > r8169.patch
~/Desktop$ cat r8169.patch 
--- r8169.c	2019-02-17 10:13:51.729632146 -0600
+++ r8169.modified.c	2019-02-17 10:14:19.212118736 -0600
@@ -4969,7 +4969,7 @@
     RTL_W32(tp, MISC, RTL_R32(tp, MISC) | PWM_EN);
     RTL_W8(tp, Config5, RTL_R8(tp, Config5) & ~Spi_en);
 
-    rtl_hw_aspm_clkreq_enable(tp, true);
+    //rtl_hw_aspm_clkreq_enable(tp, true);
 }
 
 static void rtl_hw_start_8168f(struct rtl8169_private *tp)

Comment 3 Heiner Kallweit 2019-12-02 21:32:59 UTC
Issue should be gone since 5.3.