Bug 1335173
Summary: | CONFIG_DEBUG_VM_PGFLAGS causes kernel panic for the X.org server: kernel BUG at /usr/src/kernels/4.5.3-300.fc24.x86_64/include/linux/page-flags.h:272 | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Artem S. Tashkinov <aros> | ||||||
Component: | kernel | Assignee: | Kernel Maintainer List <kernel-maint> | ||||||
Status: | CLOSED ERRATA | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||
Severity: | urgent | Docs Contact: | |||||||
Priority: | unspecified | ||||||||
Version: | 24 | CC: | ben.r.xiao, bugzilla-redhat, clodoaldo.pinto.neto, gansalmon, itamar, jforbes, jonathan, kernel-maint, knutjbj, leigh123linux, madhu.chinakonda, marek.gresko, mchehab, mike, mrmazda, spetreolle, todoleza | ||||||
Target Milestone: | --- | Keywords: | Reopened | ||||||
Target Release: | --- | ||||||||
Hardware: | x86_64 | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | kernel-4.5.5-300.fc24 kernel-4.5.5-201.fc23 | Doc Type: | Bug Fix | ||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2016-05-27 13:12:52 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Bug Depends On: | |||||||||
Bug Blocks: | 1303860, 1317296, 1335392 | ||||||||
Attachments: |
|
Description
Artem S. Tashkinov
2016-05-11 13:27:02 UTC
Fedora does not provide or support proprietary device drivers. You will need to address this with whomever you received the nvidia driver from. Considering that nouveau is unusable on my GPU (see bug 1335175) and causes frequent (every minute) hard freezes, yeah, that's amazing. Considering that overall nouveau is several times slower than NVIDIA's blob, yeah, that's great. It also doesn't support power management features. Considering that overall nouveau is a lot less stable and still have numerous glitches even on the hardware that is ostensibly well supported, yeah, that's admirable. Considering that you simply hate your own users and believe everyone should be running Intel GPUs whose drivers under Linux have *dozens* of critical errors, year, that's exemplary. Oh, maybe this particular bug report is an exception? Let's check: You do not intentionally break hardware compatibility? Oh, wait, you do. You do not intentionally break API/ABI compatibility? Oh, wait, you do. You do not limit user's choice in regard to running software or drivers? Oh, wait, you do. You make sure your software is bugs free? Oh, wait, you don't. You make sure you have the most stringent QA/QC process? Oh, wait, you don't. You know why less than 1% of people run Linux? That's because you're actively fucking with your users. You actively make their lives harder - you actually do everything so that didn't use Linux. I hope you're happy in your little wretched Linux world. If that's the official position of the company, then I'm simply appalled. CLOSED -> CANT FIX? More like "Closed" -> "Fuck you" because I'm currently successfully running vanilla kernel 4.5.3 with NVIDIA drivers, which means you actively sabotaged the Linux kernel to make it incompatible with 3d party binary drivers. That's fucking great. You should be proud of yourself. (In reply to Artem S. Tashkinov from comment #2) > If that's the official position of the company, then I'm simply appalled. Comment #1 is not an official position of Red Hat. (In reply to Artem S. Tashkinov from comment #3) > which means you actively sabotaged > the Linux kernel to make it incompatible with 3d party binary drivers. For the record, this is untrue. We do not actively or intentionally add anything to make external modules more difficult to use. We have not done anything to make the nvidia driver difficult to use, we cannot support it simply because the source is not visible to us. Even if it were, out of tree modules are much more difficult to track and debug, so the blanket Fedora statement is that we can't support out of tree modules. We do not do anything at all to make it more difficult to install or run them though, in fact we have made changes to the kernel spec recently that should make it easier to do so. Frankly, it is kind of important that the nvidia driver does run in F24 because nouveau is pretty much unusable as a desktop on any recent nvidia card (maxwell+ chipsets) since nvidia introduced the signed blob requirement in hardware (this improves in 4.6 since nvidia finally pushed some things upstream). That doesn't change the fact that we have zero visibility into their driver source, and that severely limits our ability to fix issues with it. Nvidia claims to support all released stable kernels, so they should fix issues with 4.5 on their end. Unfortunately they are also in the process of fighting with upstream because they want to do wayland support different than everyone else. Created attachment 1156782 [details] patch for 4.5 kernel (In reply to Artem S. Tashkinov from comment #3) > CLOSED -> CANT FIX? > > More like "Closed" -> "Fuck you" because I'm currently successfully running > vanilla kernel 4.5.3 with NVIDIA drivers, which means you actively sabotaged > the Linux kernel to make it incompatible with 3d party binary drivers. > > That's fucking great. You should be proud of yourself. Well it's fairly easy to patch nvidia to work with the fedora kernel (In reply to Josh Boyer from comment #5) > (In reply to Artem S. Tashkinov from comment #3) > > which means you actively sabotaged > > the Linux kernel to make it incompatible with 3d party binary drivers. > > For the record, this is untrue. We do not actively or intentionally add > anything to make external modules more difficult to use. Can you see if the patch in comment #7 helps you solve why the fedora kernel doesn't work with nvidia, other distros don't have this issue. (In reply to leigh scott from comment #8) Great many thanks! My only question is why I need no patches to successfully run the same driver on my vanilla kernel (4.5.3). (In reply to Artem S. Tashkinov from comment #9) > (In reply to leigh scott from comment #8) > > Great many thanks! > > My only question is why I need no patches to successfully run the same > driver on my vanilla kernel (4.5.3). I guess the fedora kernel has a different CONFIG file or a patch that causes the issue https://devtalk.nvidia.com/default/topic/928352/crash-with-kernel-4-5-and-4-6/ I also believe the intel driver has a similar issue https://bugzilla.redhat.com/show_bug.cgi?id=1303860 (In reply to leigh scott from comment #8) This patch fixes the problem. Thank you! Still, only Fedora users face this issue which means some Fedora specific patches and/or kernel options interfere with NVIDIA drivers. (In reply to Justin M. Forbes from comment #6) > We have not done anything to make the nvidia driver difficult to use, we This is the commit that breaks nvidia driver please revert http://pkgs.fedoraproject.org/cgit/rpms/kernel.git/commit/config-generic?id=42aa4321c736d85b027e9cf2e595db174b3cc76b +CONFIG_DEBUG_VM_PGFLAGS=y There are several related bugs: bug 1303860, bug 1317296 and bug 1335392. *** Bug 1336093 has been marked as a duplicate of this bug. *** Okay, there are a couple of issues here. What is being exposed by this new config option is quite possibly valid bugs, so we don't want to disable it entirely. I noticed from the thread that nvidia will be tracking it down on their end. I am guessing some of the other associated issues are real bugs as well. It would have been nice if they didn't exist, but having this enabled has pointed some out. And since this never his a stable release, it is just another example of how community testing can really help out. I am turning off CONFIG_DEBUG_VM_PGFLAGS for non debug kernels, so things should work as well as they did before on release, but it will remain on for debug kernels so that hopefully some of these issues can be found/resolved and we have continuous testing with rawhide. The next builds for F23/F24/rawhide should have the change, and I will not push the kernel without it to F23, it will remain in updates-testing. (In reply to Justin M. Forbes from comment #15) > Okay, there are a couple of issues here. What is being exposed by this new > config option is quite possibly valid bugs, so we don't want to disable it > entirely. I noticed from the thread that nvidia will be tracking it down on > their end. I am guessing some of the other associated issues are real bugs > as well. It would have been nice if they didn't exist, but having this > enabled has pointed some out. And since this never his a stable release, it > is just another example of how community testing can really help out. > Thank you for revisiting this issue and reverting it for now. > I am turning off CONFIG_DEBUG_VM_PGFLAGS for non debug kernels, so things > should work as well as they did before on release, but it will remain on for > debug kernels so that hopefully some of these issues can be found/resolved > and we have continuous testing with rawhide. The next builds for > F23/F24/rawhide should have the change, and I will not push the kernel > without it to F23, it will remain in updates-testing. And thank you for not pushing the current build in updates-testing. *** Bug 1338064 has been marked as a duplicate of this bug. *** *** Bug 1338076 has been marked as a duplicate of this bug. *** Trace of Bug 1338076 (duplicate) May 22 11:12:53 lap kernel: ------------[ cut here ]------------ May 22 11:12:53 lap kernel: kernel BUG at include/linux/page-flags.h:272! May 22 11:12:53 lap kernel: invalid opcode: 0000 [#1] SMP May 22 11:12:53 lap kernel: Modules linked in: xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 tun nf_conntrack_netbios_ns nf_conntrack_broadcast ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack michael_mic arc4 lib80211_crypt_tkip lib80211_crypt_ccmp ip_set nfnetlink ebtable_nat ebtable_broute bridge ip6table_raw ip6table_mangle ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_security iptable_raw iptable_mangle iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_security ebtable_filter ebtables ip6table_filter ip6_tables ipw2200 snd_intel8x0m snd_intel8x0 libipw snd_ac97_codec lib80211 iTCO_wdt ac97_bus snd_seq snd_seq_device snd_pcm ppdev cfg80211 iTCO_vendor_support hp_wmi sparse_keymap snd_timer snd rfkill lpc_ich joydev soundcore irda tifm_7xx1 parport_pc May 22 11:12:53 lap kernel: tifm_core parport tpm_infineon crc_ccitt acpi_cpufreq tpm_tis tpm nfsd auth_rpcgss nfs_acl lockd grace i915 8021q i2c_algo_bit garp drm_kms_helper stp llc mrp syscopyarea sysfillrect sysimgblt fb_sys_fops sdhci_pci tg3 drm sdhci mmc_core ata_generic serio_raw ptp yenta_socket pata_acpi pps_core wmi fjes video sunrpc scsi_transport_iscsi May 22 11:12:53 lap kernel: CPU: 0 PID: 1014 Comm: Xorg Not tainted 4.5.4-300.fc24.i686 #1 May 22 11:12:53 lap kernel: Hardware name: Hewlett-Packard HP Compaq nc6220 (PU982AW#ABA)/308A, BIOS 68DTU Ver. F.16 07/24/2009 May 22 11:12:53 lap kernel: task: eff75280 ti: f1378000 task.ti: f1378000 May 22 11:12:53 lap kernel: EIP: 0060:[<f7ec3ca2>] EFLAGS: 00013286 CPU: 0 May 22 11:12:53 lap kernel: EIP is at drm_pci_alloc+0xc2/0x1b0 [drm] May 22 11:12:53 lap kernel: EAX: 00000000 EBX: 00004000 ECX: f69fbec8 EDX: 00000007 May 22 11:12:53 lap kernel: ESI: f05c4880 EDI: c04092e0 EBP: f1379b90 ESP: f1379b6c May 22 11:12:53 lap kernel: DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 May 22 11:12:53 lap kernel: CR0: 80050033 CR2: b6848000 CR3: 3059a000 CR4: 000006d0 May 22 11:12:53 lap kernel: Stack: May 22 11:12:54 lap kernel: 024040c0 00000000 f6412064 f8152688 024040c0 90f47428 00000000 efc558c0 May 22 11:12:54 lap kernel: efc55938 f1379ba8 f80c761f 00000100 f66d0a00 f5693300 f5693240 f1379bdc May 22 11:12:54 lap kernel: f810e71b f40a0000 eec5e000 f4b1e800 f80f6201 ee928c00 f4117000 00000000 May 22 11:12:54 lap kernel: Call Trace: May 22 11:12:54 lap kernel: [<f80c761f>] i915_gem_object_attach_phys+0xef/0x190 [i915] May 22 11:12:54 lap kernel: [<f810e71b>] intel_prepare_plane_fb+0x18b/0x300 [i915] May 22 11:12:54 lap kernel: [<f80f6201>] ? pipe_dsl_stopped+0x31/0x80 [i915] May 22 11:12:54 lap kernel: [<f7e61385>] drm_atomic_helper_prepare_planes+0x45/0xb0 [drm_kms_helper] May 22 11:12:55 lap kernel: [<f8105d90>] intel_atomic_commit+0x270/0x1770 [i915] May 22 11:12:55 lap kernel: [<f81086f6>] ? intel_atomic_check+0x886/0x10a0 [i915] May 22 11:12:55 lap kernel: [<f8107e70>] ? intel_link_compute_m_n+0x50/0x50 [i915] May 22 11:12:55 lap kernel: [<f7ed8cdf>] ? drm_atomic_check_only+0x19f/0x670 [drm] May 22 11:12:55 lap kernel: [<f7ed84a4>] ? drm_atomic_get_crtc_state+0x54/0xc0 [drm] May 22 11:12:55 lap kernel: [<c057ad4a>] ? kmemdup+0x2a/0x40 May 22 11:12:56 lap kernel: [<f8105b20>] ? modeset_get_crtc_power_domains+0x140/0x140 [i915] May 22 11:12:56 lap kernel: [<f7ed91e4>] drm_atomic_commit+0x34/0x60 [drm] May 22 11:12:56 lap kernel: [<f7e61d5c>] drm_atomic_helper_update_plane+0xbc/0x100 [drm_kms_helper] May 22 11:12:56 lap kernel: [<f7e61ca0>] ? drm_atomic_helper_wait_for_vblanks+0x210/0x210 [drm_kms_helper] May 22 11:12:56 lap kernel: [<f7ec8d4b>] __setplane_internal+0x1eb/0x230 [drm] May 22 11:12:56 lap kernel: [<f7ec8f0e>] drm_mode_cursor_common+0x17e/0x3a0 [drm] May 22 11:12:57 lap kernel: [<f7ecd210>] ? drm_mode_setcrtc+0x570/0x570 [drm] May 22 11:12:57 lap kernel: [<f7ecd267>] drm_mode_cursor_ioctl+0x57/0x70 [drm] May 22 11:12:57 lap kernel: [<f7ebe319>] drm_ioctl+0x149/0x4f0 [drm] May 22 11:12:57 lap kernel: [<f80ca4c6>] ? i915_gem_fault+0xa6/0x4c0 [i915] May 22 11:12:57 lap kernel: [<f7ecd210>] ? drm_mode_setcrtc+0x570/0x570 [drm] May 22 11:12:57 lap kernel: [<c0589e97>] ? __do_fault+0x67/0x180 May 22 11:12:58 lap kernel: [<f7ebe1d0>] ? drm_getmap+0xc0/0xc0 [drm] May 22 11:12:58 lap kernel: [<c05d8521>] do_vfs_ioctl+0x91/0x6f0 May 22 11:12:58 lap kernel: [<c06ca09d>] ? selinux_file_ioctl+0xfd/0x1c0 May 22 11:12:58 lap kernel: [<c06c0f4c>] ? security_file_ioctl+0x3c/0x60 May 22 11:12:58 lap kernel: [<c05d8be8>] SyS_ioctl+0x68/0x80 May 22 11:12:59 lap kernel: [<c0401bef>] do_fast_syscall_32+0x8f/0x140 May 22 11:12:59 lap kernel: [<c0af675b>] sysenter_past_esp+0x40/0x61 May 22 11:12:59 lap kernel: Code: 70 8d 82 00 00 00 40 c1 e8 0c 8d 0c 80 a1 04 41 ff c0 8d 04 c8 8b 08 80 e5 40 74 7d 90 8d 74 26 00 ba 18 a5 ee f7 e8 9e 3d 6c c8 <0f> 0b 8d 74 26 00 8d 55 ec 8d 45 e4 e8 cd 54 54 c8 84 c0 0f 84 May 22 11:12:59 lap kernel: EIP: [<f7ec3ca2>] drm_pci_alloc+0xc2/0x1b0 [drm] SS:ESP 0068:f1379b6c May 22 11:12:59 lap kernel: ---[ end trace 3e84658473ffb87a ]--- kernel-4.5.5-300.fc24 has been submitted as an update to Fedora 24. https://bodhi.fedoraproject.org/updates/FEDORA-2016-f8739a80b0 I have installed this new kernel on both my machines, the i686 laptop and the x86_64 desktop, and booted each machine once. x86_64: there never was a problem and this does not appear to have damaged anything: everything appears to be as it was before. i686: booting was only possible with kernel option nomodeset, but now normal booting is possible: problem solved (booting without kernel option nomodeset does not cause kernel crash). kernel-4.5.5-300.fc24 has been pushed to the Fedora 24 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-f8739a80b0 kernel-4.5.5-201.fc23 has been submitted as an update to Fedora 23. https://bodhi.fedoraproject.org/updates/FEDORA-2016-06f1572324 kernel-4.5.5-300.fc24 has been pushed to the Fedora 24 stable repository. If problems still persist, please make note of it in this bug report. kernel-4.5.5-201.fc23 has been pushed to the Fedora 23 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-06f1572324 kernel-4.5.5-201.fc23 has been pushed to the Fedora 23 stable repository. If problems still persist, please make note of it in this bug report. Meanwhile NVIDIA has fixed the issue on their side: https://devtalk.nvidia.com/default/topic/941337 Release highlights since 367.18: Fixed a bug that caused kernel panics when using the NVIDIA driver on v4.5 and newer Linux kernels built with CONFIG_DEBUG_VM_PGFLAGS. |