Description of problem: Commit 309381feaee564281c3d9e90fbca8963bb7428ad to the Linux kernel added some debugging code which takes some functions that were previously internal and exports them with EXPORT_SYMBOL_GPL. The debug code is exposed only if CONFIG_DEBUG_VM is set, and if it is, then public kernel interfaces which were previously usable by non-GPL modules (e.g. get_page()) regardless of whether CONFIG_DEBUG_VM was set will only be usable by GPL-compatible modules when CONFIG_DEBUG_VM is set. An example of an affected module is the "nvidia-uvm" module, which ships as part of NVIDIA 331 and later drivers. Before this kernel change, nvidia-uvm was buildable against a kernel with CONFIG_DEBUG_VM set (e.g. the default configuration in Fedora); after this change, it cannot be built due to GPL conflicts. Version-Release number of selected component (if applicable): 3.14 How reproducible: The nvidia-uvm module fails to build on Linux 3.14 RC kernels, if CONFIG_DEBUG_VM is set (as is the case on the default Fedora kernel configuration). Steps to Reproduce: 1. Build a 3.14 RC kernel (e.g. 3.14-rc6), using the default Fedora kernel configuration as a starting point. 2. Install and boot into the kernel built in step (1). 3. Extract the contents of a recent NVIDIA driver, version 331 or later, passing the "--extract" option to the .run archive. 4. Change to the "kernel/uvm" subdirectory of the extracted installer. 5. Run `make module` Actual results: Building the nvidia-uvm module will fail, so long as CONFIG_DEBUG_VM is set on a 3.14 or later kernel. Expected results: Building the nvidia-uvm module should succeed. Additional info: It is reasonable for this build failure to occur, since the GPL-only exports are only exposed when VM debugging is configured. It is unfortunate that kernel modules which were previously buildable on default Fedora kernels will no longer be buildable on default Fedora kernels as of Linux 3.14, if Fedora continues to enable CONFIG_DEBUG_VM by default. Possible remedies include: a) Disable CONFIG_DEBUG_VM on the default Fedora kernel configuration (a quick survey of popular distros shows that this option is rarely enabled on the default kernel configuration for most distros: Fedora is the only distro we have identified so far that does set CONFIG_DEBUG_VM by default) b) Implement the debugging from commit 309381feaee564281c3d9e90fbca8963bb7428ad in a way that doesn't For (a), it would be reasonable to enable CONFIG_DEBUG_VM in Fedora kernels early in the development of a Fedora release (e.g. alpha and pre-alpha), and disable the option for beta and later releases. A similar conflict exists between the nvidia.ko module and kernels with CONFIG_DEBUG_LOCK_ALLOC enabled: Fedora alpha kernels have this option enabled by default, and Fedora beta and release kernels have it disabled.
Did you bring this issue up with the upstream authors of the commit? They may not have realized the impacts of the change they performed in terms of the GPL issues. If you did bring it up, could you please provide a link to a list archive with the discussion?
We had the same thought--the patch authors may not have noticed that they made get_page() and other routines effectively GPL-only. However, we then noticed that other major distros were not setting CONFIG_DEBUG_VM, so we decided to contact Fedora first, to get your views on this.
We've had CONFIG_DEBUG_VM set for a very long time. It is true we are one of the few distros to do so, but that also makes us the primary source for finding and reporting the bugs DEBUG_VM looks for. It has caught a few issues over time and having the added information available is always helpful to those debugging the problem. When last discussed, some of the core MM developers were surprised but generally supportive of us having this enabled. I would strongly suggest you contact upstream on their change first. Fedora may be the only major distro hitting this, but the code in question is not Fedora specific by any means. Getting resolution upstream is going to more beneficial to a wider set of users and thirdy party module developers.
The initial response from the patch author was that he wants to keep it EXPORT_SYMBOL_GPL. I've added Josh to that email thread, but I don't expect that upstream is going to change this, from the sound of it.
Oh, and here is the link to the upstream discussion thread, on linux-mm, as requested: http://permalink.gmane.org/gmane.linux.kernel.mm/114539
I just got notified that upstream has accepted the patch for 3.15-rc1, so I think we can now consider this fixed! -----Original Message----- From: akpm [akpm] Sent: Tuesday, April 01, 2014 4:09 PM To: mm-commits.org; sasha.levin; jwboyer; John Hubbard Subject: + mm-page_allocc-change-mm-debug-routines-back-to-export_symbol.patch added to -mm tree Subject: + mm-page_allocc-change-mm-debug-routines-back-to-export_symbol.patch added to -mm tree To: jhubbard,jwboyer,sasha.levin From: akpm Date: Tue, 01 Apr 2014 16:08:44 -0700 The patch titled Subject: mm/page_alloc.c: change mm debug routines back to EXPORT_SYMBOL has been added to the -mm tree. Its filename is mm-page_allocc-change-mm-debug-routines-back-to-export_symbol.patch This patch should soon appear at http://ozlabs.org/~akpm/mmots/broken-out/mm-page_allocc-change-mm-debug-routines-back-to-export_symbol.patch and later at http://ozlabs.org/~akpm/mmotm/broken-out/mm-page_allocc-change-mm-debug-routines-back-to-export_symbol.patch Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/SubmitChecklist when testing your code *** The -mm tree is included into linux-next and is updated there every 3-4 working days
(In reply to John Hubbard from comment #6) > I just got notified that upstream has accepted the patch for 3.15-rc1, so I > think we can now consider this fixed! Well, sort of. That will fix it for 3.15, but it still leaves 3.14 broken. I'll keep an eye out and as soon as the patch lands in Linus' tree I'll backport it. I suspect there might be some further pushback, but I hope to be pleasantly surprised.
Added to the F20 3.14.1 rebase.
kernel-3.14.1-200.fc20 has been submitted as an update for Fedora 20. https://admin.fedoraproject.org/updates/kernel-3.14.1-200.fc20
Package kernel-3.14.1-200.fc20: * should fix your issue, * was pushed to the Fedora 20 testing repository, * should be available at your local mirror within two days. Update it with: # su -c 'yum update --enablerepo=updates-testing kernel-3.14.1-200.fc20' as soon as you are able to, then reboot. Please go to the following url: https://admin.fedoraproject.org/updates/FEDORA-2014-5502/kernel-3.14.1-200.fc20 then log in and leave karma (feedback).
kernel-3.14.2-200.fc20 has been submitted as an update for Fedora 20. https://admin.fedoraproject.org/updates/kernel-3.14.2-200.fc20
kernel-3.14.2-200.fc20 has been pushed to the Fedora 20 stable repository. If problems still persist, please make note of it in this bug report.
I am unable to compile the nvidia driver. I found another report online: --------------------------------------------------------------------------- Linux drivers 334.21 kernel module fails to compile with kernel 3.14.2-200.fc20.x86_64 https://forums.geforce.com/default/topic/738458/linux-drivers-334-21-kernel-module-fails-to-compile-with-kernel-3-14-2-200-fc20-x86_64/?offset=1 --------------------------------------------------------------------------- I have reverted to the prior kernel I had installed. This rebuilt with no problem so I assume the new kernel is different? 3.12.7-300.fc20.x86_64
The report that you listed show an ACPI-related compilation failure, which is an entirely separate issue from the one being tracking in this bug: /var/lib/dkms/nvidia/334.21/build/nv-acpi.c:58:21: error: variable ‘nv_acpi_driver_template’ has initializer but incomplete type static const struct acpi_driver nv_acpi_driver_template = { We fixed that ACPI compilation problem about a month ago, so I'd like to request that you download a recent version of the NVIDIA driver, and that should fix your problem. If that fails, please file a new bug in order to track the issue.
Thank you for the note. I have found a difference in Release dates between the Long lived branch and the short lived branch. The most recent driver from nvidia is currently in the Long lived branch. I downloaded today and the installer/compile works properly. I assume that the short lived branch did not yet receive the update since it fails, in my prior attempt. I will check the next version of the short lived branch when it becomes available....or just continue to use the long lived branch. http://www.nvidia.com/object/unix.html Linux x86_64/AMD64/EM64T --------------------------------------------- Latest Long Lived Branch version: 331.67 Version: 331.67 Release Date: 2014.4.9 Operating System: Linux 64-bit Language: English (US) File Size: 60.00 MB --------------------------------------------- Latest Short Lived Branch version: 334.21 - Version: 334.21 Release Date: 2014.3.3 Operating System: Linux 64-bit Language: English (US) File Size: 67.00 MB --------------------------------------------- Thank you. Kevin
Our system here shows that the next version of the short-lived branch (something later than 334.21) should have that fix. I'm a little surprised that 334.21 is the latest one posted. But yes, a fix should get there pretty soon.
There likely won't be any further releases from the short-lived 334 branch, since it will soon be superseded by the 337 short-lived branch. The long-lived branches continue getting updates for some time after their initial release, while the short-lived branches typically do not get updated once a new short-lived branch is available.