Description of problem: When the dell-laptop module is loaded on my Dell laptop AND F20 udev is running, the system locks up. This is a complete lockup, - caps lock doesn't work, can't use magic-sysrq, etc. This breaks with either the F19 or F20 3.12.5 kernel, but I had sucessfully booted once under F19 with the F19 3.12.5 kernel, with no issues Version-Release number of selected component (if applicable): systemd-208-9.fc20.x86_64 BREAKS: kernel-3.11.9-200.fc19.x86_64 kernel-3.11.10-200.fc19.x86_64 WORKS: kernel-3.11.9-200.fc19.x86_64 kernel-3.11.10-200.fc19.x86_64 How reproducible: Always, on F20 Steps to Reproduce: 1. Run F20 2. Boot into emergency mode (to test the ordering manually) 3. systemctl enable debug-shell.service 4. systemctl start debug-shell.service 5. modprobe dell-laptop 6. Observe no issues 8. Change to VT 9; run udevadm monitor; change back to VT1 7. systemctl start systemd-udevd.service 8. systemctl start systemd-udev-trigger.service 9. Quickly change back to VT9 Actual results: System lockup; last line from the udev monitor is the load of dell-laptop Expected results: No lockup Additional info: This is 100% reproducable. There's presumably some udev trigger/interaction, because the system doesn't lock up until udev starts the triggering, AND it worked on F19. I only booted the 3.12.5 kernel once on F19 though (it updated yesterday, and I booted into it to run fedup to F20) Workaround is to blacklist dell-laptop in /etc/modprobe.d/ - that lets me bootup normally. Any modprode of the driver causes the lockup again This is a Dell Vostro 3560.
$ git lg v3.11.9..v3.12.5 drivers/platform/x86/dell-laptop.c $ So likely it udev is somehow triggering a loading race. If you boot without the 'rhgb quiet' options and trigger the issue, are you able to see the messages of the screen lockup? Could you attach a screenshot in that case?
There's nothing logged - the machine just locks up. As long as udev is running, and its the kernel on f20, it hangs. It could be a race, but loading the module manully doesn't cause the issue til after udev loads - how can I tell what udev is trying to do?
Hi, I am also suffering from this issue - it's a dell vostro 3560 - the computer totaly freezes , no keys work except the power button that I have to keep pressed for about 5 seconds to turn it off. I remove 'rhgb quiet' but the problem persists. I using F20 upgraded from F19 x86_64 .
If you're hitting this, the workaround is: - boot with 'rhgb quiet' removed, AND add 'emergency' - when prompted put in your root password - mount -o remount,rw / - create a file /etc/modprobe.d/dell.conf, containing: blacklist dell-laptop blacklist dell_laptop - reboot. Probably best to remove 'rhgb quiet' for the first time, just in case there are any other issues
We're carrying a backported patch series to fix the dell-laptop module on latitude machines. Apparently this is causing yoru vostros to fail. Hans?
Weird, the new code paths in dell-laptop should not be triggered on Vostro-s, because the rfkill functionality caused issues on various models in the past it is currently only enabled on Latitudes: /* * rfkill causes trouble on various non Latitudes, according to Dell * actually testing the rfkill functionality is only done on Latitudes. */ product = dmi_get_system_info(DMI_PRODUCT_NAME); if (!force_rfkill && (!product || strncmp(product, "Latitude", 8))) return 0; Bradley, can you try changing your /etc/modprobe.d/dell.conf from: blacklist dell_laptop To: options dell_laptop force_rfkill=1 Which should actually enable the rfkill functionality, and see if that makes a difference ?
I've started a scratchbuild of the F-20 kernel with the dell-laptop patches removed here: http://koji.fedoraproject.org/koji/taskinfo?taskID=6331521 Note this will take a significant amount of time to finish! Once this is finished, can you give this kernel a try (without the blacklisting of the dell-laptop module), if this kernel does work then that confirms that the dell-laptop rfkill patches are the cause. Thanks, Hans
Hi Again, The scratchbuild from comment #7 is done now, and can be downloaded now. In the mean time I've been looking at the code from the pov of what my changes mean for non whitelisted devices like Vostro-s and there is one changed code-path for them as a result of my patches. I've written another patch so that that one functional change for non whitelisted devices is avoided, and dell-laptop.c should work 100% as before on non white-listed devices with this patch. I've started another scratchbuild with the dell-laptop patches added back in including the new patch, you can find it here: http://koji.fedoraproject.org/koji/taskinfo?taskID=6332009 Again please allow for some time for it to complete building. Please try my previous scratchbuild first, if that does not help then my dell-laptop patches are definitively not the culprit. If it does help, give this new build a try as it should fix things. Also can you please do: sudo dmidecode > dmi.log And attach the generated dmi.log file here ? Thanks, Hans
In case people don't get around to testing this with Christmas and everything, and the buildsys deletes the scratch builds during its automated cleanups, I've put the kernel rpms here: http://people.fedoraproject.org/~jwrdegoede/rhbz1045807/ Please test the kernel-3.12.6-300.hgd_bz1045807.fc20.x86_64.rpm build first if that does not help then testing the _2 will be of little use. Regards, Hans
Neither kernel helps. However, what does help is removing 'acpi_backlight=vendor' - I need that to get the backlight working (see bug 986653) With that removed, I can boot with the dell-laptop driver loaded, using the standard f20 kernel. (The backlight doesn't work, of course) I'll attach the dmidecode output
Created attachment 841371 [details] dmidecode output
Hi, Merry Chistmas :) (In reply to Bradley from comment #10) > Neither kernel helps. Given that the first kernel removes all my dell-laptop rfkill patches, it is unsurprising that the second kernel which re-adds them + a small fix does not fix things either. This does rule out my dell-laptop rfkill patches being the cause of your issues. > However, what does help is removing 'acpi_backlight=vendor' - I need that to > get the backlight working (see bug 986653) Ah, ok so somehow we've a conflict between various drivers here which has gotten worse with the latest kernel, note that dell-laptop also registers a brightness device. This is probably best discussed upstream where people more knowledgeable about this are involved, can you please send a mail about this to the platform-driver-x86.org list, with me (hdegoede) in the CC? Thanks, Hans
I previously said that I could reproduce this in single user mode, but I can't any more, so I must have done it wrong. Possibly breaks starting up X? I'll try a git bisect against upstream first.
I've bisected this down to 81c0a2bb515fd4daae8cab64352877480792b515 That's a mm patch to the zone allocator. I thought I'd done the bisection wrong, but I double checked by rebuilding the commits either side again, and its definitely this one. (I also double checked a few times along the way, once the bisection hit the -mm tree's diffs which seems completely unrelated) I confirmed by reverting that patch against 3.12.0 (breaks with the patch, works with it reverted). Same results against master (where part of this patch has already been reverted, so I also reverted fff4068cba484e6b0abe334ed6b15d5a215a3b25) - master is broken, but master with this patch (and fff4068cba484e6b0abe334ed6b15d5a215a3b25) reverted works fine. This issue isn't intermittent, and while it could be a timing thing I would have expected other changes between 3.11 and master to have impacted timing like this. So I'm confused. I'm going to rebuild the fedora RPM overnight with this reversion for one more test case (under mock), and also try the working 3.11 kernel with the patch applied to confirm that it breaks.
Yep, definitely that patch. Mail sent - http://marc.info/?l=linux-mm&m=138811453914848&w=2
*********** MASS BUG UPDATE ************** We apologize for the inconvenience. There is a large number of bugs to go through and several of them have gone stale. Due to this, we are doing a mass bug update across all of the Fedora 20 kernel bugs. Fedora 20 has now been rebased to 3.13.4-200.fc20. Please test this kernel update and let us know if you issue has been resolved or if it is still present with the newer kernel. If you experience different issues, please open a new bug report for those.
This was 'resolved' by 3.13 changing the backlight logic so that the kernel param in question wasn't needed. (It looks like the kernel param was using some memory that ACPI/EFI/SMI/something was reserving)