Created attachment 1011622 [details] Journal output from Fedora 22 Design Suite Beta TC8 Live media Description of problem: Booting the live media (in this case Design Suite which is based on Workstation) led to a backtrace related to UID. As a result, user needs to manually start the X session as "liveuser" under virtual terminal. Not sure if the issue affect AMD based system especially those using KAVERI APU. Version-Release number of selected component (if applicable): Design Suite Beta TC8 How reproducible: Always Steps to Reproduce: 1. Simply boot the live media 2. 3. Actual results: Generated backtrace leading to segmentation fault [ 79.919] (II) input device 'ETPS/2 Elantech Touchpad', /dev/input/event8 is a touchpad [ 79.920] (II) config/udev: Adding input device ETPS/2 Elantech Touchpad (/dev/input/mouse1) [ 79.920] (II) No input driver specified, ignoring this device. [ 79.920] (II) This device may have been added with another device file. [ 80.188] (EE) [ 80.188] (EE) Backtrace: [ 80.190] (EE) 0: /usr/libexec/Xorg (OsLookupColor+0x139) [0x599dd9] [ 80.192] (EE) 1: /lib64/libc.so.6 (__restore_rt+0x0) [0x7f94cc954b1f] [ 80.194] (EE) 2: /lib64/libc.so.6 (__GI___strcmp_ssse3+0x16) [0x7f94cca683d6] [ 80.194] (EE) 3: /usr/libexec/Xorg (xf86SIGIOSupported+0xa18) [0x4a36b8] [ 80.195] (EE) 4: /lib64/libdbus-1.so.3 (dbus_connection_dispatch+0x375) [0x7f94ceb78095] [ 80.197] (EE) 5: /lib64/libdbus-1.so.3 (dbus_connection_dispatch+0x64d) [0x7f94ceb7891d] [ 80.197] (EE) 6: /usr/libexec/Xorg (config_fini+0x4c1) [0x49d7b1] [ 80.197] (EE) 7: /usr/libexec/Xorg (WakeupHandler+0x6d) [0x43ed3d] [ 80.198] (EE) 8: /usr/libexec/Xorg (WaitForSomething+0x1e7) [0x592ea7] [ 80.198] (EE) 9: /usr/libexec/Xorg (SendErrorToClient+0x111) [0x439f81] [ 80.198] (EE) 10: /usr/libexec/Xorg (remove_fs_handlers+0x41b) [0x43e26b] [ 80.200] (EE) 11: /lib64/libc.so.6 (__libc_start_main+0xf0) [0x7f94cc940790] [ 80.200] (EE) 12: /usr/libexec/Xorg (_start+0x29) [0x428659] [ 80.201] (EE) 13: ? (?+0x29) [0x29] [ 80.201] (EE) [ 80.201] (EE) Segmentation fault at address 0x0 [ 80.201] (EE) Fatal server error: [ 80.201] (EE) Caught signal 11 (Segmentation fault). Server aborting [ 80.201] (EE) [ 80.201] (EE) Expected results: Session should start as expected without user intervention. Additional info: Test done on ASUS X550ZE using default setting
Proposed as a Blocker for 22-beta by Fedora user luya using the blocker tracking app because: The issue seems to occur on AMD based system especially those using APU running a Live Media (in this case Design Suite Beta TC8). The test was done with a brand new ASUS X550ZE where a straight boot led to a backtrace forcing the user to manually start the session as "liveuser". I am unable to find what exactly cause the problem.
Luya, would you mind trying to load the Workstation Live Beta TC8 image (rather than Design Suite) on the same hardware? We need to rule out if this is a general issue with this hardware or if there's a bug in the Design Suite spin. (If it's an issue only with Design Suite, it's not considered a blocker for release).
Created attachment 1011871 [details] Journal output from Fedora 22 Workstation Beta TC8 Live Media (In reply to Stephen Gallagher from comment #2) > Luya, would you mind trying to load the Workstation Live Beta TC8 image > (rather than Design Suite) on the same hardware? Sure although problem is actuall Workstation issue on which Design Suite is built from. I succesfully reproduce the issue by simply boot Workstation Live Beta TC8. Highlighed issue from journal below: Apr 07 06:36:03 localhost /usr/libexec/gdm-x-session[1639]: (EE) systemd-logind: failed to take device /dev/dri/card1: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken. Apr 07 06:36:05 localhost /usr/libexec/gdm-x-session[1639]: (EE) /dev/dri/card1: failed to set DRM interface version 1.4: Permission denied Apr 07 06:36:30 localhost /usr/libexec/gdm-x-session[1639]: (EE) systemd-logind: failed to take device /dev/dri/card1: Device already taken Afterward, backtrace leading to segmentation fault: Apr 07 06:36:33 localhost /usr/libexec/gdm-x-session[1639]: (EE) Segmentation fault at address 0x0 I had to switch to vt2 to manually start Gnome X session I am posting from.. > We need to rule out if this is a general issue with this hardware or if > there's a bug in the Design Suite spin. (If it's an issue only with Design > Suite, it's not considered a blocker for release). As mentioned above, the issue affects Workstation as well. The fact I am able to manually stat X session and successfully installed without further problem shows something happened.
Created attachment 1011874 [details] Xorg.log output on Workstation Live Beta TC8 Here is the Xorg.log output when booting Workstation Live Beta TC8.
Created attachment 1011875 [details] Xorg.log output on Workstation Live Beta TC8 after manual start Xorg.log output after manually logging as liveuser via tty2 screen.
Reassigning bug to xorg-x11-drv-amd
Can you try selecting "Basic Graphics" from the troubleshooting menu on boot? Does that avoid the issue?
Luya, is this a laptop with a hybrid graphics setup? (Like a low-powered GPU for general operation and a high-powered one for graphics-intensive stuff?) The log shows both /dev/dri/card0 and /dev/dri/card1.
For right now, I'm voting -1 blocker on this, on the grounds that we don't have any evidence that this issue is present on a significant amount of hardware.
Created attachment 1012408 [details] Hardware info (In reply to Stephen Gallagher from comment #8) > Luya, is this a laptop with a hybrid graphics setup? (Like a low-powered GPU > for general operation and a high-powered one for graphics-intensive stuff?) > > The log shows both /dev/dri/card0 and /dev/dri/card1. Yes, it is AMD Radeon® R5 M230 + Radeon® R7 M265 DX Dual Graphics with 2GB DDR3 VRAM Built-in A10-7400P .
Laptop specification straight from ASUS website: http://www.asus.com/Notebooks_Ultrabooks/X550ZE/specifications/
(In reply to Stephen Gallagher from comment #7) > Can you try selecting "Basic Graphics" from the troubleshooting menu on > boot? Does that avoid the issue? Yes, using Basic Graphics form the troubleshooting menu does so.
Created attachment 1012425 [details] Journal output vith basic graphic troubleshoot Output using Basic Graphic mode from troubleshoot menu
Created attachment 1012426 [details] Xorg.log output from Workstation 22 Beta RC1 Xorg.log output following the journal output with basic graphic troubleshooting on Fedora Workstation 22 Beta RC1.
i can confirm this issue, with Nvidia card also. hopefully this is fixed in the final. GT730 Nvidia
(In reply to Greg` from comment #15) > i can confirm this issue, with Nvidia card also. hopefully this is fixed in > the final. > > GT730 Nvidia also forgot to mention i have also removed " rhgb quiet " an works
I can also reproduce this with an Alienware laptop with both Intel and nVidia GPUs in it. So it definitely looks like a problem with hybrid graphics. As above, booting into basic graphics mode works successfully. I'm on the fence about calling this a blocker. On the one hand, hybrid graphics setups are fairly common, on the other hand, basic graphic mode works as a workaround and could be documented in Common Bugs.
FYI, it's likely irrelevant, but even with removing 'rhgb quiet', my system remains hung at "Started User Manager for UID 1000". Only basic graphics mode succeeds.
Disregard my comments above. I'm having a different issue related to a really odd arrangement of hardware in my Alienware system. In general, I'm -1 blocker for this in Beta, but I'd be +1 to Final Blocker.
Aaaaand I discovered I had an older Dell XPS 15 with Intel/nVidia hybrid graphics and I can in fact reproduce the original issue on it. So it definitely seems to be related to hybrid graphics.
(In reply to Stephen Gallagher from comment #20) > Aaaaand I discovered I had an older Dell XPS 15 with Intel/nVidia hybrid > graphics and I can in fact reproduce the original issue on it. > > So it definitely seems to be related to hybrid graphics. In that case, will it be a bug to xorg driver in general for these devices? Maybe I change the title to reflect the comments?
OK, so after copious debugging, it is apparently an SELinux issue of some sort. However, permissive mode doesn't resolve it. Booting with 'selinux=0' (disabled, not permissive) allows everything to work correctly.
FWIW I have a very old school hybrid-type laptop (a 2010 Vaio Z), from before NVIDIA was branding this stuff as 'Optimus' (it's marked 'DYNAMIC HYBRID GRAPHICS SYSTEM', catchy!), and can't reproduce the bug on that. Even set to 'dynamic' mode so I see both adapters in the 'lspci -nn' output, Workstation live boots just fine for me. However, it certainly could be the case that this affects all/most more recent hybrid designs.
What I think is happening: logind thinks that card0 and card1 are two seats. gdm is started on seat0 with card0, but X would like to access card1 too. logind refuses. If this was correct, loginctl list-seats would show two seats, with card1 attached to the second one. loginctl seat-status $(loginctl list-seats --no-legend), anyone? No idea why disabling selinux changes stuff.
Luya says selinux=0 doesn't actually help for him, so that might be a red herring.
(In reply to Zbigniew Jędrzejewski-Szmek from comment #24) > What I think is happening: logind thinks that card0 and card1 are two seats. > gdm is started on seat0 with card0, but X would like to access card1 too. > logind refuses. If this was correct, loginctl list-seats would show two > seats, with card1 attached to the second one. > > loginctl seat-status $(loginctl list-seats --no-legend), anyone? > I can't easily paste the output, but 'loginctl list-seats --no-legend' returns 'seat0'. No seat1 mentioned. > No idea why disabling selinux changes stuff. Well, this is odd. I can't reproduce that now. Maybe this is a race of some kind. It definitely worked at least once...
(In reply to Zbigniew Jędrzejewski-Szmek from comment #24) > What I think is happening: logind thinks that card0 and card1 are two seats. > gdm is started on seat0 with card0, but X would like to access card1 too. > logind refuses. If this was correct, loginctl list-seats would show two > seats, with card1 attached to the second one. > > loginctl seat-status $(loginctl list-seats --no-legend), anyone? > > No idea why disabling selinux changes stuff. I confirm what Stephen reported. No seat1 mentioned. See http://ur1.ca/k51ps In case that didn't work: seat0 Sessions: *2 Devices: ├─/sys/devices/LNXSYSTM:00/LNXPWRBN:00/input/input3 │ input:input3 "Power Button" ├─/sys/device...SYBUS:00/PNP0A03:00/LNXVIDEO:00/input/input13 │ input:input13 "Video Bus" ├─/sys/device...XSYSTM:00/LNXSYBUS:00/PNP0C0C:00/input/input0 │ input:input0 "Power Button" ├─/sys/device...XSYSTM:00/LNXSYBUS:00/PNP0C0D:00/input/input1 │ input:input1 "Lid Switch" ├─/sys/device...XSYSTM:00/LNXSYBUS:00/PNP0C0E:00/input/input2 │ input:input2 "Sleep Button" ├─/sys/devices/pci0000:00/0000:00:01.0/drm/card0 │ drm:card0 ├─/sys/devices/pci0000:00/0000:00:01.0/drm/renderD128 │ drm:renderD128 ├─/sys/devices/pci0000:00/0000:00:01.0/graphics/fb0 │ [MASTER] graphics:fb0 "radeondrmfb" ├─/sys/devices/pci0000:00/0000:00:01.1/sound/card0 │ sound:card0 "Generic" │ └─/sys/devices/pci0000:00/0000:00:01.1/sound/card0/input14 │ input:input14 "HD-Audio Generic HDMI/DP,pcm=3" ├─/sys/devices/pci0000:00/0000:00:02.1/0000:01:00.0/drm/card1 │ drm:card1 ├─/sys/device...0:00/0000:00:02.1/0000:01:00.0/drm/renderD129 │ drm:renderD129 ├─/sys/device...000:00/0000:00:02.1/0000:01:00.0/graphics/fb1 │ [MASTER] graphics:fb1 "radeondrmfb" ├─/sys/devices/pci0000:00/0000:00:10.0/usb1 │ usb:usb1 ├─/sys/devices/pci0000:00/0000:00:10.0/usb2 │ usb:usb2 ├─/sys/device...11.0/ata3/host2/target2:0:0/2:0:0:0/block/sr0 │ block:sr0 ├─/sys/device...a3/host2/target2:0:0/2:0:0:0/scsi_generic/sg1 │ scsi_generic:sg1 ├─/sys/devices/pci0000:00/0000:00:12.0/usb5 │ usb:usb5 │ └─/sys/device...C52B.0003/0003:046D:4013.0004/input/input18 │ input:input18 "Logitech M525" ├─/sys/devices/pci0000:00/0000:00:12.2/usb3 │ usb:usb3 ├─/sys/devices/pci0000:00/0000:00:13.0/usb6 │ usb:usb6 ├─/sys/devices/pci0000:00/0000:00:13.2/usb4 │ usb:usb4 │ ├─/sys/device...0000:00:13.2/usb4/4-4/4-4:1.0/input/input19 │ │ input:input19 "USB Camera" │ └─/sys/device...00:13.2/usb4/4-4/4-4:1.0/video4linux/video0 │ video4linux:video0 "USB Camera" ├─/sys/devices/pci0000:00/0000:00:14.2/sound/card1 │ sound:card1 "Generic_1" │ ├─/sys/devices/pci0000:00/0000:00:14.2/sound/card1/input15 │ │ input:input15 "HD-Audio Generic Mic" │ └─/sys/devices/pci0000:00/0000:00:14.2/sound/card1/input16 │ input:input16 "HD-Audio Generic Headphone" ├─/sys/devices/platform/asus-nb-wmi/input/input17 │ input:input17 "Asus WMI hotkeys" ├─/sys/devices/platform/i8042/serio0/input/input4 │ input:input4 "AT Translated Set 2 keyboard" ├─/sys/devices/platform/i8042/serio4/input/input12 │ input:input12 "ETPS/2 Elantech Touchpad" ├─/sys/devices/virtual/misc/kvm │ misc:kvm └─/sys/devices/virtual/misc/rfkill misc:rfkill
Problem does not reproduce on MacbookPro 8,2 which has i915 and AMD graphics, with either Workstation RC1 netinstall or live; and I'm not using nomodeset (or any other changes to the top boot entry.) 01:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Whistler [Radeon HD 6630M/6650M/6750M/7670M/7690M] [1002:6741] (prog-if 00 [VGA controller]) Subsystem: Apple Inc. MacBookPro8,2 [Core i7, 15", Late 2011] [106b:00e2] 00:02.0 VGA compatible controller [0300]: Intel Corporation 2nd Generation Core Processor Family Integrated Graphics Controller [8086:0126] (rev 09) (prog-if 00 [VGA controller]) Subsystem: Apple Inc. Device [106b:00dc]
(In reply to Stephen Gallagher from comment #22) > OK, so after copious debugging, it is apparently an SELinux issue of some > sort. However, permissive mode doesn't resolve it. Booting with 'selinux=0' > (disabled, not permissive) allows everything to work correctly. that worked for me also
Discussed at 2015-04-09 Go/No-Go meeting, acting as a blocker review meeting: https://meetbot.fedoraproject.org/fedora-meeting-2/2015-04-09/f22_beta_gono-go_meeting.2015-04-09-17.00.html . This was a close decision, but given that hybrid graphics laptops are still not the *most* common kind (we checked some retail store sites), reports so far indicate not *all* are affected (though many do seem to be), and basic graphics mode is available as a workaround, we agreed this doesn't quite rate as a Beta blocker. It is accepted as a Final blocker, however, and as a Beta freeze exception issue.
I can't reproduce it either on Lenovo T520 with NVidia Optimus technology - three years old laptop (and I can't say how recent this configuration is but it's not that old). 00:02.0 VGA compatible controller: Intel Corporation 2nd Generation Core Processor Family Integrated Graphics Controller (rev 09) 01:00.0 VGA compatible controller: NVIDIA Corporation GF119M [Quadro NVS 4200M] (rev a1) I even tried to switch mode in BIOS - all three (integrated GPU, discrete GPU and NVidia Optimus) modes work.
I looked at this a bit today with sgallagh . The problem is the X server has a 500ms timeout for TakeDevice calls, and in some instances, logind takes long than 500ms to return from a TakeDevice call ( on /dev/dri/card1 ). From the log: Apr 05 20:39:47 localhost /usr/libexec/gdm-x-session[1606]: (EE) systemd-logind: failed to take device /dev/dri/card1: ...the reply timeout expired... That alone is somewhat problematic, but when a timeout error occurs, it's possible to still get a real reply later on, and the message filter set up by the logind code in the X server isn't anticipating that, which leads to a crash: Apr 05 20:40:05 localhost /usr/libexec/gdm-x-session[1606]: (EE) Backtrace: So it's taking 8 seconds for TakeDevice to return. I didn't investigate why it's taking so long for TakeDevice to return, and that probably needs more investigation. Still, two fixes, I can think of: 1) make sure the message_filter discards message replies. All calls are blocking calls, so message replies can only in spuriously 2) don't second guess the default dbus timeout, and just use DBUS_TIMEOUT_USE_DEFAULT which is like 50 times longer of a timeout than the X server is using now Hans, what do you think?
Created attachment 1013228 [details] systemd-logind: filter out non-signal messages from message filter It's possible to receive a message reply in the message filter if a previous message call timed out locally before the reply arrived. The message_filter function only handles signals, at the moment, and does not properly handle message replies. This commit changes the message_filter function to filter out all non-signal messages, including spurious message replies.
Created attachment 1013239 [details] systemd-logind: don't second guess D-Bus default timeout At the moment, the X server uses a non-default timeout for D-Bus messages to systemd-logind. The only timeouts normally used with D-Bus are: 1) Infinite 2) Default Anything else is just as arbitrary as Default, and so rarely makes sense to use instead of Default. Put another way, there's little reason to be fault tolerant against a local root running daemon (logind), that in some configurations, the X server already depends on for proper functionality. This commit changes systemd-logind to just use the default timeouts.
https://admin.fedoraproject.org/updates/xorg-x11-server-1.17.1-9.fc22 includes the above mentioned patches.
(In reply to Kalev Lember from comment #35) > https://admin.fedoraproject.org/updates/xorg-x11-server-1.17.1-9.fc22 > includes the above mentioned patches. Ray, please add this BZ as fixed by the above Bodhi update. Otherwise, it might be missed when building the RC2 compose. Thanks!
This change will be in Beta RC2; please test it carefully. Thanks!
I just tested this fix on Beta RC2. Looks like it's working properly.
oops not sure how i missed adding the bug id, will fix.
xorg-x11-server-1.17.1-9.fc22 has been submitted as an update for Fedora 22. https://admin.fedoraproject.org/updates/xorg-x11-server-1.17.1-9.fc22
Created attachment 1014428 [details] Journal report with F22Beta RC2 Just tested Fedora 22 Workstation Beta RC2. The boot went well into the Gnome session. Attached is the journal report to view that the fix seems working.
Excellent, thanks for testing! Closing the ticket now that the update has gone to stable.