Bug 1392651 - Logging out from first user after switching to second user and then back to first user crashes SDDM
Summary: Logging out from first user after switching to second user and then back to f...
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Fedora
Classification: Fedora
Component: sddm
Version: 25
Hardware: All
OS: Linux
unspecified
high
Target Milestone: ---
Assignee: Martin Bříza
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks: F25FinalBlocker
TreeView+ depends on / blocked
 
Reported: 2016-11-08 00:57 UTC by Adam Williamson
Modified: 2016-11-08 01:06 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-11-08 01:06:44 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
all sessions killed - journal (276.98 KB, text/plain)
2016-11-08 00:59 UTC, Adam Williamson
no flags Details
Xorg.0.log.old of 'crashed' user1 session (21.28 KB, text/plain)
2016-11-08 01:01 UTC, Adam Williamson
no flags Details
Xorg.1.log of logged out user2 session. (19.58 KB, text/plain)
2016-11-08 01:01 UTC, Adam Williamson
no flags Details

Description Adam Williamson 2016-11-08 00:57:38 UTC
This report splits out the issue described in https://bugzilla.redhat.com/show_bug.cgi?id=1382001#c5 to its own report, as it is distinct from #c4 in that report. kparal's reproduction steps, starting from a current Fedora 25 KDE installation:

All sessions are killed if user2 logs out. Reproducer:

1. create user1 and user2
2. reboot the system
3. log in with user1
4. switch to user2
5. log out user2
6. see that user1 has been logged out (or killed) as well. TTY2 is now empty (should not be). If you try to log in as user1, you get a new session (you should get back to the existing session). Sometimes I see an error flash about "kmserver could not be started" or something like that right before login screen is shown.

I have reproduced this with both qxl and modesetting drivers after an install of Fedora-KDE-Live-x86_64-25-20161107.n.0.iso . We have not yet discovered any solution for this.

I am applying the 'accepted blocker' decision from #1382001 to this; for now we must consider it to apply to both bugs. This one is, if anything, the more serious, as it allows one user to crash another's session, effectively a DoS.

Comment 1 Adam Williamson 2016-11-08 00:59:39 UTC
Created attachment 1218325 [details]
all sessions killed - journal

just cloned https://bugzilla.redhat.com/attachment.cgi?id=1212417 from original bug here.

Comment 2 Adam Williamson 2016-11-08 01:01:13 UTC
Created attachment 1218326 [details]
Xorg.0.log.old of 'crashed' user1 session

Clone of Oliver's https://bugzilla.redhat.com/attachment.cgi?id=1216197 .

Comment 3 Adam Williamson 2016-11-08 01:01:44 UTC
Created attachment 1218327 [details]
Xorg.1.log of logged out user2 session.

clone of Oliver's https://bugzilla.redhat.com/attachment.cgi?id=1216198 .

Comment 4 Adam Williamson 2016-11-08 01:02:14 UTC
Comment on attachment 1218326 [details]
Xorg.0.log.old of 'crashed' user1 session

sorry, wrong attachment.

Comment 5 Adam Williamson 2016-11-08 01:04:46 UTC
Oliver Henshaw's notes from the original bug:

I used perf to trace what processes use the DRM_SET_MASTER and DRM_DROP_MASTER syscalls. It looks like the Xorg on vt1 tries to get control of the DRM master before the Xorg on vt2 releases it - in fact there's no DRM_DROP_MASTER from the vt2 Xorg during logout so I think it only gets released when the fd is closed. I tried to add a filter for the close(2) event for the drm master fd in each process, but didn't get anything. Grepping the xf86-video-qxl code it doesn't look like drmClose is used in anything but an error path, so the fd is only closed when Xorg exits unless I'm missing something.

(It's a little more tricky to get the sys_exit_ioctl event corresponding to the DRM_SET_MASTER etc. calls so couldn't confirm the return values, but it does correlate to the error message in the Xorg.0.log.old)

Trying to do this with strace instead of perf makes the bug disappear - seems it's timing dependent.

1. Login as user1, create a new session for user2

2. Switch to a text terminal
# pgrep -fla Xorg
...
[shows xorg on vt1 is pid 4844, xorg on vt is pid 5885]
...

4. switch to a text terminal
# perf record -e syscalls:sys_enter_ioctl --filter 'cmd == 0x641e || cmd == 0x641f' -e sched:sched_process_exit --filter 'pid == 4844 || pid == 5885' -e sched:sched_process_free --filter 'pid == 4844 || pid == 5885' --call-graph dwarf -a -D 5000 -T -o perf_logouttwo_more.data sleep 120

5. logout from user2, wait two minutes and switch back to the text terminal
# perf script -i perf_logouttwo_more.data
Xorg  4844 [000]  3373.156721: syscalls:sys_enter_ioctl: fd: 0x00000015, cmd: 0x0000641e, arg: 0x00000000
                   f8b67 __GI___ioctl+0xffff005ba85ce007 (/usr/lib64/libc-2.23.so)
                    42b8 drmIoctl+0xffff005ba5e14028 (/usr/lib64/libdrm.so.2.4.0)
                   12888 qxl_enter_vt_kms+0xffff005bb2a2a028 (/usr/lib64/xorg/modules/drivers/qxl_drv.so)
                   b6d7d xf86RandR12EnterVT+0xffffffffff80007d (/usr/libexec/Xorg)
                   79940 xf86VTEnter+0xffffffffff800070 (/usr/libexec/Xorg)
                   3b94d WakeupHandler+0xffffffffff80006d (/usr/libexec/Xorg)
                  197fb9 WaitForSomething+0xffffffffff8001e9 (/usr/libexec/Xorg)
                   36bde Dispatch+0xffffffffff80008e (/usr/libexec/Xorg)
                   3add3 dix_main+0xffffffffff8003d3 (/usr/libexec/Xorg)
                   20731 __libc_start_main+0xffff005ba85ce0f1 (/usr/lib64/libc-2.23.so)
                   24d59 _start+0xffffffffff800029 (/usr/libexec/Xorg)

Xorg  4844 [000]  3373.157012: syscalls:sys_enter_ioctl: fd: 0x00000015, cmd: 0x0000641f, arg: 0x00000000
                   f8b67 __GI___ioctl+0xffff005ba85ce007 (/usr/lib64/libc-2.23.so)
                    42b8 drmIoctl+0xffff005ba5e14028 (/usr/lib64/libdrm.so.2.4.0)
                   12902 qxl_leave_vt_kms+0xffff005bb2a2a032 (/usr/lib64/xorg/modules/drivers/qxl_drv.so)
                   7b91f AbortDDX+0xffffffffff80007f (/usr/libexec/Xorg)
                  1a79f2 AbortServer+0xffffffffff800022 (/usr/libexec/Xorg)
                  1a87fd [unknown] (/usr/libexec/Xorg)
                   79ad6 [unknown] (/usr/libexec/Xorg)
                   3b94d WakeupHandler+0xffffffffff80006d (/usr/libexec/Xorg)
                  197fb9 WaitForSomething+0xffffffffff8001e9 (/usr/libexec/Xorg)
                   36bde Dispatch+0xffffffffff80008e (/usr/libexec/Xorg)
                   3add3 dix_main+0xffffffffff8003d3 (/usr/libexec/Xorg)
                   20731 __libc_start_main+0xffff005ba85ce0f1 (/usr/lib64/libc-2.23.so)
                   24d59 _start+0xffffffffff800029 (/usr/libexec/Xorg)                                                       
                                                                                                                             
Xorg  5885 [000]  3373.195317: sched:sched_process_exit: comm=Xorg pid=5885 prio=120                                         
            7fffa60a7b20 do_exit+0x80005a003560 ([kernel.kallsyms])                                                          
            7fffa60a8197 do_group_exit+0x80005a003047 ([kernel.kallsyms])                                                    
            7fffa60a8214 sys_exit_group+0x80005a003014 ([kernel.kallsyms])                                                   
            7fffa6006d52 do_syscall_64+0x80005a003062 ([kernel.kallsyms])                                                    
            7fffa67ef661 return_from_SYSCALL_64+0x80005a003000 ([kernel.kallsyms])

rcuos/0     9 [000]  3373.199799: sched:sched_process_free: comm=Xorg pid=5885 prio=120                                      
            7fffa60a6039 delayed_put_task_struct+0x80005a003069 ([kernel.kallsyms])                                          
            7fffa610b0fd rcu_nocb_kthread+0x80005a0032ad ([kernel.kallsyms])                                                 
            7fffa60c3638 kthread+0x80005a0030d8 ([kernel.kallsyms])                                                          
            7fffa67ef7bf ret_from_fork+0x80005a00301f ([kernel.kallsyms])                                                    
                                                                                                                             
Xorg  4844 [000]  3373.243504: sched:sched_process_exit: comm=Xorg pid=4844 prio=120
            7fffa60a7b20 do_exit+0x80005a003560 ([kernel.kallsyms])
            7fffa60a8197 do_group_exit+0x80005a003047 ([kernel.kallsyms])
            7fffa60a8214 sys_exit_group+0x80005a003014 ([kernel.kallsyms])
            7fffa6006d52 do_syscall_64+0x80005a003062 ([kernel.kallsyms])
            7fffa67ef661 return_from_SYSCALL_64+0x80005a003000 ([kernel.kallsyms])

rcuos/0     9 [000]  3373.257202: sched:sched_process_free: comm=Xorg pid=4844 prio=120
            7fffa60a6039 delayed_put_task_struct+0x80005a003069 ([kernel.kallsyms])
            7fffa610b0fd rcu_nocb_kthread+0x80005a0032ad ([kernel.kallsyms])
            7fffa60c3638 kthread+0x80005a0030d8 ([kernel.kallsyms])
            7fffa67ef7bf ret_from_fork+0x80005a00301f ([kernel.kallsyms])

Xorg  6878 [000]  3373.464285: syscalls:sys_enter_ioctl: fd: 0x00000015, cmd: 0x0000641e, arg: 0x00000000
                   f8b67 __GI___ioctl+0xffff0148328c8007 (/usr/lib64/libc-2.23.so)
                    42b8 drmIoctl+0xffff01483010e028 (/usr/lib64/libdrm.so.2.4.0)
                   12888 qxl_enter_vt_kms+0xffff01483cd24028 (/usr/lib64/xorg/modules/drivers/qxl_drv.so)
                   37141 AddScreen+0xffffffffff800101 (/usr/libexec/Xorg)
                   7cf72 InitOutput+0xffffffffff8003c2 (/usr/libexec/Xorg)
                   3abd6 dix_main+0xffffffffff8001d6 (/usr/libexec/Xorg)
                   20731 __libc_start_main+0xffff0148328c80f1 (/usr/lib64/libc-2.23.so)
                   24d59 _start+0xffffffffff800029 (/usr/libexec/Xorg)


xf86-video-ati for example seems to have more drmDropMaster() and drmClose() callers in cleanup/teardown paths. Seems like it might be a xf86-video-qxl bug? If so, I wonder why gdm users don't appear to see this. Might be because iirc they have per-user unprivileged Xorg processes and so they have some other code managing drm master transitions? Or perhaps they wait for the Xorg process to logout and terminate before switching vt? Or perhaps I'm misunderstanding the whole thing.

Comment 6 Adam Williamson 2016-11-08 01:06:44 UTC
goddamnit, I totally mixed up the two bugs. closing this and starting over. sorry for the mess.


Note You need to log in before you can comment on or make changes to this bug.