Bug 1748681 - ps shows '?' for the tty of most processes
Summary: ps shows '?' for the tty of most processes
Alias: None
Product: Fedora
Classification: Fedora
Component: gnome-session
Version: 31
Hardware: x86_64
OS: Unspecified
Target Milestone: ---
Assignee: Ray Strode [halfline]
QA Contact: Fedora Extras Quality Assurance
Whiteboard: openqa
Depends On:
TreeView+ depends on / blocked
Reported: 2019-09-04 01:37 UTC by Adam Williamson
Modified: 2019-09-16 13:16 UTC (History)
12 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Last Closed: 2019-09-13 07:54:43 UTC
Type: Bug

Attachments (Terms of Use)
packages in update after which the bug appears (26.92 KB, text/plain)
2019-09-04 20:56 UTC, Adam Williamson
no flags Details

Description Adam Williamson 2019-09-04 01:37:37 UTC
I'm still trying to pin this one down a bit more, but so far this is what I know. In recent Fedora 31, ps run from a root console at a VT seems to show '?' as the tty for most processes, particularly all processes in a GNOME session running on another VT.

This is a problem for me as it breaks a technique the Fedora openQA tests use to figure out which tty a GNOME session is running on. I've had to work around it using loginctl for now, but that's a bit of a bigger hammer in context.

I haven't yet figured out what's causing it. It doesn't seem to happen with a disk image representing the state of F31 on about August 28th. After updating that same install to current F31 (with updates-testing enabled) and rebooting, it starts happening. I thought at first it was systemd rc2 vs. final, but that doesn't seem to be it...

Comment 1 Adam Williamson 2019-09-04 20:56:41 UTC
Created attachment 1611690 [details]
packages in update after which the bug appears

So, the bug should be in one of these packages. This is the update after which the bug starts happening, in my test VM. I tested and systemd does *not* seem to be the cause. My next most likely suspect is gnome-session, I think.

Comment 2 Adam Williamson 2019-09-04 21:47:13 UTC
OK, yeah, so downgrading gnome-session - and gnome-control-center and gnome-settings-daemon, which have to go with it - to 3.33.4 (for session), 3.33.3 (for control-center) and 3.33.0 (for settings-daemon) makes the bug go away. So assigning to gnome-session for now.

Comment 3 Adam Williamson 2019-09-04 22:24:16 UTC
CCing bberg as he seems to have written the big changes in gnome-session 3.33.90. I poked a bit to try and guess what broke this, but it's a bit difficult to guess, especially as procps-ng's code is not super obvious about exactly how it *determines* the tty of a process...

Comment 4 Benjamin Berg 2019-09-05 08:04:12 UTC
The difference is that the processes are now fork'ed of the systemd user instance rather than the GDM process. They don't even technically belong to a systemd session scope because of this (they are part of the systemd user scope).

Now, I don't know how "ps" determines the TTY of a process. Maybe the processes used to inherit the actual VT for stdin/stdout/stderr, or something like that.

I fear that there is nothing we can do on the gnome-session side unfortunately.

Comment 5 Adam Williamson 2019-09-11 15:02:08 UTC
CCing Jan Rybar, who seems to be the procps-ng maintainer. Any thoughts here?

Comment 6 Benjamin Berg 2019-09-13 07:54:43 UTC
I am going to close this bug as wontfix. I am fine with explaining this more and will follow up to questions though.

Basically, in the current world, we have:
 * gdm-{x,wayland}-session: Run by GDM to launch/monitor the session; correct TTY
 * gnome-session-binary (stub): Runs in the systemd session-X.scope; correct TTY
 * systemd --user: Launched by pam_systemd for the *user* (not session), running in user-X.scope.
 * Everything else launched as services or by services in the user scope.

For these processes it is *not* directly obvious that they are part of the graphical session. What makes them part of the session are two things:
 * They connect to the display server (wayland/X11)
 * They voluntarily shut down on logout (i.e. BindsTo=graphical-session.target or similar)

Then we have some things like evolution-data-server currently, which do neither of the above. For this reason, we currently restart the DBus server at GNOME logout, so that they are reaped. This is a hack though, and e-s-d should be doing an idle shutdown.

The main way to fix this would be to:
 1. Figure out that a process is connected to a display server
 2. Have a heuristic to find the TTY that display server is running on
which really does not seem feasible to me.

Comment 7 Adam Williamson 2019-09-13 15:30:05 UTC
Well I'm not gonna bother having an opinion on all of that, but it seems to be the case that I can rely on gnome-session-binary to determine the tty a single running GNOME session is on, right?

Comment 8 Benjamin Berg 2019-09-13 15:50:30 UTC
Yes. You should be able to rely on the fact that there is one gnome-session-binary process that has the TTY. It just happens, that on a systemd launched session there is a second gnome-session-binary process which has no TTY.

Comment 9 Jan Rybar 2019-09-16 13:16:57 UTC
(In reply to Adam Williamson from comment #5)
> CCing Jan Rybar, who seems to be the procps-ng maintainer. Any thoughts here?

Sorry to reply so late, I was on vacation.

Even though the bug is closed, I just want to leave note FYI, that ps takes TTY information of a process from its /proc/<PID>/stat file. So if it's not there, ps will need to come to this information somewhere else.

Note You need to log in before you can comment on or make changes to this bug.