Bug 1590537 - Race between gnome-pty-helper and VteTerminal leads to a defunct child process
Summary: Race between gnome-pty-helper and VteTerminal leads to a defunct child process
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: vte291
Version: 7.6
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: rc
: ---
Assignee: Debarshi Ray
QA Contact: Desktop QE
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-06-12 19:37 UTC by Lenny Szubowicz
Modified: 2018-10-30 10:29 UTC (History)
6 users (show)

Fixed In Version: vte291-0.52.2-2.el7
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-10-30 10:28:48 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
GNOME Gitlab GNOME vte issues 7 0 None None None 2018-06-22 12:53:59 UTC
Red Hat Product Errata RHSA-2018:3140 0 None None None 2018-10-30 10:29:23 UTC

Description Lenny Szubowicz 2018-06-12 19:37:04 UTC
Description of problem:

After yum update to RHEL-7.6-20180612.n.1 nightly build, which includes gnome-terminal-3.28.2-2.el7.x86_64, I can't create a usable gnome-terminal.

The window is created, a bash process is created, but the /dev/pts/<N> device does not exist, there is no bash prompt in the gnome terminal window, and the bash process is a zombie.


Version-Release number of selected component (if applicable):

gnome-terminal-3.28.2-2.el7.x86_64


How reproducible: 100%

Steps to Reproduce:
1. yum update to RHEL-7.6-20180612.n.1 from an earlier 7.6 nightly
   I didn't try a clean install.

2. Log in and select terminal from gnome shell application menu

Actual results:

1. Terminal window is created
2. No bash prompt in created terminal window

Additional info:

I don't know if it's relevant, but I was able to install terminator from the epel repo and it is able to successfully create usable terminal windows.

Comment 2 Tomas Pelka 2018-06-13 09:55:11 UTC
Have you restarted your session (Alt+F2 r <Enter>)? Or rather reboot to be sure?

I saw that too but after session restart it works fine.

Comment 3 Tomas Pelka 2018-06-13 13:43:23 UTC
OK it is back, seems vte is incredible slow in displaying prompt.

Comment 4 Martin Krajnak 2018-06-13 13:45:44 UTC
When this occurs, the folowing message is logged in /var/log/messages:

Jun 13 15:43:54 localhost journal: void vte_terminal_watch_child(VteTerminal*, GPid): assertion 'impl->m_pty != NULL' failed

Comment 5 Debarshi Ray 2018-06-13 13:51:10 UTC
Just to narrow things down a bit, what happens if you install the vte291-devel package and try the toy /usr/bin/vte-2.91 program? Does it give you working shell prompt?

Comment 6 Michal Odehnal 2018-06-13 14:02:56 UTC
Installed vte291-devel, tried some simple commands. Seems to be working well for me.

Comment 7 Martin Krajnak 2018-06-13 14:06:26 UTC
In my case it behaves randomly, so the same behaviour as the gnome-terminal

Comment 8 Michal Odehnal 2018-06-13 14:06:44 UTC
(In reply to Michal Odehnal from comment #6)
> Installed vte291-devel, tried some simple commands. Seems to be working well
> for me.

And as I thought it was working, few restarts later it is not working again. Few restarts later it is working again.

Comment 9 Lenny Szubowicz 2018-06-13 15:16:47 UTC
Still hard failure for me.

I installed vte291-devel (and all the other packages it depends on) and rebooted.


I'm also seeing the following in /var/log/messages: 

Jun 13 11:09:38 rhel-vm01 journal: void vte_terminal_watch_child(VteTerminal*, GPid): assertion 'impl->m_pty != NULL' failed

                                 -Lenny.

Comment 10 Lenny Szubowicz 2018-06-13 15:48:07 UTC
(In reply to Debarshi Ray from comment #5)
> Just to narrow things down a bit, what happens if you install the
> vte291-devel package and try the toy /usr/bin/vte-2.91 program? Does it give
> you working shell prompt?

Sorry, I missed the part about /usr/bin/vte-2.91

Indeed /usr/bin/vte-2.91 works and gets me a shell prompt.

I do get this message though:

(Terminal:3803): GLib-GIO-CRITICAL **: 11:44:33.549: g_dbus_proxy_new_sync: assertion 'G_IS_DBUS_CONNECTION (connection)' failed

                               -Lenny.

Comment 11 Lenny Szubowicz 2018-06-14 13:46:18 UTC
(In reply to Lenny Szubowicz from comment #10)
> (In reply to Debarshi Ray from comment #5)
> > Just to narrow things down a bit, what happens if you install the
> > vte291-devel package and try the toy /usr/bin/vte-2.91 program? Does it give
> > you working shell prompt?
> 
> Sorry, I missed the part about /usr/bin/vte-2.91
> 
> Indeed /usr/bin/vte-2.91 works and gets me a shell prompt.
> 
> I do get this message though:
> 
> (Terminal:3803): GLib-GIO-CRITICAL **: 11:44:33.549: g_dbus_proxy_new_sync:
> assertion 'G_IS_DBUS_CONNECTION (connection)' failed
> 
>                                -Lenny.

I also see the same intermittent behavior as reported in comment 7.

/usr/bin/vte-2.91 sometimes also ends up with a window with no shell prompt.
When that happens, I see the following:

Terminal:3330): Vte-CRITICAL **: 09:41:58.736: void vte_terminal_watch_child(VteTerminal*, GPid): assertion 'impl->m_pty != NULL' failed

                                 -Lenny.

Comment 12 Debarshi Ray 2018-06-21 10:22:38 UTC
Just a quick update.  I can reproduce it too, and I am looking into it.  It seems related to the gnome-pty-helper code, because it doesn't happen if you build without --enable-gnome-pty-helper.

(--enable-gnome-pty-helper needs to be explicitly passed because it's not there by default. That's why I didn't notice it when testing the rebased patches.  Sorry about that.)

Comment 13 Debarshi Ray 2018-06-21 17:25:37 UTC
For some reason, VteTerminal was getting a "hang up" (ie. G_IO_HUP) while polling the pseudo-terminal master device (ie. the end of the pseudo-terminal pipe / device pair that's in the UI process), and this was only happening when using gnome-pty-helper and vte_terminal_spawn_async. Not using gnome-pty-helper or using vte_terminal_spawn_sync would not cause this bug.

The move from vte_terminal_spawn_sync to vte_terminal_spawn_async happened between RHEL 7.4 and 7.6. So running gnome-terminal from RHEL < 7.6 with newer vte291 would not cause this bug.

The reason for this is that the asynchronous code path starts polling the pseudo-terminal master device a little too early in the sequence. When gnome-pty-helper is used, the helper is in charge of creating the pseudo-terminal device pair, and it closes its copies of the file descriptors after sending them to the main UI process. I believe that polling the master device immediately after receiving it from the helper can leave it vulnerable to a race condition where the helper closes its copies after the poll has been initiated.

Instead, we should start polling after the child process has been forked and exec-ed. Since the child would keep its copy of the pseudo-terminal slave device open, having the helper close its copy after that wouldn't matter. This is also what the synchronous code path does, which we have been using without any problem so far.

Comment 14 Debarshi Ray 2018-06-22 13:07:18 UTC
Scratch build:
https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=16831663

Comment 15 Debarshi Ray 2018-06-22 13:08:14 UTC
Building vte291-0.52.2-2.el7:
https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=16831853

Comment 16 Martin Krajnak 2018-06-22 14:05:31 UTC
(In reply to Debarshi Ray from comment #15)
> Building vte291-0.52.2-2.el7:
> https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=16831853

quickly tested in wm, works fine for me, thanks rishi

Comment 18 Lenny Szubowicz 2018-06-26 12:28:13 UTC
Works for me as well. Thanks! -Lenny.

Comment 19 Martin Krajnak 2018-08-15 12:06:20 UTC
gnome-terminal-3.28.2-2.el7.x86_64
vte291-0.52.2-2.el7.x86_64

terminal works fine for long time, moving to VERIFIED

Comment 21 errata-xmlrpc 2018-10-30 10:28:48 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:3140


Note You need to log in before you can comment on or make changes to this bug.