Bug 732302

Summary: Anaconda could not set a new controlling tty
Product: [Fedora] Fedora Reporter: Mark Hamzy <hamzy>
Component: anacondaAssignee: Will Woods <wwoods>
Status: CLOSED CURRENTRELEASE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: medium    
Version: 16CC: anaconda-maint-list, jonathan, karsten, vanmeeuwen+fedora, wwoods
Target Milestone: ---   
Target Release: ---   
Hardware: ppc64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-01-20 15:48:58 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 718272    

Description Mark Hamzy 2011-08-21 18:49:35 UTC
Description of problem:
Booting Fedora-20110817-ppc64-netinst.iso with the following options

linux vnc=1

hangs when the Anaconda installer is run.  I see the following output:

Starting Anaconda version 16.14.
could not set new controlling tty

When I run gdb on the loader, I see the following stack trace:

(gdb) bt
#0  0x00000fffa5aa3808 in ___newselect_nocancel () from /lib64/libc.so.6
#1  0x00000fffa6972098 in newtFormRun (co=0x10029deb380, es=0xffffa712a10) at form.c:1016
#2  0x00000fffa69725b8 in newtRunForm (co=<optimized out>) at form.c:840
#3  0x00000fffa69798b4 in newtWinMenu (title=0x100319f8 "Choose a Language", text=<optimized out>, suggestedWidth=<optimized out>,
    flexDown=<optimized out>, flexUp=<optimized out>, maxListHeight=<optimized out>, items=<optimized out>, listItem=
    0xffffa712dd4, button1=0x1002f528 "OK") at windows.c:190
#4  0x0000000010012820 in .chooseLanguage ()
#5  0x000000001000d374 in .doLoaderMain ()
#6  0x000000001000a358 in .main ()

So it seems to be waiting on user input for choosing a language.

The following simple program also reproduces the error:

/* gcc -o test_ioctl -g test_ioctl.c
*/

#include <stdio.h>
#include <sys/ioctl.h>
#include <errno.h>

int
main (int argc, const char *argv[])
{
    int   rc;
    pid_t sessionID;

    errno = 0;

    sessionID = setsid();
    printf ("setsid() = %d\n", sessionID);
    if (sessionID == -1)
        /* On error, -1 is returned, and errno is set. The only error which can happen is EPERM. It is returned when the process group ID of any process equals the PID of the calling process. Thus, in particular, setsid() fails if the calling process is already a process group leader. */
        printf ("errno = %d\n", errno);

    rc = ioctl(0, TIOCSCTTY, NULL);
    printf ("ioctl(0, TIOCSCTTY, NULL) = %d\n", rc);

    if (rc) {
        printf ("errno = %d\n", errno);
        printf ("could not set new controlling tty\n");
    }

    return 0;
}

Unfortunately the kernel returns the same error for multiple problems.  I don't know if the process is not a session leader or if it already has a controlling tty.

Comment 1 Mark Hamzy 2011-08-22 13:48:02 UTC
It fails in setsid() because of:

        /* Fail if a process group id already exists that equals the
         * proposed session id.
         */
        if (pid_task(sid, PIDTYPE_PGID))
                goto out;

Comment 2 Mark Hamzy 2011-08-24 19:47:32 UTC
I have noticed that if I start the program via systemd, then it works.

bash-4.2# cat << __EOF__ > /lib/systemd/system/bob.service
[Unit]
Description=Test program
Requires=dbus.service udev.service rsyslog.service  instperf.service
After=dbus.service udev.service rsyslog.service  instperf.service

[Service]
Environment=HOME=/root MALLOC_CHECK_=2 MALLOC_PERTURB_=204 PATH=/usr/bin:/bin:/sbin:/usr/sbin:/mnt/sysimage/bin:/mnt/sysimage/usr/bin:/mnt/sysimage/usr/sbin:/mnt/sysimage/sbin PYTHONPATH=/tmp/updates TERM=linux
Type=oneshot
WorkingDirectory=/root
ExecStart=/tmp/test_ioctl
StandardInput=tty-force
TimeoutSec=0
__EOF__
bash-4.2# /bin/systemctl start bob.service
setsid() = -1
errno = 2
ioctl(0, TIOCSCTTY, NULL) = 0
[  253.697892] systemd[1]: Received SIGCHLD from PID 924 (test_ioctl).
[  253.698034] systemd[1]: Got SIGCHLD for process 924 (test_ioctl)
[  253.698340] systemd[1]: Child 924 died (code=exited, status=0/SUCCESS)
[  253.698351] systemd[1]: Child 924 belongs to bob.service
[  253.698364] systemd[1]: bob.service: main process exited, code=exited, status=0
[  253.699509] systemd[1]: bob.service changed start -> dead
[  253.734022] systemd[1]: Job bob.service/start finished, result=done

Comment 3 Mark Hamzy 2011-08-26 21:53:47 UTC
Okay, here is a workaround: add the "serial" parameter to the boot line.

It seems that in the file loader/serial.c, init_serial() is called.  If "serial" is passed to anaconda, then get_serial_fd() will return /dev/hvc0.  Which will work.  It seems that using /dev/tty1 is not working.

You do not need the "serial" parameter when installing RHEL6.1.  So I need to investigate why that is.

Comment 4 Will Woods 2011-11-09 23:15:53 UTC
RHEL6 tries to set up the 'weird' serial devices (hvc0, xvc0, hvsi0, etc) unconditionally, but Fedora gained a 'serial_requested()' check early in the F16 branch (see git commit 422702a3).

I think that check was meant to handle the normal ttyS0 case and not the 'weird' devices, so that might need fixing. I've filed bug 752596 for that.

The question of why we get the 'could not set new controlling tty' message remains, but fixing 752596 might make it irrelevant.

Comment 5 Will Woods 2011-11-10 16:42:29 UTC
Aha - I think we may have figured this out. It looks like a couple chunks of terminal setup code seem to have been switched around in some of the refactoring work over the past 4 years.

I'll put together a patch so we can test this theory.

Comment 6 Will Woods 2011-11-17 16:37:59 UTC
Okay! Here's a test image:
http://wwoods.fedorapeople.org/Fedora-16-ppc64-netinst-20111116.iso
SHA1SUM: a7c0ec797b56ccf6e31a90e3667789ed3416dc78

This image has proposed fixes for this bug and bug 752596. Could someone try booting that image and report back? If it works, I'll commit the patches and we can close this bug.

Comment 7 Will Woods 2011-11-17 16:42:17 UTC
To test the fixes, boot the image without "serial" and confirm that:

1) no "could not set new controlling tty" message appears (this bug)

2) anaconda sends subsequent output (i.e. welcome messages, menus, etc.) to the hvc0 console automatically (bug 752596)

Comment 8 Karsten Hopp 2011-11-21 16:45:00 UTC
I've never seen the "could not set new controlling tty" message, on my system anaconda just stops after 'Starting Anaconda version xxxxx' without the serial parameter.

This is still the case with the iso from comment #6 and the cmdline 'linux  vnc':


Started anaconda performance monitor.
Starting System Logging Service...
Starting Shell on tty2...
Started Shell on tty2.
Starting Shell on hvc1...
Started Shell on hvc1.
Starting Anaconda version 16.24.



I've tried again with 'serial' parameter and anaconda continues with
detecting hardware...
waiting for hardware to initialize...
....

Comment 9 Will Woods 2011-11-21 17:06:12 UTC
Karsten, you're talking about bug 752596. I'm using this for the "could not set a new controlling tty" message (which is actually harmless - keep in mind that "controlling tty" is a totally different thing from "tty where output goes", so this message has no effect on where the output actually ends up.)

To clarify a bit:

After the "Starting Anaconda version XXX" message, loader tries to figure out which terminal it should talk to. Once it finds one it likes, it switches its controlling terminal there, then sends all further output to that terminal.

In pseudocode, here's what it was doing:

  fd = find_serial_fd();
  if (fd == -1)
      fd = open("/dev/tty1");
  set_controlling_tty(0); /* BUG: should be fd! */
  dup2(fd, 0); dup2(fd, 1); dup2(fd, 2);

Two things go wrong:

1) find_serial_fd() isn't finding /dev/hvc0 (that's bug 752596)
2) set_controlling_tty() should use fd, not 0.

So - unless you see the "could not set new controlling tty" message, this bug is fixed. If your output stops after "Starting Anaconda version XXX", that's bug 752596.

Comment 10 Will Woods 2011-11-21 23:05:03 UTC
Argh, that image was messed up. Here's a (hopefully) fixed one:
  http://wwoods.fedorapeople.org/Fedora-16-ppc64-netinst-20111116.iso
SHA1SUM: 013c8dec6d41cf21862667866a306761e8d20ef0

Please try booting without 'serial' or 'text' on the commandline and let me know if the normal text-based installer comes up.

Comment 11 Karsten Hopp 2011-11-22 14:01:16 UTC
The requested URL /Fedora-16-ppc64-netinst-20111116.iso was not found on this server.

Comment 13 Karsten Hopp 2011-11-22 14:34:09 UTC
the 20111121.iso boots fine into TUI installer with a simple <enter> at the yaboot prompt

Comment 14 Fedora End Of Life 2013-01-17 02:01:21 UTC
This message is a reminder that Fedora 16 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 16. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '16'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 16's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 16 is end of life. If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora, you are encouraged to click on 
"Clone This Bug" and open it against that version of Fedora.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping