Bug 1569045

Summary: kickstart file trigger critical exception in anaconda
Product: [Fedora] Fedora Reporter: Michel Normand <normand>
Component: anacondaAssignee: Martin Kolman <mkolman>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 28CC: anaconda-maint-list, awilliam, fzatlouk, jkonecny, jonathan, kellin, mkolman, normand, vanmeeuwen+fedora, v.podzimek+fedora, vponcova, wwoods
Target Milestone: ---   
Target Release: ---   
Hardware: powerpc   
OS: Unspecified   
Whiteboard: AcceptedFreezeException
Fixed In Version: anaconda-28.22.10-1 anaconda-28.22.10-1.fc28 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-04-25 00:03:40 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1071880, 1469207    
Attachments:
Description Flags
anaconda.log
none
server.ks
none
alltmplogs.txt.gz none

Description Michel Normand 2018-04-18 14:31:17 UTC
Created attachment 1423623 [details]
anaconda.log

kickstart file trigger critical exception in anaconda
while trying  to use the createhdds 

How reproducible:
virt-install --disk size=8,path=disk_f28_server_3_ppc64le.img.tmp --os-variant fedora-unknown -x inst.ks=file:/server.ks --initrd-inject /root/createhdds/server.ks --location https://dl.fedoraproject.org/pub/fedora-secondary/development/28/Server/ppc64le/os/ --name createhdds --memory 4096 --noreboot --wait -1 --graphics none --extra-args console=ttyS0 --network user

From anaconda root console, the /tmp/anaconda.log ultimately reports traceback error:
===
16:14:02,614 DBG exception: running handleException^M
16:14:02,617 CRT exception: Traceback (most recent call last):^M
^M
  File "/usr/lib64/python3.6/site-packages/pyanaconda/threading.py", line 291, in run^M
    threading.Thread.run(self)^M
^M
  File "/usr/lib64/python3.6/threading.py", line 864, in run^M
    self._target(*self._args, **self._kwargs)^M
^M
  File "/usr/lib64/python3.6/site-packages/pyanaconda/installation.py", line 361, in doInstall^M
    installation_queue.start()^M
^M
  File "/usr/lib64/python3.6/site-packages/pyanaconda/installation_tasks.py", line 304, in start^M
    item.start()^M
^M
  File "/usr/lib64/python3.6/site-packages/pyanaconda/installation_tasks.py", line 304, in start^M
    item.start()^M
^M
  File "/usr/lib64/python3.6/site-packages/pyanaconda/installation_tasks.py", line 472, in start^M
    self.run_task()^M
^M
  File "/usr/lib64/python3.6/site-packages/pyanaconda/installation_tasks.py", line 438, in run_task^M
    self._task(*self._task_args, **self._task_kwargs)^M
^M
  File "/usr/lib64/python3.6/site-packages/pyanaconda/bootloader.py", line 2552, in writeBootLoader^M
    writeBootLoaderFinal(storage, payload, instClass, ksdata)^M
^M
  File "/usr/lib64/python3.6/site-packages/pyanaconda/bootloader.py", line 2471, in writeBootLoaderFinal^M
    payload=payload)^M
^M
  File "/usr/lib64/python3.6/site-packages/pyanaconda/bootloader.py", line 847, in set_boot_args^M
    setup_args = cfg_obj.dracutSetupArgs()^M
^M
  File "/usr/lib64/python3.6/site-packages/pyanaconda/payload/__init__.py", line 909, in dracutSetupArgs^M
    if util.isConsoleOnVirtualTerminal() and \^M
^M
  File "/usr/lib64/python3.6/site-packages/pyanaconda/core/util.py", line 578, in isConsoleOnVirtualTerminal^M
    console = get_active_console(dev)           # e.g. 'tty1', 'ttyS0', 'hvc1'^M
^M
  File "/usr/lib64/python3.6/site-packages/pyanaconda/core/util.py", line 573, in get_active_console^M
    dev = open("/sys/class/tty/%s/active" % dev).read().split()[-1]^M
^M
IndexError: list index out of range^M
===

Comment 1 Michel Normand 2018-04-18 14:33:43 UTC
Created attachment 1423625 [details]
server.ks

the server.ks is from createhdds git tree
https://pagure.io/fedora-qa/createhdds/blob/master/f/server.ks

Comment 2 Jiri Konecny 2018-04-19 06:57:22 UTC
Hello,

Could you please provide us other missing logs from /tmp/*.log as plain text files. Also it would be nice to see output of the `cat /sys/class/tty/<lowest tty>/active` command.

Thank you for your report.

Comment 3 Michel Normand 2018-04-19 08:14:59 UTC
Created attachment 1423940 [details]
alltmplogs.txt.gz

(In reply to Jiri Konecny from comment #2)
> Could you please provide us other missing logs from /tmp/*.log as plain text
> files. Also it would be nice to see output of the `cat
> /sys/class/tty/<lowest tty>/active` command.

I gathered all /tmp/*log in attached alltmplogs.txt.gz
that also contains the cat /sys/class/tty/tty0/active
=== /sys/class/tty/tty0/active:
tty1
=== /sys/class/tty/tty0/active end

Comment 4 Martin Kolman 2018-04-19 11:58:40 UTC
(In reply to Michel Normand from comment #3)
> Created attachment 1423940 [details]
> alltmplogs.txt.gz
> 
> (In reply to Jiri Konecny from comment #2)
> > Could you please provide us other missing logs from /tmp/*.log as plain text
> > files. Also it would be nice to see output of the `cat
> > /sys/class/tty/<lowest tty>/active` command.
> 
> I gathered all /tmp/*log in attached alltmplogs.txt.gz
> that also contains the cat /sys/class/tty/tty0/active
> === /sys/class/tty/tty0/active:
> tty1
> === /sys/class/tty/tty0/active end

Are there some other /sys/class/*/ devices that have "active" but it's empty ?

That's what going on IMHO, the piece of code:


def get_active_console(dev="console"):
    '''Find the active console device.
      
    Some tty devices (/dev/console, /dev/tty0) aren't actual devices;
    they just redirect input and output to the real console device(s).
      
    These 'fake' ttys have an 'active' sysfs attribute, which lists the real
    console device(s). (If there's more than one, the *last* one in the list
    is the primary console.)   
    '''
    # If there's an 'active' attribute, this is a fake console..
    while os.path.exists("/sys/class/tty/%s/active" % dev):
        # So read the name of the real, primary console out of the file.
        dev = open("/sys/class/tty/%s/active" % dev).read().split()[-1]
    return dev 


Most importantly this part:


    # If there's an 'active' attribute, this is a fake console..
    while os.path.exists("/sys/class/tty/%s/active" % dev):
        # So read the name of the real, primary console out of the file.
        dev = open("/sys/class/tty/%s/active" % dev).read().split()[-1]

We basically iterate over all tty-like devices known to the system and if they have the active "file", we try to read it, expecting it to be nonempty.

Basically:


In [7]: "tty2".split()[-1]
Out[7]: 'tty2'

In [8]: "".split()[-1]
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-8-1de257272cc7> in <module>()
----> 1 "".split()[-1]

IndexError: list index out of range


So even one tty-like device with active that is empty is enough to crash this piece of code. 

The question is, what is the proper fix ? We can quite easily change to code to handle empty active but maybe the issue is there are tty-like device that have active but it's empty ? In such case this could be a bug in kernel or the tty subsystem.

Comment 5 Michel Normand 2018-04-19 12:22:26 UTC
(In reply to Martin Kolman from comment #4)
> [CUT]...
> Are there some other /sys/class/*/ devices that have "active" but it's empty
> ?
> 

As  per output below, only /sys/class/tty with activate file, and "console" has empty file.
============================
[anaconda root@localhost /]# ls /sys/class/*/*/active
/sys/class/tty/console/active  /sys/class/tty/tty0/active
[anaconda root@localhost /]# for xx in $(ls /sys/class/*/*/active); do echo "=== $xx:"; cat $xx; echo "=== $xx end"; done
=== /sys/class/tty/console/active:
=== /sys/class/tty/console/active end
=== /sys/class/tty/tty0/active:
tty1
=== /sys/class/tty/tty0/active end
============================

Comment 6 Martin Kolman 2018-04-19 12:45:38 UTC
(In reply to Michel Normand from comment #5)
> (In reply to Martin Kolman from comment #4)
> > [CUT]...
> > Are there some other /sys/class/*/ devices that have "active" but it's empty
> > ?
> > 
> 
> As  per output below, only /sys/class/tty with activate file, and "console"
> has empty file.
> ============================
> [anaconda root@localhost /]# ls /sys/class/*/*/active
> /sys/class/tty/console/active  /sys/class/tty/tty0/active
> [anaconda root@localhost /]# for xx in $(ls /sys/class/*/*/active); do echo
> "=== $xx:"; cat $xx; echo "=== $xx end"; done
> === /sys/class/tty/console/active:
> === /sys/class/tty/console/active end
> === /sys/class/tty/tty0/active:
> tty1
> === /sys/class/tty/tty0/active end
> ============================
Looking at the code, things should still work as long as at least one active is nonempty.

So I guess we can go the path of fixing Anaconda to ignore empty active and that should fix the crash.

BTW, it this should be fixed in the F28 install media you will need to request a freeze exception here as F28 is now in final freeze:
https://qa.fedoraproject.org/blockerbugs/milestone/28/final/buglist

Otherwise the fix will go only to Rawhide.

Comment 7 Adam Williamson 2018-04-19 21:55:08 UTC
Proposing this for an FE (thanks for flagging it up Martin), this is actually kinda a significant issue for openQA PPC testing and it'd be best to get it fixed in the release.

Comment 8 Martin Kolman 2018-04-20 21:30:06 UTC
Fix PR: https://github.com/rhinstaller/anaconda/pull/1452

Comment 9 FrantiĊĦek Zatloukal 2018-04-23 17:53:26 UTC
Discussed during the 2018-04-23 blocker review meeting: [1]

The decision to classify this bug as an AcceptedFreezeException was made:

"this can cause kickstart / console installs to crash on certain systems, and cannot be fixed with a post-release update"

[1] https://meetbot-raw.fedoraproject.org/fedora-blocker-review/2018-04-23/f28-blocker-review.2018-04-23-16.00.log.txt

Comment 10 Fedora Update System 2018-04-24 10:52:46 UTC
anaconda-28.22.10-1.fc28 has been submitted as an update to Fedora 28. https://bodhi.fedoraproject.org/updates/FEDORA-2018-1884c34b53

Comment 11 Fedora Update System 2018-04-24 19:14:56 UTC
anaconda-28.22.10-1.fc28 has been pushed to the Fedora 28 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2018-1884c34b53

Comment 12 Fedora Update System 2018-04-25 00:03:40 UTC
anaconda-28.22.10-1.fc28 has been pushed to the Fedora 28 stable repository. If problems still persist, please make note of it in this bug report.

Comment 13 Michel Normand 2018-04-25 14:48:31 UTC
I verified with above change that no more exception with virt-install called with text console parameter; BUT identify a new problem with virt-install called with vnc console parameter => new bug https://bugzilla.redhat.com/show_bug.cgi?id=1571860