Bug 1569045
Summary: | kickstart file trigger critical exception in anaconda | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Michel Normand <normand> | ||||||||
Component: | anaconda | Assignee: | Martin Kolman <mkolman> | ||||||||
Status: | CLOSED ERRATA | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||||
Severity: | unspecified | Docs Contact: | |||||||||
Priority: | unspecified | ||||||||||
Version: | 28 | CC: | anaconda-maint-list, awilliam, fzatlouk, jkonecny, jonathan, kellin, mkolman, normand, vanmeeuwen+fedora, v.podzimek+fedora, vponcova, wwoods | ||||||||
Target Milestone: | --- | ||||||||||
Target Release: | --- | ||||||||||
Hardware: | powerpc | ||||||||||
OS: | Unspecified | ||||||||||
Whiteboard: | AcceptedFreezeException | ||||||||||
Fixed In Version: | anaconda-28.22.10-1 anaconda-28.22.10-1.fc28 | Doc Type: | If docs needed, set a value | ||||||||
Doc Text: | Story Points: | --- | |||||||||
Clone Of: | Environment: | ||||||||||
Last Closed: | 2018-04-25 00:03:40 UTC | Type: | Bug | ||||||||
Regression: | --- | Mount Type: | --- | ||||||||
Documentation: | --- | CRM: | |||||||||
Verified Versions: | Category: | --- | |||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||
Embargoed: | |||||||||||
Bug Depends On: | |||||||||||
Bug Blocks: | 1071880, 1469207 | ||||||||||
Attachments: |
|
Description
Michel Normand
2018-04-18 14:31:17 UTC
Created attachment 1423625 [details] server.ks the server.ks is from createhdds git tree https://pagure.io/fedora-qa/createhdds/blob/master/f/server.ks Hello, Could you please provide us other missing logs from /tmp/*.log as plain text files. Also it would be nice to see output of the `cat /sys/class/tty/<lowest tty>/active` command. Thank you for your report. Created attachment 1423940 [details] alltmplogs.txt.gz (In reply to Jiri Konecny from comment #2) > Could you please provide us other missing logs from /tmp/*.log as plain text > files. Also it would be nice to see output of the `cat > /sys/class/tty/<lowest tty>/active` command. I gathered all /tmp/*log in attached alltmplogs.txt.gz that also contains the cat /sys/class/tty/tty0/active === /sys/class/tty/tty0/active: tty1 === /sys/class/tty/tty0/active end (In reply to Michel Normand from comment #3) > Created attachment 1423940 [details] > alltmplogs.txt.gz > > (In reply to Jiri Konecny from comment #2) > > Could you please provide us other missing logs from /tmp/*.log as plain text > > files. Also it would be nice to see output of the `cat > > /sys/class/tty/<lowest tty>/active` command. > > I gathered all /tmp/*log in attached alltmplogs.txt.gz > that also contains the cat /sys/class/tty/tty0/active > === /sys/class/tty/tty0/active: > tty1 > === /sys/class/tty/tty0/active end Are there some other /sys/class/*/ devices that have "active" but it's empty ? That's what going on IMHO, the piece of code: def get_active_console(dev="console"): '''Find the active console device. Some tty devices (/dev/console, /dev/tty0) aren't actual devices; they just redirect input and output to the real console device(s). These 'fake' ttys have an 'active' sysfs attribute, which lists the real console device(s). (If there's more than one, the *last* one in the list is the primary console.) ''' # If there's an 'active' attribute, this is a fake console.. while os.path.exists("/sys/class/tty/%s/active" % dev): # So read the name of the real, primary console out of the file. dev = open("/sys/class/tty/%s/active" % dev).read().split()[-1] return dev Most importantly this part: # If there's an 'active' attribute, this is a fake console.. while os.path.exists("/sys/class/tty/%s/active" % dev): # So read the name of the real, primary console out of the file. dev = open("/sys/class/tty/%s/active" % dev).read().split()[-1] We basically iterate over all tty-like devices known to the system and if they have the active "file", we try to read it, expecting it to be nonempty. Basically: In [7]: "tty2".split()[-1] Out[7]: 'tty2' In [8]: "".split()[-1] --------------------------------------------------------------------------- IndexError Traceback (most recent call last) <ipython-input-8-1de257272cc7> in <module>() ----> 1 "".split()[-1] IndexError: list index out of range So even one tty-like device with active that is empty is enough to crash this piece of code. The question is, what is the proper fix ? We can quite easily change to code to handle empty active but maybe the issue is there are tty-like device that have active but it's empty ? In such case this could be a bug in kernel or the tty subsystem. (In reply to Martin Kolman from comment #4) > [CUT]... > Are there some other /sys/class/*/ devices that have "active" but it's empty > ? > As per output below, only /sys/class/tty with activate file, and "console" has empty file. ============================ [anaconda root@localhost /]# ls /sys/class/*/*/active /sys/class/tty/console/active /sys/class/tty/tty0/active [anaconda root@localhost /]# for xx in $(ls /sys/class/*/*/active); do echo "=== $xx:"; cat $xx; echo "=== $xx end"; done === /sys/class/tty/console/active: === /sys/class/tty/console/active end === /sys/class/tty/tty0/active: tty1 === /sys/class/tty/tty0/active end ============================ (In reply to Michel Normand from comment #5) > (In reply to Martin Kolman from comment #4) > > [CUT]... > > Are there some other /sys/class/*/ devices that have "active" but it's empty > > ? > > > > As per output below, only /sys/class/tty with activate file, and "console" > has empty file. > ============================ > [anaconda root@localhost /]# ls /sys/class/*/*/active > /sys/class/tty/console/active /sys/class/tty/tty0/active > [anaconda root@localhost /]# for xx in $(ls /sys/class/*/*/active); do echo > "=== $xx:"; cat $xx; echo "=== $xx end"; done > === /sys/class/tty/console/active: > === /sys/class/tty/console/active end > === /sys/class/tty/tty0/active: > tty1 > === /sys/class/tty/tty0/active end > ============================ Looking at the code, things should still work as long as at least one active is nonempty. So I guess we can go the path of fixing Anaconda to ignore empty active and that should fix the crash. BTW, it this should be fixed in the F28 install media you will need to request a freeze exception here as F28 is now in final freeze: https://qa.fedoraproject.org/blockerbugs/milestone/28/final/buglist Otherwise the fix will go only to Rawhide. Proposing this for an FE (thanks for flagging it up Martin), this is actually kinda a significant issue for openQA PPC testing and it'd be best to get it fixed in the release. Discussed during the 2018-04-23 blocker review meeting: [1] The decision to classify this bug as an AcceptedFreezeException was made: "this can cause kickstart / console installs to crash on certain systems, and cannot be fixed with a post-release update" [1] https://meetbot-raw.fedoraproject.org/fedora-blocker-review/2018-04-23/f28-blocker-review.2018-04-23-16.00.log.txt anaconda-28.22.10-1.fc28 has been submitted as an update to Fedora 28. https://bodhi.fedoraproject.org/updates/FEDORA-2018-1884c34b53 anaconda-28.22.10-1.fc28 has been pushed to the Fedora 28 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2018-1884c34b53 anaconda-28.22.10-1.fc28 has been pushed to the Fedora 28 stable repository. If problems still persist, please make note of it in this bug report. I verified with above change that no more exception with virt-install called with text console parameter; BUT identify a new problem with virt-install called with vnc console parameter => new bug https://bugzilla.redhat.com/show_bug.cgi?id=1571860 |