Bug 967502 - harness should handle a misconfigured serial console
harness should handle a misconfigured serial console
Status: CLOSED CURRENTRELEASE
Product: Beaker
Classification: Community
Component: beah (Show other bugs)
0.12
Unspecified Unspecified
unspecified Severity unspecified (vote)
: 21.0
: ---
Assigned To: Dan Callaghan
tools-bugs
Misc
: NeedsTestCase, Patch
: 1136141 (view as bug list)
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2013-05-27 06:12 EDT by Jan Stancek
Modified: 2015-08-26 02:17 EDT (History)
9 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1136141 (view as bug list)
Environment:
Last Closed: 2015-08-26 02:17:18 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Jan Stancek 2013-05-27 06:12:44 EDT
Description of problem:
If a system has misconfigured serial console then beah can't run any tests and all of them fail with:

Redirecting to /bin/systemctl status  rhts-compat.service
+++ basename /var/lib/beah/tortilla/wrappers.d/runtest
++ tortilla get-next-wrapper runtest
+ WRAP_NEXT=
+ /usr/bin/rhts-test-runner.sh
logger: /usr/bin/rhts-test-runner.sh rhts-extend lab-02 12662707 5400
logger: /usr/bin/rhts-test-runner.sh rhts-test-checkin 127.0.0.1:7091 intel-mccreary-01 422430 /kernel/networking/kdump 5400 12662707
Traceback (most recent call last):
  File "/usr/bin/rhts-test-checkin", line 41, in <module>
    res = os.write(fd, "%s JobID:%s Test:%s Response:%s\n" % (timetext, jobid, test, resp))
OSError: [Errno 5] Input/output error
+ rc=1
+ '[' -n '' ']'
+ exit 1


Version-Release number of selected component (if applicable):
0.12

How reproducible:
100%

Steps to Reproduce:
1. run any test on system with misconfigured serial console, such that any write to /dev/console fails with "Input/output error"

Actual results:
beah fails to run every test

Expected results:
beah should report failure to write to stdout (/dev/console), but this shouldn't stop tests from running

Additional info:
Bug 966942 - /dev/ttyS0: not a character device, starting with 3.10.0-0.rc1.56.el7.x86_64
Comment 1 Dan Callaghan 2014-09-01 22:28:48 EDT
/dev/console is not a real file and we really shouldn't be pretending it is. beah also needs to handle the case where open("/dev/console") fails with ENODEV or EIO [1].

On systemd distros we should just stop all this logging business and write everything to stdout/stderr, and let the systemd journal capture it and forward it to the console. I think restraint does this currently. Then the quirks of the console are systemd's problem.

For RHEL6 and earlier we may need to introduce a new beah config option for logging to the console, which uses a different code path (not the stdlib logging file handler) which is resilient to the various failure cases with /dev/console. In this case we will also need to be careful about redirecting stdout/stderr for child processes -- see the traceback in comment 0 for an example of why.

[1] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/554172/comments/245
Comment 2 Dan Callaghan 2014-09-01 22:34:09 EDT
I forgot the context for comment 1... We had a report of a similar problem, with the harness services failing to start on RHEL7 due to IOError opening /dev/console. It is essentially the same issue (/dev/console has some weird behaviour, we can't treat it like a regular file).

systemd[1]: Starting The Beaker backend server....
systemd[1]: Started The Beaker backend server..
beah-beaker-backend[1319]: --- WARNING: Value for DEFAULT.HOSTNAME (None) is not an string.
beah-beaker-backend[1319]: Traceback (most recent call last):
beah-beaker-backend[1319]: File "/usr/bin/beah-beaker-backend", line 9, in <module>
beah-beaker-backend[1319]: load_entry_point('beah==0.7.3.dev201403262029', 'console_scripts', 'beah-beaker-backend')()
beah-beaker-backend[1319]: File "/usr/lib/python2.7/site-packages/beah/backends/beakerlc.py", line 2070, in main
beah-beaker-backend[1319]: log_handler()
beah-beaker-backend[1319]: File "/usr/lib/python2.7/site-packages/beah/wires/internals/twbackend.py", line 90, in log_handler
beah-beaker-backend[1319]: console=parse_bool(conf.get('DEFAULT', 'CONSOLE_LOG')))
beah-beaker-backend[1319]: File "/usr/lib/python2.7/site-packages/beah/misc/__init__.py", line 214, in make_log_handler
beah-beaker-backend[1319]: lhandler = logging.FileHandler('/dev/console')
beah-beaker-backend[1319]: File "/usr/lib64/python2.7/logging/__init__.py", line 902, in __init__
beah-beaker-backend[1319]: StreamHandler.__init__(self, self._open())
beah-beaker-backend[1319]: File "/usr/lib64/python2.7/logging/__init__.py", line 925, in _open
beah-beaker-backend[1319]: stream = open(self.baseFilename, self.mode)
beah-beaker-backend[1319]: IOError: [Errno 5] Input/output error: '/dev/console'
systemd[1]: beah-beaker-backend.service: main process exited, code=exited, status=1/FAILURE
systemd[1]: Unit beah-beaker-backend.service entered failed state.
Comment 3 Nick Coghlan 2014-09-01 22:45:51 EDT
I agree relying on systemd is the right thing to do for systems that offer it, but I'm less convinced it's worth the hassle of coming up with our own solution for older RHEL versions.

I'll split this into two issues (this one for systemd based systems, a new one for RHEL6 and earlier), as I'd like to separate the two decisions.
Comment 5 Dan Callaghan 2014-09-02 19:41:52 EDT
The cloned bug for dealing with /dev/console on RHEL6 and earlier is bug 1136141.
Comment 6 Dan Callaghan 2015-07-21 02:02:12 EDT
The simplest way to reproduce this (broken /dev/console) is to run a recipe on x86_64 with kernel_options_post="console=ttyS2". Most Beaker systems have a serial console on ttyS0 and then there are four standard PC serial ports on ttyS[1-4] which are not hooked up to anything. Using ttyS2 is probably the most reliable way to get a port which is definitely not connected because a few systems have ttyS1 as a real port too.

When /dev/console is pointed at a serial port which is not connected, every write to it will fail with EIO (Input/output error) which is the same error condition in comment 0.
Comment 7 Dan Callaghan 2015-07-21 02:12:20 EDT
The actual problem which causes tasks to fail is not inside beah itself, but rather a few of the rhts scripts (rhts-test-checkin in particular) which try and write to /dev/console directly and don't suppress any errors when that fails.

Since we don't keep separate branches of rhts for every RHEL release we will actually need to solve this for all releases, not just RHEL7. Hence duping the cloned bug back onto this one.
Comment 8 Dan Callaghan 2015-07-21 02:13:55 EDT
*** Bug 1136141 has been marked as a duplicate of this bug. ***
Comment 9 Dan Callaghan 2015-07-21 02:45:05 EDT
(In reply to Dan Callaghan from comment #6)
> The simplest way to reproduce this (broken /dev/console) is to run a recipe
> on x86_64 with kernel_options_post="console=ttyS2".

Note that on RHEL6 this does not work because Anaconda also injects the proper console argument (console=ttyS0 or whatever it is) on the kernel command line ahead of console=ttyS2 and the earlier option wins.

However even with a more aggressive approach like:

<ks_appends>
<ks_append><![CDATA[
%post
sed -i -r -e 's/console=\S+/console=ttyS2/' /boot/grub/grub.conf
%end
]]></ks_append>
</ks_appends>

there are no errors when writing to /dev/console. Instead the output seems to be just be silently discarded. It seems like the earlier kernel's behaviour was a lot more forgiving of misconfigured /dev/console.

So I don't currently have any way to reproduce this on RHEL < 7.
Comment 10 Dan Callaghan 2015-07-21 03:06:07 EDT
(And RHEL5 with a misconfigured console= doesn't even boot so...)
Comment 11 Dan Callaghan 2015-07-21 03:15:38 EDT
(In reply to Dan Callaghan from comment #10)
> (And RHEL5 with a misconfigured console= doesn't even boot so...)

Correct, it does boot and then writes to /dev/console return EIO, same behaviour as RHEL7.
Comment 12 Dan Callaghan 2015-07-21 03:37:57 EDT
Attempt to handle errors from /dev/console in beah:
http://gerrit.beaker-project.org/4304
(but note that this is still just a best-effort attempt in case of transient errors, if /dev/console is broken beah will still proceed but logs will be lost)

On systemd distros, make systemd write to /dev/console for us:
http://gerrit.beaker-project.org/4305
(although this doesn't actually help to simplify matters much since we need to keep all beah's logging infrastructure for non-systemd distros)

In rhts, avoid writing to /dev/console:
http://gerrit.beaker-project.org/4306
This is a behaviour change, but I think overall a positive one since it reduces the amount of different spew on the console (that message will go to syslog/journal, and also to stderr which is captured by beah). It also means we don't need to make all the rhts-* scripts deal with errors when writing to /dev/console.
Comment 13 Dan Callaghan 2015-07-21 03:39:07 EDT
And lastly, make Beaker only set CONSOLE_LOG=Console for non-systemd distros:
http://gerrit.beaker-project.org/4302
This is just a tidy-up, since the CONSOLE_LOG option will be overridden anyway with the beah patch above.
Comment 14 Dan Callaghan 2015-07-21 18:53:00 EDT
On RHEL5 (and earlier I guess) when rhts-compat is enabled, which is the default, all tasks will break because rhts-compat runs with stdout hooked up to the console and many, many different scripts inside rhts-compat write to stdout. It seems that at some point (I can't figure out exactly where) a script dies without actually executing the task.

So these patches do not fix the case of RHEL5, with broken console, and rhts-compat enabled. But it doesn't seem worthwhile trying to fix that at this point.
Comment 15 Dan Callaghan 2015-07-21 19:56:21 EDT
Verification strategy:
# Recipe on RHEL5, rhts-compat disabled
# Recipe on RHEL5 with bad console, rhts-compat disabled
# Recipe on RHEL6
# Recipe on RHEL7
# Recipe on RHEL7 with bad console
Each recipe has /distribution/install plus some /distribution/dummy tasks to give the harness some work to do.

Expected results:
For recipes with bad console, console.log will be empty after installation finishes and the system reboots. However the recipe will run normally with no failures. Some "Input/output error" tracebacks may appear in debug/task_beah_unexpected as beah fails to log to the console.
For recipes with normal console, console.log will show the beah/rhts log messages as tasks are executed.
On RHEL7 all beah/rhts messages will also be accessible from the systemd journal (journalctl -u beah-srv) regardless of whether the console is working or not.
Comment 19 Dan Callaghan 2015-08-26 02:17:18 EDT
Beaker 21.0 has been released.

Note You need to log in before you can comment on or make changes to this bug.