Description of problem: If a system has misconfigured serial console then beah can't run any tests and all of them fail with: Redirecting to /bin/systemctl status rhts-compat.service +++ basename /var/lib/beah/tortilla/wrappers.d/runtest ++ tortilla get-next-wrapper runtest + WRAP_NEXT= + /usr/bin/rhts-test-runner.sh logger: /usr/bin/rhts-test-runner.sh rhts-extend lab-02 12662707 5400 logger: /usr/bin/rhts-test-runner.sh rhts-test-checkin 127.0.0.1:7091 intel-mccreary-01 422430 /kernel/networking/kdump 5400 12662707 Traceback (most recent call last): File "/usr/bin/rhts-test-checkin", line 41, in <module> res = os.write(fd, "%s JobID:%s Test:%s Response:%s\n" % (timetext, jobid, test, resp)) OSError: [Errno 5] Input/output error + rc=1 + '[' -n '' ']' + exit 1 Version-Release number of selected component (if applicable): 0.12 How reproducible: 100% Steps to Reproduce: 1. run any test on system with misconfigured serial console, such that any write to /dev/console fails with "Input/output error" Actual results: beah fails to run every test Expected results: beah should report failure to write to stdout (/dev/console), but this shouldn't stop tests from running Additional info: Bug 966942 - /dev/ttyS0: not a character device, starting with 3.10.0-0.rc1.56.el7.x86_64
/dev/console is not a real file and we really shouldn't be pretending it is. beah also needs to handle the case where open("/dev/console") fails with ENODEV or EIO [1]. On systemd distros we should just stop all this logging business and write everything to stdout/stderr, and let the systemd journal capture it and forward it to the console. I think restraint does this currently. Then the quirks of the console are systemd's problem. For RHEL6 and earlier we may need to introduce a new beah config option for logging to the console, which uses a different code path (not the stdlib logging file handler) which is resilient to the various failure cases with /dev/console. In this case we will also need to be careful about redirecting stdout/stderr for child processes -- see the traceback in comment 0 for an example of why. [1] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/554172/comments/245
I forgot the context for comment 1... We had a report of a similar problem, with the harness services failing to start on RHEL7 due to IOError opening /dev/console. It is essentially the same issue (/dev/console has some weird behaviour, we can't treat it like a regular file). systemd[1]: Starting The Beaker backend server.... systemd[1]: Started The Beaker backend server.. beah-beaker-backend[1319]: --- WARNING: Value for DEFAULT.HOSTNAME (None) is not an string. beah-beaker-backend[1319]: Traceback (most recent call last): beah-beaker-backend[1319]: File "/usr/bin/beah-beaker-backend", line 9, in <module> beah-beaker-backend[1319]: load_entry_point('beah==0.7.3.dev201403262029', 'console_scripts', 'beah-beaker-backend')() beah-beaker-backend[1319]: File "/usr/lib/python2.7/site-packages/beah/backends/beakerlc.py", line 2070, in main beah-beaker-backend[1319]: log_handler() beah-beaker-backend[1319]: File "/usr/lib/python2.7/site-packages/beah/wires/internals/twbackend.py", line 90, in log_handler beah-beaker-backend[1319]: console=parse_bool(conf.get('DEFAULT', 'CONSOLE_LOG'))) beah-beaker-backend[1319]: File "/usr/lib/python2.7/site-packages/beah/misc/__init__.py", line 214, in make_log_handler beah-beaker-backend[1319]: lhandler = logging.FileHandler('/dev/console') beah-beaker-backend[1319]: File "/usr/lib64/python2.7/logging/__init__.py", line 902, in __init__ beah-beaker-backend[1319]: StreamHandler.__init__(self, self._open()) beah-beaker-backend[1319]: File "/usr/lib64/python2.7/logging/__init__.py", line 925, in _open beah-beaker-backend[1319]: stream = open(self.baseFilename, self.mode) beah-beaker-backend[1319]: IOError: [Errno 5] Input/output error: '/dev/console' systemd[1]: beah-beaker-backend.service: main process exited, code=exited, status=1/FAILURE systemd[1]: Unit beah-beaker-backend.service entered failed state.
I agree relying on systemd is the right thing to do for systems that offer it, but I'm less convinced it's worth the hassle of coming up with our own solution for older RHEL versions. I'll split this into two issues (this one for systemd based systems, a new one for RHEL6 and earlier), as I'd like to separate the two decisions.
The cloned bug for dealing with /dev/console on RHEL6 and earlier is bug 1136141.
The simplest way to reproduce this (broken /dev/console) is to run a recipe on x86_64 with kernel_options_post="console=ttyS2". Most Beaker systems have a serial console on ttyS0 and then there are four standard PC serial ports on ttyS[1-4] which are not hooked up to anything. Using ttyS2 is probably the most reliable way to get a port which is definitely not connected because a few systems have ttyS1 as a real port too. When /dev/console is pointed at a serial port which is not connected, every write to it will fail with EIO (Input/output error) which is the same error condition in comment 0.
The actual problem which causes tasks to fail is not inside beah itself, but rather a few of the rhts scripts (rhts-test-checkin in particular) which try and write to /dev/console directly and don't suppress any errors when that fails. Since we don't keep separate branches of rhts for every RHEL release we will actually need to solve this for all releases, not just RHEL7. Hence duping the cloned bug back onto this one.
*** Bug 1136141 has been marked as a duplicate of this bug. ***
(In reply to Dan Callaghan from comment #6) > The simplest way to reproduce this (broken /dev/console) is to run a recipe > on x86_64 with kernel_options_post="console=ttyS2". Note that on RHEL6 this does not work because Anaconda also injects the proper console argument (console=ttyS0 or whatever it is) on the kernel command line ahead of console=ttyS2 and the earlier option wins. However even with a more aggressive approach like: <ks_appends> <ks_append><![CDATA[ %post sed -i -r -e 's/console=\S+/console=ttyS2/' /boot/grub/grub.conf %end ]]></ks_append> </ks_appends> there are no errors when writing to /dev/console. Instead the output seems to be just be silently discarded. It seems like the earlier kernel's behaviour was a lot more forgiving of misconfigured /dev/console. So I don't currently have any way to reproduce this on RHEL < 7.
(And RHEL5 with a misconfigured console= doesn't even boot so...)
(In reply to Dan Callaghan from comment #10) > (And RHEL5 with a misconfigured console= doesn't even boot so...) Correct, it does boot and then writes to /dev/console return EIO, same behaviour as RHEL7.
Attempt to handle errors from /dev/console in beah: http://gerrit.beaker-project.org/4304 (but note that this is still just a best-effort attempt in case of transient errors, if /dev/console is broken beah will still proceed but logs will be lost) On systemd distros, make systemd write to /dev/console for us: http://gerrit.beaker-project.org/4305 (although this doesn't actually help to simplify matters much since we need to keep all beah's logging infrastructure for non-systemd distros) In rhts, avoid writing to /dev/console: http://gerrit.beaker-project.org/4306 This is a behaviour change, but I think overall a positive one since it reduces the amount of different spew on the console (that message will go to syslog/journal, and also to stderr which is captured by beah). It also means we don't need to make all the rhts-* scripts deal with errors when writing to /dev/console.
And lastly, make Beaker only set CONSOLE_LOG=Console for non-systemd distros: http://gerrit.beaker-project.org/4302 This is just a tidy-up, since the CONSOLE_LOG option will be overridden anyway with the beah patch above.
On RHEL5 (and earlier I guess) when rhts-compat is enabled, which is the default, all tasks will break because rhts-compat runs with stdout hooked up to the console and many, many different scripts inside rhts-compat write to stdout. It seems that at some point (I can't figure out exactly where) a script dies without actually executing the task. So these patches do not fix the case of RHEL5, with broken console, and rhts-compat enabled. But it doesn't seem worthwhile trying to fix that at this point.
Verification strategy: # Recipe on RHEL5, rhts-compat disabled # Recipe on RHEL5 with bad console, rhts-compat disabled # Recipe on RHEL6 # Recipe on RHEL7 # Recipe on RHEL7 with bad console Each recipe has /distribution/install plus some /distribution/dummy tasks to give the harness some work to do. Expected results: For recipes with bad console, console.log will be empty after installation finishes and the system reboots. However the recipe will run normally with no failures. Some "Input/output error" tracebacks may appear in debug/task_beah_unexpected as beah fails to log to the console. For recipes with normal console, console.log will show the beah/rhts log messages as tasks are executed. On RHEL7 all beah/rhts messages will also be accessible from the systemd journal (journalctl -u beah-srv) regardless of whether the console is working or not.
Beaker 21.0 has been released.