Bug 1481858 - CVE-2017-8379 fix causes major regression in openQA usage (programmatic typing via VNC)
Summary: CVE-2017-8379 fix causes major regression in openQA usage (programmatic typin...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: qemu
Version: 26
Hardware: All
OS: Linux
unspecified
high
Target Milestone: ---
Assignee: Fedora Virtualization Maintainers
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-08-15 21:44 UTC by Adam Williamson
Modified: 2017-08-22 20:43 UTC (History)
8 users (show)

Fixed In Version: qemu-2.9.0-5.fc26
Clone Of:
Environment:
Last Closed: 2017-08-22 20:43:11 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)

Description Adam Williamson 2017-08-15 21:44:21 UTC
Since 2017-08-09, openQA has been plagued by failures caused by typing errors - cases where key presses sent by the openQA test driver (os-autoinst) to the virtual machine running the actual test were missed. We've always had such failures, but they're usually quite rare, especially when typing at a console (we have a few tricks to mitigate the problem when typing in graphical desktops); the failure rate would be something like 1-2% of tests. Since 2017-08-10, something like 50% or more of all tests that do any significant amount of typing are failing.

After some investigation, it looks like the boxes where the tests run were updated on 2017-08-09 (as part of a mass infra update/reboot task), and went from qemu-2.9.0-1.fc26.1 to  qemu-2.9.0-3.fc26 . The most recent test batch where most tests passed as usual was run with qemu-2.9.0-1.fc26.1 ; the earliest test batch where a large number of tests failed due to typing errors was run with qemu-2.9.0-3.fc26 . I have now forcibly downgraded qemu on the openQA worker host boxes and re-run one set of tests, and for the first time since 2017-08-09, they mostly passed:

https://openqa.fedoraproject.org/tests/overview?distri=fedora&version=25&build=Update-FEDORA-2017-bd0324f3e9&groupid=2

(the failure is not a typing failure but a screenshot that needs updating). So we're fairly confident the qemu update is the cause of the problem.

It seems highly likely the culprit is this commit:

https://github.com/qemu/qemu/commit/fa18f36a461984eae50ab957e47ec78dae3c14fc

which is intended to fix CVE-2017-8379 . Cole also notes that a SUSE developer, Alex Graf, has landed three commits since that one which seem very relevant to the specific case of openQA:

https://github.com/qemu/qemu/commit/77b0359bf414ad666d1714dc9888f1017c08e283
https://github.com/qemu/qemu/commit/51dbea77a29ea46173373a6dad4ebd95d4661f42
https://github.com/qemu/qemu/commit/d3b0db6dfea6b3a9ee0d96aceb796bdcafa84314

so we're going to check whether applying those commits on top of the 'limit kbd queue depth' commit makes things behave better. I'll do a scratch build and test this soon.

Comment 1 Adam Williamson 2017-08-16 06:23:45 UTC
So I've deployed a scratch build with those three patches applied on openQA staging, and it seems to be behaving *much* better so far. Will plan with Cole tomorrow what to do about sending out updates etc.

Comment 2 Fedora Update System 2017-08-16 22:22:50 UTC
qemu-2.9.0-5.fc26 has been submitted as an update to Fedora 26. https://bodhi.fedoraproject.org/updates/FEDORA-2017-a314d15e62

Comment 3 Fedora Update System 2017-08-19 18:54:01 UTC
qemu-2.9.0-5.fc26 has been pushed to the Fedora 26 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2017-a314d15e62

Comment 4 Fedora Update System 2017-08-22 20:43:11 UTC
qemu-2.9.0-5.fc26 has been pushed to the Fedora 26 stable repository. If problems still persist, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.