Bug 1393961
Summary: | redhat.py:67:__init__:KeyError: 'filesystem' | |||
---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Paulo Andrade <pandrade> | |
Component: | sos | Assignee: | Pavel Moravec <pmoravec> | |
Status: | CLOSED ERRATA | QA Contact: | Miroslav Hradílek <mhradile> | |
Severity: | medium | Docs Contact: | ||
Priority: | unspecified | |||
Version: | 7.2 | CC: | agk, bmr, dkochuka, gavin, isenfeld, mhradile, michele, plambri, pmoravec, sbradley, supergallego31, xnie, yozone | |
Target Milestone: | rc | |||
Target Release: | --- | |||
Hardware: | All | |||
OS: | All | |||
URL: | https://github.com/sosreport/sos/pull/942 | |||
Whiteboard: | ||||
Fixed In Version: | sos-3.4-4.el7 | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 1441155 1506596 (view as bug list) | Environment: | ||
Last Closed: | 2017-08-01 23:08:12 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1506596 |
Description
Paulo Andrade
2016-11-10 17:36:16 UTC
I'm curious how you ended up with a RHEL system, without the 'filesystem' package present?
# rpm -q --whatrequires filesystem
iputils-20121221-6.el7.x86_64
cpp-4.8.3-9.el7.x86_64
dracut-033-283.el7.x86_64
Is this a container, or other "minimal" environment?
> I believe the condition may be caused by either not having a PATH
> defined, or a timeout.
Was there any evidence to suggest this? A log of a failing run (with "-vv --debug") would be useful in understanding what's going on here.
Unfortunately, the patch suggested papers over the problem - if we failed to get the package list from RPM, we are pretty much hosed: most plugins rely on the package manager data to control enablement, so although this may allow the command to run to completion, it's not going to do anything very useful.
The filesystem package is installed. The first chunk of the pseudo patch would force running the "real" rpm, not what is in the path, of fail if the path is not set. The second chunk I agree is not fully correct. But note that on some conditions, it is not triggering an error, and it attempts to create the pkgs variable with an empty string ("") due to the way utilities.py:sos_get_command_output() works. I believe increasing PackageManager.timeout to a value larger than 30, or using the 300 default could be a good idea. The sos command is being run from abrt. I asked the user to test first with every chunk of the proposed pseudo patch; the first one to see if "rpm" is in $PATH, and the second to see if there was a timeout, or other error condition that caused it to create an empty 'pkgs'. > The first chunk of the pseudo patch would force running > the "real" rpm, not what is in the path, of fail if the > path is not set. That's not how we manage PATH in sos. If this command fails, then a whole bunch of others will fail to. The cause of this is a change we made to support UsrMove; we need to know the version of the filesystem RPM in order to know the correct policy-defined PATH to set (the point of this is exactly to avoid problems like you report). The problem in this case, is that we're calling 'rpm' before we establish PATH, but that's not a problem, since neither rpm nor any of its dependencies are installed under /usr. I'll add a 'initial path' (eff. PATH|grep -v usr) for the Red Hat policy family, that allows them to run commands from /bin and /sbin for policy bootstrapping. > The second chunk I agree is not fully correct. But note > that on some conditions, it is not triggering an error, and > it attempts to create the pkgs variable with an empty string > ("") due to the way utilities.py:sos_get_command_output() > works. It doesn't trigger an error, but it populates the packages dictionary in a way that's completely useless, and that will prevent almost all plugins from running. *** Bug 1391615 has been marked as a duplicate of this bug. *** I'm confused, what exactly is the cause here? How to reproduce it? Failure to call "rpm" during initial policy set-up but since we've never been given logs or other steps to reproduce it's hard to say; with sos-3.2-4.el7 and "mv /bin/rpm /bin/notrpm" I just get: # /bin/notrpm -q sos sos-3.3-4.el7.noarch So at a guess, in the reporter's case, rpm was taking so long that we got only a partial list of packages, and that partial list did not include the 'filesystem' pacakge. Actually, this little hack seems to work: # prepare a package list _without_ filesystem # rpm -qa --queryformat "%{NAME}|%{VERSION}\\n" | grep -v filesystem > /tmp/packages # save the real rpm # mv /bin/rpm /bin/notrpm # cat <<EOF >/bin/rpm #!/bin/bash cat /tmp/packages # sosreport now obliges: # /usr/sbin/sosreport Traceback (most recent call last): File "/usr/sbin/sosreport", line 25, in <module> main(sys.argv[1:]) File "/usr/lib/python2.7/site-packages/sos/sosreport.py", line 1636, in main sos = SoSReport(args) File "/usr/lib/python2.7/site-packages/sos/sosreport.py", line 697, in __init__ self.policy = sos.policies.load(sysroot=self.opts.sysroot) File "/usr/lib/python2.7/site-packages/sos/policies/__init__.py", line 38, in load cache['policy'] = policy(sysroot=sysroot) File "/usr/lib/python2.7/site-packages/sos/policies/redhat.py", line 158, in __init__ super(RHELPolicy, self).__init__(sysroot=sysroot) File "/usr/lib/python2.7/site-packages/sos/policies/redhat.py", line 67, in __init__ if pkgs['filesystem']['version'][0] == '3': KeyError: 'filesystem' Err, with sos-3.2-4.el7 and "mv /bin/rpm /bin/notrpm" I just get: # sosreport Could not obtain installed package list Thanks Bryn. *** Bug 1355986 has been marked as a duplicate of this bug. *** Planned to be in 7.4. Upstream PR raised. Trivial workaround evident in the PR. POSTed to upstream in https://github.com/sosreport/sos/commit/361d663d7d99b02186c3b47bf144f55c7080198f It cannot occur on RHEL5 as that release is still shipping sos-1.7, which lacks timeout support, and coreutils in RHEL5 does not include the "timeout" command (which sos uses to implement the feature). I've cloned this as bug 1441155 for RHEL6. I think the test may be broken - if we hit the error (that 'filesys' is not in the package list because of a failure to run rpm, or a failure to retrieve full package data), then we cannot ever create a report - the expectation that output contains "send this file to your support representative" is false. That said, I just checked the upstream patch and it is incomplete: commit 361d663d7d99b02186c3b47bf144f55c7080198f Author: Pavel Moravec <pmoravec> Date: Thu Feb 23 17:53:17 2017 +0100 [policies] get package list without a timeout Package list shall never timeout, otherwise pkgs gets empty, causing KeyError later on. Resolves: #942 Signed-off-by: Pavel Moravec <pmoravec> As well as disabling the timeout we need to fail immediately (with a useful message) if the package set does not contain the 'filesys' package. The only thing that we actually depend on 'filesystem' for is whether or not to enable the UsrMove path handling in policy/redhat. I'm testing a patch which changes this behaviour to assume UsrMove is we cannot find 'filesystem' - that way we do go on and generate a report, rather than treating this as a fatal error. This seems a better approach - if we cannot get any package data at all then many other things will break, but if we're just missing one, or a small number of packages then we should at least try to continue. I'll test this a bit more here and push it upstream later on (it also has the side effect of making the test assumptions true again ;). Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:2203 |