Bug 1566089
Summary: | oscap --remediate hangs at autofs | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Thomas Jones <redhat> | ||||
Component: | openscap | Assignee: | Matus Marhefka <mmarhefk> | ||||
Status: | CLOSED CANTFIX | QA Contact: | BaseOS QE Security Team <qe-baseos-security> | ||||
Severity: | unspecified | Docs Contact: | |||||
Priority: | unspecified | ||||||
Version: | 7.5 | CC: | jcerny, loren, mhaicman, openscap-maint | ||||
Target Milestone: | rc | ||||||
Target Release: | --- | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2018-04-18 09:41:09 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Thomas Jones
2018-04-11 13:47:28 UTC
FWIW, here's the same command but using the SCAP content distributed by the scap-security-guide RPM, resulting in the same issue. Run: ``` oscap xccdf eval --remediate --profile xccdf_org.ssgproject.content_profile_C2S /usr/share/xml/scap/ssg/content/ssg-rhel7-ds.xml ``` Hangs in the same place: ``` Title Disable the Automounter Rule xccdf_org.ssgproject.content_rule_service_autofs_disabled Ident CCE-27498-5 ``` Hello, I was unable to reproduce this issue. I suppose you are using datastream from scap-security-guide-0.1.36-7.el7.noarch? Can you try running remediation only for this single rule: $ sudo oscap xccdf eval --remediate --profile xccdf_org.ssgproject.content_profile_C2S --rule xccdf_org.ssgproject.content_rule_service_autofs_disabled /usr/share/xml/scap/ssg/content/ssg-rhel7-ds.xml so we are 100% sure that previous rules/remediations do not have any influence in that hanging issue. Also based on your reports it is not clear if the hanging issue appears during the scan or after the scan during remediation process. When looking at the remediation in ssg-rhel7-ds.xml for this rule, it runs the following command: systemctl disable autofs Can you try to execute the following commands manually and report their output: $ sudo systemctl status autofs $ sudo systemctl disable autofs $ sudo systemctl status autofs Thanks, yes, we are using the datastream from scap-security-guide-0.1.36-7.el7.noarch. The hanging issue appears during the scan portion, before any remediation begins. The issue occurs even just running the single rule, using the command you provided. autofs doesn't look to be installed... ``` # systemctl status autofs Unit autofs.service could not be found. # rpm -q autofs package autofs is not installed ``` I went ahead and installed autofs, but that did not resolve the problem. It was installed in the disabled state, so I enabled it and tried again. Still no dice. It was enabled but dead, so I started it. Continued to hang. Still searching for a cause here... (In reply to Matus Marhefka from comment #3) > Hello, > > I was unable to reproduce this issue. I suppose you are using datastream > from scap-security-guide-0.1.36-7.el7.noarch? Can you try running > remediation only for this single rule: > > $ sudo oscap xccdf eval --remediate --profile > xccdf_org.ssgproject.content_profile_C2S --rule > xccdf_org.ssgproject.content_rule_service_autofs_disabled > /usr/share/xml/scap/ssg/content/ssg-rhel7-ds.xml Did some additional testing on a new test-rig. Looks like running `oscap xccdf eval --remediate --profile xccdf_org.ssgproject.content_profile_C2S --rule xccdf_org.ssgproject.content_rule_service_autofs_disabled /usr/share/xml/scap/ssg/content/ssg-rhel7-ds.xml` doesn't so much hang/fail as it attempts to make a call-out to external content, pauses while that times out, but then ultimately returns: ``` # strace -D -f -ff -t -y -yy -o oscap-remediate.trace-$(date "+%Y%m%d%H%M") oscap xccdf eval --remediate --profile xccdf_org.ssgproject.content_profile_C2S --rule xccdf_org.ssgproject.content_rule_service_autofs_disabled /usr/share/xml/scap/ssg/content/ssg-rhel7-ds.xml WARNING: This content points out to the remote resources. Use `--fetch-remote-resources' option to download them. WARNING: Skipping https://www.redhat.com/security/data/oval/com.redhat.rhsa-RHEL7.xml.bz2 file which is referenced from XCCDF content Title Disable the Automounter Rule xccdf_org.ssgproject.content_rule_service_autofs_disabled Ident CCE-27498-5 Result pass --- Starting Remediation --- WARNING: This content points out to the remote resources. Use `--fetch-remote-resources' option to download them. WARNING: Skipping https://www.redhat.com/security/data/oval/com.redhat.rhsa-RHEL7.xml.bz2 file which is referenced from XCCDF content ``` Note: new test rig does not have autofs installed. While *this* test-EC2 isn't failing on the requested, stand-alone oscap-task, going to try to attach an sos report any way (our test EC2s will generally align very closely to this current one). Created attachment 1421436 [details]
Exemplar SOS report
Capturing SOS for test-EC2 (our EC2s are auto-built so should be just about identical if we need to launch a new one for further testing-exercises)
Did a lot more digging around, today. Looks like our issue's a bit more complicated than first blush. The AMI we use is derived from the Red Hat published one. Specifically, to meet our customers' compliance-needs, the AMI needs to be partitioned and have FIPS-mode enabled. The derivation-method allows our customized AMI to use the RHUI entitlements that the Red Hat published upstream AMI provides. We were initially testing against an RHEL 7.4 instance that _became_ (well, tried to, really) 7.5 when the remediation was run. Ultimately, it looks like there's an issue when moving from the 7.4 (March 2018) version of DBUS to 7.5's verion. During yum update, we started getting DBUS hangs and "quark 39" errors. Upon attempting a post-update reboot, the reboot process would take a very long time. In looking at the logs, the slowness was because a number of services were getting connection refused errors from DBUS and having to be timeout-killed. Upon finally coming back online, DBUS continued to be deranged: even attempting to do `timedatectl status` resulted in "connection refused" errors and /var/log/messages was logging NetworkManager-related DBUS errors a couple times per minute. There didn't seem to be a recovery path for the test-EC2, so we terminated it and launched a new one. Prior to doing the `yum update`, we: * installed the versionlock yum plugin * added "exclude=dbus dbus-libs teamd libteam wpa_supplicant" to the /etc/yum.conf * added openscap-1.2.14 and openscap-scanner-1.2.14 to /etc/yum/pluginconf.d/versionlock.list. The actual issue seems to be the DBUS update when going 7.4 to 7.5. The openscap-scanner, itself, has a DBUS dependency that, absent version-locking it, will also cause an attempted update of DBUS. After running yum update and rebooting, the original `openscap --remediate` that was hanging on prior testing EC2 instances succeeded. Did a lot more digging around, today. Looks like our issue's a bit more complicated than first blush. The AMI we use is derived from the Red Hat published one. Specifically, to meet our customers' compliance-needs, the AMI needs to be partitioned and have FIPS-mode enabled. The derivation-method allows our customized AMI to use the RHUI entitlements that the Red Hat published upstream AMI provides. We were initially testing against an RHEL 7.4 instance that _became_ (well, tried to, really) 7.5 when the remediation was run. Ultimately, it looks like there's an issue when moving from the 7.4 (March 2018) version of DBUS to 7.5's verion. During yum update, we started getting DBUS hangs and "quark 39" errors. Upon attempting a post-update reboot, the reboot process would take a very long time. In looking at the logs, the slowness was because a number of services were getting connection refused errors from DBUS and having to be timeout-killed. Upon finally coming back online, DBUS continued to be deranged: even attempting to do `timedatectl status` resulted in "connection refused" errors and /var/log/messages was logging NetworkManager-related DBUS errors a couple times per minute. There didn't seem to be a recovery path for the test-EC2, so we terminated it and launched a new one. Prior to doing the `yum update`, we: * installed the versionlock yum plugin * added "exclude=dbus dbus-libs teamd libteam wpa_supplicant" to the /etc/yum.conf * added openscap-1.2.14 and openscap-scanner-1.2.14 to /etc/yum/pluginconf.d/versionlock.list. The actual issue seems to be the DBUS update when going 7.4 to 7.5. The openscap-scanner, itself, has a DBUS dependency that, absent version-locking it, will also cause an attempted update of DBUS. After running yum update and rebooting, the original `openscap --remediate` that was hanging on prior testing EC2 instances succeeded. Hello Thomas, thank you for a detailed report. Have you also tried if the problem occurs on the fresh RHEL-7.5 installation or your situation requires upgrade from 7.4 to 7.5? Anyway, I think we should either report a new bug against dbus or change this bug's component to dbus. Do you agree? The situation only seems to trigger on the 7.4 to 7.5 migration. The two specific scenarios we've observed are: * Updating an instance built from a 7.4 AMI to the most recent patch set * Attempting to create a "downstream" 7.5 AMI from a 7.4 upstream AMI The latter is mostly a matter of being inconvenient: we have to update our Packer jobs' starting AMI ID rather than being able to use an arbitrary AMI as we are able to do with our RHEL 6 AMI-generation routines. The former is the more critical problem as, right now, it seems like the tenants we provide tailored AMIs for may need to be told "you have two choices: re-deploy from the new, 7.5 AMI; or, use the following multi-step, manual procedure to version-pin your DBUS so that you execute your normal `yum update` procedures". Given that it's apparent this is an issue with DBUS — possibly our particular method of (ab)using it — and that the apparent `oscap` issues were symptomatic of DBUS issues rather than `oscap` issues, it likely makes sense to close this issue in favor of opening a DBUS issue. Do you need me to open the issue or can you open the issue (backreferencing this one)? Yes, please open the new issue for DBUS. I am closing this one, but you can still reference it in the new issue. |