Bug 1784001
| Summary: | chkconfig network on returing failed to glob pattern /etc/rc0.d/[SK][0-9][0-9]network | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat OpenStack | Reporter: | Chandan Kumar <chkumar> | ||||
| Component: | diskimage-builder | Assignee: | Ian Wienand <iwienand> | ||||
| Status: | CLOSED CURRENTRELEASE | QA Contact: | nlevinki <nlevinki> | ||||
| Severity: | urgent | Docs Contact: | |||||
| Priority: | unspecified | ||||||
| Version: | 16.2 (Train) | Keywords: | Triaged | ||||
| Target Milestone: | --- | ||||||
| Target Release: | --- | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2020-10-06 19:47:48 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
|
Description
Chandan Kumar
2019-12-16 13:06:18 UTC
Can you run it under strace and post here the output? Or is it possible to upload somewhere the content of the image right before this step? (In reply to Lukáš Nykrýn from comment #3) > Or is it possible to upload somewhere the content of the image right before > this step? Not really, it's being built with chroot in DIB and just fails and cleans all up. The error is not reproducible always (but in most of cases), sometimes it passes. I ran a simple script before chkconfig in case it passed and failed: https://review.opendev.org/#/c/699178/ echo "Start debug" ls -alsh /etc/rc0.d/ || true ls -alsh /etc/rc0.d || true mkdir -p /etc/rc0.d || true Here is the output of failed case and passed case: PASS: dib-run-parts Running /tmp/in_target.d/post-install.d/51-enable-network-service + set -o pipefail + ls -alsh /etc/rc0.d/ total 0 0 drwxr-xr-x. 2 root root 24 Dec 16 09:52 . 0 drwxr-xr-x. 10 root root 127 Aug 30 04:52 .. 0 lrwxrwxrwx. 1 root root 17 Dec 16 09:52 K90network -> ../init.d/network + ls -alsh /etc/rc0.d 0 lrwxrwxrwx. 1 root root 10 Aug 23 06:17 /etc/rc0.d -> rc.d/rc0.d + mkdir -p /etc/rc0.d + chkconfig network on dib-run-parts 51-enable-network-service completed FAIL: dib-run-parts Running /tmp/in_target.d/post-install.d/51-enable-network-service + set -o pipefail + echo 'Start debug' Start debug + ls -alsh /etc/rc0.d/ ls: cannot access '/etc/rc0.d/': No such file or directory + true + ls -alsh /etc/rc0.d 0 lrwxrwxrwx. 1 root root 10 Aug 23 06:17 /etc/rc0.d -> rc.d/rc0.d + mkdir -p /etc/rc0.d mkdir: cannot create directory '/etc/rc0.d': File exists + true + chkconfig network on failed to glob pattern /etc/rc0.d/[SK][0-9][0-9]network: No such file or directory I attach the strace of failed case. Just in case you want to see how strace executes in centos7 and compare, we have it here: https://88763c12f13d1aeca43c-63681721353a54dab1064b012b97b3cb.ssl.cf1.rackcdn.com/699221/3/check/tripleo-buildimage-overcloud-full-centos-7/09d02a8/build.log look for 'strace chkconfig network on' The strace for rhel8 problem is here: http://logs.rdoproject.org/21/699221/3/openstack-check/tripleo-rhel-8-buildimage-overcloud-full/da44408/build.log and also attached to this bug. Created attachment 1645629 [details]
strace of failed case of chkconfig
This is assigned to me, but it seems like it has been fixed in [1] via [2]? Is this part of something else that has started happening I should look at? [1] https://bugs.launchpad.net/tripleo/+bug/1823353 [2] https://review.openstack.org/650305 https://review.openstack.org/650305 did not fix the issue in terms of the image building process. For whatever reason files are disappearing while building the image. It started in newer versions of RHEL/CentOS and is not something we saw when running under 7. This seems to be related to the same issue as https://bugzilla.redhat.com/show_bug.cgi?id=1875266 which I think is saying tmpfile cleanup is randomly removing things? If so, I'd suggest the same thing as there, set DIB_TMP to some scratch space not in global /tmp. Would we agree this is the same issue, or is there more going on here? << If so, I'd suggest the same thing as there, set DIB_TMP to some scratch space not in global /tmp. << Would we agree this is the same issue, or is there more going on here? I think it's the tmpfiles cleanup issue, we actually discussed this couple of months back in relation to tmpfiles cleanup http://eavesdrop.openstack.org/irclogs/%23tripleo/%23tripleo.2020-06-29.log.html#t2020-06-29T15:17:03, if it can be reproduced with tmpfiles cleanup workaround or DIB_TMP applied then only can be considered something else is going on. Closing, this was a temporary issue in early RHEL-8 |