Bug 1784001 - chkconfig network on returing failed to glob pattern /etc/rc0.d/[SK][0-9][0-9]network
Summary: chkconfig network on returing failed to glob pattern /etc/rc0.d/[SK][0-9][0-9...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: diskimage-builder
Version: 16.2 (Train)
Hardware: Unspecified
OS: Unspecified
unspecified
urgent
Target Milestone: ---
: ---
Assignee: Ian Wienand
QA Contact: nlevinki
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-12-16 13:06 UTC by Chandan Kumar
Modified: 2023-04-05 10:09 UTC (History)
0 users

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-10-06 19:47:48 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
strace of failed case of chkconfig (13.41 KB, text/plain)
2019-12-16 16:50 UTC, Sagi Shnaidman
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Launchpad 1853028 0 None None None 2019-12-16 13:06:17 UTC
Red Hat Issue Tracker OSP-23985 0 None None None 2023-04-05 10:09:28 UTC

Description Chandan Kumar 2019-12-16 13:06:18 UTC
Description of problem:
In Tripleo CI side, we use diskimage builder to create overcloud images on RHEL8.1 with selinux enforcing mode.
Here is the build scipt used to create http://logs.rdoproject.org/openstack-periodic-master/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-rhel-8-buildimage-overcloud-full-master/b7e6593/build_images.sh to create the same.

Here is the full log: http://logs.rdoproject.org/openstack-periodic-master/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-rhel-8-buildimage-overcloud-full-master/b7e6593/build.log

During build process:
network-scripts-10.00.4-1.el8.x86_64 is used.

At this step, it is returning this, 
2019-12-12 00:31:12.054 | + set -o pipefail
2019-12-12 00:31:12.054 | + chkconfig network on
2019-12-12 00:31:12.055 | failed to glob pattern /etc/rc0.d/[SK][0-9][0-9]network: No such file or directory


We have no idea what is wrong there.

This issue is tracked externally here: https://bugs.launchpad.net/tripleo/+bug/1853028/

Comment 2 Lukáš Nykrýn 2019-12-16 14:40:42 UTC
Can you run it under strace and post here the output?

Comment 3 Lukáš Nykrýn 2019-12-16 15:09:46 UTC
Or is it possible to upload somewhere the content of the image right before this step?

Comment 4 Sagi Shnaidman 2019-12-16 16:49:53 UTC
(In reply to Lukáš Nykrýn from comment #3)
> Or is it possible to upload somewhere the content of the image right before
> this step?

Not really, it's being built with chroot in DIB and just fails and cleans all up. The error is not reproducible always (but in most of cases), sometimes it passes.
I ran a simple script before chkconfig in case it passed and failed: https://review.opendev.org/#/c/699178/

echo "Start debug"
ls -alsh /etc/rc0.d/ || true
ls -alsh /etc/rc0.d || true
mkdir -p /etc/rc0.d || true

Here is the output of failed case and passed case:

PASS:

 dib-run-parts Running /tmp/in_target.d/post-install.d/51-enable-network-service
 + set -o pipefail
 + ls -alsh /etc/rc0.d/
 total 0
 0 drwxr-xr-x.  2 root root  24 Dec 16 09:52 .
 0 drwxr-xr-x. 10 root root 127 Aug 30 04:52 ..
 0 lrwxrwxrwx.  1 root root  17 Dec 16 09:52 K90network -> ../init.d/network
 + ls -alsh /etc/rc0.d
 0 lrwxrwxrwx. 1 root root 10 Aug 23 06:17 /etc/rc0.d -> rc.d/rc0.d
 + mkdir -p /etc/rc0.d
 + chkconfig network on
 dib-run-parts 51-enable-network-service completed

FAIL:

 dib-run-parts Running /tmp/in_target.d/post-install.d/51-enable-network-service
 + set -o pipefail
 + echo 'Start debug'
 Start debug
 + ls -alsh /etc/rc0.d/
 ls: cannot access '/etc/rc0.d/': No such file or directory
 + true
 + ls -alsh /etc/rc0.d
 0 lrwxrwxrwx. 1 root root 10 Aug 23 06:17 /etc/rc0.d -> rc.d/rc0.d
 + mkdir -p /etc/rc0.d
 mkdir: cannot create directory '/etc/rc0.d': File exists
 + true
 + chkconfig network on
 failed to glob pattern /etc/rc0.d/[SK][0-9][0-9]network: No such file or directory

I attach the strace of failed case.
Just in case you want to see how strace executes in centos7 and compare, we have it here: https://88763c12f13d1aeca43c-63681721353a54dab1064b012b97b3cb.ssl.cf1.rackcdn.com/699221/3/check/tripleo-buildimage-overcloud-full-centos-7/09d02a8/build.log look for 'strace chkconfig network on'

The strace for rhel8 problem is here: http://logs.rdoproject.org/21/699221/3/openstack-check/tripleo-rhel-8-buildimage-overcloud-full/da44408/build.log
and also attached to this bug.

Comment 5 Sagi Shnaidman 2019-12-16 16:50:46 UTC
Created attachment 1645629 [details]
strace of failed case of chkconfig

Comment 6 Ian Wienand 2020-08-11 03:37:17 UTC
This is assigned to me, but it seems like it has been fixed in [1] via [2]?  Is this part of something else that has started happening I should look at?

[1] https://bugs.launchpad.net/tripleo/+bug/1823353
[2] https://review.openstack.org/650305

Comment 8 Alex Schultz 2020-08-11 13:19:18 UTC
https://review.openstack.org/650305 did not fix the issue in terms of the image building process. For whatever reason files are disappearing while building the image.  It started in newer versions of RHEL/CentOS and is not something we saw when running under 7.

Comment 11 Ian Wienand 2020-09-10 02:01:09 UTC
This seems to be related to the same issue as https://bugzilla.redhat.com/show_bug.cgi?id=1875266 which I think is saying tmpfile cleanup is randomly removing things?

If so, I'd suggest the same thing as there, set DIB_TMP to some scratch space not in global /tmp.

Would we agree this is the same issue, or is there more going on here?

Comment 12 Yatin Karel 2020-09-15 09:54:25 UTC
<< If so, I'd suggest the same thing as there, set DIB_TMP to some scratch space not in global /tmp.
<< Would we agree this is the same issue, or is there more going on here?

I think it's the tmpfiles cleanup issue, we actually discussed this couple of months back in relation to tmpfiles cleanup http://eavesdrop.openstack.org/irclogs/%23tripleo/%23tripleo.2020-06-29.log.html#t2020-06-29T15:17:03, if it can be reproduced with tmpfiles cleanup workaround or DIB_TMP applied then only can be considered something else is going on.

Comment 14 Steve Baker 2020-10-06 19:47:48 UTC
Closing, this was a temporary issue in early RHEL-8


Note You need to log in before you can comment on or make changes to this bug.