Bug 1894477 - bash syntax error in nodeip-configuration.service
Summary: bash syntax error in nodeip-configuration.service
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Machine Config Operator
Version: 4.7
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: 4.7.0
Assignee: Antonio Murdaca
QA Contact: Michael Nguyen
URL:
Whiteboard:
Depends On:
Blocks: 1894483
TreeView+ depends on / blocked
 
Reported: 2020-11-04 11:05 UTC by Tomáš Nožička
Modified: 2021-02-24 15:30 UTC (History)
0 users

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1894483 (view as bug list)
Environment:
Last Closed: 2021-02-24 15:30:03 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift machine-config-operator pull 2199 0 None closed Bug 1894477: Fix bash in nodeip-configuration.service 2021-01-31 23:31:28 UTC
Red Hat Product Errata RHSA-2020:5633 0 None None None 2021-02-24 15:30:33 UTC

Description Tomáš Nožička 2020-11-04 11:05:29 UTC
Description of problem:
nodeip-configuration.service contains invalid bash which borked OKD UPI cluster install.

Version-Release number of selected component (if applicable):
4.6.0-0.okd-2020-11-03-123207


How reproducible:
always


Steps to Reproduce:
1. install OKD with UPI
2.
3.

Actual results:
[root@master-0 ~]# systemctl status nodeip-configuration
● nodeip-configuration.service - Writes IP address configuration so that kubelet and crio services select a valid node IP
     Loaded: loaded (/etc/systemd/system/nodeip-configuration.service; enabled; vendor preset: enabled)
     Active: failed (Result: exit-code) since Wed 2020-11-04 08:22:50 UTC; 22min ago
    Process: 980 ExecStart=/bin/bash -c     until    /usr/bin/podman run --rm    --authfile /var/lib/kubelet/config.json    --net=host    --volume /etc/systemd/system:/etc/systemd/system:z    registry.svc.ci.openshift.org/origin/4.6-2020>
   Main PID: 980 (code=exited, status=1/FAILURE)
        CPU: 4ms
lis 04 08:22:50 master-0 systemd[1]: Starting Writes IP address configuration so that kubelet and crio services select a valid node IP...
lis 04 08:22:50 master-0 bash[980]: /bin/bash: -c: line 0: syntax error near unexpected token `done'
lis 04 08:22:50 master-0 bash[980]: /bin/bash: -c: line 0: `    until    /usr/bin/podman run --rm    --authfile /var/lib/kubelet/config.json    --net=host    --volume /etc/systemd/system:/etc/systemd/system:z    registry.svc.ci.openshift>
lis 04 08:22:50 master-0 systemd[1]: nodeip-configuration.service: Main process exited, code=exited, status=1/FAILURE
lis 04 08:22:50 master-0 systemd[1]: nodeip-configuration.service: Failed with result 'exit-code'.
lis 04 08:22:50 master-0 systemd[1]: Failed to start Writes IP address configuration so that kubelet and crio services select a valid node IP.


Expected results:
success

Additional info:

Comment 2 Michael Nguyen 2020-11-18 21:09:24 UTC
Verified the fix is in 4.7.0-0.nightly-2020-11-18-125028.


$ oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.7.0-0.nightly-2020-11-18-125028   True        False         38m     Cluster version is 4.7.0-0.nightly-2020-11-18-125028
$ oc debug node/ip-10-0-147-250.us-west-2.compute.internal
Starting pod/ip-10-0-147-250us-west-2computeinternal-debug ...
To use host binaries, run `chroot /host`
If you don't see a command prompt, try pressing enter.
sh-4.2# chroot /host
sh-4.4# cat /etc/systemd/system/nodeip-configuration.service 
[Unit]
Description=Writes IP address configuration so that kubelet and crio services select a valid node IP
Wants=network-online.target
After=network-online.target ignition-firstboot-complete.service
Before=kubelet.service crio.service

[Service]
# Need oneshot to delay kubelet
Type=oneshot
# Would prefer to do Restart=on-failure instead of this bash retry loop, but
# the version of systemd we have right now doesn't support it. It should be
# available in systemd v244 and higher.
ExecStart=/bin/bash -c " \
  until \
  /usr/bin/podman run --rm \
  --authfile /var/lib/kubelet/config.json \
  --net=host \
  --volume /etc/systemd/system:/etc/systemd/system:z \
  quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:de61a4e12d0893b90aaf170b28eca51504d94be9724b91ac27deed88001d54ee \
  node-ip \
  set --retry-on-failure; \
  do \
  sleep 5; \
  done"

[Install]
RequiredBy=kubelet.service
sh-4.4# systemctl status nodeip-configuration.service
● nodeip-configuration.service - Writes IP address configuration so that kubelet and crio services sele>
   Loaded: loaded (/etc/systemd/system/nodeip-configuration.service; disabled; vendor preset: disabled)
   Active: inactive (dead)
sh-4.4# systemctl start nodeip-configuration.service


sh-4.4# 
sh-4.4# 
sh-4.4# systemctl status nodeip-configuration.service
● nodeip-configuration.service - Writes IP address configuration so that kubelet and crio services sele>
   Loaded: loaded (/etc/systemd/system/nodeip-configuration.service; disabled; vendor preset: disabled)
   Active: inactive (dead)

Nov 18 21:08:05 ip-10-0-147-250 bash[84158]: Writing manifest to image destination
Nov 18 21:08:05 ip-10-0-147-250 bash[84158]: Storing signatures
Nov 18 21:08:07 ip-10-0-147-250 bash[84158]: time="2020-11-18T21:08:07Z" level=info msg="Address 10.0.1>
Nov 18 21:08:07 ip-10-0-147-250 bash[84158]: time="2020-11-18T21:08:07Z" level=info msg="Chosen Node IP>
Nov 18 21:08:07 ip-10-0-147-250 bash[84158]: time="2020-11-18T21:08:07Z" level=info msg="Opening Kubele>
Nov 18 21:08:07 ip-10-0-147-250 bash[84158]: time="2020-11-18T21:08:07Z" level=info msg="Writing Kubele>
Nov 18 21:08:07 ip-10-0-147-250 bash[84158]: time="2020-11-18T21:08:07Z" level=info msg="Opening CRI-O >
Nov 18 21:08:07 ip-10-0-147-250 bash[84158]: time="2020-11-18T21:08:07Z" level=info msg="Writing CRI-O >
Nov 18 21:08:08 ip-10-0-147-250 systemd[1]: Started Writes IP address configuration so that kubelet and>
Nov 18 21:08:08 ip-10-0-147-250 systemd[1]: nodeip-configuration.service: Consumed 7.990s CPU time

sh-4.4# exit
exit
sh-4.2# exit
exit

Removing debug pod ...

Comment 3 Michael Nguyen 2020-11-18 21:48:23 UTC
There are no accepted builds for 4.7 nightly right now.  Looking at the release payload, the MCO is at a commit that contains the fix.

$ mco-commit  registry.svc.ci.openshift.org/origin/release:4.7.0-0.okd-2020-11-18-131704
Name:        registry.svc.ci.openshift.org/origin/4.7-2020-11-18-131704@sha256:c2e191d9e3cfd7338809dbcee745016dbc85748efb567d5eaf33051f5503bba3
Media Type:  application/vnd.docker.distribution.manifest.v2+json
Created:     9h ago
Image Size:  143.1MB in 6 layers
Layers:      73.86MB sha256:ec1681b6a383e4ecedbeddd5abc596f3de835aed6db39a735f62395c8edbff30
             1.789kB sha256:c4d668e229cd131e0a8e4f8218dca628d9cf9697572875e355fe4b247b6aa9f0
             4.468MB sha256:93495cb22058662a790cab6ce01d962573f9c7ba350ac36304cc22380294ea38
             447kB   sha256:3e2f47dba162e3f140c4104888167c7b39a56ace6c866c39a757e8020d7fb984
             12.38MB sha256:506373d67132f73bea88e6e527ee035b0140345f7d785e5e150ac9651a593467
             51.91MB sha256:64af0388a8325b520a869989e6bf677e25880c6897b7e508a1688e3c23b2a0aa
OS:          linux
Arch:        amd64
Entrypoint:  /usr/bin/machine-config-operator
User:        0
Environment: OPENSHIFT_CI=true
             PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
             container=oci
             GODEBUG=x509ignoreCN=0
             foo=bar
             OPENSHIFT_BUILD_NAME=machine-config-operator
             OPENSHIFT_BUILD_NAMESPACE=ci-op-ptw2jkjj
Labels:      architecture=x86_64
             build-date=2020-10-17T00:38:27.192916
             com.redhat.build-host=cpt-1002.osbs.prod.upshift.rdu2.redhat.com
             com.redhat.component=openshift-enterprise-base-container
             com.redhat.license_terms=https://www.redhat.com/agreements
             description=The Universal Base Image is designed and engineered to be the base layer for all of your containerized applications, middleware and utilities. This base image is freely redistributable, but Red Hat only supports Red Hat technologies through subscriptions for Red Hat products. This image is maintained by Red Hat and updated regularly.
             distribution-scope=public
             io.buildah.version=1.16.4
             io.k8s.description=This is the base image from which all OpenShift images inherit.
             io.k8s.display-name=OpenShift Base
             io.openshift.build.commit.author=
             io.openshift.build.commit.date=
             io.openshift.build.commit.id=e02ed51beb57f64bd8a17e2bddf5f5a37af2669f
             io.openshift.build.commit.message=
             io.openshift.build.commit.ref=master
             io.openshift.build.name=
             io.openshift.build.namespace=
             io.openshift.build.source-context-dir=
             io.openshift.build.source-location=https://github.com/openshift/machine-config-operator
             io.openshift.expose-services=
             io.openshift.release.operator=true
             io.openshift.tags=base rhel8
             maintainer=Red Hat, Inc.
             name=openshift/ose-base
             release=202010170037.13635
             summary=Provides the latest release of Red Hat Universal Base Image 8.
             url=https://access.redhat.com/containers/#/registry.access.redhat.com/openshift/ose-base/images/v4.0-202010170037.13635
             vcs-ref=e02ed51beb57f64bd8a17e2bddf5f5a37af2669f
             vcs-type=git
             vcs-url=https://github.com/openshift/machine-config-operator
             vendor=Red Hat, Inc.
             version=v4.0

Comment 6 errata-xmlrpc 2021-02-24 15:30:03 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:5633


Note You need to log in before you can comment on or make changes to this bug.