Bug 1888467

Summary: Repos should be disabled in -firstboot.service before OS extensions are applied
Product: OpenShift Container Platform Reporter: OpenShift BugZilla Robot <openshift-bugzilla-robot>
Component: Machine Config OperatorAssignee: Vadim Rutkovsky <vrutkovs>
Status: CLOSED ERRATA QA Contact: Michael Nguyen <mnguyen>
Severity: high Docs Contact:
Priority: high    
Version: 4.6CC: amurdaca, bleanhar, cglombek, mkrejci, smilner
Target Milestone: ---   
Target Release: 4.6.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-12-14 13:50:18 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1884165    
Bug Blocks:    

Description OpenShift BugZilla Robot 2020-10-14 22:40:19 UTC
+++ This bug was initially created as a clone of Bug #1884165 +++

MCO uses rpm-ostree install when installing shipped extensions. However, the image may contain enabled repos, so external repositories may be used to install RPMs during firstboot.

This mostly affects OKD, which has FCOS with standard Fedora repos enabled by default. As a result installed RPMs being pulled from ostree repo instead of embedded RPM repository

--- Additional comment from bleanhar on 2020-10-02 19:24:40 UTC ---

This doesn't seem like a blocker.  I'm going to move out it.  I trust Vadim will backport it if necessary for OKD.

Comment 4 Christian Glombek 2020-11-25 20:16:36 UTC
setting severity and priority to high as this is an OKD 4.6 release blocker

Comment 7 Michael Nguyen 2020-11-30 22:30:01 UTC
Verified on 4.6.0-0.nightly-2020-11-28-204928 -- machine-config-daemon-firstboot.service has the fix in the unit file.



$ oc debug node/ip-10-0-128-247.us-west-2.compute.internal
Starting pod/ip-10-0-128-247us-west-2computeinternal-debug ...
To use host binaries, run `chroot /host`
If you don't see a command prompt, try pressing enter.
sh-4.2# chroot /host
sh-4.4# systemctl cat machine-config-daemon-firstboot.service
# /etc/systemd/system/machine-config-daemon-firstboot.service
[Unit]
Description=Machine Config Daemon Firstboot
# Make sure it runs only on OSTree booted system
ConditionPathExists=/run/ostree-booted
# Removal of this file signals firstboot completion
ConditionPathExists=/etc/ignition-machine-config-encapsulated.json
After=machine-config-daemon-pull.service
Before=crio.service crio-wipe.service
Before=kubelet.service

[Service]
Type=oneshot
RemainAfterExit=yes
# Disable existing repos (if any) so that OS extensions would use embedded RPMs only
ExecStartPre=-/usr/bin/sh -c "sed -i 's/enabled=1/enabled=0/' /etc/yum.repos.d/*.repo"
ExecStart=/run/bin/machine-config-daemon firstboot-complete-machineconfig

[Install]
WantedBy=multi-user.target
RequiredBy=crio.service kubelet.service
sh-4.4# journalctl -u machine-config-daemon-firstboot
-- Logs begin at Mon 2020-11-30 22:15:10 UTC, end at Mon 2020-11-30 22:28:25 UTC. --
Nov 30 22:16:16 ip-10-0-128-247 systemd[1]: Starting Machine Config Daemon Firstboot...
Nov 30 22:16:16 ip-10-0-128-247 sh[1941]: sed: can't read /etc/yum.repos.d/*.repo: No such file or directory
Nov 30 22:16:17 ip-10-0-128-247 machine-config-daemon[1943]: I1130 22:16:17.017716    1943 rpm-ostree.go:261] Running captured: rpm-ostree status --json
Nov 30 22:16:22 ip-10-0-128-247 machine-config-daemon[1943]: I1130 22:16:22.288033    1943 daemon.go:226] Booted osImageURL:  (46.82.202010011740-0)
Nov 30 22:16:28 ip-10-0-128-247 machine-config-daemon[1943]: I1130 22:16:28.018090    1943 daemon.go:233] Installed Ignition binary version: 2.6.0
Nov 30 22:16:28 ip-10-0-128-247 machine-config-daemon[1943]: I1130 22:16:28.019571    1943 update.go:423] Checking Reconcilable for config mco-empty-mc to rendered-worker-e2825f140e84a200a54950101efa3790
Nov 30 22:16:28 ip-10-0-128-247 machine-config-daemon[1943]: I1130 22:16:28.020175    1943 update.go:1653] Starting update from mco-empty-mc to rendered-worker-e2825f140e84a200a54950101efa3790: &{osUpdate:true kargs:false fips:false >
Nov 30 22:16:28 ip-10-0-128-247 machine-config-daemon[1943]: I1130 22:16:28.023861    1943 update.go:1040] Updating files
Nov 30 22:16:28 ip-10-0-128-247 machine-config-daemon[1943]: I1130 22:16:28.024188    1943 update.go:1113] Deleting stale data
Nov 30 22:16:28 ip-10-0-128-247 machine-config-daemon[1943]: I1130 22:16:28.034266    1943 controlplane.go:52] Set root blockdev /sys/devices/pci0000:00/0000:00:04.0/nvme/nvme0/nvme0n1 to use scheduler bfq
Nov 30 22:16:28 ip-10-0-128-247 machine-config-daemon[1943]: I1130 22:16:28.034491    1943 run.go:18] Running: nice -- ionice -c 3 oc image extract --path /:/run/mco-machine-os-content/os-content-024229197 --registry-config /var/lib/>
Nov 30 22:17:40 ip-10-0-128-247 machine-config-daemon[1943]: I1130 22:17:40.424536    1943 update.go:1531] Updating OS to quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:ccf308bdfbc7e0d076a94be9fc0b1671703c7be56a9de3fbbb46e880d>
Nov 30 22:17:40 ip-10-0-128-247 machine-config-daemon[1943]: I1130 22:17:40.424719    1943 rpm-ostree.go:261] Running captured: rpm-ostree status --json
Nov 30 22:17:40 ip-10-0-128-247 machine-config-daemon[1943]: I1130 22:17:40.487729    1943 rpm-ostree.go:184] Current origin is not custom
Nov 30 22:17:42 ip-10-0-128-247 machine-config-daemon[1943]: I1130 22:17:42.104810    1943 rpm-ostree.go:211] Pivoting to: 46.82.202011281819-0 (8398b4ee2024e71d9de24e95b5a97da3bc7e4a010a779f551f640d5f46d20739)
Nov 30 22:17:42 ip-10-0-128-247 machine-config-daemon[1943]: I1130 22:17:42.104833    1943 rpm-ostree.go:243] Executing rebase from repo path /run/mco-machine-os-content/os-content-024229197/srv/repo with customImageURL pivot://quay.>
Nov 30 22:17:42 ip-10-0-128-247 machine-config-daemon[1943]: I1130 22:17:42.104846    1943 rpm-ostree.go:261] Running captured: rpm-ostree rebase --experimental /run/mco-machine-os-content/os-content-024229197/srv/repo:8398b4ee2024e7>
Nov 30 22:18:03 ip-10-0-128-247 machine-config-daemon[1943]: I1130 22:18:03.219052    1943 update.go:1653] initiating reboot: Completing firstboot provisioning to rendered-worker-e2825f140e84a200a54950101efa3790
Nov 30 22:18:03 ip-10-0-128-247 systemd[1]: machine-config-daemon-firstboot.service: Main process exited, code=killed, status=15/TERM
Nov 30 22:18:03 ip-10-0-128-247 systemd[1]: machine-config-daemon-firstboot.service: Failed with result 'signal'.
Nov 30 22:18:03 ip-10-0-128-247 systemd[1]: Stopped Machine Config Daemon Firstboot.
Nov 30 22:18:03 ip-10-0-128-247 systemd[1]: machine-config-daemon-firstboot.service: Consumed 17.711s CPU time
sh-4.4# exit
exit
sh-4.2# exit
exit

Removing debug pod ...

Comment 9 errata-xmlrpc 2020-12-14 13:50:18 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.6.8 security and bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:5259