Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1819232

Summary:	MCO stuck in pending after upgrade
Product:	OpenShift Container Platform	Reporter:	Ben Parees <bparees>
Component:	Machine Config Operator	Assignee:	Antonio Murdaca <amurdaca>
Status:	CLOSED DUPLICATE	QA Contact:	Michael Nguyen <mnguyen>
Severity:	high	Docs Contact:
Priority:	unspecified
Version:	4.4	CC:	sbatsche
Target Milestone:	---
Target Release:	4.4.0
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2020-04-03 07:30:09 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Comment 1 Antonio Murdaca 2020-03-31 14:35:09 UTC

what's happening is MCD on masters complaining about the etcd recovery files - I think this bug is already reported and fixed in 4.4 and maybe since this cluster started with a 4.4 nightly from 03/15 it may be the issue.

CC'ing Sam and lowering priority as I think we fixed this already:


```
...

E0330 21:59:16.155040 2504182 daemon.go:1354] content mismatch for file /usr/local/bin/etcd-snapshot-restore.sh: #!/usr/bin/env bash

set -o errexit
set -o pipefail


A: set -o errtrace

# example
# ./etcd-snapshot-restore.sh $path-to-backup

...
```

Comment 2 Antonio Murdaca 2020-03-31 14:40:59 UTC

also, Ben, was this cluster installed with 4.4.0-0.nightly-2020-03-15-215151 and later upgraded to 4.4.0-0.nightly-2020-03-30-163532? or it was something else before becoming 4.4.0-0.nightly-2020-03-15-215151? that would explain the bug on the MCO side which we're fixing in 4.4 (https://github.com/openshift/machine-config-operator/pull/1593)

Comment 3 Ben Parees 2020-03-31 14:54:11 UTC

>  I think this bug is already reported and fixed in 4.4 and maybe since this cluster started with a 4.4 nightly from 03/15 it may be the issue.

if it's fixed in the version i've upgraded to, would the expectation be that it recovers?  I'm not clear if you're saying it's been fixed already(in some nightly) or not.


> also, Ben, was this cluster installed with 4.4.0-0.nightly-2020-03-15-215151 and later upgraded to 4.4.0-0.nightly-2020-03-30-163532? or it was something else before becoming 4.4.0-0.nightly-2020-03-15-215151


Looks like it started on 4.4.0-0.nightly-2020-03-10-115843

    history:
    - completionTime: null
      image: registry.svc.ci.openshift.org/ocp/release:4.4.0-0.nightly-2020-03-30-163532
      startedTime: "2020-03-30T19:31:49Z"
      state: Partial
      verified: false
      version: 4.4.0-0.nightly-2020-03-30-163532
    - completionTime: "2020-03-24T15:54:19Z"
      image: registry.svc.ci.openshift.org/ocp/release:4.4
      startedTime: "2020-03-16T00:04:36Z"
      state: Completed
      verified: false
      version: 4.4.0-0.nightly-2020-03-15-215151
    - completionTime: "2020-03-10T19:38:36Z"
      image: registry.svc.ci.openshift.org/ocp/release@sha256:36cd5bc706b135e4dff14064dd4c6dffb87d5c04158e3a8243d2ffe94beea64e
      startedTime: "2020-03-10T19:18:46Z"
      state: Completed
      verified: false
      version: 4.4.0-0.nightly-2020-03-10-115843

Comment 4 Antonio Murdaca 2020-04-03 07:30:09 UTC


*** This bug has been marked as a duplicate of bug 1817455 ***

Comment 5 Red Hat Bugzilla 2023-09-14 05:54:52 UTC

The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days