Bug 1819232 - MCO stuck in pending after upgrade
Summary: MCO stuck in pending after upgrade
Keywords:
Status: CLOSED DUPLICATE of bug 1817455
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Machine Config Operator
Version: 4.4
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: 4.4.0
Assignee: Antonio Murdaca
QA Contact: Michael Nguyen
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-03-31 14:00 UTC by Ben Parees
Modified: 2023-09-14 05:54 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-04-03 07:30:09 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Comment 1 Antonio Murdaca 2020-03-31 14:35:09 UTC
what's happening is MCD on masters complaining about the etcd recovery files - I think this bug is already reported and fixed in 4.4 and maybe since this cluster started with a 4.4 nightly from 03/15 it may be the issue.

CC'ing Sam and lowering priority as I think we fixed this already:


```
...

E0330 21:59:16.155040 2504182 daemon.go:1354] content mismatch for file /usr/local/bin/etcd-snapshot-restore.sh: #!/usr/bin/env bash

set -o errexit
set -o pipefail


A: set -o errtrace

# example
# ./etcd-snapshot-restore.sh $path-to-backup

...
```

Comment 2 Antonio Murdaca 2020-03-31 14:40:59 UTC
also, Ben, was this cluster installed with 4.4.0-0.nightly-2020-03-15-215151 and later upgraded to 4.4.0-0.nightly-2020-03-30-163532? or it was something else before becoming 4.4.0-0.nightly-2020-03-15-215151? that would explain the bug on the MCO side which we're fixing in 4.4 (https://github.com/openshift/machine-config-operator/pull/1593)

Comment 3 Ben Parees 2020-03-31 14:54:11 UTC
>  I think this bug is already reported and fixed in 4.4 and maybe since this cluster started with a 4.4 nightly from 03/15 it may be the issue.

if it's fixed in the version i've upgraded to, would the expectation be that it recovers?  I'm not clear if you're saying it's been fixed already(in some nightly) or not.


> also, Ben, was this cluster installed with 4.4.0-0.nightly-2020-03-15-215151 and later upgraded to 4.4.0-0.nightly-2020-03-30-163532? or it was something else before becoming 4.4.0-0.nightly-2020-03-15-215151


Looks like it started on 4.4.0-0.nightly-2020-03-10-115843

    history:
    - completionTime: null
      image: registry.svc.ci.openshift.org/ocp/release:4.4.0-0.nightly-2020-03-30-163532
      startedTime: "2020-03-30T19:31:49Z"
      state: Partial
      verified: false
      version: 4.4.0-0.nightly-2020-03-30-163532
    - completionTime: "2020-03-24T15:54:19Z"
      image: registry.svc.ci.openshift.org/ocp/release:4.4
      startedTime: "2020-03-16T00:04:36Z"
      state: Completed
      verified: false
      version: 4.4.0-0.nightly-2020-03-15-215151
    - completionTime: "2020-03-10T19:38:36Z"
      image: registry.svc.ci.openshift.org/ocp/release@sha256:36cd5bc706b135e4dff14064dd4c6dffb87d5c04158e3a8243d2ffe94beea64e
      startedTime: "2020-03-10T19:18:46Z"
      state: Completed
      verified: false
      version: 4.4.0-0.nightly-2020-03-10-115843

Comment 4 Antonio Murdaca 2020-04-03 07:30:09 UTC

*** This bug has been marked as a duplicate of bug 1817455 ***

Comment 5 Red Hat Bugzilla 2023-09-14 05:54:52 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days


Note You need to log in before you can comment on or make changes to this bug.