Bug 1956507 - master MCP degraded w possibly corrupted configMap after upgrade
Summary: master MCP degraded w possibly corrupted configMap after upgrade
Status: NEW
Product: OpenShift Container Platform
Classification: Red Hat
Component: Machine Config Operator
Version: 4.5
Assignee: Yu Qi Zhang
QA Contact: Michael Nguyen
Reported: 2021-05-03 19:53 UTC by milti leonard
Modified: 2021-05-10 16:52 UTC (History)
2 users (show)

Description milti leonard 2021-05-03 19:53:00 UTC
Description of problem:

Upgraded from 4.4.9 to 4.5.24 using local mirror.  Process appeared to run successfully, however current configuration reports:

Failed to resync 4.5.24 because: timed out waiting for the condition during syncRequiredMachineConfigPools: error pool master is not ready, retrying. Status: (pool degraded: true total: 3, ready 3, updated: 3, unavailable: 0)

Version-Release number of selected component (if applicable):

How reproducible:
not very

Steps to Reproduce:

Actual results:
upgrade reports success but master MCP reports a degraded status

Expected results:
upgrade succeeds and MCP is healthy

Additional info:

Comment 4 milti leonard 2021-05-05 16:20:44 UTC
@jerzhang, cu is having RBAC issues getting both the must-gather and the inspection. will update BZ when either becomes available

Comment 8 milti leonard 2021-05-06 15:07:56 UTC
@jerzhang, can you yank the file-bundle in supportshell? that would be quicker than me d'loading/splitting/attaching it to this ticket. and the RBAC errors that i posted in the BZ previously is what led me to believe that the CM corruption is preventing them from getting a must-gather (i can be wrong, i frequently am); and yes, this is the result of an upgrade.

Comment 12 milti leonard 2021-05-10 16:48:19 UTC
@jerzhang, must-gather has been executed and attached to the ticket.

Comment 13 milti leonard 2021-05-10 16:52:26 UTC
the error msg for the master MCP has changed ever-so-slightly:

  - lastTransitionTime: "2021-04-29T16:46:04Z"
    message: |-
      Failed to render configuration for pool master: parsing Ignition config failed with error: config is not valid
      Report: error at line 1, column 1178
          1: {"ignition":{"config":{},"security":{"tls":{}},"timeouts":{},"version":"2.2.0"},"networkd":{},"passwd":{},"storage":{"files":[{"contents":{"source":"data:text/plain;charset=utf-8;base64,IyBTcGVjaWZ5IHRpbWUgc291cmNlcy4Kc2VydmVyICAgbnRwMmEubWwuY29tCnNlcnZlciAgIG50 cDJiLm1sLmNvbQpzZXJ2ZXIgICBudHAyYy5tbC5jb20Kc2VydmVyICAgbnRwMmQubWwuY29tCgoj IFJlY29yZCB0aGUgcmF0ZSBhdCB3aGljaCB0aGUgc3lzdGVtIGNsb2NrIGdhaW5zL2xvc3NlcyB0 aW1lLgpkcmlmdGZpbGUgL3Zhci9saWIvY2hyb255L2RyaWZ0CgojIEFsbG93IHRoZSBzeXN0ZW0g Y2xvY2sgdG8gYmUgc3RlcHBlZCBpbiB0aGUgZmlyc3QgdGhyZWUgdXBkYXRlcwojIGlmIGl0cyBv ZmZzZXQgaXMgbGFyZ2VyIHRoYW4gMSBzZWNvbmQuCm1ha2VzdGVwIDEuMCAzCgojIEVuYWJsZSBr ZXJuZWwgc3luY2hyb25pemF0aW9uIG9mIHRoZSByZWFsLXRpbWUgY2xvY2sgKFJUQykuCnJ0Y3N5 bmMKCiMgSW5jcmVhc2UgdGhlIG1pbmltdW0gbnVtYmVyIG9mIHNlbGVjdGFibGUgc291cmNlcyBy ZXF1aXJlZCB0byBhZGp1c3QKIyB0aGUgc3lzdGVtIGNsb2NrLgptaW5zb3VyY2VzIDIKCiMgU3Bl Y2lmeSBmaWxlIGNvbnRhaW5pbmcga2V5cyBmb3IgTlRQIGF1dGhlbnRpY2F0aW9uLgprZXlmaWxl IC9ldGMvY2hyb255LmtleXMKCiMgR2V0IFRBSS1VVEMgb2Zmc2V0IGFuZCBsZWFwIHNlY29uZHMg ZnJvbSB0aGUgc3lzdGVtIHR6IGRhdGFiYXNlLgpsZWFwc2VjdHogcmlnaHQvVVRDCgojIFNwZWNp ZnkgZGlyZWN0b3J5IGZvciBsb2cgZmlsZXMuCmxvZ2RpciAvdmFyL2xvZy9jaHJvbnkK
      invalid data character
    reason: ""
    status: "True"
    type: RenderDegraded

it could just be formatting, but i dont think so

