Bug 2156703

Summary: OCP upgrade, between 4.11.2 and 4.11.9, stuck for month
Product: [Red Hat Storage] Red Hat OpenShift Data Foundation Reporter: Vered Berenstein Paz <bvered>
Component: unclassifiedAssignee: Mudit Agarwal <muagarwa>
Status: CLOSED NOTABUG QA Contact: Elad <ebenahar>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 4.11CC: bniver, bvered, ebenahar, muagarwa, ocs-bugs, odf-bz-bot, sostapov, tmuthami
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-03-20 06:44:38 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
must-gather none

Description Vered Berenstein Paz 2022-12-28 08:26:22 UTC
Created attachment 1934693 [details]
must-gather

An OCP upgrade has been triggered over 2 month ago without any progress. 
Upgrade between 4.11.2 to 4.11.9

Going into cluster operation I see image-registry has the following error:
 Degraded: Registry deployment has timed out progressing: ReplicaSet "image-registry-54b54c884" has timed out progressing.


I started a slack discussion about this issue here - https://coreos.slack.com/archives/CEGKQ43CP/p1671520275127179

According to some quick investigation done over the discussion it seems like to be an issue with some PV and one of the nodes might be unhealthy?!

Attaching all the collected logs and outputs

Comment 2 Vered Berenstein Paz 2022-12-28 08:29:04 UTC
I'm unable to upload here the other reports I collected .. 
So I'm attacching slack links:

https://redhat.enterprise.slack.com/files/U03MFV332DR/F04FQ3Q0X1V/inspect.tar.gz 
https://redhat.enterprise.slack.com/files/U03MFV332DR/F04GCSA0UG4/node.log.gz

Comment 7 Vered Berenstein Paz 2023-01-08 13:25:28 UTC
Hi,
Any update on this ticket?
Do you need any additional logs? 
Were you able to download the logs from slack channels?

Comment 9 Vered Berenstein Paz 2023-02-27 08:41:03 UTC
Opened a Jira ticket according to Elad's instructions - https://issues.redhat.com/browse/OCPBUGS-7976

Comment 11 Vered Berenstein Paz 2023-03-20 06:44:38 UTC
The issue have been fixed.
This ticket can be closed.