Bug 2053616 - inability to detect static lifecycle failure [NEEDINFO]
Summary: inability to detect static lifecycle failure
Keywords:
Status: CLOSED DUPLICATE of bug 2053268
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Etcd
Version: 4.10
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: ---
Assignee: Wally
QA Contact: ge liu
URL:
Whiteboard: EmergencyRequest
Depends On: 2053582 2053617
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-02-11 16:08 UTC by Abu Kashem
Modified: 2022-02-14 15:26 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 2053268
: 2053617 (view as bug list)
Environment:
Last Closed: 2022-02-14 15:26:32 UTC
Target Upstream Version:
Embargoed:
mfojtik: needinfo?


Attachments (Terms of Use)

Comment 1 Michal Fojtik 2022-02-11 16:09:38 UTC
** A NOTE ABOUT USING URGENT **

This BZ has been set to urgent severity and priority. When a BZ is marked urgent priority Engineers are asked to stop whatever they are doing, putting everything else on hold.
Please be prepared to have reasonable justification ready to discuss, and ensure your own and engineering management are aware and agree this BZ is urgent. Keep in mind, urgent bugs are very expensive and have maximal management visibility.

NOTE: This bug was automatically assigned to an engineering manager with the severity reset to *unspecified* until the emergency is vetted and confirmed. Please do not manually override the severity.

** INFORMATION REQUIRED **

Please answer these questions before escalation to engineering:

1. Has a link to must-gather output been provided in this BZ? We cannot work without. If must-gather fails to run, attach all relevant logs and provide the error message of must-gather.
2. Give the output of "oc get clusteroperators -o yaml".
3. In case of degraded/unavailable operators, have all their logs and the logs of the operands been analyzed [yes/no]
4. List the top 5 relevant errors from the logs of the operators and operands in (3).
5. Order the list of degraded/unavailable operators according to which is likely the cause of the failure of the other, root-cause at the top.
6. Explain why (5) is likely the right order and list the information used for that assessment.
7. Explain why Engineering is necessary to make progress.

Comment 2 Michal Fojtik 2022-02-11 16:39:44 UTC
** A NOTE ABOUT USING URGENT **

This BZ has been set to urgent severity and priority. When a BZ is marked urgent priority Engineers are asked to stop whatever they are doing, putting everything else on hold.
Please be prepared to have reasonable justification ready to discuss, and ensure your own and engineering management are aware and agree this BZ is urgent. Keep in mind, urgent bugs are very expensive and have maximal management visibility.

NOTE: This bug was automatically assigned to an engineering manager with the severity reset to *unspecified* until the emergency is vetted and confirmed. Please do not manually override the severity.

** INFORMATION REQUIRED **

Please answer these questions before escalation to engineering:

1. Has a link to must-gather output been provided in this BZ? We cannot work without. If must-gather fails to run, attach all relevant logs and provide the error message of must-gather.
2. Give the output of "oc get clusteroperators -o yaml".
3. In case of degraded/unavailable operators, have all their logs and the logs of the operands been analyzed [yes/no]
4. List the top 5 relevant errors from the logs of the operators and operands in (3).
5. Order the list of degraded/unavailable operators according to which is likely the cause of the failure of the other, root-cause at the top.
6. Explain why (5) is likely the right order and list the information used for that assessment.
7. Explain why Engineering is necessary to make progress.

Comment 5 Wally 2022-02-14 15:26:32 UTC

*** This bug has been marked as a duplicate of bug 2053268 ***


Note You need to log in before you can comment on or make changes to this bug.