Bug 2232106 - Cluster need few hours to recover after shutting down 2 worker nodes (10 minutes shut down)
Summary: Cluster need few hours to recover after shutting down 2 worker nodes (10 minu...
Keywords:
Status: NEW
Alias: None
Product: Red Hat OpenShift Data Foundation
Classification: Red Hat Storage
Component: unclassified
Version: 4.14
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: ---
Assignee: Mudit Agarwal
QA Contact: Elad
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2023-08-15 10:16 UTC by Aviad Polak
Modified: 2023-08-15 10:55 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Embargoed:


Attachments (Terms of Use)

Description Aviad Polak 2023-08-15 10:16:09 UTC
after running automated test: https://github.com/red-hat-storage/ocs-ci/blob/master/tests/manage/z_cluster/nodes/test_check_pod_status_after_two_nodes_shutdown_recovery.py as part of our Tier4 testing:
 flow is to shut down 2 (out of 3) worker nodes for 10 minutes, then start them again and check cluster status. after the run few pods went CLBO or init status. it took few hours before cluster recovered

Version of all relevant components (if applicable):


Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?
ODF: full_version: 4.14.0-102
OCP: 
openshiftVersion: 4.14.0-0.nightly-2023-08-08-005757
releaseClientVersion: 4.14.0-0.ci-2023-07-11-133509
serverVersion:
  buildDate: "2023-08-03T17:26:35Z"
  compiler: gc
  gitCommit: ee9c1a1f13b06f5e2a79dcbd06285ec3f8315448
  gitTreeState: clean
  gitVersion: v1.27.3+e123787
  goVersion: go1.20.5 X:strictfipsruntime
  major: "1"
  minor: "27"
  platform: linux/amd64


Steps to Reproduce:
1. Run automated test as described above


Note You need to log in before you can comment on or make changes to this bug.