2262070 – Network fencing is not applied on a node which is down when 'node.kubernetes.io/out-of-service=nodeshutdown:NoExecute' label is applied

Bug 2262070 - Network fencing is not applied on a node which is down when 'node.kubernetes.io/out-of-service=nodeshutdown:NoExecute' label is applied [NEEDINFO]

Summary: Network fencing is not applied on a node which is down when 'node.kubernetes....

Keywords:
Status:	ON_QA
Alias:	None
Product:	Red Hat OpenShift Data Foundation
Classification:	Red Hat Storage
Component:	rook
Sub Component:
Version:	4.15
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	high
Target Milestone:	---
Target Release:	ODF 4.15.8
Assignee:	Subham Rai
QA Contact:	jpinto
Docs Contact:
URL:
Whiteboard:
Depends On:	2259668
Blocks:	2265124
TreeView+	depends on / blocked

Reported:	2024-01-31 09:36 UTC by Joy John Pinto
Modified:	2024-10-09 16:33 UTC (History)
CC List:	8 users (show)
Fixed In Version:	4.15.8-1
Doc Type:	No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed:
Embargoed:
Flags:	srai: needinfo? (kramdoss)

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Github	red-hat-storage rook pull 572	0	None	open	Bug 2262070: core: remove namespace/ownerRef from networkFence	2024-02-14 02:58:30 UTC
Github	rook rook pull 13718	0	None	Draft	core: remove namespace/ownerRef from networkFence	2024-02-07 20:37:55 UTC

Description Joy John Pinto 2024-01-31 09:36:00 UTC

Description of problem (please be detailed as possible and provide log
snippests):
Network fencing is not applied on a node which is down when 'node.kubernetes.io/out-of-service=nodeshutdown:NoExecute' label is applied 

Version of all relevant components (if applicable):
OCP 4.15 and ODF 4.15.0-126

Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?
NA

Is there any workaround available to the best of your knowledge?
NA

Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?
1

Can this issue reproducible?
Yes

Can this issue reproduce from the UI?
Yes

If this is a regression, please provide more details to justify this:
NA

Steps to Reproduce:
1. Install Openshift data foundation and deploy a app pod in same node as that of rook ceph operator pod
2. Shutdown the node on which CephFS RWO pod is deployed
3.Once the node is down, add taint
```oc  taint nodes <node-name> node.kubernetes.io/out-of-service=nodeshutdown:NoExecute ```
Wait for some time(if the application pod and rook operator are on the same node wait for bit logger) then check the networkFence cr status 

Actual results:
Network fence is not created if the node is down, But pod gets rescheduled on the new node immedietely

Expected results:
Network fence should be created when the node is down

Additional info:
When the node is up and working if you apply the taint 'oc adm taint nodes compute-1 node.kubernetes.io/out-of-service=nodeshutdown:NoExecute' Network fence would be created

Comment 16 Joy John Pinto 2024-04-18 07:08:22 UTC

Status was updated to Verified state by mistake, Moving it back to assigned state

Comment 21 Sunil Kumar Acharya 2024-08-26 11:22:42 UTC

Are there any blockers to provide devel ack for this bz? If not, please provide the devel ack.

Note You need to log in before you can comment on or make changes to this bug.