Bug 2219552

Summary: NHC sections are disconnected
Product: Container Native Virtualization (CNV) Reporter: Fabian Deutsch <fdeutsch>
Component: DocumentationAssignee: Avital Pinnick <apinnick>
Status: CLOSED NEXTRELEASE QA Contact: Oded Ramraz <oramraz>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 4.13.0Flags: apinnick: needinfo? (fdeutsch)
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-07-20 08:34:16 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Fabian Deutsch 2023-07-04 09:36:07 UTC
Document URL: 
https://docs.openshift.com/container-platform/4.13/virt/install/preparing-cluster-for-virt.html#cluster-high-availability-options_preparing-cluster-for-virt
~ and ~
https://docs.openshift.com/container-platform/4.13/virt/virtual_machines/virt-triggering-vm-failover-resolving-failed-node.html

Section Number and Name: 
See above

Describe the issue: Today two relevant parts to "fencing" are disconnected in the docs.
One part is part of "Installation" second half is part of "VM Management".

Suggestions for improvement: 
Create a new section under AdV vm mgmt to be called "High Availability" and to link to the two existing sections.

Additional information:

Comment 1 Avital Pinnick 2023-07-05 07:57:06 UTC
Fabian,

We have an existing section called "Node maintenance" (https://docs.openshift.com/container-platform/4.13/virt/node_maintenance/virt-about-node-maintenance.html). We could move the HA section and node documentation there. What do you think?

Also, the Operator needs to be updated to Node Self Remediation Operator in the docs.

Comment 2 Fabian Deutsch 2023-07-05 08:02:19 UTC
Avital, I see where you are coming from.

But I actually tink that the two places we have today are good:
1. Installation, to call out to make sure HA is enabled
2. User looking to undertsand how to make a VM HA, this belongs into a VM section.

Thus while it concerns nodes, the existing two sections are .. good where they are.
However, I could imagine a new section under node to be caleld "Node recovery" or "remediation" which tells how this cna be done manually or automatically. But this would be an RFE iiuic.

Comment 3 Avital Pinnick 2023-07-05 08:25:19 UTC
Fabian,
> 2. User looking to undertsand how to make a VM HA, this belongs into a VM section.

Re: https://docs.openshift.com/container-platform/4.13/virt/virtual_machines/virt-triggering-vm-failover-resolving-failed-node.html

This looks like a remediation to me. I am not sure why it belongs in the VM section.

Comment 4 Fabian Deutsch 2023-07-07 12:15:45 UTC
You are right https://docs.openshift.com/container-platform/4.13/virt/virtual_machines/virt-triggering-vm-failover-resolving-failed-node.html better fits into node maintenance.
The remaining issue: we do not have anything about VM HA, IIRC. We do not call it out specifically in the docs.

Thus from a content perspective, as you said, it should move to node maint. but from a topic perspective we should see that we fill the gap for VM HA.

Thus now I'm requesting a short section called "High-Availability"  in "Advanced Virtual Machine Management" to
- state that VM HA is enabled by enabling remediating nodes
- node remediation can be done manually (link to resolving failed node) and automatically with NHC/MHC (link to preparing cluster)

Comment 5 Avital Pinnick 2023-07-19 11:10:44 UTC
Fabian, please review the PR: https://github.com/openshift/openshift-docs/pull/62506

Comment 6 Avital Pinnick 2023-07-20 08:34:16 UTC
Changes approved and merged for 4.14