Bug 2155453

Summary: fence_ibm_powervs fencing agent performance enhancements needed (RHEL8)
Product: Red Hat Enterprise Linux 8 Reporter: Andreas Schauberer <andreas.schauberer>
Component: fence-agentsAssignee: Oyvind Albrigtsen <oalbrigt>
Status: MODIFIED --- QA Contact: Brandon Perkins <bperkins>
Severity: low Docs Contact:
Priority: unspecified    
Version: 8.4CC: bperkins, cfeist, cluster-maint, fdanapfe, ksatarin
Target Milestone: rcKeywords: Triaged
Target Release: 8.9   
Hardware: ppc64le   
OS: Linux   
Whiteboard:
Fixed In Version: fence-agents-4.2.1-119.el8 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 2221643 (view as bug list) Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 2221643    
Attachments:
Description Flags
enhanced fence_ibm_powervs.py none

Description Andreas Schauberer 2022-12-21 10:00:55 UTC
Description of problem:
The current version of the fence agent fence_ibm_powervs uses --method=onoff for the action reboot. The current version leads to a twice as long failover time compared to the new proposal.
 
Version-Release number of selected component (if applicable):
File: https://github.com/ClusterLabs/fence-agents/blob/main/agents/ibm_powervs/fence_ibm_powervs.py
Change Date: Oct 25 2022 
Tested Version: https://github.com/ClusterLabs/fence-agents/blob/3373431dc49d6e429bbf613765385cb33a56e917/agents/ibm_powervs/fence_ibm_powervs.py
 
How reproducible:
always
 
Steps to Reproduce:
1.	Deploy two PowerVS LPARs for SAP HANA using RHEL8.4 image
see https://cloud.ibm.com/docs/power-iaas?topic=power-iaas-creating-power-virtual-server 
and https://cloud.ibm.com/docs/sap?topic=sap-hana-iaas-offerings-profiles-power-vs  
2.	Install SAP HANA 2.0 on both nodes 
see https://help.sap.com/docs/SAP_HANA_PLATFORM/2c1988d620e04368aa4103bf26f17727/7eb0167eb35e4e2885415205b8383584.html?locale=en-US
3.	Setup SAP HANA System Replication and its pacemaker HSR cluster policy 
see https://access.redhat.com/articles/3004101
4.	Create PowerVS fence agent with command:
pcs stonith create fence_device fence_ibm_powervs token=${APIKEY} crn=${IBMCLOUD_CRN} instance=${GUID} region=${CLOUD_REGION} api-type=private proxy=http://${PROXY_IP}:3128  pcmk_host_map="${NODE1}:${POWERVSI_01};${NODE2}:${POWERVSI_02}" pcmk_reboot_timeout=600 pcmk_monitor_timeout=600
5.	Delay reboot with GRUB_TIMEOUT=3600 in /etc/default/grub and “grub2-mkconfig -o /boot/grub2/grub.cfg” (set pw for console user before reboot)
6.	reboot primary HSR node using “sync; echo b > /proc/sysrq-trigger”
 
Actual results:
Messages like the following will be shown:
Node List:
  * Node sap-ha-s1-1: online:
    * Resources:
      * fence_device    (stonith:fence_ibm_powervs):     Started (Monitoring)
      * SAPHanaTopology_ASD_00  (ocf::heartbeat:SAPHanaTopology):        Started
      * SAPHana_ASD_00  (ocf::heartbeat:SAPHana):        Slave
  * Node sap-ha-s1-2: UNCLEAN (offline):
    * Resources:
      * fence_device    (stonith:fence_ibm_powervs):     Started (Monitoring)
      * SAPHanaTopology_ASD_00  (ocf::heartbeat:SAPHanaTopology):        Started
      * vip_ASD_00      (ocf::heartbeat:IPaddr2):        Started
      * SAPHana_ASD_00  (ocf::heartbeat:SAPHana):        Master
-> Node sap-ha-s1-2 is flagged UNCLEAN and Node sap-ha-s1-1 does takeover after min. 2 minutes.
 
Expected results:
-> Node sap-ha-s1-2 should be flagged (offline) and Node sap-ha-s1-1 should takeover VIP and HANA Primary in less than 2 minutes (Most cases 1 minute).
 
Additional info:
-	new version of fencing agent uses --method=cycle (def reboot_cycle) is needed. It helps to reduce the overall failover time. 
-	new version of fencing agent should optimize get_power_status execution time compared to the current version.

Development question:
The RHEL 8 and RHEL 9 docu contains the following statement in chapter “9.1. Displaying available fence agents and their options”:
Warning: For fence agents that provide a method option, a value of cycle is unsupported and should not be specified, as it may cause data corruption.

Is it still allowed to use the method cycle (as coded below) as default in a fencing agent, since the existing fencing agents fence_sbd and fence_gce use also this default?
all_opt["method"]["default"] = "cycle"
The fencing agent fence_ibm_powervs uses PowerVM “immediate-shutdown” for off and PowerVM “hard-reboot” for reboot which leads to the same possibilities of data corruption.
Since the end user does not need to specify this method I would still like to go with this new method cycle if its supported to be used as default in general.

Comment 1 Chris Feist 2022-12-21 15:28:22 UTC
We strongly recommend against using cycle because we've had problems in the past with the cycle/reboot call failing to succeed, but not notifying the cluster.  With off/on calls the cluster runs a check to confirm the node is powered off giving us higher confidence that we won't have to deal with a split brain situation.

That being said, I don't think using the 'onoff' method should double recovery times (assuming the actual API calls happen quickly).

How long does it take for the fencing agent to just power off the node?  Is there a large delay between the API call and when the node actually powers off?  Also, when using the cycle method does the API just immediately return before the node is actually restarted?  This will make things appear to speed up, but could be dangerous because the cluster thinks the node has been powered off, but it is actually running.

Comment 2 Andreas Schauberer 2023-01-12 14:50:25 UTC Comment hidden (obsolete)
Comment 3 Andreas Schauberer 2023-02-03 19:59:08 UTC Comment hidden (obsolete)
Comment 4 Andreas Schauberer 2023-02-07 09:55:51 UTC
I executed 36 HANA failover tests and all failovers where successful. This leads me to the conclusion that this is a rock solid HANA HA solution.
Never the less, my tests show that the fencing agent can be improved to get faster failover times.
The biggest difference is the number of REST API calls comparing the GA and new proposed fencing agents.

Below the failover times for the GA fence_ibm_powervs (extension .org) compared to the failover times for the new proposed fence_ibm_powervs (extension .mod).

Change history for proposed fence_ibm_powervs.mod
 - For action=status now one PowerVS REST API call less is used, therefore execution time is half of the original time. The original code triggered a second status call to get the list of all LPAR instances for a given workspace, this is now only done if the first status call fails.
 - For action=reboot (method=cycle) now only the PowerVS REST API action HARD_REBOOT is called. When this call returns ok, it is save to say that the node was stopped. The original code had no cycle method implemented.

Three failover scenarios where tested with recovery times for an HANA client connected to the current HANA primary node:
 - Time to recover in seconds is calculated as the time between the last successful client call to the old primary node and the first successful client call to the new primary node.

Failover scenarios (time to recover avg/min/max):
- Scenario: HDB primary cmd “HDB kill -9” 
  - fence_ibm_powervs.mod: avg 61, min 24, max 81 sec 
  - fence_ibm_powervs.org: avg 68, min 51, max 83 sec 
- Scenario: HDB primary “LPAR Immediate shutdown” 
  - fence_ibm_powervs.mod: avg 60, min 41, max 66 sec 
  - fence_ibm_powervs.org: avg 98, min 64, max 135 sec 
- Scenario: HDB primary “OS kernel panic” 
  - fence_ibm_powervs.mod: avg 54, min 30, max 89 sec 
  - fence_ibm_powervs.org: avg 120, min 89, max 128 sec 

Four different fencing actions where tested:
 - Execution time in seconds is calculated as the time for the fencing agent calling required REST APIs for one specific fencing action.

Fencing action scenarios (PowerVS REST API call number) (execution time avg/min/max):
- Scenario: action=status 
  - “fence_ibm_powervs.mod -o status” (2 API calls) avg 12, min 6, max 28 sec 
  - “fence_ibm_powervs.org -o status” (3 API calls) avg 24, min 10, max 39 sec 
- Scenario: action=off
  - “fence_ibm_powervs.mod -o off” (4 API calls) avg 44, min 28, max 57 sec
  - “fence_ibm_powervs.org -o off” (6 API calls) avg 81, min 74, max 89 sec
- Scenario: action=on
  - “fence_ibm_powervs.mod -o on” (4 API calls) avg 50, min 24, max 73 sec 
  - “fence_ibm_powervs.org -o on” (6 API calls) avg 74, min 48, max 94 sec 
- Scenario: action=reboot
  - “fence_ibm_powervs.mod -o reboot” (3 API calls) avg 12, min 9, max 19 sec 
  - “fence_ibm_powervs.org -o reboot” (9 API calls) avg 95, min 71, max 132 sec 

Test Setup:
 - 2 node pacemaker cluster using RHEL HA Add-On SAP HANA HSR policy
 - Third system to run sql client to get failover times using script:
   while [ 1=1 ]; do date;/usr/sap/hdbclient/hdbsql -n <IP-Address> -i 00 -u SYSTEM -p <pw> 'select HARDWARE_KEY,SYSTEM_ID,EXPIRATION_DATE,PERMANENT,VALID from M_LICENSE';sleep 2;done

Comment 5 Oyvind Albrigtsen 2023-02-07 10:04:07 UTC
Nice. Can you add the modified agent to the bz, so I can see your changes?

Comment 6 Andreas Schauberer 2023-02-07 11:12:18 UTC
Created attachment 1942698 [details]
enhanced fence_ibm_powervs.py

enhanced fence_ibm_powervs.py:
 - For action=status now one PowerVS REST API call less is used, therefore execution time is half of the original time. The original code triggered a second status call to get the list of all LPAR instances for a given workspace, this is now only done if the first status call fails.
 - For action=reboot (method=cycle) now only the PowerVS REST API action HARD_REBOOT is called. When this call returns ok, it is save to say that the node was stopped. The original code had no cycle method implemented.

Comment 7 Oyvind Albrigtsen 2023-02-09 10:27:01 UTC
Looks good to me, but we should keep the cycle method optional as the onoff method ensures the device has been fully rebooted.

Feel free to make an upstream PR at https://github.com/ClusterLabs/fence-agents unless you want me to make one for you.

Comment 8 Andreas Schauberer 2023-02-20 15:32:37 UTC
I plan to open the upstream PR this week.

Comment 9 Oyvind Albrigtsen 2023-04-17 14:33:35 UTC
Any updates on this?

Comment 10 Andreas Schauberer 2023-04-25 09:57:17 UTC
Opened pull request #542 for these changes.
https://github.com/ClusterLabs/fence-agents/pull/542

Comment 12 Andreas Schauberer 2023-06-16 06:54:51 UTC
All review comments solved in PR #542.