Bug 1919061

Summary: [RFE] fence_vmware_soap: Add diag option to send NMI
Product: Red Hat Enterprise Linux 8 Reporter: Reid Wahl <nwahl>
Component: fence-agentsAssignee: Oyvind Albrigtsen <oalbrigt>
Status: ASSIGNED --- QA Contact: cluster-qe <cluster-qe>
Severity: medium Docs Contact:
Priority: medium    
Version: 8.3CC: cfeist, cluster-maint, mjuricek, sbradley
Target Milestone: rcKeywords: FutureFeature, Triaged
Target Release: 8.5   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Enhancement
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: Feature Request
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Reid Wahl 2021-01-22 02:28:58 UTC
Description of problem:

This is a request to add a `diag` option to fence_vmware_soap, analogous to the one used in fence_ipmilan. The idea is to send an NMI to the guest, in order to trigger a panic and collect a vmcore.

This appears to be possible with the SendNMI endpoint for the SOAP API.
  - https://code.vmware.com/apis/42/vsphere/doc/vim.VirtualMachine.html#sendNMI

As much as I would love to add this feature to fence_vmware_rest, as far as I can tell this feature is missing from the REST API. If someone finds a way to send an NMI via the REST API, then let's try to add this in the REST agent.
  - https://developer.vmware.com/docs/vsphere-automation/latest/vcenter/operation-index/

-----

Version-Release number of selected component (if applicable):

fence-agents-vmware-soap-4.2.1-53.el8_3.1

-----

How reproducible:

N/A / always

-----

Steps to Reproduce:

Run fence_vmware_soap with `-o diag`.

-----

Actual results:

# fence_vmware_soap -a vcenter.gsslab.brq.redhat.com -l gssuser -p <pw> --ssl-insecure -o diag -n my-vm-name
2021-01-21 18:25:52,338 ERROR: Failed: Unrecognised action 'diag'

2021-01-21 18:25:52,339 ERROR: Please use '-h' for usage

-----

Expected results:

An NMI is sent to the VM. If kdump is configured and the guest OS is configured to panic on NMI, then a vmcore is generated.

Comment 2 Reid Wahl 2021-01-22 05:02:27 UTC
This might not be possible. I performed two tests against nwahl-rhel-7-node1 and got the following result in the vCenter log both times: "The requested operation is not implemented by the server."

Based on the below GitHub issue, SendNMI might only be supported on the ESXi host rather than on the vCenter. Another very unfortunate VMware API limitation if so.
  - https://github.com/vmware/pyvmomi/issues/726

There seems to be some kind of workaround involving vm-support that I haven't tested yet and that may or may not be viable.
  - https://gist.github.com/prziborowski/1a8ee0e3e4185e07f208212fcc083078


This BZ is just a nice-to-have anyway. Adding the feature looked simple enough until the "not implemented" error.