Hide Forgot
Description of problem: Running "stonith_admin --confirm nonexistent_node" returns 0 and does not print any warning or error. Version-Release number of selected component (if applicable): pacemaker-1.1.15-10.el7.x86_64 How reproducible: always, easily Steps to Reproduce: stonith_admin --confirm nonexistent_node Actual results: Exit code 0, no error messages. Expected results: Non-zero exit code, error message saying it is not possible to confirm fencing of a nonexistent node. Additional info: corosync.log: Oct 18 16:18:38 [12585] rh72-node1 stonith-ng: notice: handle_request: Received manual confirmation that nonexistent is fenced Oct 18 16:18:38 [12585] rh72-node1 stonith-ng: notice: initiate_remote_stonith_op: Initiating manual confirmation for nonexistent: 6eb84017-6c8c-4041-a323-a4e0ae75e38a Oct 18 16:18:38 [12585] rh72-node1 stonith-ng: notice: stonith_manual_ack: Injecting manual confirmation that nonexistent is safely off/down Oct 18 16:18:38 [12585] rh72-node1 stonith-ng: notice: remote_op_done: Operation off of nonexistent by a human for stonith_admin.13024: OK Oct 18 16:18:38 [12589] rh72-node1 crmd: notice: tengine_stonith_notify: Peer nonexistent was terminated (off) by a human for rh72-node1: OK (ref=6eb84017-6c8c-4041-a323-a4e0ae75e38a) by client stonith_admin.13024
This is not a bug. First, stonithd allows the user to use node names that aren't currently known at the cluster level, whether with --confirm or something like pcmk_host_list, because the node may be added to the cluster at any time, or it may have joined a partition that stonith_admin currently can't see (but a fence device can shoot). Second (and not widely known), stonithd is designed to be usable for fencing anything, not just cluster nodes. A user can register arbitrary node names and arbitrary fence devices that can fence those nodes, and request that stonithd perform fencing. As long as some device is capable of fencing the node, stonithd doesn't care what the node name is or whether it is part of the cluster. The stonithd regression tests even use this behavior to set up imaginary fence scenarios.
Thanks for quick answer and thorough explanation. Considering the "pcs stonith confirm" command is pretty much just a wrapper for "stonith_admin --confirm", do you think it should or should not check if a node exists in a cluster? Based on your explanation pcs should not check it. If pacemaker cannot see a node, then pcs getting the list of nodes from pacemaker cannot see it either. The check would make it impossible to confirm the invisible node fenced. However it might be a good idea to explain the behavior of the command in more details in pcs documentation.
Agreed. At most, pcs could indicate in its success/fail message whether the fenced node was a known cluster node or not.