Bug 1321445

Summary: 8.0 puddle - 2016-03-22.1 Error: unable to fence 'overcloud-controller-2' Command failed: No route to host
Product: Red Hat OpenStack Reporter: Asaf Hirshberg <ahirshbe>
Component: rhosp-directorAssignee: Giulio Fidente <gfidente>
Status: CLOSED NOTABUG QA Contact: Asaf Hirshberg <ahirshbe>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: 8.0 (Liberty)CC: ahirshbe, dbecker, dsneddon, emacchi, gfidente, mburns, mcornea, morazi, oblaut, rhel-osp-director-maint
Target Milestone: gaKeywords: Regression
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-04-01 17:57:22 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Asaf Hirshberg 2016-03-27 08:02:15 UTC
Description of problem:
Failed to Fence a controller on IPv6 environment.

[root@overcloud-controller-1 ~]# pcs stonith fence overcloud-controller-2
Error: unable to fence 'overcloud-controller-2'
Command failed: No route to host

Mar 27 07:51:15 [3407] overcloud-controller-1.localdomain stonith-ng:   notice: handle_request: Client stonith_admin.26387.29b1abe6 wants to fence (reboot) 'overcloud-controller-2' with device '(any)'
Mar 27 07:51:15 [3407] overcloud-controller-1.localdomain stonith-ng:   notice: initiate_remote_stonith_op:     Initiating remote operation reboot for overcloud-controller-2: 9641de72-e221-4a6e-9c19-2ca3903953da (0)
Mar 27 07:51:15 [3407] overcloud-controller-1.localdomain stonith-ng:   notice: can_fence_host_with_device:     ipmilan-overcloud-controller-2 can fence (reboot) overcloud-controller-2: static-list
Mar 27 07:51:15 [3407] overcloud-controller-1.localdomain stonith-ng:   notice: can_fence_host_with_device:     ipmilan-overcloud-controller-0 can not fence (reboot) overcloud-controller-2: static-list
Mar 27 07:51:15 [3407] overcloud-controller-1.localdomain stonith-ng:     info: process_remote_stonith_query:   Query result 1 of 3 from overcloud-controller-1 for overcloud-controller-2/reboot (1 devices) 9641de72-e221-4a6e-9c19-2ca3903953da
Mar 27 07:51:15 [3407] overcloud-controller-1.localdomain stonith-ng:     info: process_remote_stonith_query:   Query result 2 of 3 from overcloud-controller-2 for overcloud-controller-2/reboot (0 devices) 9641de72-e221-4a6e-9c19-2ca3903953da
Mar 27 07:51:15 [3407] overcloud-controller-1.localdomain stonith-ng:     info: process_remote_stonith_query:   Query result 3 of 3 from overcloud-controller-0 for overcloud-controller-2/reboot (1 devices) 9641de72-e221-4a6e-9c19-2ca3903953da
Mar 27 07:51:15 [3407] overcloud-controller-1.localdomain stonith-ng:     info: call_remote_stonith:    Total remote op timeout set to 120 for fencing of node overcloud-controller-2 for stonith_admin.26387.9641de72
Mar 27 07:51:15 [3407] overcloud-controller-1.localdomain stonith-ng:     info: call_remote_stonith:    Requesting that overcloud-controller-0 perform op reboot overcloud-controller-2 for stonith_admin.26387 (144s, 0s)
Mar 27 07:51:16 [3407] overcloud-controller-1.localdomain stonith-ng:     info: call_remote_stonith:    Requesting that overcloud-controller-1 perform op reboot overcloud-controller-2 for stonith_admin.26387 (144s, 0s)
Mar 27 07:51:16 [3407] overcloud-controller-1.localdomain stonith-ng:   notice: can_fence_host_with_device:     ipmilan-overcloud-controller-2 can fence (reboot) overcloud-controller-2: static-list
Mar 27 07:51:16 [3407] overcloud-controller-1.localdomain stonith-ng:   notice: can_fence_host_with_device:     ipmilan-overcloud-controller-0 can not fence (reboot) overcloud-controller-2: static-list
Mar 27 07:51:16 [3407] overcloud-controller-1.localdomain stonith-ng:     info: stonith_fence_get_devices_cb:   Found 1 matching devices for 'overcloud-controller-2'
Mar 27 07:51:16 [3407] overcloud-controller-1.localdomain stonith-ng:     info: internal_stonith_action_execute:        Attempt 2 to execute fence_ipmilan (reboot). remaining timeout is 120
Mar 27 07:51:17 [3407] overcloud-controller-1.localdomain stonith-ng:     info: update_remaining_timeout:       Attempted to execute agent fence_ipmilan (reboot) the maximum number of times (2) allowed
Mar 27 07:51:17 [3407] overcloud-controller-1.localdomain stonith-ng:    error: log_operation:  Operation 'reboot' [26417] (call 2 from stonith_admin.26387) for host 'overcloud-controller-2' with device 'ipmilan-overcloud-controller-2' returned: -201 (Generic Pacemaker error)
Mar 27 07:51:17 [3407] overcloud-controller-1.localdomain stonith-ng:  warning: log_operation:  ipmilan-overcloud-controller-2:26417 [ Failed: Unable to obtain correct plug status or plug is not available ]
Mar 27 07:51:17 [3407] overcloud-controller-1.localdomain stonith-ng:  warning: log_operation:  ipmilan-overcloud-controller-2:26417 [  ]
Mar 27 07:51:17 [3407] overcloud-controller-1.localdomain stonith-ng:  warning: log_operation:  ipmilan-overcloud-controller-2:26417 [  ]
Mar 27 07:51:17 [3407] overcloud-controller-1.localdomain stonith-ng:   notice: stonith_choose_peer:    Couldn't find anyone to fence (reboot) overcloud-controller-2 with any device
Mar 27 07:51:17 [3407] overcloud-controller-1.localdomain stonith-ng:     info: call_remote_stonith:    None of the 3 peers are capable of fencing (reboot) overcloud-controller-2 for stonith_admin.26387 (1)
Mar 27 07:51:17 [3407] overcloud-controller-1.localdomain stonith-ng:    error: remote_op_done: Operation reboot of overcloud-controller-2 by <no-one> for stonith_admin.26387: No route to host
Mar 27 07:51:17 [3411] overcloud-controller-1.localdomain       crmd:   notice: tengine_stonith_notify: Peer overcloud-controller-2 was not terminated (reboot) by <anyone> for overcloud-controller-1: No route to host (ref=9641de72-e221-4a6e-9c19-2ca3903953da) by client stonith_admin.26387

The operation succeeded on an environment deployed with IPv4 
[root@overcloud-controller-0 ~]# pcs stonith fence overcloud-controller-2
Node: overcloud-controller-2 fenced


How reproducible:
3/3 (deployments with ipv6)

Steps to Reproduce:
1.deploy ospd-8 using 2016-03-22.1 pudlle with ipv6 infrastructure.
2. configure fencing
3. use "pcs stonith fence <controller-x> "

Actual results:
[root@overcloud-controller-1 ~]# pcs stonith fence overcloud-controller-2
Error: unable to fence 'overcloud-controller-2'
Command failed: No route to host

Expected results:
[root@overcloud-controller-0 ~]# pcs stonith fence overcloud-controller-2
Node: overcloud-controller-2 fenced


Additional info:

Comment 3 Emilien Macchi 2016-03-31 16:02:09 UTC
to me, it sounds like a networking issue in your infrastructure, can you confirm it?

Comment 4 Dan Sneddon 2016-04-01 17:57:22 UTC
I'm reasonably certain that this is not a problem with OSP. The IP address 10.35.160.378 is not valid (each IP octet can only go up to 255, so 378 is automatically invalid).

Comment 5 Asaf Hirshberg 2016-06-22 03:23:09 UTC
Emilien Macchi, apparently it was a configuration problem.. comment 4