Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1401704

Summary: Python exception error when configuring fence_rhevm
Product: Red Hat Enterprise Linux 6 Reporter: Sam Yangsao <syangsao>
Component: fence-agentsAssignee: Marek Grac <mgrac>
Status: CLOSED WONTFIX QA Contact: cluster-qe <cluster-qe>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 6.8CC: abeekhof, cluster-maint, rbalakri
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-12-06 10:40:52 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Sam Yangsao 2016-12-05 21:56:58 UTC
Description of problem:

An 'unknown error' after trying to configure fence_rhevm with a valid option

Version-Release number of selected component (if applicable):

# uname -a
Linux sap1 2.6.32-642.4.2.el6.x86_64 #1 SMP Mon Aug 15 02:06:41 EDT 2016 x86_64 x86_64 x86_64 GNU/Linux

# rpm -qa |egrep 'corosync|pacemaker|pcs|fence'
fence-virt-0.2.3-19.el6.x86_64
pacemaker-1.1.14-8.el6_8.1.x86_64
corosync-1.4.7-5.el6.x86_64
pcs-0.9.148-7.el6_8.1.x86_64
pacemaker-cluster-libs-1.1.14-8.el6_8.1.x86_64
pacemaker-cli-1.1.14-8.el6_8.1.x86_64
fence-agents-4.0.15-12.el6.x86_64
pacemaker-libs-1.1.14-8.el6_8.1.x86_64
libxshmfence-1.2-1.el6.x86_64
corosynclib-1.4.7-5.el6.x86_64

How reproducible:

Always

Steps to Reproduce:

1.  Install rhel 6 with the latest pacemaker bits as of 12/05/2016
2.  Configure stonith with the following option:

# pcs stonith create fence_sap1 fence_rhevm port="sap1" ipaddr="10.15.108.21" action="reboot" login="admin@internal" passwd="redhat" pcmk_host_list="sap1" ssl=1

3.  crmd crashes with the results below, but the 

Actual results:

# pcs status
Cluster name: sap_pacemaker
Last updated: Mon Dec  5 15:54:33 2016		Last change: Mon Dec  5 15:03:36 2016 by root via cibadmin on sap1
Stack: cman
Current DC: sap1 (version 1.1.14-8.el6_8.1-70404b0) - partition with quorum
2 nodes and 2 resources configured

Online: [ sap1 sap2 ]

Full list of resources:

 fence_sap2	(stonith:fence_rhevm):	Started sap1
 fence_sap1	(stonith:fence_rhevm):	Stopped

Failed Actions:
* fence_sap1_start_0 on sap2 'unknown error' (1): call=82, status=Error, exitreason='none',
    last-rc-change='Mon Dec  5 15:03:36 2016', queued=0ms, exec=2182ms
* fence_sap1_start_0 on sap1 'unknown error' (1): call=80, status=Error, exitreason='none',
    last-rc-change='Mon Dec  5 15:03:40 2016', queued=0ms, exec=2150ms


PCSD Status:
  sap1: Online
  sap2: Online

# /var/log/messages file

Dec  5 15:00:23 sap1 crmd[20483]:   notice: State transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_FSA_INTERNAL origin=abort_transition_graph ]
Dec  5 15:00:23 sap1 stonith-ng[20479]:   notice: Added 'fence_sap1' to the device list (2 active devices)
Dec  5 15:00:23 sap1 pengine[20482]:   notice: Start   fence_sap1#011(sap2)
Dec  5 15:00:23 sap1 pengine[20482]:   notice: Calculated Transition 77: /var/lib/pacemaker/pengine/pe-input-77.bz2
Dec  5 15:00:23 sap1 crmd[20483]:   notice: Initiating action 4: monitor fence_sap1_monitor_0 on sap2
Dec  5 15:00:23 sap1 crmd[20483]:   notice: Initiating action 3: monitor fence_sap1_monitor_0 on sap1 (local)
Dec  5 15:00:23 sap1 crmd[20483]:   notice: Operation fence_sap1_monitor_0: not running (node=sap1, call=72, rc=7, cib-update=193, confirmed=true)
Dec  5 15:00:23 sap1 crmd[20483]:   notice: Initiating action 7: start fence_sap1_start_0 on sap2
Dec  5 15:00:26 sap1 crmd[20483]:  warning: Action 7 (fence_sap1_start_0) on sap2 failed (target: 0 vs. rc: 1): Error
Dec  5 15:00:26 sap1 crmd[20483]:   notice: Transition aborted by fence_sap1_start_0 'modify' on sap2: Event failed (magic=4:1;7:77:0:78492043-f970-40c7-a553-cc6a95a6f17e, cib=0.25.3, source=match_graph_event:381, 0)
Dec  5 15:00:26 sap1 crmd[20483]:  warning: Action 7 (fence_sap1_start_0) on sap2 failed (target: 0 vs. rc: 1): Error
Dec  5 15:00:26 sap1 crmd[20483]:   notice: Transition 77 (Complete=3, Pending=0, Fired=0, Skipped=0, Incomplete=1, Source=/var/lib/pacemaker/pengine/pe-input-77.bz2): Complete
Dec  5 15:00:26 sap1 pengine[20482]:  warning: Processing failed op start for fence_sap1 on sap2: unknown error (1)
Dec  5 15:00:26 sap1 pengine[20482]:  warning: Processing failed op start for fence_sap1 on sap2: unknown error (1)
Dec  5 15:00:26 sap1 pengine[20482]:   notice: Recover fence_sap1#011(Started sap2)
Dec  5 15:00:26 sap1 pengine[20482]:   notice: Calculated Transition 78: /var/lib/pacemaker/pengine/pe-input-78.bz2
Dec  5 15:00:26 sap1 crmd[20483]:   notice: Initiating action 1: stop fence_sap1_stop_0 on sap2
Dec  5 15:00:26 sap1 crmd[20483]:   notice: Transition aborted by status-sap2-fail-count-fence_sap1, fail-count-fence_sap1=INFINITY: Transient attribute change (create cib=0.25.4, source=abort_unless_down:329, path=/cib/status/node_state[@id='sap2']/transient_attributes[@id='sap2']/instance_attributes[@id='status-sap2'], 0)
Dec  5 15:00:26 sap1 crmd[20483]:   notice: Transition 78 (Complete=2, Pending=0, Fired=0, Skipped=1, Incomplete=2, Source=/var/lib/pacemaker/pengine/pe-input-78.bz2): Stopped
Dec  5 15:00:26 sap1 pengine[20482]:  warning: Processing failed op start for fence_sap1 on sap2: unknown error (1)
Dec  5 15:00:26 sap1 pengine[20482]:  warning: Forcing fence_sap1 away from sap2 after 1000000 failures (max=1000000)
Dec  5 15:00:26 sap1 pengine[20482]:   notice: Start   fence_sap1#011(sap1)
Dec  5 15:00:26 sap1 pengine[20482]:   notice: Calculated Transition 79: /var/lib/pacemaker/pengine/pe-input-79.bz2
Dec  5 15:00:26 sap1 crmd[20483]:   notice: Initiating action 5: start fence_sap1_start_0 on sap1 (local)
Dec  5 15:00:27 sap1 abrt: detected unhandled Python exception in '/usr/sbin/fence_rhevm'
Dec  5 15:00:27 sap1 abrt-server[23380]: Saved Python crash dump of pid 23375 to /var/spool/abrt/pyhook-2016-12-05-15:00:27-23375
Dec  5 15:00:27 sap1 abrtd: Directory 'pyhook-2016-12-05-15:00:27-23375' creation detected
Dec  5 15:00:27 sap1 stonith-ng[20479]:  warning: fence_rhevm[23375] stderr: [ Traceback (most recent call last): ]
Dec  5 15:00:27 sap1 stonith-ng[20479]:  warning: fence_rhevm[23375] stderr: [   File "/usr/sbin/fence_rhevm", line 165, in <module> ]
Dec  5 15:00:27 sap1 stonith-ng[20479]:  warning: fence_rhevm[23375] stderr: [     main() ]
Dec  5 15:00:27 sap1 stonith-ng[20479]:  warning: fence_rhevm[23375] stderr: [   File "/usr/sbin/fence_rhevm", line 160, in main ]
Dec  5 15:00:27 sap1 stonith-ng[20479]:  warning: fence_rhevm[23375] stderr: [     result = fence_action(None, options, set_power_status, get_power_status, get_list) ]
Dec  5 15:00:27 sap1 stonith-ng[20479]:  warning: fence_rhevm[23375] stderr: [   File "/usr/share/fence/fencing.py", line 821, in fence_action ]
Dec  5 15:00:27 sap1 stonith-ng[20479]:  warning: fence_rhevm[23375] stderr: [     status = status.upper() ]
Dec  5 15:00:27 sap1 stonith-ng[20479]:  warning: fence_rhevm[23375] stderr: [ AttributeError: 'NoneType' object has no attribute 'upper' ]
Dec  5 15:00:27 sap1 abrtd: Duplicate: core backtrace
Dec  5 15:00:27 sap1 abrtd: DUP_OF_DIR: /var/spool/abrt/pyhook-2016-12-05-14:09:04-19068
Dec  5 15:00:27 sap1 abrtd: Deleting problem directory pyhook-2016-12-05-15:00:27-23375 (dup of pyhook-2016-12-05-14:09:04-19068)
Dec  5 15:00:27 sap1 abrtd: Sending an email...
Dec  5 15:00:27 sap1 abrtd: Email was sent to: root@localhost
Dec  5 15:00:28 sap1 abrt: detected unhandled Python exception in '/usr/sbin/fence_rhevm'
Dec  5 15:00:28 sap1 abrt-server[23397]: Not saving repeating crash in '/usr/sbin/fence_rhevm'
Dec  5 15:00:28 sap1 stonith-ng[20479]:  warning: fence_rhevm[23383] stderr: [ Traceback (most recent call last): ]
Dec  5 15:00:28 sap1 stonith-ng[20479]:  warning: fence_rhevm[23383] stderr: [   File "/usr/sbin/fence_rhevm", line 165, in <module> ]
Dec  5 15:00:28 sap1 stonith-ng[20479]:  warning: fence_rhevm[23383] stderr: [     main() ]
Dec  5 15:00:28 sap1 stonith-ng[20479]:  warning: fence_rhevm[23383] stderr: [   File "/usr/sbin/fence_rhevm", line 160, in main ]
Dec  5 15:00:28 sap1 stonith-ng[20479]:  warning: fence_rhevm[23383] stderr: [     result = fence_action(None, options, set_power_status, get_power_status, get_list) ]
Dec  5 15:00:28 sap1 stonith-ng[20479]:  warning: fence_rhevm[23383] stderr: [   File "/usr/share/fence/fencing.py", line 821, in fence_action ]
Dec  5 15:00:28 sap1 stonith-ng[20479]:  warning: fence_rhevm[23383] stderr: [     status = status.upper() ]
Dec  5 15:00:28 sap1 stonith-ng[20479]:  warning: fence_rhevm[23383] stderr: [ AttributeError: 'NoneType' object has no attribute 'upper' ]
Dec  5 15:00:28 sap1 stonith-ng[20479]:   notice: Operation 'monitor' [23383] for device 'fence_sap1' returned: -201 (Generic Pacemaker error)
Dec  5 15:00:29 sap1 crmd[20483]:    error: Operation fence_sap1_start_0 (node=sap1, call=73, status=4, cib-update=196, confirmed=true) Error
Dec  5 15:00:29 sap1 crmd[20483]:  warning: Action 5 (fence_sap1_start_0) on sap1 failed (target: 0 vs. rc: 1): Error
Dec  5 15:00:29 sap1 crmd[20483]:   notice: Transition aborted by fence_sap1_start_0 'modify' on sap1: Event failed (magic=4:1;5:79:0:78492043-f970-40c7-a553-cc6a95a6f17e, cib=0.25.7, source=match_graph_event:381, 0)
Dec  5 15:00:29 sap1 crmd[20483]:  warning: Action 5 (fence_sap1_start_0) on sap1 failed (target: 0 vs. rc: 1): Error
Dec  5 15:00:29 sap1 crmd[20483]:   notice: Transition 79 (Complete=1, Pending=0, Fired=0, Skipped=0, Incomplete=1, Source=/var/lib/pacemaker/pengine/pe-input-79.bz2): Complete
Dec  5 15:00:29 sap1 attrd[20481]:   notice: Sending flush op to all hosts for: fail-count-fence_sap1 (INFINITY)
Dec  5 15:00:29 sap1 attrd[20481]:   notice: Sent update 187: fail-count-fence_sap1=INFINITY
Dec  5 15:00:29 sap1 attrd[20481]:   notice: Sending flush op to all hosts for: last-failure-fence_sap1 (1480971629)
Dec  5 15:00:29 sap1 attrd[20481]:   notice: Sent update 189: last-failure-fence_sap1=1480971629
Dec  5 15:00:29 sap1 attrd[20481]:   notice: Sending flush op to all hosts for: fail-count-fence_sap1 (INFINITY)
Dec  5 15:00:29 sap1 pengine[20482]:  warning: Processing failed op start for fence_sap1 on sap2: unknown error (1)
Dec  5 15:00:29 sap1 pengine[20482]:  warning: Processing failed op start for fence_sap1 on sap1: unknown error (1)
Dec  5 15:00:29 sap1 pengine[20482]:  warning: Processing failed op start for fence_sap1 on sap1: unknown error (1)
Dec  5 15:00:29 sap1 pengine[20482]:  warning: Forcing fence_sap1 away from sap1 after 1000000 failures (max=1000000)
Dec  5 15:00:29 sap1 pengine[20482]:  warning: Forcing fence_sap1 away from sap2 after 1000000 failures (max=1000000)
Dec  5 15:00:29 sap1 pengine[20482]:   notice: Stop    fence_sap1#011(sap1)
Dec  5 15:00:29 sap1 pengine[20482]:   notice: Calculated Transition 80: /var/lib/pacemaker/pengine/pe-input-80.bz2
Dec  5 15:00:29 sap1 attrd[20481]:   notice: Sent update 191: fail-count-fence_sap1=INFINITY
Dec  5 15:00:29 sap1 attrd[20481]:   notice: Sending flush op to all hosts for: last-failure-fence_sap1 (1480971629)
Dec  5 15:00:29 sap1 attrd[20481]:   notice: Sent update 193: last-failure-fence_sap1=1480971629
Dec  5 15:00:29 sap1 pengine[20482]:  warning: Processing failed op start for fence_sap1 on sap2: unknown error (1)
Dec  5 15:00:29 sap1 pengine[20482]:  warning: Processing failed op start for fence_sap1 on sap1: unknown error (1)
Dec  5 15:00:29 sap1 pengine[20482]:  warning: Processing failed op start for fence_sap1 on sap1: unknown error (1)
Dec  5 15:00:29 sap1 pengine[20482]:  warning: Forcing fence_sap1 away from sap1 after 1000000 failures (max=1000000)
Dec  5 15:00:29 sap1 pengine[20482]:  warning: Forcing fence_sap1 away from sap2 after 1000000 failures (max=1000000)
Dec  5 15:00:29 sap1 pengine[20482]:   notice: Stop    fence_sap1#011(sap1)
Dec  5 15:00:29 sap1 pengine[20482]:   notice: Calculated Transition 81: /var/lib/pacemaker/pengine/pe-input-81.bz2
Dec  5 15:00:29 sap1 crmd[20483]:   notice: Initiating action 2: stop fence_sap1_stop_0 on sap1 (local)
Dec  5 15:00:29 sap1 crmd[20483]:   notice: Operation fence_sap1_stop_0: ok (node=sap1, call=74, rc=0, cib-update=200, confirmed=true)
Dec  5 15:00:29 sap1 crmd[20483]:   notice: Transition 81 (Complete=2, Pending=0, Fired=0, Skipped=0, Incomplete=0, Source=/var/lib/pacemaker/pengine/pe-input-81.bz2): Complete
Dec  5 15:00:29 sap1 crmd[20483]:   notice: State transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS cause=C_FSA_INTERNAL origin=notify_crmd ]

Expected results:

pacemaker should give us more information about the error, whether or not the 'ssl=1' option is valid

Additional info:

Works fine _without_ the 'ssl=1' option

Comment 3 Andrew Beekhof 2016-12-05 22:17:27 UTC
At best this might be something for pcs, but it sounds more like a bug in the fence_rhevm agent.

Re-assigning.

Comment 5 Marek Grac 2016-12-07 12:34:28 UTC
I agree that this problem with fence agents. Issue mentioned in debug (line 821) should be already fixed in 6.9 (rhbz#1361623) - so I believe that it is duplicate. Can you retest it with latest build, please?

However, I'm quite surprised that it happends only with SSL. Can you please re-run it with verbose flag. So I can see complete communication between fence agent and device?

Comment 6 Sam Yangsao 2016-12-07 14:44:33 UTC
(In reply to Marek Grac from comment #5)
> I agree that this problem with fence agents. Issue mentioned in debug (line
> 821) should be already fixed in 6.9 (rhbz#1361623) - so I believe that it is
> duplicate. Can you retest it with latest build, please?

Will test with this build.

> 
> However, I'm quite surprised that it happends only with SSL. Can you please
> re-run it with verbose flag. So I can see complete communication between
> fence agent and device?

It also occurs with 'ssl_insecure=1' as well, have not tried many options.

Which verbose flag are you referring to?  just -vvv?  

Thanks!

Comment 7 Marek Grac 2016-12-07 15:32:46 UTC
One -v is enough for us, or verbose=1 if you are using fence agent via pcs

ssl_insecure is same as ssl in RHEL6

Comment 9 Jan Kurik 2017-12-06 10:40:52 UTC
Red Hat Enterprise Linux 6 is in the Production 3 Phase. During the Production 3 Phase, Critical impact Security Advisories (RHSAs) and selected Urgent Priority Bug Fix Advisories (RHBAs) may be released as they become available.

The official life cycle policy can be reviewed here:

http://redhat.com/rhel/lifecycle

This issue does not meet the inclusion criteria for the Production 3 Phase and will be marked as CLOSED/WONTFIX. If this remains a critical requirement, please contact Red Hat Customer Support to request a re-evaluation of the issue, citing a clear business justification. Note that a strong business justification will be required for re-evaluation. Red Hat Customer Support can be contacted via the Red Hat Customer Portal at the following URL:

https://access.redhat.com/