1471506 – ocf:pacemaker:ClusterMon do not use "update" attribute

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1471506 - ocf:pacemaker:ClusterMon do not use "update" attribute

Summary: ocf:pacemaker:ClusterMon do not use "update" attribute

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 7
Classification:	Red Hat
Component:	pacemaker
Sub Component:
Version:	7.3
Hardware:	x86_64
OS:	Linux
Priority:	medium
Severity:	low
Target Milestone:	rc
Target Release:	7.5
Assignee:	Ken Gaillot
QA Contact:	cluster-qe@redhat.com
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1565619
TreeView+	depends on / blocked

Reported:	2017-07-16 13:44 UTC by Strahil Nikolov
Modified:	2018-04-10 15:32 UTC (History)
CC List:	4 users (show)
Fixed In Version:	pacemaker-1.1.18-1.el7
Doc Type:	Bug Fix
Doc Text:	Cause: The meta-data for the ocf:pacemaker:ClusterMon resource agent incorrectly implied the "update" option used seconds (instead of milliseconds) as its units. Consequence: Users might configure the "update" option with a value less than 1000, which would be ignored. Fix: The meta-data now correctly specifies that the "update" units are in milliseconds. Additionally, the resource agent will interpret any values less than 1000 as seconds. Result: Users are clearly directed to use milliseconds as "update" units, and resources previously configured with the expectation of seconds as units now work properly.
Clone Of:
Clones:	1565619 (view as bug list)
Environment:
Last Closed:	2018-04-10 15:30:29 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHEA-2018:0860	0	None	None	None	2018-04-10 15:32:11 UTC

Description Strahil Nikolov 2017-07-16 13:44:12 UTC

Description of problem:
The "ocf:pacemaker:ClusterMon" resource accepts the "update" attribute, but never uses it.It even ignores the default and starts crm_mon with "-i 0" instead.

Version-Release number of selected component (if applicable):
pacemaker-cli-1.1.15-11.el7_3.5.x86_64

How reproducible:
Always

Steps to Reproduce:

1.Create a Cloned Monitor resource:
#pcs resource create MONITOR ClusterMon user="root"  update="30" \
extra_options="-r" htmlfile="/var/log/clustermon" --clone

2.Check the crm_mon deamon's parameters:
# ps -aux | grep [c]rm_mon
# grep [c]ontent /var/log/clustermon


3.Deamon is stated as follows:
"usr/sbin/crm_mon -p /tmp/ClusterMon_MONITOR.pid -d -i 0 -r -h /var/log/clustermon"
HTML file has refresh rate of "0"

Actual results:
crm_mon stated with "-i 0"
HTML file has same refresh rate of 0:
<meta http-equiv="refresh" content="0">

Expected results:
crm_mon to use value from "update" attribute of the resource - in my case "-i 30"
And the refresh rate in the html file to be the same value:
<meta http-equiv="refresh" content="30">

Additional info:
# pcs resource show MONITOR
 Resource: MONITOR (class=ocf provider=pacemaker type=ClusterMon)
  Attributes: user=root update=30 extra_options=-r htmlfile=/var/log/clustermon
  Operations: start interval=0s timeout=20 (MONITOR-start-interval-0s)
              stop interval=0s timeout=20 (MONITOR-stop-interval-0s)
              monitor interval=10 timeout=20 (MONITOR-monitor-interval-10)

Comment 2 Strahil Nikolov 2017-07-16 13:48:11 UTC

Update:
Recreating the resource without the "update" attribute - takes the default of "-i 15" and the refresh rate is defined as:
<meta http-equiv="refresh" content="15">

Comment 3 Andrew Beekhof 2017-07-17 00:35:50 UTC

Looks like the agent's metadata is wrong. The value should be in milliseconds because it does:

: ${OCF_RESKEY_update:="15000"}

OCF_RESKEY_update=`expr $OCF_RESKEY_update / 1000`

CMON_CMD="${HA_SBIN_DIR}/crm_mon -p $OCF_RESKEY_pidfile -d -i $OCF_RESKEY_update $OCF_RESKEY_extra_options -h $OCF_RESKEY_htmlfile"

which explains why '30' is functioning as '0'

Comment 4 Ken Gaillot 2017-07-17 13:43:19 UTC

I'll update the metadata for 7.5. Values less than 1000 will currently be ignored anyway, so I'll also change it to treat the value as seconds if it's below 1000 and milliseconds otherwise.

Comment 5 Ken Gaillot 2017-07-18 23:40:52 UTC

Fixed upstream as of commit 02c294df

Comment 8 michal novacek 2017-12-06 09:52:25 UTC

I have verified that update paremeter of clustermon resource is taken into
account with pacemaker-1.1.18-6.el7.x86_64

---

[root@virt-265 ~]# pcs resource show clustermon
 Resource: clustermon (class=ocf provider=pacemaker type=ClusterMon)
  Attributes: extra_options=-r htmlfile=/var/log/clustermon pidfile=/tmp/ClusterMon_MONITOR.pid update=30 user=root
  Operations: monitor interval=10 timeout=20 (clustermon-monitor-interval-10)
              start interval=0s timeout=20 (clustermon-start-interval-0s)
              stop interval=0s timeout=20 (clustermon-stop-interval-0s)

[root@virt-265 ~]# ps axfu | grep crm_mon
root     10610  0.0  0.0 112704   928 pts/0    S+   10:49   0:00          \_ grep --color=auto crm_mon
root     10432  0.0  0.3  93480  3684 ?        S    10:48   0:00 /usr/sbin/crm_mon -p /tmp/ClusterMon_MONITOR.pid -d -i 30 -r -h /var/log/clustermon

[root@virt-265 ~]# pcs resource update clustermon update=10

[root@virt-265 ~]# ps axfu | grep crm_mon
root     10704  0.0  0.0 112704   928 pts/0    S+   10:49   0:00          \_ grep --color=auto crm_mon
root     10669  0.0  0.3  93484  3680 ?        S    10:49   0:00 /usr/sbin/crm_mon -p /tmp/ClusterMon_MONITOR.pid -d -i 10 -r -h /var/log/clustermon



> (1) pcs status
[root@virt-266 ~]# pcs status
Cluster name: STSRHTS9315
Stack: corosync
Current DC: virt-265 (version 1.1.18-6.el7-2b07d5c5a9) - partition with quorum
Last updated: Wed Dec  6 10:45:49 2017
Last change: Wed Dec  6 10:27:05 2017 by root via cibadmin on virt-265

2 nodes configured
7 resources configured (1 DISABLED)

Online: [ virt-265 virt-266 ]

Full list of resources:

 fence-virt-265 (stonith:fence_xvm):    Started virt-265
 fence-virt-266 (stonith:fence_xvm):    Started virt-266
 Clone Set: dlm-clone [dlm]
     Started: [ virt-265 virt-266 ]
 Clone Set: clvmd-clone [clvmd]
     Started: [ virt-265 virt-266 ]
 clustermon     (ocf::pacemaker:ClusterMon):    Stopped (disabled)

Daemon Status:
  corosync: active/disabled
  pacemaker: active/disabled
  pcsd: active/enabled

> (2) pcs config
[root@virt-266 ~]# pcs config
Cluster Name: STSRHTS9315
Corosync Nodes:
 virt-265 virt-266
Pacemaker Nodes:
 virt-265 virt-266

Resources:
 Clone: dlm-clone
  Meta Attrs: interleave=true ordered=true 
  Resource: dlm (class=ocf provider=pacemaker type=controld)
   Operations: monitor interval=30s on-fail=fence (dlm-monitor-interval-30s)
               start interval=0s timeout=90 (dlm-start-interval-0s)
               stop interval=0s timeout=100 (dlm-stop-interval-0s)
 Clone: clvmd-clone
  Meta Attrs: interleave=true ordered=true 
  Resource: clvmd (class=ocf provider=heartbeat type=clvm)
   Attributes: with_cmirrord=1
   Operations: monitor interval=30s on-fail=fence (clvmd-monitor-interval-30s)
               start interval=0s timeout=90 (clvmd-start-interval-0s)
               stop interval=0s timeout=90 (clvmd-stop-interval-0s)
 Resource: clustermon (class=ocf provider=pacemaker type=ClusterMon)
  Attributes: extra_options=-r htmlfile=/var/log/clustermon pidfile=/tmp/ClusterMon_MONITOR.pid update=30 user=root
  Meta Attrs: target-role=Stopped 
  Operations: monitor interval=10 timeout=20 (clustermon-monitor-interval-10)
              start interval=0s timeout=20 (clustermon-start-interval-0s)
              stop interval=0s timeout=20 (clustermon-stop-interval-0s)

Stonith Devices:
 Resource: fence-virt-265 (class=stonith type=fence_xvm)
  Attributes: delay=5 pcmk_host_check=static-list pcmk_host_list=virt-265 pcmk_host_map=virt-265:virt-265.cluster-qe.lab.eng.brq.redhat.com
  Operations: monitor interval=60s (fence-virt-265-monitor-interval-60s)
 Resource: fence-virt-266 (class=stonith type=fence_xvm)
  Attributes: pcmk_host_check=static-list pcmk_host_list=virt-266 pcmk_host_map=virt-266:virt-266.cluster-qe.lab.eng.brq.redhat.com
  Operations: monitor interval=60s (fence-virt-266-monitor-interval-60s)
Fencing Levels:

Location Constraints:
Ordering Constraints:
  start dlm-clone then start clvmd-clone (kind:Mandatory)
Colocation Constraints:
  clvmd-clone with dlm-clone (score:INFINITY)
Ticket Constraints:

Alerts:
 No alerts defined

Resources Defaults:
 No defaults set
Operations Defaults:
 No defaults set

Cluster Properties:
 cluster-infrastructure: corosync
 cluster-name: STSRHTS9315
 dc-version: 1.1.18-6.el7-2b07d5c5a9
 have-watchdog: false
 no-quorum-policy: freeze

Quorum:
  Options:

Comment 11 errata-xmlrpc 2018-04-10 15:30:29 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2018:0860

Note You need to log in before you can comment on or make changes to this bug.