Bug 2008298
| Summary: | Invalid values of some hacluster metrics on s390x | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 8 | Reporter: | Jan Kurik <jkurik> | ||||
| Component: | pcp | Assignee: | Nathan Scott <nathans> | ||||
| Status: | CLOSED ERRATA | QA Contact: | Jan Kurik <jkurik> | ||||
| Severity: | unspecified | Docs Contact: | Apurva Bhide <abhide> | ||||
| Priority: | unspecified | ||||||
| Version: | 8.6 | CC: | agerstmayr, jkurik, nathans, pevans | ||||
| Target Milestone: | rc | Keywords: | Bugfix, Triaged | ||||
| Target Release: | 8.6 | Flags: | pm-rhel:
mirror+
|
||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | pcp-5.3.4-1.el8 | Doc Type: | No Doc Update | ||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2022-05-10 13:31:13 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
|
Description
Jan Kurik
2021-09-27 20:07:26 UTC
Paul, any ideas as to what might be causing these issues on big endian platforms? Thanks! I was looking into the source code of the PMDA and I found some mishmash in type definition
In file "pmda.c" is the following code:
<snip pmda.c>
{ .m_desc = {
PMDA_PMID(CLUSTER_DRBD_PEER_DEVICE, DRBD_PEER_DEVICE_CONNECTIONS_RECEIVED),
PM_TYPE_U64, DRBD_PEER_DEVICE_INDOM, PM_SEM_INSTANT,
PMDA_PMUNITS(0,0,1,PM_SPACE_KBYTE,0,PM_COUNT_ONE) } },
{ .m_desc = {
PMDA_PMID(CLUSTER_DRBD_PEER_DEVICE, DRBD_PEER_DEVICE_CONNECTIONS_SENT),
PM_TYPE_U64, DRBD_PEER_DEVICE_INDOM, PM_SEM_INSTANT,
PMDA_PMUNITS(0,0,1,PM_SPACE_KBYTE,0,PM_COUNT_ONE) } },
{ .m_desc = {
PMDA_PMID(CLUSTER_DRBD_PEER_DEVICE, DRBD_PEER_DEVICE_CONNECTIONS_PENDING),
PM_TYPE_U32, DRBD_PEER_DEVICE_INDOM, PM_SEM_INSTANT,
PMDA_PMUNITS(0,0,1,0,0,PM_COUNT_ONE) } },
{ .m_desc = {
PMDA_PMID(CLUSTER_DRBD_PEER_DEVICE, DRBD_PEER_DEVICE_CONNECTIONS_UNACKED),
PM_TYPE_U32, DRBD_PEER_DEVICE_INDOM, PM_SEM_INSTANT,
PMDA_PMUNITS(0,0,1,0,0,PM_COUNT_ONE) } },
</snip>
however in "drbd.h" the data types are defined differently:
<snip drbd.h>
uint32_t connections_received;
uint32_t connections_sent;
uint64_t connections_pending;
uint64_t connections_unacked;
</snip>
Similarly for "ha_cluster.corosync.member_votes.node_id" merics, the datatypes differs in "pmda.c" resp. "corosync.h" files:
<snip pmda.c>
{ .m_desc = {
PMDA_PMID(CLUSTER_COROSYNC_NODE, COROSYNC_MEMBER_VOTES_NODE_ID),
PM_TYPE_U32, COROSYNC_NODE_INDOM, PM_SEM_INSTANT,
PMDA_PMUNITS(0,0,1,0,0,PM_COUNT_ONE) } },
</snip>
<snip corosync.h>
struct member_votes {
uint32_t votes;
uint8_t local;
uint64_t node_id;
};
</snip>
Due to some time constraints I have not tried yet to modify the code and test it with aligned data types, however IMO this is the core of the issue.
If I am mistaken, then feel free to correct me :-)
> If I am mistaken, then feel free to correct me :-)
Those are exactly the sorts of places to look at Jan. The other place where things can go wrong is in the fetchCallback routine, where we copy into the pmAtomValue union field of each type - if the wrong field (ll, ull, l, ul) is used, truncation or sign extension can result.
Merged upstream (Paul, can you also review? LGTM).
commit cf5aefe663ba48ef0848290a1d5b51850c336702
Author: Jan Kurik <jkurik>
Date: Wed Sep 29 18:26:42 2021 +0200
Fix of bz2008298
Fix of datatypes for ha_cluster.corosync.member_votes.node_id and
ha_cluster.drbd.connections_* metrics.
Hi, Can confirm the changes look good to me also (ACK), not too sure how the mismatches happened there. Have double-checked each other type definition and they look correct. Thanks Jan for the fix! Cheers, Paul Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (pcp bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2022:1765 |