Bug 865588

Summary: cluster-cim: missing symbols/provider library not compiled properly
Product: Red Hat Enterprise Linux 6 Reporter: Jan Pokorný [poki] <jpokorny>
Component: clustermonAssignee: Jan Pokorný [poki] <jpokorny>
Status: CLOSED ERRATA QA Contact: Cluster QE <mspqa-list>
Severity: high Docs Contact:
Priority: high    
Version: 6.3CC: cluster-maint, fdinitto, mnovacek, rsteiger
Target Milestone: rcKeywords: EasyFix
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: clustermon-0.16.2-19.el6 Doc Type: Bug Fix
Doc Text:
Cause: Dynamic library representing CIM provider of cluster status was not built with all due dependencies and hence some symbols could not be resolved at a run-time, leading to unability to fulfill the cluster status queries. Consequence: Attempt to access cluster status via CIM (e.g., via wbemcli command-line client) failed with an error. Fix: Necessary dependencies are added to the build process. Result: No undefined symbols at a run-time are encountered and in turn the access to cluster status via CIM does not lead to an error.
Story Points: ---
Clone Of:
: 882277 (view as bug list) Environment:
Last Closed: 2013-02-21 10:56:05 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 882277, 957651    
Attachments:
Description Flags
reproducer bash script based on comment 6 none

Description Jan Pokorný [poki] 2012-10-11 20:43:30 UTC
(some steps may be accidentally omitted)

# yum install -y cluster-snmp
# (edit /etc/Pegasus/access.conf, remove all "EXCEPT" parts)
# service tog-pegasus start
# cimmof -n root/PG_InterOp RedHat_ClusterProvider.mof
# cimmof -n root/cimv2 RedHat_ClusterSchema.mof
# yum -y install sblim-wbemcli
# # following are used system-wide credentials (root in my case)
# wbemcli ei https://user:pass@localhost:5989/root/cimv2:RedHat_Cluster


[current BAD SCENARIO + diagnostics]

*
* wbemcli: Cim: (1) CIM_ERR_FAILED: ProviderLoadFailure
  (/usr/lib64/Pegasus/providers/libRedHatClusterProvider.so:
  RedHatClusterProvider):Cannot load library,
  error: /usr/lib64/Pegasus/providers/libRedHatClusterProvider.so:
  undefined symbol: _ZNK17ClusterMonitoring4Node4nameEv
*

# # c++filt _ZNK17ClusterMonitoring4Node4nameEv
ClusterMonitoring::Node::name() const

# objdump -T /usr/lib*/Pegasus/providers/libRedHatClusterProvider.so | \
  grep UND | grep lust | tr "\t" " " | tr -s " " | cut -d" " -f5 | c++filt
ClusterMonitoring::Node::name() const
ClusterMonitoring::Cluster::unclusteredNodes()
ClusterMonitoring::Cluster::nodes()
ClusterMonitoring::Cluster::name()
ClusterMonitoring::ClusterMonitor::get_cluster()
ClusterMonitoring::Cluster::clusteredNodes()
ClusterMonitoring::Node::votes() const
ClusterMonitoring::ClusterMonitor::~ClusterMonitor()
ClusterMonitoring::Cluster::failedServices()
ClusterMonitoring::Node::clustered() const
ClusterMonitoring::Service::running() const
ClusterMonitoring::Service::clustername() const
ClusterMonitoring::Node::services()
ClusterMonitoring::Node::online() const
ClusterMonitoring::ClusterMonitor::ClusterMonitor(std::basic_string<char,
                   std::char_traits<char>, std::allocator<char> > const&)
ClusterMonitoring::Cluster::runningServices()
ClusterMonitoring::Service::name() const
ClusterMonitoring::Node::clustername() const
ClusterMonitoring::Cluster::votes()
ClusterMonitoring::Cluster::minQuorum()
ClusterMonitoring::Service::failed() const
ClusterMonitoring::Service::nodename() const
ClusterMonitoring::Cluster::services()
ClusterMonitoring::Cluster::quorate()
ClusterMonitoring::Cluster::stoppedServices()
ClusterMonitoring::Service::autostart() const

Above are the undefined symbols (complete list should be reviewed
thoroughly).


[GOOD SCENARIO]

Unknown, definitely some kind of structured information.  The objdump
command like above should yield no "our" symbol.


Most probably easy fix (to have the library compiled correctly, not to
have proper results from wbemcli command (or other means how to inspect
CIM model of the cluster).

Comment 2 RHEL Program Management 2012-10-12 18:49:40 UTC
This request was evaluated by Red Hat Product Management for
inclusion in the current release of Red Hat Enterprise Linux.
Because the affected component is not scheduled to be updated
in the current release, Red Hat is unable to address this
request at this time.

Red Hat invites you to ask your support representative to
propose this request, if appropriate, in the next release of
Red Hat Enterprise Linux.

Comment 3 Jan Pokorný [poki] 2012-10-12 18:57:15 UTC
Fix: http://git.fedorahosted.org/cgit/conga.git/commit/?id=2e7e1c3

Comment 6 Jan Pokorný [poki] 2012-10-22 15:08:07 UTC
More proper reproducer is as follows.  Either selinux-policy already
contains a fix for [bug 868959] or do not use selinux for the experiment.

# yum install -y cluster-cim
# (edit /etc/Pegasus/access.conf, remove all "EXCEPT" parts)
# cimconfig -p -s enableHttpConnection=true
# cimconfig -p -s enableHttpsConnection=false
# service tog-pegasus start
# cimmof -n root/PG_InterOp RedHat_ClusterProvider.mof
# cimmof -n root/cimv2 RedHat_ClusterSchema.mof
# yum -y install sblim-wbemcli
# # following are used system-wide credentials (root in my case)
# wbemcli ei http://root:password@localhost:5988/root/cimv2:RedHat_Cluster

with final output like this (3nodes cluster w/ only single node running,
hence not quorate; original is a single long line):

> localhost:5988/root/cimv2:RedHat_Cluster.*KEYBINDING="MISSING*"
> Name="rhel63-64kvm",Votes=1,VotesNeededForQuorum=2,MaxNumberOfNodes=0,
> NodesNumber=3,AvailableNodesNumber=1,UnavailableNodesNumber=2,
> NodesNames="rhel63-64kvm-1","rhel63-64kvm-2","rhel63-64kvm-3",
> AvailableNodesNames="rhel63-64kvm-1",
> UnavailableNodesNames="rhel63-64kvm-2","rhel63-64kvm-3",
> ServicesNumber=1,RunningServicesNumber=0,StoppedServicesNumber=1,
> FailedServicesNumber=0,ServicesNames="hello",RunningServicesNames=,
> StoppedServicesNames="hello",FailedServicesNames=,OperationalStatus=3,
> StatusDescriptions="All services stopped, not quorate",ClusterState=2,
> Types=2,CreationClassName="RedHat_Cluster"

Comment 7 Jan Pokorný [poki] 2012-10-25 15:18:12 UTC
NB: Half-way fix suffered from "clock_gettime" unresolved symbol, which
    was introduced by straightforward addition of missing internal objects.
    Once "-lrt" used, no symbol missing at run-time was encountered.

Comment 8 michal novacek 2012-11-19 16:25:08 UTC
Created attachment 647863 [details]
reproducer bash script based on comment 6

Comment 9 michal novacek 2012-11-19 16:44:14 UTC
Created attachment 647880 [details]
reproducer bash script based on comment 6

Comment 10 michal novacek 2012-11-19 17:02:50 UTC
Created attachment 647886 [details]
reproducer bash script based on comment 6

Comment 11 michal novacek 2012-11-20 10:39:15 UTC
I was not able to reproduce missing symbols as reported in -18 version.

However, there is definitely incorrect behaviour in version -18 where it
behaves correctly in the -19 version. It should correctly report instead of
reporting error.

I used reproducer.sh script attached to this script to verify the beviour
on three node cluster that has all nodes online and modclusterd is running:
$ clustat
Cluster Status for rhel63 @ Tue Nov 20 04:32:43 2012
Member Status: Quorate

 Member Name                                           ID   Status
 ------ ----                                           ---- ------
 rhel63-node01                                             1 Online
 rhel63-node02                                             2 Online
 rhel63-node03                                             3 Online, Local
$

REPORTED FAULTY version cluster-cim-0.16.2-18.el6:
$ ./reproducer.sh
...
+ wbemcli ei http://root:password@localhost:5988/root/cimv2:RedHat_Cluster
*
* wbemcli: Cim: (1) CIM_ERR_FAILED: Lost connection with cimprovagt \
* "RedHatClusterProviderModule".
*

CORRECTED version cluster-cim-0.16.2-19.el6:
$ ./reproducer.sh
...
+ wbemcli ei http://root:password@localhost:5988/root/cimv2:RedHat_Cluster
localhost:5988/root/cimv2:RedHat_Cluster.*KEYBINDING="MISSING*" Name="rhel63"\
,Votes=3,VotesNeededForQuorum=2,MaxNumberOfNodes=0,NodesNumber=3,AvailableNod\
esNumber=3,UnavailableNodesNumber=0,NodesNames="rhel63-node01","rhel63-node02\
","rhel63-node03",AvailableNodesNames="rhel63-node01","rhel63-node02","rhel63\
-node03",UnavailableNodesNames=,ServicesNumber=0,RunningServicesNumber=0,Stop\
pedServicesNumber=0,FailedServicesNumber=0,ServicesNames=,RunningServicesName\
s=,StoppedServicesNames=,FailedServicesNames=,OperationalStatus=2,StatusDescr\
iptions="All services and nodes functional",ClusterState=2,Types=2,CreationCl\
assName="RedHat_Cluster"

Comment 12 Jan Pokorný [poki] 2012-11-20 11:07:17 UTC
> I was not able to reproduce missing symbols as reported in -18 version.

Strange, wrt. /etc/Pegasus/access.conf, I've tried to leave it empty
(": > /etc/Pegasus/access.conf"), then restarted CIM server and
still getting:

# wbemcli ei http://root:password@localhost:5988/root/cimv2:RedHat_Cluster
*
* wbemcli: Cim: (1) CIM_ERR_FAILED: ProviderLoadFailure
* (/usr/lib64/Pegasus/providers/libRedHatClusterProvider.so:
* RedHatClusterProvider):Cannot load library, error:
* /usr/lib64/Pegasus/providers/libRedHatClusterProvider.so: undefined
* symbol: _ZNK17ClusterMonitoring4Node4nameEv
*

Base RHEL 6.3, no Zstreams:
# rpm -q cluster-cim tog-pegasus sblim-wbemcli
cluster-cim-0.16.2-18.el6.x86_64
tog-pegasus-2.11.0-3.el6.x86_64
sblim-wbemcli-1.6.1-1.el6.x86_64

Comment 13 michal novacek 2012-11-20 13:35:58 UTC
I retried on fresh install of RHEL 6.3 and it has behaved as expected.
I'm marking the bug verified.

BUGGY version cluster-cim-0.16.2-18.el6:

$ /tmp/reproducer.sh 
+ trap err ERR
+ service modclusterd status
modclusterd (pid  1904) is running...
+ yum -y install sblim-wbemcli cluster-cim
...
Complete!
+ rpm -q cluster-cim
cluster-cim-0.16.2-18.el6.x86_64
+ echo -ne '-: ALL\n-: ALL\n'
+ cimconfig -p -s enableHttpConnection=true
Property 'enableHttpConnection' updated in configuration file.
+ cimconfig -p -s enableHttpsConnection=false
Property 'enableHttpsConnection' updated in configuration file.
+ service tog-pegasus restart
Shutting down CIM server:                                  [  OK  ]
tog-pegasus: Generating cimserver SSL certificates...      [  OK  ]
Starting up CIM server:                                    [  OK  ]
+ cimmof -n root/PG_InterOp /usr/share/doc/cluster-cim-0.16.2/RedHat_ClusterProvider.mof
+ cimmof -n root/cimv2 /usr/share/doc/cluster-cim-0.16.2/RedHat_ClusterSchema.mof
+ wbemcli ei http://root:password@localhost:5988/root/cimv2:RedHat_Cluster
*
* wbemcli: Cim: (1) CIM_ERR_FAILED: ProviderLoadFailure
* (/usr/lib64/Pegasus/providers/libRedHatClusterProvider.so:RedHatClusterProvider):Cannot
* load library, error:
* /usr/lib64/Pegasus/providers/libRedHatClusterProvider.so: undefined symbol:
* _ZNK17ClusterMonitoring4Node4nameEv


CORRECTED version cluster-cim-0.16.2-19.el6:

$ /tmp/reproducer.sh 
+ trap err ERR
+ service modclusterd status
modclusterd (pid  4780) is running...
+ yum -y install sblim-wbemcli cluster-cim
...
Nothing to do
+ rpm -q cluster-cim
cluster-cim-0.16.2-19.el6.x86_64
+ echo -ne '-: ALL\n-: ALL\n'
+ cimconfig -p -s enableHttpConnection=true
Planned value for the property enableHttpConnection is set to "true" in CIMServer.
+ cimconfig -p -s enableHttpsConnection=false
Planned value for the property enableHttpsConnection is set to "false" in CIMServer.
+ service tog-pegasus restart
Shutting down CIM server:                                  [  OK  ]
Starting up CIM server:                                    [  OK  ]
+ cimmof -n root/PG_InterOp /usr/share/doc/cluster-cim-0.16.2/RedHat_ClusterProvider.mof
+ cimmof -n root/cimv2 /usr/share/doc/cluster-cim-0.16.2/RedHat_ClusterSchema.mof
+ wbemcli ei http://root:password@localhost:5988/root/cimv2:RedHat_Cluster
localhost:5988/root/cimv2:RedHat_Cluster.*KEYBINDING="MISSING*" Name="rhel63",Votes=3,VotesNeededForQuorum=2,MaxNumberOfNodes=0,NodesNumber=3,AvailableNodesNumber=3,UnavailableNodesNumber=0,NodesNames="rhel63-node01","rhel63-node02","rhel63-node03",AvailableNodesNames="rhel63-node01","rhel63-node02","rhel63-node03",UnavailableNodesNames=,ServicesNumber=0,RunningServicesNumber=0,StoppedServicesNumber=0,FailedServicesNumber=0,ServicesNames=,RunningServicesNames=,StoppedServicesNames=,FailedServicesNames=,OperationalStatus=2,StatusDescriptions="All services and nodes functional",ClusterState=2,Types=2,CreationClassName="RedHat_Cluster"

Comment 15 errata-xmlrpc 2013-02-21 10:56:05 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-0469.html