Bug 1167828
| Summary: | vdsm plugin broken for sos >= 3.1 | |||
|---|---|---|---|---|
| Product: | Red Hat Enterprise Virtualization Manager | Reporter: | Jiri Belka <jbelka> | |
| Component: | vdsm | Assignee: | Yeela Kaplan <ykaplan> | |
| Status: | CLOSED CURRENTRELEASE | QA Contact: | Jiri Belka <jbelka> | |
| Severity: | urgent | Docs Contact: | ||
| Priority: | high | |||
| Version: | 3.4.4 | CC: | aberezin, bazulay, bmr, bugproxy, danken, didi, ecohen, gklein, iheim, jbelka, lpeer, lsurette, lveyde, michal.skrivanek, oourfali, rbalakri, Rhev-m-bugs, rmm, sbonazzo, stirabos, yeylon, ykaplan | |
| Target Milestone: | --- | Keywords: | TestCaseNeeded, TestCaseProvided, ZStream | |
| Target Release: | 3.5.0 | |||
| Hardware: | All | |||
| OS: | Unspecified | |||
| Whiteboard: | infra | |||
| Fixed In Version: | vdsm-4.16.8.1-4.el6ev | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | ||
| Clone Of: | ||||
| : | 1171468 (view as bug list) | Environment: | ||
| Last Closed: | 2015-02-16 13:37:13 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | Infra | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | 1168500 | |||
| Bug Blocks: | 1122979, 1162189, 1171468, 1184995 | |||
| Attachments: | ||||
Can you please run "sosreport -o vdsm --verbose" on ibm-p8-rhevm-02 and paste the output? [root@ibm-p8-rhevm-02 ~]# sosreport -o vdsm --verbose │;; OPT PSEUDOSECTION:
│; EDNS: version: 0, flags:; udp: 4096
sosreport (version 3.1) │;; QUESTION SECTION:
│;249.69.16.10.in-addr.arpa. IN PTR
This command will collect diagnostic and configuration information from │
this Red Hat Enterprise Linux system and installed applications. │;; ANSWER SECTION:
│249.69.16.10.in-addr.arpa. 21590 IN PTR ibm-p8-rhevm-03.rhts.eng.bos.redhat.com.
An archive containing the collected information will be generated in │
/var/tmp and may be provided to a Red Hat support representative. │;; AUTHORITY SECTION:
│69.16.10.in-addr.arpa. 21590 IN NS ns1.eng.bne.redhat.com.
Any information provided to Red Hat will be treated in accordance with │69.16.10.in-addr.arpa. 21590 IN NS ns1.eng.nay.redhat.com.
the published support policies at: │69.16.10.in-addr.arpa. 21590 IN NS ns1.eng.rdu2.redhat.com.
│69.16.10.in-addr.arpa. 21590 IN NS ns1.eng.brq.redhat.com.
https://access.redhat.com/support/ │69.16.10.in-addr.arpa. 21590 IN NS ns1.eng.tlv.redhat.com.
│69.16.10.in-addr.arpa. 21590 IN NS ns1.eng.blr.redhat.com.
The generated archive may contain data considered sensitive and its │69.16.10.in-addr.arpa. 21590 IN NS ns1.app.eng.bos.redhat.com.
content should be reviewed by the originating organization before being │
passed to any third party. │;; ADDITIONAL SECTION:
│ns1.eng.tlv.redhat.com. 103 IN A 10.35.28.1
No changes will be made to system configuration. │ns1.eng.tlv.redhat.com. 103 IN AAAA 2620:52:0:231c:5054:ff:fe12:1070
│ns1.eng.brq.redhat.com. 276 IN A 10.34.32.3
Press ENTER to continue, or CTRL-C to quit. │ns1.eng.brq.redhat.com. 276 IN AAAA 2620:52:0:2223:5054:ff:fe0c:a664
│ns1.eng.bne.redhat.com. 211 IN A 10.64.10.64
Please enter your first initial and last name [ibm-p8-rhevm-02.rhts.eng.brq.redhat.com]: │ns1.eng.bne.redhat.com. 211 IN AAAA 2620:52:0:400b:5054:ff:fe35:ef09
Please enter the case id that you are generating this report for: 666 │ns1.eng.rdu2.redhat.com. 246 IN A 10.10.160.1
│ns1.eng.rdu2.redhat.com. 246 IN AAAA 2620:52:0:aa0::dead:beef
Running plugins. Please wait ... │ns1.eng.nay.redhat.com. 194 IN A 10.66.78.111
│ns1.eng.nay.redhat.com. 194 IN AAAA 2620:52:0:424f:216:3eff:fe52:717e
Running 1/1: base... │
Creating compressed archive... │;; Query time: 0 msec
│;; SERVER: 10.34.32.1#53(10.34.32.1)
Your sosreport has been generated and saved in: │;; WHEN: Tue Nov 25 14:05:54 GMT 2014
/var/tmp/sosreport-ibm-p8-rhevm-02.rhts.eng.brq.redhat.com.666-20141125152209.tar.xz │;; MSG SIZE rcvd: 506
│
The checksum is: 550ab16d2fe99bfbdff225f28ca4b2d9 │[root@ibm-p8-rhevm-01 ~]# lsmcode
│Version of System Firmware is FW810.10 (SV810_081) (t) FW810.10 (SV810_081) (p) FW810.10 (SV810_081) (b)
Please send this file to your support representative.
[root@ibm-p8-rhevm-02 ~]# tar tJf /var/tmp/sosreport-ibm-p8-rhevm-02.rhts.eng.brq.redhat.com.666-20141125
152209.tar.xz
sosreport-ibm-p8-rhevm-02.rhts.eng.brq.redhat.com.666-20141125152209/
sosreport-ibm-p8-rhevm-02.rhts.eng.brq.redhat.com.666-20141125152209/version.txt
sosreport-ibm-p8-rhevm-02.rhts.eng.brq.redhat.com.666-20141125152209/sos_logs/
sosreport-ibm-p8-rhevm-02.rhts.eng.brq.redhat.com.666-20141125152209/sos_logs/ui.log
sosreport-ibm-p8-rhevm-02.rhts.eng.brq.redhat.com.666-20141125152209/sos_logs/sos.log
sosreport-ibm-p8-rhevm-02.rhts.eng.brq.redhat.com.666-20141125152209/sos_reports/
sosreport-ibm-p8-rhevm-02.rhts.eng.brq.redhat.com.666-20141125152209/sos_reports/sos.txt
sosreport-ibm-p8-rhevm-02.rhts.eng.brq.redhat.com.666-20141125152209/sos_reports/sos.html
sosreport-ibm-p8-rhevm-02.rhts.eng.brq.redhat.com.666-20141125152209/sos_commands/
as per comment #3 vdsm plugin is not working. I need a VM / host for debugging since the plugin works on x86_64. Can I grab the vdsm and sos packages (ppc64, x86_64 or source are all fine) used in this test? What is the output of sosreport -l vdsm if run from the command line, as root, on the PPC host? > permissions
corrected.
sosreport -l vdsm
sosreport (version 3.1)
The following plugins are currently enabled:
acpid acpid related information
anacron capture scheduled jobs information
ata ATA and IDE related information (including PATA and SATA)
auditd Auditd related information
block Block device related information
boot Bootloader information
cgroups Red Hat specific cgroup subsystem information
cluster cluster suite and GFS related information
corosync corosync information for RedHat based distribution
cron Crontab information
devicemapper device-mapper related information
dmraid dmraid related information
filesys information on filesystems
gdm gdm related information
general Basic system information for RedHat based distributions
gluster gluster related information
grub2 Bootloader information
hardware hardware related information for Red Hat distribution
hardwaretestsuite Red Hat Hardware Test Suite related information
i18n Internationalization related information
infiniband Infiniband related information
iscsi iscsi-initiator related information Red Hat based distributions
iscsitarget iscsi-target related information for Red Hat distributions
java basic java information
kdump Kdump related information for Red Hat distributions
kernel kernel related information
kimchi kimchi-related information
krb5 Kerberos related information
kvm KVM related information
ldap LDAP related information for RedHat based distribution
libraries information on shared libraries
libvirt libvirt-related information
logrotate logrotate configuration files and debug info
logs Basic system information for RedHat based distributions
lsbrelease Linux Standard Base information
lvm2 lvm2 related information
md MD subsystem information
memory memory usage information
mrggrid MRG GRID related information
mrgmessg MRG Messaging related information
multipath device-mapper multipath information
networking network related information for RedHat based distribution
nfs NFS related information
nis NIS related information
ntp NTP related information for RedHat based distributions
openhpi OpenHPI related information
Openshift Openshift related information
openssl openssl related information for Red Hat distributions
pam PAM related information for RedHat based distribution
pci PCI device related information
powerpc IBM Power System related information
ppp ppp, wvdial and rp-pppoe related information
process process information
processor CPU information
rpm RPM information
sanlock <no description available>
sar Collect system activity reporter data
scsi hardware related information
selinux selinux related information
snmp snmp related information for RedHat based distributions
ssh ssh-related information
startup startup information for RedHat based distributions
system core system related information
systemd Information on systemd and related subsystems
systemtap SystemTap information
sysvipc SysV IPC related information
tuned Tuned related information
udev Udev related information
usb USB device related information
base <no description available>
x11 X related information
xen Xen related information
xfs information on the XFS filesystem
yum yum information
The following plugins are currently disabled:
abrt inactive ABRT log dump
anaconda inactive Anaconda / Installation information
apache inactive Apache related information for Red Hat distributions
autofs inactive autofs server-related on RedHat based distributions
ceph inactive information on CEPH
cobbler inactive cobbler related information
certificatesystem inactive Red Hat Certificate System 7.1, 7.3, 8.0 and dogtag related information
printing inactive printing related information (cups)
dhcp inactive DHCP related information for Red Hat based distributions
distupgrade inactive <no description available>
dovecot inactive dovecot server related information for RedHat based distribution
directoryserver inactive Directory Server information
emc inactive EMC related information (PowerPath, Solutions Enabler CLI and Navisphere CLI)
foreman inactive Foreman project related information
grub inactive Grub information
ipa inactive IPA diagnostic information
ipsec inactive ipsec related information for Red Hat distributions
katello inactive Katello project related information
kernelrt inactive Information specific to the realtime kernel
lilo inactive Lilo information
mysql inactive MySQL related information for RedHat based distributions
named inactive named related information for RedHat based distribution
neutron inactive OpenStack Neutron related information for Red Hat distributions
nfsserver inactive NFS server-related information
nscd inactive NSCD related information
oddjob inactive oddjob related information
openstack-ceilometer inactive OpenStackCeilometer related information for Red Hat distributions.
openstack-cinder inactive OpenStack related information for Red Hat distributions
openstack-glance inactive OpenStackGlance related information for Red Hat distributions.
openstack-heat inactive OpenStackHeat related information for Red Hat distributions.
openstack-horizon inactive OpenStack Horizon related information for Red Hat distributions
openstack-keystone inactive OpenStack Keystone related information for Red Hat distributions
openstack-neutron inactive OpenStack Neutron related information for Red Hat distributions
openstack-nova inactive OpenStack nova related information for Red Hat distributions
openstack-swift inactive OpenStackSwift related information for Red Hat distributions.
openswan inactive ipsec related information
ovirt inactive oVirt Engine related information
postfix inactive mail server related information for RedHat based distributions
postgresql inactive PostgreSQL related information for Red Hat distributions
psacct inactive Process accounting related information for RedHat based distributions
pxe inactive PXE related information for RedHat based distributions
qpid inactive Messaging related information
quagga inactive quagga related information
radius inactive radius related information on Red Hat distributions
rhui inactive Red Hat Update Infrastructure for Cloud Providers data
s390 inactive s390 related information
samba inactive Samba related information
satellite inactive RHN Satellite and Spacewalk related information
sendmail inactive sendmail information for RedHat based distributions
smartcard inactive Smart Card related information
soundcard not default Sound card information for RedHat distros
squid inactive squid Red Hat related information
sssd inactive sssd-related Diagnostic Information on Red Hat based distributions
sunrpc inactive Sun RPC related information for Red Hat systems
tftpserver inactive tftpserver related information
tomcat inactive Tomcat related information
upstart inactive Information on Upstart, the event-based init system.
veritas inactive Veritas related information
vmware inactive VMWare related information
vsftpd inactive FTP server related information
xinetd inactive xinetd information
The following plugin options are available:
auditd.logsize 15 maximum size (MiB) of logs to collect
boot.all-images off collect a file listing for all initramfs images
cluster.gfslockdump off gather output of gfs lockdumps
cluster.crm_from off specify the --from parameter passed to crm_report
cluster.lockdump off gather dlm lockdumps
filesys.lsof off gathers information on all open files
filesys.dumpe2fs off dump filesystem information
libraries.ldconfigv off the name of each directory as it is scanned, and any links that are created.
logs.logsize 15 max size (MiB) to collect per syslog file
logs.all_logs off collect all log files defined in syslog.conf
lvm2.lvmdump off collect an lvmdump tarball
lvm2.lvmdump-am off attempt to collect an lvmdump with advanced options and raw metadata collection
networking.traceroute off collects a traceroute to www.example.com
Openshift.broker off Gathers broker specific files
Openshift.node off Gathers node specific files
rpm.rpmq on queries for package information via rpm -q
rpm.rpmva off runs a verify on all packages
sar.all_sar off gather all system activity records
selinux.fixfiles off Print incorrect file context labels
selinux.list off List objects and their context
startup.servicestatus off get a status of all running services
xfs.logprint off gathers the log information
yum.yumlist off list repositories and packages
yum.yumdebug off gather yum debugging data
kroot@ibm-p8-rhevm-02:~\[root@ibm-p8-rhevm-02 ~]#
Thanks - from the output in comment #10 vdsm.py is not loadable on this system. Do you see some exception if you run manually with --debug? Also where are these 3.1 packages getting built/hosted? I can't seem to find anything in brew or dist-git. I'm requesting to add an automated test to the release criteria: - install, configure and start vdsm - install vdsm-test.rpm cd /usr/shre/vdsm/tests ./run_tests.sh functional/sosPluginTests.py It's a simple test, that could have caught this bug before release. I tend to disagree with having severity urgent on this bug since vdsm logs can be easily collected manually by executing "tar czf logs.tgz /var/log/vdsm", I think high would have been enough. Running sosreport using the python debugger: > /usr/lib/python2.7/site-packages/sos/sosreport.py(731)load_plugins() -> if not len(plugin_classes): (Pdb) n > /usr/lib/python2.7/site-packages/sos/sosreport.py(735)load_plugins() -> plugin_class = self.policy.match_plugin(plugin_classes) (Pdb) n > /usr/lib/python2.7/site-packages/sos/sosreport.py(736)load_plugins() -> if not self.policy.validate_plugin(plugin_class): (Pdb) plugin_class <class 'sos.plugins.vdsm.Base'> I would have expected something like class 'sos.plugins.vdsm.vdsm' here since the class is declared as: class vdsm(Base) in vdsm.py. Bryn is this right? That's an artefact of the scoping changes caused by defining Base in sos/plugins/vdsm.py as opposed to sos/plugintools.py; what's slightly odd is that the plugin does run (albeit with the wrong name): # sosreport -v --batch --debug -o vdsm sosreport (version 3.1) [...] Running plugins. Please wait ... Running 1/1: base... Creating compressed archive... We don't appear to be throwing an exception in setup/collect but we're also not collecting anything. I've just logged into the box & started testing - will update the bug again when I have more information to share. It's not simply that the class derivation is wrong; the resulting class object is completely missing the defined attributes present in the source for the vdsm class:
(Pdb) self.loaded_plugins
deque([('base', <sos.plugins.vdsm.Base object at 0x1000ad2eb50>)])
(Pdb) self.loaded_plugins[0]
('base', <sos.plugins.vdsm.Base object at 0x1000ad2eb50>)
(Pdb) dir(self.loaded_plugins[0])
['__add__', '__class__', '__contains__', '__delattr__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__getnewargs__', '__getslice__', '__gt__', '__hash__', '__init__', '__iter__', '__le__', '__len__', '__lt__', '__mul__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__rmul__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', 'count', 'index']
No setup, no option_list, ... no nothing.
It's clear at this point that the compat hack is responsible but it's not entirely clear why.
Generally I would recommend not attempting these types of hacks; it's easier to just ship a native plugin (even if that means having to have two versions of the code lying around).
If they are absolutely required then at a minimum I'd suggest working with the sos team to ensure that the end result is going to function as expected.
The compat wrapper to attempt to load a sos2/sos3 version of the plugin at runtime breaks the sos classloader.
I don't have an environment in which to test right now but unless the python runtime sorts class definitions differently on ppc64 vs. x86_64 (seems rather unlikely) I don't see how this would be specific to Power platforms.
Looking into the namespace of sos.plugins.vdsm:
(Pdb) dir(sos.plugins.vdsm)
['Base', 'Plugin', 'RedHatPlugin', '__builtins__', '__doc__', '__file__', '__name__', '__package__', '_importVdsmPylibModule', 'config', 'os', 'subprocess', 'vdsm']
And then into the vdsm class we find the missing attrs:
(Pdb) dir(sos.plugins.vdsm.vdsm)
['__class__', '__delattr__', '__dict__', '__doc__', '__format__', '__getattribute__', '__hash__', '__init__', '__module__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', '_addVdsmRunDir', '_get_dest_for_srcpath', '_path_in_path_list', '_vdsm__addCopySpecLogLimit', 'addCopySpec', 'addCopySpecLimit', 'add_alert', 'add_cmd_output', 'add_copy_spec', 'add_copy_spec_limit', 'add_copy_specs', 'add_custom_text', 'add_forbidden_path', 'add_string_as_file', 'call_ext_prog', 'check_enabled', 'check_ext_prog', 'collect', 'collectExtOutput', 'collect_cmd_output', 'collect_copy_specs', 'collect_strings', 'copy_dir', 'copy_symlink', 'default_enabled', 'do_cmd_output_sub', 'do_copy_path', 'do_file_sub', 'do_path_regex_sub', 'do_regex_find_all', 'expand_copy_spec', 'file_grep', 'files', 'getOption', 'get_all_options', 'get_cmd_output_now', 'get_cmd_output_path', 'get_command_output', 'get_description', 'get_option', 'get_option_as_list', 'is_installed', 'make_command_filename', 'mangle_command', 'name', 'optionList', 'option_enabled', 'packages', 'plugin_name', 'policy', 'postproc', 'report', 'requires_root', 'set_option', 'setup', 'version']
The sos.plugins.vdsm.Base class (the first defined in the file) is being instantiated by the sosreport class loader instead of the desired vdsm class.
This occurs because it is correctly tagged (RedHatPlugin), loads, and appears first in the plugin_classes list for the vdsm module (remember that each sos plugin exists in its own python module):
plugbase: vdsm, [[<class 'sos.plugins.vdsm.Base'>, <class
'sos.plugins.vdsm.vdsm'>]]
It may be possible to fix this in the class loader but I am not keen to endorse this use-case; it is not something we test (or can reasonably test since it's purpose is to allow forward/backward compatibility between runtimes).
The only defined use for multiple class definitions in a plugin module today is portability: that works fine since the class loader will resolve a single instance via the set of enabled tagging classes.
I would have serious concerns about making these compat wrappers a supported feature: it encourages something we are trying to discourage (out-of-tree plugins) and our ability to meaningfully test and maintain the feature is limited.
Moving to infra since this will be fixed by fixing bug #1168500 which is on infra The monkey patching in vdsm is incompatible with this post-3.1 patch:
commit 4d1351efbd09220c36e889e222c40fe3ae68958a
Author: Bryn M. Reeves <bmr>
Date: Wed Mar 12 20:25:19 2014 +0000
Match plugins against policies
Fixes Issue #238.
When tagging classes are used to enable plugins on multiple
platforms it is possible for there to be more than one valid class
instance for a given policy. For e.g.:
class DebianFooPlugin(Plugin, DebianPlugin):
///
class UbuntuFooPlugin(Plugin, UbuntuPlugin):
///
Since UbuntuPolicy includes both DebianPlugin and UbuntuPlugin in
its valid_subclasses list both classes pass the validity test and
both are added to the loaded_plugins list. This causes plugins
to run twice:
2014-03-12 19:57:50,974 DEBUG: copying file /var/log/mail.log to /var/log/mail.log
2014-03-12 19:57:50,975 DEBUG: added /var/log/mail.log to FileCacheArchive /tmp/sosreport-u1210-vm1-20140312195750
2014-03-12 19:57:51,293 DEBUG: copying file /var/log/mail.log to /var/log/mail.log
2014-03-12 19:57:51,294 DEBUG: added /var/log/mail.log to FileCacheArchive /tmp/sosreport-u1210-vm1-20140312195750
Fix this by adding a match_plugin() method to the policy base
class and prefer plugins that are subclasses of the first entry
in the list. This patch also reverses the order of the
valid_subclasses list for the UbuntuPolicy to ensure preference
is given to native plugins:
self.valid_subclasses = [UbuntuPlugin, DebianPlugin]
Signed-off-by: Bryn M. Reeves <bmr>
[...]
+ def match_plugin(self, plugin_classes):
+ if len(plugin_classes) > 1:
+ for p in plugin_classes:
+ # Give preference to the first listed tagging class
+ # so that e.g. UbuntuPlugin is chosen over DebianPlugin
+ # on an Ubuntu installation.
+ if issubclass(p, self.valid_subclasses[0]):
+ return p
+ return plugin_classes[0]
+
Using __all__ or _-prefixing unfortunately isn't sufficient to 'hide' this internal class from the sos class loader (I was hoping this might give a way to avoid this without needing changes outside vdsm.py). I have a fix for this now that allows the monkey-patched vdsm.py to load on sos >= 3.1-36-g4d1351e. I'm not certain that it's something we should merge upstream however; it's a kludge and really only of help in this very specific use-case.
There is also another problem (even after addressing the vdsm load problem) with the packages as built and shipped by IBM.
The kimchi pacakge (kimchi-1.2.0-17.7.pkvm2_1.13.ppc64) installed on this host installs an sos plugin:
# rpm -ql kimchi | grep sos
/usr/lib/python2.7/site-packages/sos/plugins/kimchi.py
/usr/lib/python2.7/site-packages/sos/plugins/kimchi.pyc
/usr/lib/python2.7/site-packages/sos/plugins/kimchi.pyo
This plugin attempts to execute several commands that are problematic in a RHEV environment:
30 self.add_cmd_output("virsh pool-list --details")
31 rc, out, _ = sos_get_command_output('virsh pool-list')
32 if rc == 0:
33 for pool in out.splitlines()[2:]:
34 if pool:
35 pool_name = pool.split()[0]
36 self.add_cmd_output("virsh vol-list --pool %s --details"
37 % pool_name)
Since virsh in a RHEV setup requires authentication for the pool-list action these commands all block indefinitely:
72064 pts/0 S+ 0:00 | \_ /usr/bin/python /usr/sbin/sosreport -vvv --batch --debug
72081 pts/0 S 0:00 | \_ /usr/bin/timeout 300s virsh pool-list
72082 pts/0 Tl 0:00 | \_ virsh pool-list
Sos now runs all commands via timeout so these will be abandoned after 5m however since the plugin runs the command three times it will cause sosreport generation on these systems to take over 15m.
Not exactly an optimal user experience...
The plugin also misuse sos.utilities.sos_get_command_output; this should never be called directly from plugin code (Plugin.get_cmd_output_now() is provided for that purpose).
Since the plugin was never submitted upstream it's had no review from sos developers before now.
Ricardo, can sos/plugins/kimchi.py mentioned in comment 25 use virsh's --readonly, so in can fly on a RHEV host? If the kimchi plugin is to be used on RHEV systems it should be submitted upstream so it can be reviewed and integrated into the main sos tree. Considering the side effects of running it we've seen here I'm considering making changes to sos to detect its presence and disable it. Created attachment 964294 [details]
Fix the existing sos plugin for kimchi on PokwerKVM (2.1.1)
------- Comment on attachment From clnperez.com 2014-12-03 18:19 EDT-------
I've tested this patch on a Power system here and would appreciate feedback. The person who was working on the plugin before is out for a couple of weeks so I'm taking a crack at it.
What it created was
$ ls sos_commands/kimchi/
virsh_pool-list_--details virsh_vol-list_--pool_ISO_--details virsh_vol-list_--pool_nfs-pool_--details
virsh_vol-list_--pool_default_--details virsh_vol-list_--pool_logical_--details
Also, given this comment:
If the kimchi plugin is to be used on RHEV systems it should be submitted upstream so it can be reviewed and integrated into the main sos tree.
Considering the side effects of running it we've seen here I'm considering making changes to sos to detect its presence and disable it.
---
is this okay for the time being?
It's an improvement but there are still some problems:
+ self.add_cmd_output_now("virsh -r pool-list --details")
+ file_name = Plugin.get_cmd_output_path('virsh -r pool-list',
+ suggest_filename = u'pools-list')
The return value of get_cmd_output_now() *is* the captured path; if you want to access the data as a file then just store the return and open() it.
I don't know what the next line is supposed to do; you should never call Plugin instance methods via the class object (there is exactly one class method defined on Plugin; Plugin.name()).
The get_cmd_output_path method returns a string that describes the path into which this plugin's command output should be stored. It does *not* collect any data and it does not accept a 'suggest_filename' kwarg. This is documented in the method docstring:
"""Return a path into which this module should store collected
command output
"""
The purpose of this to support external data collector tools; typically something like:
dump_path = self.get_cmd_output_path("dump_stuff")
self.add_cmd_output("dump_stuff --output-path=%s" % dump_path)
It's clear that the plugin does not work as intended currently as there is no 'pool_list' or 'virsh_-r_pool-list' in the output collected in comment #29.
The rest of the plugin isn't really reviewable at the moment since it cannot possibly work (file_name will always reference a non-existant file causing an exception in the subsequent open() - this should be visible when run with '--debug').
Why not just send this upstream where it can be maintained with the other plugins?
We are seriously considering adding a blacklisting mechanism due to the frequent problems with 3rd party plugins and the amount of developer time they consume.
------- Comment From clnperez.com 2014-12-04 17:15 EDT------- (In reply to comment #18) Note: Your Comment #30 is our Comment #18. First, thanks for the comments. I've got a few questions, and will also pick this up in the m/l ASAP. > It's an improvement but there are still some problems: > > + self.add_cmd_output_now("virsh -r pool-list --details") > + file_name = Plugin.get_cmd_output_path('virsh -r pool-list', > + suggest_filename = u'pools-list') > > The return value of get_cmd_output_now() *is* the captured path; if you want > to access the data as a file then just store the return and open() it. Noted. Thanks. > > I don't know what the next line is supposed to do; you should never call > Plugin instance methods via the class object (there is exactly one class > method defined on Plugin; Plugin.name()). > > The get_cmd_output_path method returns a string that describes the path into > which this plugin's command output should be stored. It does *not* collect > any data and it does not accept a 'suggest_filename' kwarg. This is > documented in the method docstring: > > """Return a path into which this module should store collected > command output > """ That was the intent of that line (to just use the output for the following virsh commands), so I didn't want it to store the call without the --details flag. > > The purpose of this to support external data collector tools; typically > something like: > > dump_path = self.get_cmd_output_path("dump_stuff") > self.add_cmd_output("dump_stuff --output-path=%s" % dump_path) > > It's clear that the plugin does not work as intended currently as there is > no 'pool_list' or 'virsh_-r_pool-list' in the output collected in comment > #29. > > The rest of the plugin isn't really reviewable at the moment since it cannot > possibly work (file_name will always reference a non-existant file causing > an exception in the subsequent open() - this should be visible when run with > '--debug'). So are you saying it won't work just on RHEV? It worked fine on our PowerKVM system. Is the fact that it is RHEV what's going to cause the file not to be created? So I'll need to find a different way to get the output of the first virsh command (to get the pools defined on the system) to use for the next? It sounds like I'll be able to just use the --details call and parse that output (with some possible modification to the loop), so that's a non-issue. Just wondering why this won't work to understand the env a bit better. > > Why not just send this upstream where it can be maintained with the other > plugins? We're planning to. We were under the impression that a short-term solution for the current plugin was preferrable, and in the long term (a few weeks max, so not too long...), we'd get this upstream and out of kimchi. > > We are seriously considering adding a blacklisting mechanism due to the > frequent problems with 3rd party plugins and the amount of developer time > they consume. That said -- should we abandon this short-term fix? ------- Comment From fnovak.com 2014-12-04 17:51 EDT------- On the call w/ RH today, what both side thought best is to as planned try and work this in parallel.. Make the short term fix best attempt aligned with what would be needed to push to sos community (rather than kimchi community), and use that for short term incorporation into RHEV/PowerKVM delivery , as well as submitting upstream to sos... if we get feedback, update, etc and have time to respin before RHEV ga, great, if not, next update will hopefully have the community side taken care of.. Any suggestions RH or anyone has as this is updated welcome.. One of the biggest issues I believe was insuring virsh --readonly but other updates as discussed goodness... > So are you saying it won't work just on RHEV? It worked fine on our PowerKVM > system. I'm puzzled as to how this can work on PowerKVM based on the code in comment #29: 01 self.add_cmd_output_now("virsh -r pool-list --details") 02 file_name = Plugin.get_cmd_output_path('virsh -r pool-list', 03 suggest_filename = u'pools-list') 04 if file_name is not None: 05 with open(file_name, '-r') as pools: 06 for pool in pools.splitlines()[2:]: 07 if pool: 08 pool_name = pool.split()[0] 09 self.add_cmd_output_now("virsh -r vol-list --pool \ 10 %s --details" % pool_name) Line 2 assigns the return of get_cmd_output_path() to file_name. This will be something like: "/var/tmp/sosreport-user-host-2014120412345/sos_commands/kimchi/virsh_-r_pool-list" But nothing ever appears to write anything to that location; the file listing in comment #29 seems to confirm this: $ ls sos_commands/kimchi/ virsh_pool-list_--details virsh_vol-list_--pool_ISO_--details virsh_vol-list_--pool_nfs-pool_--details virsh_vol-list_--pool_default_--details virsh_vol-list_--pool_logical_--details There is no 'virsh_-r_pool-list'; this is why I don't see how this can work on PowerKVM (or anywhere) since the next thing the plugin does is to try to open that file for reading and parse the contents to drive the 'vol-list' loop. What's really strange though is that you do appear to have the results of the "vol-list" loop in the above directory listing. Did this data really come from the plugin code from the same comment? Or is there some missing context in the diff and this file is collected elsewhere in the version of the plugin you are using? (I do not have access to the full source file for the plugin currently). The normal idiom for working with command output that you do not want to store in the report is to use the Plugin.call_ext_prog() method. Something like: 115 ip_link_result = self.call_ext_prog("ip -o link") 116 if ip_link_result['status'] == 0: 117 for eth in self.get_eth_interfaces(ip_link_result['output']): 118 self.add_cmd_output([ 119 "ethtool "+eth, 120 "ethtool -i "+eth 121 ]) The return value of call_ext_prog() is a dictionary containing 'status' and 'output' members that can be tested and used to drive further collection. > We're planning to. We were under the impression that a short-term solution > for the current plugin was preferrable, and in the long term (a few weeks > max, so not too long...), we'd get this upstream and out of kimchi. That would be great, thank - you can submit patches (or post them for discussion) at the sos-devel mailing list: http://www.redhat.com/mailman/listinfo/sos-devel Or via the GitHub project pages: https://github.com/sosreport/sos > That said -- should we abandon this short-term fix? No need - if you need to get something together quickly to meet deadlines that's understandable and we can help with that but the earlier things get submitted (or posted for review feedback) the less chance there is of needing to make significant changes or rewrites before the plugin can be accepted (as well as any knock-on changes to support tooling that consumes the data). Created attachment 965219 [details]
short-term fix for powerkvm
------- Comment on attachment From clnperez.com 2014-12-05 20:24 EDT-------
So, Bryn, you were absoultely right that the output I pasted wasn't from the patch. It turns out that neither running from a local branch nor our make install picked up my changes. Sorry for that confusion.
This version was actually tested on a system (although still one without the RHEV build) and is candidate for our 12/12 internal build.
------- Comment From clnperez.com 2014-12-09 16:01 EDT------- Hi RH, Any feedback on this latest patch? Created attachment 966346 [details]
3.4 vdsm plugin sos patch
Regarding the vdsm.py patch in comment #37 I still do not recommend this approach and there is every chance that later versions of sos will break it again (e.g. has it been tested with a RHEL7 3.2 build?). I've posted a reply in the review thread for the Kimchi plugin on sos-devel; generally the plugin looks OK but I think the virsh operations belong in the existing libvirt plugin (since they are useful to more components than just kimchi but currently will only be collected if the kimchi package is installed). There were a few other minor nits but they should be easy to clear up. The patch in attachment 966346 [details] works and packs the VDSM logs into the resulting archive, but a few errors show up when it is running:
----------------------------------
Running plugins. Please wait ...
Running 1/1: vdsm... caught IO error copying /var/run/vdsm/svdsm.sock
could not run '/etc/init.d/vdsmd status': command not found
Creating compressed archive...
----------------------------------
Those aren't 'errors' in the sense that something has gone wrong; the plugin is just doing dumb stuff. It's attempting to copy a UNIX socket (svdsm.sock) which obviously won't work and it's trying to run an initscript that does not exist. (In reply to Vitor de Lima from comment #40) Both issues are fixed in the master branch of vdsm. We can get them into the d/s 3.4 branch, but I don't think it's particularly urgent. ------- Comment From clnperez.com 2014-12-10 17:33 EDT------- Quick question re: log sizes. Since the --all-logs option isn't available for this version, is it okay to pull in all files (including tar'd & zipped log archives)? Or should we limit it to only the .log files? e.g.: ]# ls var/log/nginx/ access.log access.log-20141125.gz access.log-20141202.gz error.log-20141122.gz error.log-20141203.gz error.log-20141209.gz access.log-20141119.gz access.log-20141126.gz access.log-20141206.gz error.log-20141125.gz error.log-20141204.gz error.log-20141210.gz access.log-20141121.gz access.log-20141127.gz access.log-20141210.gz error.log-20141128.gz error.log-20141205.gz access.log-20141122.gz access.log-20141128.gz error.log error.log-20141202.gz error.log-20141206.gz Kimchi uses nginx, so I'd like to pull in that info too. Thanks! alright, given the time pressure I think it's good enough. So IIUC both changes as posted here in this bug are ok. And there's still some work upstream on both... > Since the --all-logs option isn't available for this version The all_logs option is available as a per-plugin option in 3.1, i.e. set like: # sosreport -o logs -k logs.all_logs Since the syntax for accessing global and per-plugin logs is the same you can add this to the plugin now and it will get hooked up to the global --all-logs when used in versions that support that. E.g. from the 3.1 logs plugin: if self.get_option('all_logs'): logs = self.do_regex_find_all("^\S+\s+(-?\/.*$)\s+", "/etc/syslog.conf") [...] > Kimchi uses nginx, so I'd like to pull in that info too. This should be in a separate plugin; Kimchi is not the only user of nginx. Created attachment 967286 [details]
short-term fix for powerkvm v3
------- Comment on attachment From clnperez.com 2014-12-11 15:46 EDT-------
Here's the latest version, using the all_logs option, and I also added a log_size option.
Created attachment 967365 [details]
short-term fix for powerkvm v4
------- Comment on attachment From clnperez.com 2014-12-11 18:53 EDT-------
One more try. I realized I misunderstood what Bryn said about the all_logs flag, so that's been added. I also renamed the log_limit parameter to match to other plugins' logsize parameter for consistency.
Sorry abou the rapid-fire patches here...
Created attachment 969063 [details]
short-term fix for powerkvm v3
default comment by bridge
Created attachment 969064 [details]
short-term fix for powerkvm v4
default comment by bridge
ok, rhel7 x86_64. this bug could not be found on ppc64 as we do have _only_ 3.4 vdsm there and that one is covered with BZ1171468. vdsm-4.16.8.1-4.el7ev.x86_64 sos-3.2-9.el7.noarch |
Created attachment 961216 [details] ovirt-log-collector-20141125140232.log Description of problem: there are no vdsm logs for ppc64 hosts # grep --color ',vdsm,' /var/log/ovirt-engine/ovirt-log-collector/ovirt-log-collector-20141125140232.log | grep ibm-p8 2014-11-25 14:08:07::DEBUG::engine-log-collector::218::root:: calling(['/usr/bin/ssh', '-n', '-p', '22', '-i', '/etc/pki/ovirt-engine/keys/engine_id_rsa', '-oStrictHostKeyChecking=no', '-oServerAliveInterval=600', 'root.eng.brq.redhat.com', "\nVERSION=`/bin/rpm -q --qf '[%{VERSION}]' sos | /bin/sed 's/\\.//'`;\nif [ $VERSION -ge 30 ]; then\n /usr/sbin/sosreport --batch -k logs.all_logs=True -o logs,libvirt,vdsm,general,networking,hardware,process,yum,filesys,devicemapper,selinux,kernel,memory,rpm\nelif [ $VERSION -ge 22 ]; then\n /usr/sbin/sosreport --batch -k general.all_logs=True -o libvirt,vdsm,general,networking,hardware,process,yum,filesys,devicemapper,selinux,kernel,memory,rpm\nelif [ $VERSION -ge 17 ]; then\n /usr/sbin/sosreport --no-progressbar -k general.all_logs=True -o vdsm,general,networking,hardware,process,yum,filesys\nelse\n /bin/echo No", 'valid', 'version', 'of', 'sosreport', 'found. 1>&2\n exit 1\nfi\n']) 2014-11-25 14:08:07::DEBUG::engine-log-collector::218::root:: calling(['/usr/bin/ssh', '-n', '-p', '22', '-i', '/etc/pki/ovirt-engine/keys/engine_id_rsa', '-oStrictHostKeyChecking=no', '-oServerAliveInterval=600', 'root.eng.brq.redhat.com', "\nVERSION=`/bin/rpm -q --qf '[%{VERSION}]' sos | /bin/sed 's/\\.//'`;\nif [ $VERSION -ge 30 ]; then\n /usr/sbin/sosreport --batch -k logs.all_logs=True -o logs,libvirt,vdsm,general,networking,hardware,process,yum,filesys,devicemapper,selinux,kernel,memory,rpm\nelif [ $VERSION -ge 22 ]; then\n /usr/sbin/sosreport --batch -k general.all_logs=True -o libvirt,vdsm,general,networking,hardware,process,yum,filesys,devicemapper,selinux,kernel,memory,rpm\nelif [ $VERSION -ge 17 ]; then\n /usr/sbin/sosreport --no-progressbar -k general.all_logs=True -o vdsm,general,networking,hardware,process,yum,filesys\nelse\n /bin/echo No", 'valid', 'version', 'of', 'sosreport', 'found. 1>&2\n exit 1\nfi\n']) [root@jb-rhevm34 ~]# tar xJf /tmp/sosreport-LogCollector-20141125140835.tar.xz './log-collector-data/ibm-p8-rhevm-0*.rhts.eng.brq.redhat.com/ibm-p8-rhevm-0*.rhts.eng.brq.redhat.com-sosreport-ibm-p8-rhevm-0*.rhts.eng.brq.redhat.com-2014*.tar.xz' [root@jb-rhevm34 ~]# for f in log-collector-data/*/* ; do tar tJvf $f ; done | fgrep vdsm -rw-r--r-- root/root 76 2014-11-07 00:11 sosreport-ibm-p8-rhevm-01.rhts.eng.brq.redhat.com-20141125130808/etc/sysctl.d/vdsm.conf -rw------- root/root 443 2014-11-25 10:04 sosreport-ibm-p8-rhevm-01.rhts.eng.brq.redhat.com-20141125130808/etc/libvirt/nwfilter/vdsm-no-mac-spoofing.xml lrw------- root/root 0 2014-11-12 09:54 sosreport-ibm-p8-rhevm-01.rhts.eng.brq.redhat.com-20141125130808/etc/libvirt/qemu/networks/autostart/vdsm-rhevm.xml -> ../vdsm-rhevm.xml -rw------- root/root 383 2014-11-12 09:54 sosreport-ibm-p8-rhevm-01.rhts.eng.brq.redhat.com-20141125130808/etc/libvirt/qemu/networks/vdsm-rhevm.xml -rw-r--r-- root/root 91 2014-05-17 05:04 sosreport-ibm-p8-rhevm-01.rhts.eng.brq.redhat.com-20141125130808/etc/modprobe.d/vdsm-nestedvt.conf -r--r--r-- root/root 2241 2014-11-25 14:08 sosreport-ibm-p8-rhevm-01.rhts.eng.brq.redhat.com-20141125130808/proc/net/dev_snmp6/;vdsmdummy; -rw-r--r-- root/root 76 2014-11-07 00:11 sosreport-ibm-p8-rhevm-02.rhts.eng.brq.redhat.com-20141125130813/etc/sysctl.d/vdsm.conf -rw------- root/root 443 2014-11-25 12:40 sosreport-ibm-p8-rhevm-02.rhts.eng.brq.redhat.com-20141125130813/etc/libvirt/nwfilter/vdsm-no-mac-spoofing.xml lrw------- root/root 0 2014-11-12 09:53 sosreport-ibm-p8-rhevm-02.rhts.eng.brq.redhat.com-20141125130813/etc/libvirt/qemu/networks/autostart/vdsm-rhevm.xml -> ../vdsm-rhevm.xml -rw------- root/root 383 2014-11-12 09:53 sosreport-ibm-p8-rhevm-02.rhts.eng.brq.redhat.com-20141125130813/etc/libvirt/qemu/networks/vdsm-rhevm.xml -r--r--r-- root/root 2241 2014-11-25 14:08 sosreport-ibm-p8-rhevm-02.rhts.eng.brq.redhat.com-20141125130813/proc/net/dev_snmp6/;vdsmdummy; but vdsm logs exist for x86 host in same DC [root@jb-rhevm34 ~]# tar tvJf ./log-collector-data/10.34.63.223/10.34.63.223-sosreport-admin-20141125140829-b7e4.tar.xz | fgrep /var/log/vdsm/ drwxr-xr-x root/root 0 2014-11-25 14:08 dell-r210ii-04-2014112514081416920888/var/log/vdsm/ -rw-r--r-- root/root 13527 2014-11-25 13:55 dell-r210ii-04-2014112514081416920888/var/log/vdsm/supervdsm.log -rw-r--r-- vdsm/kvm 686 2014-11-25 14:08 dell-r210ii-04-2014112514081416920888/var/log/vdsm/connectivity.log -rw-r--r-- vdsm/kvm 3303920 2014-11-25 14:08 dell-r210ii-04-2014112514081416920888/var/log/vdsm/vdsm.log -rw-r--r-- vdsm/kvm 0 2014-11-25 09:04 dell-r210ii-04-2014112514081416920888/var/log/vdsm/metadata.log -rw-r--r-- vdsm/kvm 964 2014-11-25 09:09 dell-r210ii-04-2014112514081416920888/var/log/vdsm/mom.log [root@ibm-p8-rhevm-02 ~]# ls -ldZ /var/log/vdsm drwxr-xr-x. vdsm kvm system_u:object_r:virt_log_t:s0 /var/log/vdsm [root@ibm-p8-rhevm-02 ~]# grep denied /var/log/audit/audit.log [root@ibm-p8-rhevm-02 ~]# Version-Release number of selected component (if applicable): rhevm-log-collector-3.4.5-2.el6ev.noarch / sos-3.1-1.1.pkvm2_1_1.6.noarch How reproducible: 100% Steps to Reproduce: 1. add ppc64 hosts into 3.4 rhevm 2. execute log collector 3. inspect collected data Actual results: no vdsm logs for ppc64 hosts Expected results: should be there Additional info: