Bug 1372093
| Summary: | vdsm sos plugin should collect 'nodectl info' output | |||
|---|---|---|---|---|
| Product: | Red Hat Enterprise Virtualization Manager | Reporter: | Derrick Ornelas <dornelas> | |
| Component: | vdsm | Assignee: | Ryan Barry <rbarry> | |
| Status: | CLOSED ERRATA | QA Contact: | Pavol Brilla <pbrilla> | |
| Severity: | high | Docs Contact: | ||
| Priority: | high | |||
| Version: | 4.0.2 | CC: | bazulay, cshao, danken, dguo, dornelas, gklein, lsurette, lsvaty, mgoldboi, mkalinin, mperina, nsoffer, oourfali, pbrilla, srevivo, weiwang, ycui, ykaul | |
| Target Milestone: | ovirt-4.1.1-1 | Keywords: | TestOnly, ZStream | |
| Target Release: | --- | |||
| Hardware: | All | |||
| OS: | Linux | |||
| Whiteboard: | ||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | ||
| Doc Text: | Story Points: | --- | ||
| Clone Of: | ||||
| : | 1381590 (view as bug list) | Environment: | ||
| Last Closed: | 2017-04-25 00:47:26 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | Node | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | ||||
| Bug Blocks: | 1381590 | |||
--- /usr/lib/python2.7/site-packages/sos/plugins/vdsm.py.orig 2016-08-31 17:46:31.769168422 -0400
+++ /usr/lib/python2.7/site-packages/sos/plugins/vdsm.py 2016-08-31 17:48:34.583332264 -0400
@@ -112,6 +112,7 @@ class vdsm(Plugin, RedHatPlugin):
self.collectExtOutput("/usr/bin/iostat")
self.collectExtOutput("/sbin/iscsiadm -m node")
self.collectExtOutput("/sbin/iscsiadm -m session")
+ self.collectExtOutput("/usr/sbin/nodectl info")
sslopt = ['', '-s '][config.getboolean('vars', 'ssl')]
vdsclient = "/usr/bin/vdsClient " + sslopt + "0 "
self.collectExtOutput(vdsclient + "getVdsCapabilities")
If sosreport can't find the command on the host, which will happen when this is run on RHEL, then it just ignores it. This is actually good for us as we can use the fact that the sos_commands/vdsm/nodectl_info command exists or not to know right away whether we're dealing with RHV-H or RHEL. This looks like a very easy fix, and would become even easier if Derrick post his patch to upstream vdsm, and take care of backporting. Moving back to MODIFIED as new 4.0.5 clone has been created File is there but contains python exeptions not info about node, command itself works wine on server.
# cat sosreport-p.1-20170208085724/sos_commands/vdsm/nodectl_info
WARNING: Not using lvmetad because config setting use_lvmetad=0.
WARNING: To avoid corruption, rescan devices to make changes visible (pvscan --cache).
Traceback (most recent call last):
File "/usr/lib64/python2.7/runpy.py", line 162, in _run_module_as_main
"__main__", fname, loader, pkg_name)
File "/usr/lib64/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File "/usr/lib/python2.7/site-packages/nodectl/__main__.py", line 42, in <module>
CliApplication()
File "/usr/lib/python2.7/site-packages/nodectl/__init__.py", line 200, in CliApplication
return cmdmap.command(args)
File "/usr/lib/python2.7/site-packages/nodectl/__init__.py", line 118, in command
return self.commands[command](**kwargs)
File "/usr/lib/python2.7/site-packages/nodectl/__init__.py", line 76, in info
Info(self.imgbased, self.machine).write()
File "/usr/lib/python2.7/site-packages/nodectl/info.py", line 45, in __init__
self._fetch_information()
File "/usr/lib/python2.7/site-packages/nodectl/info.py", line 49, in _fetch_information
self._get_layout()
File "/usr/lib/python2.7/site-packages/nodectl/info.py", line 66, in _get_layout
layout = LayoutParser(self.app.imgbase.layout()).parse()
File "/usr/lib/python2.7/site-packages/imgbased/imgbase.py", line 148, in layout
return self.naming.layout()
File "/usr/lib/python2.7/site-packages/imgbased/naming.py", line 111, in layout
raise RuntimeError("No valid layout found. Initialize if needed.")
RuntimeError: No valid layout found. Initialize if needed.
# yum list sos vdsm systemd redhat-release-virtualization-host
Loaded plugins: imgbased-persist, product-id, search-disabled-repos, subscription-manager
This system is not registered to Red Hat Subscription Management. You can use subscription-manager to register.
Installed Packages
redhat-release-virtualization-host.x86_64 4.1-0.6.el7 installed
sos.noarch 3.3-5.el7_3 installed
systemd.x86_64 219-30.el7_3.6 installed
vdsm.x86_64 4.19.4-1.el7ev installed
Can you attach systemd log? Also, please paste 'nodectl info' command output. Adding nodectl
# nodectl info
layers:
rhvh-4.1-0.20170202.0:
rhvh-4.1-0.20170202.0+1
bootloader:
default: rhvh-4.1-0.20170202.0+1
entries:
rhvh-4.1-0.20170202.0+1:
index: 0
title: rhvh-4.1-0.20170202.0
kernel: /boot/rhvh-4.1-0.20170202.0+1/vmlinuz-3.10.0-514.6.1.el7.x86_64
args: "ro crashkernel=auto rd.lvm.lv=rhvh_slot-10/rhvh-4.1-0.20170202.0+1 rd.lvm.lv=rhvh_slot-10/swap console=ttyS1,115200n8 LANG=en_US.UTF-8 img.bootid=rhvh-4.1-0.20170202.0+1"
initrd: /boot/rhvh-4.1-0.20170202.0+1/initramfs-3.10.0-514.6.1.el7.x86_64.img
root: /dev/rhvh_slot-10/rhvh-4.1-0.20170202.0+1
current_layer: rhvh-4.1-0.20170202.0+1
Also dev received access to server
I didn't find any vdsm or sos issue, moving to Node investigation. If it turns out to be an Infra issue, please move it back. Pavol, can you please provide a system I can test on? I can't reproduce this. It is same system which you had access, 14-15 Feb, when we talked about upgrade issues: LVM was in a state where LVM thought the device existed, but devicemapper did not. As I was not able to reproduce bug on any other system feel free to close the bug If you will encounter this issue in the future feel free to reopen (In reply to Pavol Brilla from comment #14) > It is same system which you had access, 14-15 Feb, when we talked about > upgrade issues: > LVM was in a state where LVM thought the device existed, but devicemapper > did not. this sounds like a platform race between lvm and devicemapper, which is utterly unrelated to this little RFE. I believe that the RFE should be marked (by QE) as verified if nodectl output is collect. Dan, can you explain what info do you need? Nir, do you think that Pavol has stumbled on a platform race between lvm and devicemapper? Do we already have a bug about that race? We got to comment#14 (discovered later) due to the lvmetad changes. RHV-H relies heavily on LV tagging behind the scenes. This is either: The change to lvmetad caused any LVM command to output tons of stderr warnings. Imgbased expected "lvs --noheading -otags,name" (for example) to output parsable information, and stderr went to stdout. We now put stderr into a different channel. Some LV was not activated due to lvmetad. Imgbased now vgchanges before upgrades, but "nodectl info" doesn't need that anyway. If this isn't reproducible anymore, I'd call the root cause the former, and close this (In reply to Dan Kenigsberg from comment #18) > Nir, do you think that Pavol has stumbled on a platform race between lvm and > devicemapper? Do we already have a bug about that race? No Pavol/Derrick -- can one of you please provide a test environment? # cat sosreport-pb.1-20170411101411/sos_commands/vdsm/nodectl_in
layers:
rhvh-4.1-0.20170403.0:
rhvh-4.1-0.20170403.0+1
rhvh-4.0-0.20170307.0:
rhvh-4.0-0.20170307.0+1
bootloader:
default: rhvh-4.1-0.20170403.0+1
entries:
rhvh-4.1-0.20170403.0+1:
index: 0
title: rhvh-4.1-0.20170403.0
kernel: /boot/rhvh-4.1-0.20170403.0+1/vmlinuz-3.10.0-514.6.1.el7.x86_64
args: "ro crashkernel=auto rd.lvm.lv=rhvh_slot-10/swap rd.lvm.lv=rhvh_slot-10/rhvh-4.1-0.20170403.0+1 console=ttyS1,115200n8 LANG=en_US.UTF-8 img.bootid=rhvh-4.1-0.20170403.0+1"
initrd: /boot/rhvh-4.1-0.20170403.0+1/initramfs-3.10.0-514.6.1.el7.x86_64.img
root: /dev/rhvh_slot-10/rhvh-4.1-0.20170403.0+1
rhvh-4.0-0.20170307.0+1:
index: 1
title: rhvh-4.0-0.20170307.0
kernel: /boot/rhvh-4.0-0.20170307.0+1/vmlinuz-3.10.0-514.10.2.el7.x86_64
args: "ro crashkernel=auto rd.lvm.lv=rhvh_slot-10/rhvh-4.0-0.20170307.0+1 rd.lvm.lv=rhvh_slot-10/swap console=ttyS1,115200n8 LANG=en_US.UTF-8 img.bootid=rhvh-4.0-0.20170307.0+1"
initrd: /boot/rhvh-4.0-0.20170307.0+1/initramfs-3.10.0-514.10.2.el7.x86_64.img
root: /dev/rhvh_slot-10/rhvh-4.0-0.20170307.0+1
current_layer: rhvh-4.1-0.20170403.0+1
|
Description of problem: RHV-H 4.0 no longer contains release information in /etc/redhat-release[1], and CEE needs a quick and reliable way to determine this information from a customer's sosreport. The wrapper script, nodectl, is installed on RHV-H by default, and the output of 'nodectl info' appears to be a nice summary of the available images and their boot options as well as the current image/layer in use. Version-Release number of selected component (if applicable): vdsm-4.18.11-1.el7ev redhat-release-virtualization-host-4.0-2.el7 How reproducible: 100% Steps to Reproduce: 1. Run sosreport on RHV-H system 2. Extract sosreport 3. Check for sos_commands/vdsm/nodectl_info Actual results: This data is not collected Expected results: sos_commands/vdsm/nodectl_info exists and contains the output from 'nodectl info' Additional info: # cat /etc/redhat-release Red Hat Enterprise Linux release 7.2 # nodectl info layers: rhvh-4.0-0.20160817.0: rhvh-4.0-0.20160817.0+1 bootloader: default: rhvh-4.0-0.20160817.0+1 entries: rhvh-4.0-0.20160817.0+1: index: 0 title: rhvh-4.0-0.20160817.0 kernel: /boot/rhvh-4.0-0.20160817.0+1/vmlinuz-3.10.0-327.28.2.el7.x86_64 args: "ro crashkernel=auto rd.lvm.lv=rhvh_rhevh-25/rhvh-4.0-0.20160817.0+1 rd.lvm.lv=rhvh_rhevh-25/swap rhgb quiet LANG=en_US.UTF-8 img.bootid=rhvh-4.0-0.20160817.0+1" initrd: /boot/rhvh-4.0-0.20160817.0+1/initramfs-3.10.0-327.28.2.el7.x86_64.img root: /dev/rhvh_rhevh-25/rhvh-4.0-0.20160817.0+1 current_layer: rhvh-4.0-0.20160817.0+1 [1] https://bugzilla.redhat.com/show_bug.cgi?id=1347621