Description of problem: RHV-H 4.0 no longer contains release information in /etc/redhat-release[1], and CEE needs a quick and reliable way to determine this information from a customer's sosreport. The wrapper script, nodectl, is installed on RHV-H by default, and the output of 'nodectl info' appears to be a nice summary of the available images and their boot options as well as the current image/layer in use. Version-Release number of selected component (if applicable): vdsm-4.18.11-1.el7ev redhat-release-virtualization-host-4.0-2.el7 How reproducible: 100% Steps to Reproduce: 1. Run sosreport on RHV-H system 2. Extract sosreport 3. Check for sos_commands/vdsm/nodectl_info Actual results: This data is not collected Expected results: sos_commands/vdsm/nodectl_info exists and contains the output from 'nodectl info' Additional info: # cat /etc/redhat-release Red Hat Enterprise Linux release 7.2 # nodectl info layers: rhvh-4.0-0.20160817.0: rhvh-4.0-0.20160817.0+1 bootloader: default: rhvh-4.0-0.20160817.0+1 entries: rhvh-4.0-0.20160817.0+1: index: 0 title: rhvh-4.0-0.20160817.0 kernel: /boot/rhvh-4.0-0.20160817.0+1/vmlinuz-3.10.0-327.28.2.el7.x86_64 args: "ro crashkernel=auto rd.lvm.lv=rhvh_rhevh-25/rhvh-4.0-0.20160817.0+1 rd.lvm.lv=rhvh_rhevh-25/swap rhgb quiet LANG=en_US.UTF-8 img.bootid=rhvh-4.0-0.20160817.0+1" initrd: /boot/rhvh-4.0-0.20160817.0+1/initramfs-3.10.0-327.28.2.el7.x86_64.img root: /dev/rhvh_rhevh-25/rhvh-4.0-0.20160817.0+1 current_layer: rhvh-4.0-0.20160817.0+1 [1] https://bugzilla.redhat.com/show_bug.cgi?id=1347621
--- /usr/lib/python2.7/site-packages/sos/plugins/vdsm.py.orig 2016-08-31 17:46:31.769168422 -0400 +++ /usr/lib/python2.7/site-packages/sos/plugins/vdsm.py 2016-08-31 17:48:34.583332264 -0400 @@ -112,6 +112,7 @@ class vdsm(Plugin, RedHatPlugin): self.collectExtOutput("/usr/bin/iostat") self.collectExtOutput("/sbin/iscsiadm -m node") self.collectExtOutput("/sbin/iscsiadm -m session") + self.collectExtOutput("/usr/sbin/nodectl info") sslopt = ['', '-s '][config.getboolean('vars', 'ssl')] vdsclient = "/usr/bin/vdsClient " + sslopt + "0 " self.collectExtOutput(vdsclient + "getVdsCapabilities")
If sosreport can't find the command on the host, which will happen when this is run on RHEL, then it just ignores it. This is actually good for us as we can use the fact that the sos_commands/vdsm/nodectl_info command exists or not to know right away whether we're dealing with RHV-H or RHEL.
This looks like a very easy fix, and would become even easier if Derrick post his patch to upstream vdsm, and take care of backporting.
Moving back to MODIFIED as new 4.0.5 clone has been created
File is there but contains python exeptions not info about node, command itself works wine on server. # cat sosreport-p.1-20170208085724/sos_commands/vdsm/nodectl_info WARNING: Not using lvmetad because config setting use_lvmetad=0. WARNING: To avoid corruption, rescan devices to make changes visible (pvscan --cache). Traceback (most recent call last): File "/usr/lib64/python2.7/runpy.py", line 162, in _run_module_as_main "__main__", fname, loader, pkg_name) File "/usr/lib64/python2.7/runpy.py", line 72, in _run_code exec code in run_globals File "/usr/lib/python2.7/site-packages/nodectl/__main__.py", line 42, in <module> CliApplication() File "/usr/lib/python2.7/site-packages/nodectl/__init__.py", line 200, in CliApplication return cmdmap.command(args) File "/usr/lib/python2.7/site-packages/nodectl/__init__.py", line 118, in command return self.commands[command](**kwargs) File "/usr/lib/python2.7/site-packages/nodectl/__init__.py", line 76, in info Info(self.imgbased, self.machine).write() File "/usr/lib/python2.7/site-packages/nodectl/info.py", line 45, in __init__ self._fetch_information() File "/usr/lib/python2.7/site-packages/nodectl/info.py", line 49, in _fetch_information self._get_layout() File "/usr/lib/python2.7/site-packages/nodectl/info.py", line 66, in _get_layout layout = LayoutParser(self.app.imgbase.layout()).parse() File "/usr/lib/python2.7/site-packages/imgbased/imgbase.py", line 148, in layout return self.naming.layout() File "/usr/lib/python2.7/site-packages/imgbased/naming.py", line 111, in layout raise RuntimeError("No valid layout found. Initialize if needed.") RuntimeError: No valid layout found. Initialize if needed. # yum list sos vdsm systemd redhat-release-virtualization-host Loaded plugins: imgbased-persist, product-id, search-disabled-repos, subscription-manager This system is not registered to Red Hat Subscription Management. You can use subscription-manager to register. Installed Packages redhat-release-virtualization-host.x86_64 4.1-0.6.el7 installed sos.noarch 3.3-5.el7_3 installed systemd.x86_64 219-30.el7_3.6 installed vdsm.x86_64 4.19.4-1.el7ev installed
Can you attach systemd log? Also, please paste 'nodectl info' command output.
Adding nodectl # nodectl info layers: rhvh-4.1-0.20170202.0: rhvh-4.1-0.20170202.0+1 bootloader: default: rhvh-4.1-0.20170202.0+1 entries: rhvh-4.1-0.20170202.0+1: index: 0 title: rhvh-4.1-0.20170202.0 kernel: /boot/rhvh-4.1-0.20170202.0+1/vmlinuz-3.10.0-514.6.1.el7.x86_64 args: "ro crashkernel=auto rd.lvm.lv=rhvh_slot-10/rhvh-4.1-0.20170202.0+1 rd.lvm.lv=rhvh_slot-10/swap console=ttyS1,115200n8 LANG=en_US.UTF-8 img.bootid=rhvh-4.1-0.20170202.0+1" initrd: /boot/rhvh-4.1-0.20170202.0+1/initramfs-3.10.0-514.6.1.el7.x86_64.img root: /dev/rhvh_slot-10/rhvh-4.1-0.20170202.0+1 current_layer: rhvh-4.1-0.20170202.0+1 Also dev received access to server
I didn't find any vdsm or sos issue, moving to Node investigation. If it turns out to be an Infra issue, please move it back.
Pavol, can you please provide a system I can test on? I can't reproduce this.
It is same system which you had access, 14-15 Feb, when we talked about upgrade issues: LVM was in a state where LVM thought the device existed, but devicemapper did not. As I was not able to reproduce bug on any other system feel free to close the bug
If you will encounter this issue in the future feel free to reopen
(In reply to Pavol Brilla from comment #14) > It is same system which you had access, 14-15 Feb, when we talked about > upgrade issues: > LVM was in a state where LVM thought the device existed, but devicemapper > did not. this sounds like a platform race between lvm and devicemapper, which is utterly unrelated to this little RFE. I believe that the RFE should be marked (by QE) as verified if nodectl output is collect.
Dan, can you explain what info do you need?
Nir, do you think that Pavol has stumbled on a platform race between lvm and devicemapper? Do we already have a bug about that race?
We got to comment#14 (discovered later) due to the lvmetad changes. RHV-H relies heavily on LV tagging behind the scenes. This is either: The change to lvmetad caused any LVM command to output tons of stderr warnings. Imgbased expected "lvs --noheading -otags,name" (for example) to output parsable information, and stderr went to stdout. We now put stderr into a different channel. Some LV was not activated due to lvmetad. Imgbased now vgchanges before upgrades, but "nodectl info" doesn't need that anyway. If this isn't reproducible anymore, I'd call the root cause the former, and close this
(In reply to Dan Kenigsberg from comment #18) > Nir, do you think that Pavol has stumbled on a platform race between lvm and > devicemapper? Do we already have a bug about that race? No
Pavol/Derrick -- can one of you please provide a test environment?
# cat sosreport-pb.1-20170411101411/sos_commands/vdsm/nodectl_in layers: rhvh-4.1-0.20170403.0: rhvh-4.1-0.20170403.0+1 rhvh-4.0-0.20170307.0: rhvh-4.0-0.20170307.0+1 bootloader: default: rhvh-4.1-0.20170403.0+1 entries: rhvh-4.1-0.20170403.0+1: index: 0 title: rhvh-4.1-0.20170403.0 kernel: /boot/rhvh-4.1-0.20170403.0+1/vmlinuz-3.10.0-514.6.1.el7.x86_64 args: "ro crashkernel=auto rd.lvm.lv=rhvh_slot-10/swap rd.lvm.lv=rhvh_slot-10/rhvh-4.1-0.20170403.0+1 console=ttyS1,115200n8 LANG=en_US.UTF-8 img.bootid=rhvh-4.1-0.20170403.0+1" initrd: /boot/rhvh-4.1-0.20170403.0+1/initramfs-3.10.0-514.6.1.el7.x86_64.img root: /dev/rhvh_slot-10/rhvh-4.1-0.20170403.0+1 rhvh-4.0-0.20170307.0+1: index: 1 title: rhvh-4.0-0.20170307.0 kernel: /boot/rhvh-4.0-0.20170307.0+1/vmlinuz-3.10.0-514.10.2.el7.x86_64 args: "ro crashkernel=auto rd.lvm.lv=rhvh_slot-10/rhvh-4.0-0.20170307.0+1 rd.lvm.lv=rhvh_slot-10/swap console=ttyS1,115200n8 LANG=en_US.UTF-8 img.bootid=rhvh-4.0-0.20170307.0+1" initrd: /boot/rhvh-4.0-0.20170307.0+1/initramfs-3.10.0-514.10.2.el7.x86_64.img root: /dev/rhvh_slot-10/rhvh-4.0-0.20170307.0+1 current_layer: rhvh-4.1-0.20170403.0+1