Bug 1372093 - vdsm sos plugin should collect 'nodectl info' output
Summary: vdsm sos plugin should collect 'nodectl info' output
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: vdsm
Version: 4.0.2
Hardware: All
OS: Linux
high
high
Target Milestone: ovirt-4.1.1-1
: ---
Assignee: Ryan Barry
QA Contact: Pavol Brilla
URL:
Whiteboard:
Depends On:
Blocks: 1381590
TreeView+ depends on / blocked
 
Reported: 2016-08-31 21:46 UTC by Derrick Ornelas
Modified: 2021-08-30 12:17 UTC (History)
18 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1381590 (view as bug list)
Environment:
Last Closed: 2017-04-25 00:47:26 UTC
oVirt Team: Node
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker RHV-43219 0 None None None 2021-08-30 12:17:31 UTC
Red Hat Knowledge Base (Solution) 2748561 0 None None None 2016-11-03 20:45:37 UTC
Red Hat Product Errata RHEA-2017:0998 0 normal SHIPPED_LIVE VDSM bug fix and enhancement update 4.1 GA 2017-04-18 20:11:39 UTC
oVirt gerrit 63765 0 master MERGED sos: collect 'nodectl info' output 2020-05-04 07:40:29 UTC
oVirt gerrit 63775 0 ovirt-4.0 MERGED sos: collect 'nodectl info' output 2020-05-04 07:40:29 UTC

Description Derrick Ornelas 2016-08-31 21:46:07 UTC
Description of problem:  

RHV-H 4.0 no longer contains release information in /etc/redhat-release[1], and CEE needs a quick and reliable way to determine this information from a customer's sosreport.  The wrapper script, nodectl, is installed on RHV-H by default, and the output of 'nodectl info' appears to be a nice summary of the available images and their boot options as well as the current image/layer in use.  


Version-Release number of selected component (if applicable):
vdsm-4.18.11-1.el7ev
redhat-release-virtualization-host-4.0-2.el7


How reproducible:  100%


Steps to Reproduce:
1.  Run sosreport on RHV-H system
2.  Extract sosreport
3.  Check for sos_commands/vdsm/nodectl_info

Actual results:
This data is not collected


Expected results:
sos_commands/vdsm/nodectl_info exists and contains the output from 'nodectl info'


Additional info:

# cat /etc/redhat-release 
Red Hat Enterprise Linux release 7.2

# nodectl info
layers: 
  rhvh-4.0-0.20160817.0: 
    rhvh-4.0-0.20160817.0+1
bootloader: 
  default: rhvh-4.0-0.20160817.0+1
  entries: 
    rhvh-4.0-0.20160817.0+1: 
      index: 0
      title: rhvh-4.0-0.20160817.0
      kernel: /boot/rhvh-4.0-0.20160817.0+1/vmlinuz-3.10.0-327.28.2.el7.x86_64
      args: "ro crashkernel=auto rd.lvm.lv=rhvh_rhevh-25/rhvh-4.0-0.20160817.0+1 rd.lvm.lv=rhvh_rhevh-25/swap rhgb quiet LANG=en_US.UTF-8 img.bootid=rhvh-4.0-0.20160817.0+1"
      initrd: /boot/rhvh-4.0-0.20160817.0+1/initramfs-3.10.0-327.28.2.el7.x86_64.img
      root: /dev/rhvh_rhevh-25/rhvh-4.0-0.20160817.0+1
current_layer: rhvh-4.0-0.20160817.0+1



[1] https://bugzilla.redhat.com/show_bug.cgi?id=1347621

Comment 1 Derrick Ornelas 2016-08-31 21:59:18 UTC
--- /usr/lib/python2.7/site-packages/sos/plugins/vdsm.py.orig	2016-08-31 17:46:31.769168422 -0400
+++ /usr/lib/python2.7/site-packages/sos/plugins/vdsm.py	2016-08-31 17:48:34.583332264 -0400
@@ -112,6 +112,7 @@ class vdsm(Plugin, RedHatPlugin):
         self.collectExtOutput("/usr/bin/iostat")
         self.collectExtOutput("/sbin/iscsiadm -m node")
         self.collectExtOutput("/sbin/iscsiadm -m session")
+        self.collectExtOutput("/usr/sbin/nodectl info")
         sslopt = ['', '-s '][config.getboolean('vars', 'ssl')]
         vdsclient = "/usr/bin/vdsClient " + sslopt + "0 "
         self.collectExtOutput(vdsclient + "getVdsCapabilities")

Comment 2 Derrick Ornelas 2016-08-31 22:03:32 UTC
If sosreport can't find the command on the host, which will happen when this is run on RHEL, then it just ignores it.  This is actually good for us as we can use the fact that the sos_commands/vdsm/nodectl_info command exists or not to know right away whether we're dealing with RHV-H or RHEL.

Comment 5 Dan Kenigsberg 2016-09-13 06:32:36 UTC
This looks like a very easy fix, and would become even easier if Derrick post his patch to upstream vdsm, and take care of backporting.

Comment 7 Martin Perina 2016-10-04 13:37:42 UTC
Moving back to MODIFIED as new 4.0.5 clone has been created

Comment 9 Pavol Brilla 2017-02-08 08:02:53 UTC
File is there but contains python exeptions not info about node, command itself works wine on server.

# cat sosreport-p.1-20170208085724/sos_commands/vdsm/nodectl_info 
  WARNING: Not using lvmetad because config setting use_lvmetad=0.
  WARNING: To avoid corruption, rescan devices to make changes visible (pvscan --cache).
Traceback (most recent call last):
  File "/usr/lib64/python2.7/runpy.py", line 162, in _run_module_as_main
    "__main__", fname, loader, pkg_name)
  File "/usr/lib64/python2.7/runpy.py", line 72, in _run_code
    exec code in run_globals
  File "/usr/lib/python2.7/site-packages/nodectl/__main__.py", line 42, in <module>
    CliApplication()
  File "/usr/lib/python2.7/site-packages/nodectl/__init__.py", line 200, in CliApplication
    return cmdmap.command(args)
  File "/usr/lib/python2.7/site-packages/nodectl/__init__.py", line 118, in command
    return self.commands[command](**kwargs)
  File "/usr/lib/python2.7/site-packages/nodectl/__init__.py", line 76, in info
    Info(self.imgbased, self.machine).write()
  File "/usr/lib/python2.7/site-packages/nodectl/info.py", line 45, in __init__
    self._fetch_information()
  File "/usr/lib/python2.7/site-packages/nodectl/info.py", line 49, in _fetch_information
    self._get_layout()
  File "/usr/lib/python2.7/site-packages/nodectl/info.py", line 66, in _get_layout
    layout = LayoutParser(self.app.imgbase.layout()).parse()
  File "/usr/lib/python2.7/site-packages/imgbased/imgbase.py", line 148, in layout
    return self.naming.layout()
  File "/usr/lib/python2.7/site-packages/imgbased/naming.py", line 111, in layout
    raise RuntimeError("No valid layout found. Initialize if needed.")
RuntimeError: No valid layout found. Initialize if needed.

# yum list sos vdsm systemd redhat-release-virtualization-host 
Loaded plugins: imgbased-persist, product-id, search-disabled-repos, subscription-manager
This system is not registered to Red Hat Subscription Management. You can use subscription-manager to register.
Installed Packages
redhat-release-virtualization-host.x86_64     4.1-0.6.el7          installed
sos.noarch                                    3.3-5.el7_3          installed
systemd.x86_64                                219-30.el7_3.6       installed
vdsm.x86_64                                   4.19.4-1.el7ev       installed

Comment 10 Irit Goihman 2017-02-08 12:02:16 UTC
Can you attach systemd log?
Also, please paste 'nodectl info' command output.

Comment 11 Pavol Brilla 2017-02-10 07:42:49 UTC
Adding nodectl

# nodectl info
layers: 
  rhvh-4.1-0.20170202.0: 
    rhvh-4.1-0.20170202.0+1
bootloader: 
  default: rhvh-4.1-0.20170202.0+1
  entries: 
    rhvh-4.1-0.20170202.0+1: 
      index: 0
      title: rhvh-4.1-0.20170202.0
      kernel: /boot/rhvh-4.1-0.20170202.0+1/vmlinuz-3.10.0-514.6.1.el7.x86_64
      args: "ro crashkernel=auto rd.lvm.lv=rhvh_slot-10/rhvh-4.1-0.20170202.0+1 rd.lvm.lv=rhvh_slot-10/swap console=ttyS1,115200n8 LANG=en_US.UTF-8 img.bootid=rhvh-4.1-0.20170202.0+1"
      initrd: /boot/rhvh-4.1-0.20170202.0+1/initramfs-3.10.0-514.6.1.el7.x86_64.img
      root: /dev/rhvh_slot-10/rhvh-4.1-0.20170202.0+1
current_layer: rhvh-4.1-0.20170202.0+1


Also dev received access to server

Comment 12 Irit Goihman 2017-02-14 09:15:20 UTC
I didn't find any vdsm or sos issue, moving to Node investigation.
If it turns out to be an Infra issue, please move it back.

Comment 13 Ryan Barry 2017-02-15 19:10:03 UTC
Pavol, can you please provide a system I can test on? I can't reproduce this.

Comment 14 Pavol Brilla 2017-03-01 10:19:57 UTC
It is same system which you had access, 14-15 Feb, when we talked about upgrade issues:
LVM was in a state where LVM thought the device existed, but devicemapper did not.


As I was not able to reproduce bug on any other system feel free to close the bug

Comment 15 Pavol Brilla 2017-03-01 10:20:47 UTC
If you will encounter this issue in the future feel free to reopen

Comment 16 Dan Kenigsberg 2017-03-01 10:28:27 UTC
(In reply to Pavol Brilla from comment #14)
> It is same system which you had access, 14-15 Feb, when we talked about
> upgrade issues:
> LVM was in a state where LVM thought the device existed, but devicemapper
> did not.

this sounds like a platform race between lvm and devicemapper, which is utterly unrelated to this little RFE. I believe that the RFE should be marked (by QE) as verified if nodectl output is collect.

Comment 17 Nir Soffer 2017-03-06 22:34:40 UTC
Dan, can you explain what info do you need?

Comment 18 Dan Kenigsberg 2017-03-07 08:44:11 UTC
Nir, do you think that Pavol has stumbled on a platform race between lvm and devicemapper? Do we already have a bug about that race?

Comment 19 Ryan Barry 2017-03-07 14:26:38 UTC
We got to comment#14 (discovered later) due to the lvmetad changes.

RHV-H relies heavily on LV tagging behind the scenes. This is either:

The change to lvmetad caused any LVM command to output tons of stderr warnings. Imgbased expected "lvs --noheading -otags,name" (for example) to output parsable information, and stderr went to stdout. We now put stderr into a different channel.

Some LV was not activated due to lvmetad. Imgbased now vgchanges before upgrades, but "nodectl info" doesn't need that anyway.

If this isn't reproducible anymore, I'd call the root cause the former, and close this

Comment 20 Nir Soffer 2017-03-07 14:44:25 UTC
(In reply to Dan Kenigsberg from comment #18)
> Nir, do you think that Pavol has stumbled on a platform race between lvm and
> devicemapper? Do we already have a bug about that race?

No

Comment 21 Ryan Barry 2017-03-16 03:37:39 UTC
Pavol/Derrick -- can one of you please provide a test environment?

Comment 25 Pavol Brilla 2017-04-11 08:36:37 UTC
# cat  sosreport-pb.1-20170411101411/sos_commands/vdsm/nodectl_in
layers: 
  rhvh-4.1-0.20170403.0: 
    rhvh-4.1-0.20170403.0+1
  rhvh-4.0-0.20170307.0: 
    rhvh-4.0-0.20170307.0+1
bootloader: 
  default: rhvh-4.1-0.20170403.0+1
  entries: 
    rhvh-4.1-0.20170403.0+1: 
      index: 0
      title: rhvh-4.1-0.20170403.0
      kernel: /boot/rhvh-4.1-0.20170403.0+1/vmlinuz-3.10.0-514.6.1.el7.x86_64
      args: "ro crashkernel=auto rd.lvm.lv=rhvh_slot-10/swap rd.lvm.lv=rhvh_slot-10/rhvh-4.1-0.20170403.0+1 console=ttyS1,115200n8 LANG=en_US.UTF-8 img.bootid=rhvh-4.1-0.20170403.0+1"
      initrd: /boot/rhvh-4.1-0.20170403.0+1/initramfs-3.10.0-514.6.1.el7.x86_64.img
      root: /dev/rhvh_slot-10/rhvh-4.1-0.20170403.0+1
    rhvh-4.0-0.20170307.0+1: 
      index: 1
      title: rhvh-4.0-0.20170307.0
      kernel: /boot/rhvh-4.0-0.20170307.0+1/vmlinuz-3.10.0-514.10.2.el7.x86_64
      args: "ro crashkernel=auto rd.lvm.lv=rhvh_slot-10/rhvh-4.0-0.20170307.0+1 rd.lvm.lv=rhvh_slot-10/swap console=ttyS1,115200n8 LANG=en_US.UTF-8 img.bootid=rhvh-4.0-0.20170307.0+1"
      initrd: /boot/rhvh-4.0-0.20170307.0+1/initramfs-3.10.0-514.10.2.el7.x86_64.img
      root: /dev/rhvh_slot-10/rhvh-4.0-0.20170307.0+1
current_layer: rhvh-4.1-0.20170403.0+1


Note You need to log in before you can comment on or make changes to this bug.