Bug 1709628 - lshw can take more than 15 seconds to execute depending on the system
Summary: lshw can take more than 15 seconds to execute depending on the system
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: vdsm
Version: 4.2.8
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: ovirt-4.4.0
: ---
Assignee: Edward Haas
QA Contact: Michael Burman
URL:
Whiteboard:
Depends On:
Blocks: 1719125
TreeView+ depends on / blocked
 
Reported: 2019-05-14 02:35 UTC by Germano Veit Michel
Modified: 2020-08-04 13:27 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1719125 (view as bug list)
Environment:
Last Closed: 2020-08-04 13:26:28 UTC
oVirt Team: Network
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 4134861 0 Troubleshoot None VDSM periodically logs Worker blocked warning messages in vdsm.virt.sampling.HostMonitor 2019-05-14 04:34:45 UTC
Red Hat Product Errata RHEA-2020:3246 0 None None None 2020-08-04 13:27:03 UTC
oVirt gerrit 100193 0 'None' MERGED disable dpdk by default 2021-01-03 19:34:55 UTC
oVirt gerrit 100614 0 'None' MERGED disable dpdk by default 2021-01-03 19:34:53 UTC

Description Germano Veit Michel 2019-05-14 02:35:55 UTC
Description of problem:

This is seen very frequently in vdsm logs during host monitoring:

vdsm[121675]: WARN Worker blocked: <Worker name=periodic/82 running <Task <Operation action=<vdsm.virt.sampling.HostMonitor object at 0x7f94e443fe50> at 0x7f94e443fe90> timeout=15,
                                         File: "/usr/lib64/python2.7/threading.py", line 785, in __bootstrap
                                           self.__bootstrap_inner()
                                         File: "/usr/lib64/python2.7/threading.py", line 812, in __bootstrap_inner
                                           self.run()
                                         File: "/usr/lib64/python2.7/threading.py", line 765, in run
                                           self.__target(*self.__args, **self.__kwargs)
                                         File: "/usr/lib/python2.7/site-packages/vdsm/common/concurrent.py", line 194, in run
                                           ret = func(*args, **kwargs)
                                         File: "/usr/lib/python2.7/site-packages/vdsm/executor.py", line 301, in _run
                                           self._execute_task()
                                         File: "/usr/lib/python2.7/site-packages/vdsm/executor.py", line 315, in _execute_task
                                           task()
                                         File: "/usr/lib/python2.7/site-packages/vdsm/executor.py", line 391, in __call__
                                           self._callable()
                                         File: "/usr/lib/python2.7/site-packages/vdsm/virt/periodic.py", line 220, in __call__
                                           self._func()
                                         File: "/usr/lib/python2.7/site-packages/vdsm/virt/sampling.py", line 576, in __call__
                                           sample = HostSample(self._pid)
                                         File: "/usr/lib/python2.7/site-packages/vdsm/virt/sampling.py", line 240, in __init__
                                           self.interfaces = _get_interfaces_and_samples()
                                         File: "/usr/lib/python2.7/site-packages/vdsm/virt/sampling.py", line 200, in _get_interfaces_and_samples
                                           for link in ipwrapper.getLinks():
                                         File: "/usr/lib/python2.7/site-packages/vdsm/network/ipwrapper.py", line 267, in getLinks
                                           in six.viewitems(dpdk.get_dpdk_devices()))
                                         File: "/usr/lib/python2.7/site-packages/vdsm/network/link/dpdk.py", line 47, in get_dpdk_devices
                                           dpdk_devices = _get_dpdk_devices()
                                         File: "/usr/lib/python2.7/site-packages/vdsm/common/cache.py", line 41, in __call__
                                           value = self.func(*args)
                                         File: "/usr/lib/python2.7/site-packages/vdsm/network/link/dpdk.py", line 111, in _get_dpdk_devices
                                           devices = _lshw_command()
                                         File: "/usr/lib/python2.7/site-packages/vdsm/network/link/dpdk.py", line 123, in _lshw_command
                                           rc, out, err = cmd.exec_sync(['lshw', '-json'] + filterout_cmd)
                                         File: "/usr/lib/python2.7/site-packages/vdsm/network/cmd.py", line 38, in exec_sync
                                           retcode, out, err = exec_sync_bytes(cmds)
                                         File: "/usr/lib/python2.7/site-packages/vdsm/common/cmdutils.py", line 156, in exec_cmd
                                           out, err = p.communicate()
                                         File: "/usr/lib64/python2.7/site-packages/subprocess32.py", line 924, in communicate
                                           stdout, stderr = self._communicate(input, endtime, timeout)
                                         File: "/usr/lib64/python2.7/site-packages/subprocess32.py", line 1706, in _communicate
                                           orig_timeout)
                                         File: "/usr/lib64/python2.7/site-packages/subprocess32.py", line 1779, in _communicate_with_poll
                                           ready = poller.poll(self._remaining_time(endtime))

It is because the lshw command used to get the network devices can taker a longer time apparently depending on the number of devices present on the system.
Even with the improvements from "dpdk: update lshw command", which filters out several classes of devices, it can still take more than 15s on some systems.

$ time lshw -disable usb -disable pcmcia -disable isapnp -disable ide -disable scsi -disable dmi -disable memory -disable cpuinfo
real    0m23.723s    
user    0m23.521s    
sys     0m0.176s  

Version-Release number of selected component (if applicable):
vdsm-4.20.43-1.el7ev.x86_64

How reproducible:
100%

Steps to Reproduce:
# ip link add name dummy type dummy
# for i in {1..300} ; do ip link add link dummy type macvtap; done
# time lshw -disable usb -disable pcmcia -disable isapnp -disable ide -disable scsi -disable dmi -disable memory -disable cpuinfo

Actual results:
lshw time goes up with the number of devices (such as VM vnets), it can take more than 15s

Expected results:
Perhaps something can be done more efficiently by  _get_dpdk_devices? Or allow for a greater time...

Comment 4 RHV bug bot 2019-10-22 17:26:06 UTC
WARN: Bug status wasn't changed from MODIFIED to ON_QA due to the following reason:

[Found non-acked flags: '{}', ]

For more info please contact: rhv-devops: Bug status wasn't changed from MODIFIED to ON_QA due to the following reason:

[Found non-acked flags: '{}', ]

For more info please contact: rhv-devops

Comment 5 RHV bug bot 2019-10-22 17:39:27 UTC
WARN: Bug status wasn't changed from MODIFIED to ON_QA due to the following reason:

[Found non-acked flags: '{}', ]

For more info please contact: rhv-devops: Bug status wasn't changed from MODIFIED to ON_QA due to the following reason:

[Found non-acked flags: '{}', ]

For more info please contact: rhv-devops

Comment 6 RHV bug bot 2019-10-22 17:46:39 UTC
WARN: Bug status wasn't changed from MODIFIED to ON_QA due to the following reason:

[Found non-acked flags: '{}', ]

For more info please contact: rhv-devops: Bug status wasn't changed from MODIFIED to ON_QA due to the following reason:

[Found non-acked flags: '{}', ]

For more info please contact: rhv-devops

Comment 7 RHV bug bot 2019-10-22 18:02:28 UTC
WARN: Bug status wasn't changed from MODIFIED to ON_QA due to the following reason:

[Found non-acked flags: '{}', ]

For more info please contact: rhv-devops: Bug status wasn't changed from MODIFIED to ON_QA due to the following reason:

[Found non-acked flags: '{}', ]

For more info please contact: rhv-devops

Comment 8 Michael Burman 2019-10-24 10:07:05 UTC
Verified on - vdsm-4.40.0-128.git7bfc751.el8ev.x86_64

Comment 9 RHV bug bot 2019-11-19 11:53:34 UTC
WARN: Bug status (VERIFIED) wasn't changed but the folowing should be fixed:

[Found non-acked flags: '{}', ]

For more info please contact: rhv-devops: Bug status (VERIFIED) wasn't changed but the folowing should be fixed:

[Found non-acked flags: '{}', ]

For more info please contact: rhv-devops

Comment 10 RHV bug bot 2019-11-19 12:03:36 UTC
WARN: Bug status (VERIFIED) wasn't changed but the folowing should be fixed:

[Found non-acked flags: '{}', ]

For more info please contact: rhv-devops: Bug status (VERIFIED) wasn't changed but the folowing should be fixed:

[Found non-acked flags: '{}', ]

For more info please contact: rhv-devops

Comment 11 RHV bug bot 2019-12-13 13:16:56 UTC
WARN: Bug status (VERIFIED) wasn't changed but the folowing should be fixed:

[Found non-acked flags: '{}', ]

For more info please contact: rhv-devops: Bug status (VERIFIED) wasn't changed but the folowing should be fixed:

[Found non-acked flags: '{}', ]

For more info please contact: rhv-devops

Comment 12 RHV bug bot 2019-12-20 17:46:18 UTC
WARN: Bug status (VERIFIED) wasn't changed but the folowing should be fixed:

[Found non-acked flags: '{}', ]

For more info please contact: rhv-devops: Bug status (VERIFIED) wasn't changed but the folowing should be fixed:

[Found non-acked flags: '{}', ]

For more info please contact: rhv-devops

Comment 13 RHV bug bot 2020-01-08 14:50:40 UTC
WARN: Bug status (VERIFIED) wasn't changed but the folowing should be fixed:

[Found non-acked flags: '{}', ]

For more info please contact: rhv-devops: Bug status (VERIFIED) wasn't changed but the folowing should be fixed:

[Found non-acked flags: '{}', ]

For more info please contact: rhv-devops

Comment 14 RHV bug bot 2020-01-08 15:19:54 UTC
WARN: Bug status (VERIFIED) wasn't changed but the folowing should be fixed:

[Found non-acked flags: '{}', ]

For more info please contact: rhv-devops: Bug status (VERIFIED) wasn't changed but the folowing should be fixed:

[Found non-acked flags: '{}', ]

For more info please contact: rhv-devops

Comment 15 RHV bug bot 2020-01-24 19:52:22 UTC
WARN: Bug status (VERIFIED) wasn't changed but the folowing should be fixed:

[Found non-acked flags: '{}', ]

For more info please contact: rhv-devops: Bug status (VERIFIED) wasn't changed but the folowing should be fixed:

[Found non-acked flags: '{}', ]

For more info please contact: rhv-devops

Comment 17 errata-xmlrpc 2020-08-04 13:26:28 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (RHV RHEL Host (ovirt-host) 4.4), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2020:3246


Note You need to log in before you can comment on or make changes to this bug.