Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 1340234 - Save network configuration takes more than 3 minute makes hypervisor non responsive in RHEV-M
Save network configuration takes more than 3 minute makes hypervisor non re...
Status: CLOSED ERRATA
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: vdsm (Show other bugs)
3.6.5
All Linux
high Severity high
: ovirt-4.0.0-rc
: ---
Assigned To: Edward Haas
Michael Burman
: Performance, ZStream
Depends On:
Blocks: 1349029
  Show dependency treegraph
 
Reported: 2016-05-26 15:22 EDT by nijin ashok
Modified: 2018-08-06 09:15 EDT (History)
10 users (show)

See Also:
Fixed In Version: v4.17.31
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1349029 (view as bug list)
Environment:
Last Closed: 2016-08-23 16:16:16 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: Network
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
oVirt gerrit 58221 ovirt-3.5 ABANDONED Revert "net: always persist owned ifcfg files on ovirt node" 2016-07-26 03:53 EDT
oVirt gerrit 58298 master MERGED Revert "net: always persist owned ifcfg files on ovirt node" 2016-05-31 10:36 EDT
oVirt gerrit 58299 ovirt-3.6 MERGED Revert "net: always persist owned ifcfg files on ovirt node" 2016-06-02 09:08 EDT
Red Hat Product Errata RHEA-2016:1671 normal SHIPPED_LIVE VDSM 4.0 GA bug fix and enhancement update 2016-09-02 17:32:03 EDT

  None (edit)
Description nijin ashok 2016-05-26 15:22:28 EDT
Description of problem:

In RHEV-H hypervisor, "save network configuration" takes more than 3 minute which ends up in hypervisor to go non responsive when the number of logical networks assigned to the hypervisor is high. This is only observed in the RHEV-H and not in RHEL-H . 

In the customer environment, Host.setSafeNetworkConfig is taking about 4 minute. Customer have 24 bridge network. Any minor change like removing a single vlan from the hypervisor will take more than 3 minute which causes hypervisor to go non reponsive and results in the migration of the VMs to the other hypervisor.


Version-Release number of selected component (if applicable):
Red Hat Enterprise Virtualization Hypervisor release 7.2 (20160413.0.el7ev)
vdsm-4.17.26-0.el7ev.noarch

How reproducible:
100%

Steps to Reproduce:

1. Add more than 25 vlan in a hypervisor . 

2. Try to do any minor change like removing a logical network . 

3. The hypervisor go into non responsive during the "save network configuration" process.

Actual results:

"save network configuration" is making the hypervisor non responsive

Expected results:

"save network configuration" should work.

Additional info:
Comment 2 nijin ashok 2016-05-26 15:23:17 EDT
It seems like the delay is here.

node_persist_owned_ifcfgs() {
    for f in $(find "$NET_CONF_DIR" -type f); do
        if grep -q "# Generated by VDSM version" "$f"; then
            ovirt_store_config "$f"
        fi
    done
}

ovirt_store_config() {
    for p in "$@"; do
        python <<EOP
from ovirtnode.ovirtfunctions import ovirt_store_config_retnum
ovirt_store_config_retnum("$p")

ovirtfunctions.py is called separately for each  ifcfg file and it seems like it's taking more than 4 seconds for ovirtfunctions.py to load in each iterate.

===
time python /usr/lib/python2.7/site-packages/ovirtnode/ovirtfunctions.py

real	0m4.041s
user	0m3.587s
sys	0m0.178s
===

For customer, we have 52 ifcfg file to persist.

grep -ir "Generated by VDSM version" etc/sysconfig/network-scripts/|wc -l
52
Comment 3 Dan Kenigsberg 2016-05-29 09:12:24 EDT
Could you attach supervdsm.log to BZ?

This bug might have been introduced by https://gerrit.ovirt.org/#/c/44929/3/vdsm/network/configurators/ifcfg.py which was required to solve bug 1252268.
Comment 5 Dan Kenigsberg 2016-05-29 09:27:21 EDT
Actually, it is more likely that it's due to https://gerrit.ovirt.org/#/q/Ibc717b86194a32c050d346e235a5c35fd318e1ff - one of the many patches done to solve bug 1203422.
Comment 7 Dan Kenigsberg 2016-05-31 10:39:11 EDT
> Dan, am i right about the patch number?

yes you are.
Comment 10 nijin ashok 2016-06-01 19:33:48 EDT
I am checking with customer if he can test this.

Meanwhile I tried in my test environment with 40 logical networks.

Before it was taking about 3+ minute.

jsonrpc.Executor/0::DEBUG::2016-06-01 16:13:00,173::__init__::503::jsonrpc.JsonRpcServer::(_serveRequest) Calling 'Host.setSafeNetworkConfig' in bridge with {}
jsonrpc.Executor/0::DEBUG::2016-06-01 16:16:25,880::__init__::533::jsonrpc.JsonRpcServer::(_serveRequest) Return 'Host.setSafeNetworkConfig' in bridge with True


After applying the patch, the process finished within few milliseconds!

jsonrpc.Executor/1::DEBUG::2016-06-01 16:23:00,187::__init__::503::jsonrpc.JsonRpcServer::(_serveRequest) Calling 'Host.setSafeNetworkConfig' in bridge with {}
jsonrpc.Executor/1::DEBUG::2016-06-01 16:23:00,228::__init__::533::jsonrpc.JsonRpcServer::(_serveRequest) Return 'Host.setSafeNetworkConfig' in bridge with True
Comment 11 nijin ashok 2016-06-01 19:46:15 EDT
Sorry, there was some error when I copied the file. The correct result after applying the patch in my test environment is

jsonrpc.Executor/5::DEBUG::2016-06-01 16:36:00,562::__init__::503::jsonrpc.JsonRpcServer::(_serveRequest) Calling 'Host.setSafeNetworkConfig' in bridge with {}
jsonrpc.Executor/5::DEBUG::2016-06-01 16:36:03,222::__init__::533::jsonrpc.JsonRpcServer::(_serveRequest) Return 'Host.setSafeNetworkConfig' in bridge with True

 
supervdsm log

MainProcess|jsonrpc.Executor/5::DEBUG::2016-06-01 16:36:00,565::utils::671::root::(execCmd) /usr/bin/taskset --cpu-list 0-1 /usr/share/vdsm/vdsm-store-net-config unified (cwd None)
MainProcess|jsonrpc.Executor/5::DEBUG::2016-06-01 16:36:03,221::utils::689::root::(execCmd) SUCCESS: <err> = ''; <rc> = 0
MainProcess|jsonrpc.Executor/5::DEBUG::2016-06-01 16:36:03,222::supervdsmServer::123::SuperVdsm.ServerCallback::(wrapper) return setSafeNetworkConfig with None

So it's taking only 3 seconds to complete.
Comment 17 errata-xmlrpc 2016-08-23 16:16:16 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHEA-2016-1671.html

Note You need to log in before you can comment on or make changes to this bug.