Bug 1448837 - Engine and vdsm complaining when trying to perform any SetupNetworks command on latest master vdsm
Summary: Engine and vdsm complaining when trying to perform any SetupNetworks command ...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: ovirt-engine
Classification: oVirt
Component: BLL.Network
Version: 4.2.0
Hardware: x86_64
OS: Linux
urgent
high
Target Milestone: ovirt-4.2.0
: 4.2.0
Assignee: Edward Haas
QA Contact: Michael Burman
URL:
Whiteboard:
Depends On: 1457889
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-05-08 11:08 UTC by Michael Burman
Modified: 2017-12-20 10:45 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-12-20 10:45:07 UTC
oVirt Team: Network
Embargoed:
rule-engine: ovirt-4.2+
rule-engine: blocker+


Attachments (Terms of Use)
Logs (1.12 MB, application/x-gzip)
2017-05-08 11:08 UTC, Michael Burman
no flags Details


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 76603 0 master MERGED net: Fix store net config - delete dir correctly. 2017-05-09 10:28:27 UTC
oVirt gerrit 76630 0 master MERGED net: Fix store net config - delete dir only if existed. 2017-05-09 14:50:46 UTC

Description Michael Burman 2017-05-08 11:08:44 UTC
Created attachment 1277087 [details]
Logs

Description of problem:
Engine and vdsm complaining when trying to perform any SetupNetworks command on latest master vdsm.

Looks like we have a regression on vdsm side when trying to perform any setup networks command on the host.
Looks like related to the latest changes in /var/lib/vdsm/persistence/netconf

2017-05-08 13:54:32,679+0300 INFO  (jsonrpc/4) [jsonrpc.JsonRpcServer] RPC call Host.getCapabilities succeeded in 0.18 seconds (__init__:570)
2017-05-08 13:54:33,087+0300 ERROR (jsonrpc/1) [jsonrpc.JsonRpcServer] Internal server error (__init__:607)
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/yajsonrpc/__init__.py", line 602, in _handle_request
    res = method(**params)
  File "/usr/lib/python2.7/site-packages/vdsm/rpc/Bridge.py", line 202, in _dynamicMethod
    result = fn(*methodArgs)
  File "/usr/share/vdsm/API.py", line 1434, in setSafeNetworkConfig
    supervdsm.getProxy().setSafeNetworkConfig()
  File "/usr/lib/python2.7/site-packages/vdsm/supervdsm.py", line 53, in __call__
    return callMethod()
  File "/usr/lib/python2.7/site-packages/vdsm/supervdsm.py", line 51, in <lambda>
    **kwargs)
  File "<string>", line 2, in setSafeNetworkConfig
  File "/usr/lib64/python2.7/multiprocessing/managers.py", line 773, in _callmethod
    raise convert_to_error(kind, result)
OSError: [Errno 21] Is a directory: '/var/lib/vdsm/persistence/netconf.1493183658658928008'
2017-05-08 13:54:33,088+0300 INFO  (jsonrpc/1) [jsonrpc.JsonRpcServer] RPC call Host.setSafeNetworkConfig failed (error -32603) in 0.01 seconds (__init__:570)


MainProcess|jsonrpc/2::DEBUG::2017-05-08 13:55:26,754::supervdsm_server::92::SuperVdsm.ServerCallback::(wrapper) call setSafeNetworkConfig with () {}
MainProcess|jsonrpc/2::ERROR::2017-05-08 13:55:26,757::supervdsm_server::96::SuperVdsm.ServerCallback::(wrapper) Error in setSafeNetworkConfig
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/vdsm/supervdsm_server.py", line 94, in wrapper
    res = func(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/vdsm/network/api.py", line 264, in setSafeNetworkConfig
    netconfpersistence.RunningConfig.store()
  File "/usr/lib/python2.7/site-packages/vdsm/network/netconfpersistence.py", line 204, in store
    _store_net_config()
  File "/usr/lib/python2.7/site-packages/vdsm/network/netconfpersistence.py", line 281, in _store_net_config
    os.remove(real_old_safeconf_dir)
OSError: [Errno 21] Is a directory: '/var/lib/vdsm/persistence/netconf.1493122321050905813'


Version-Release number of selected component (if applicable):
vdsm-4.20.0-753.gitbdeadde.el7.centos.x86_64
4.2.0-0.0.master.20170505124438.git61f971b.el7.centos


How reproducible:
100

Steps to Reproduce:
1. Update to latest vdsm - vdsm-4.20.0-753.gitbdeadde.el7.centos.x86_64
2. Perform any changes on the host via the setup networks dialog


Actual results:
Failed with error:
Error while executing action Commit Network changes: Unexpected exception
Engine complains that operation failed, but it seems to succeed

Expected results:
Should work as expected without any errors on engine and vdsm side

Comment 1 Michael Burman 2017-05-08 11:15:15 UTC
Host that was upgraded to latest vdsm version -
vdsm-4.20.0-753.gitbdeadde.el7.centos.x86_64 

The /var/lib/vdsm/persistence/netconf/ looks like this:

[root@navy-vds1 yum.repos.d]# tree /var/lib/vdsm/persistence/netconf/
/var/lib/vdsm/persistence/netconf/
├── bonds
│   └── bond1
└── nets
    ├── m1
    ├── m3
    ├── m5
    └── ovirtmgmt

2 directories, 5 files


[root@camel-vdsa ~]# tree /var/lib/vdsm/persistence/netconf/
/var/lib/vdsm/persistence/netconf/
├── bonds
└── nets
    ├── ip6_migration_n
    ├── m1
    ├── m2
    ├── m3
    └── ovirtmgmt

2 directories, 5 files

- But new host that installed with latest vdsm version has now something else:

[root@orchid-vds2 yum.repos.d]# tree /var/lib/vdsm/persistence/netconf.
netconf.9yxYlB22/ netconf.GLoG83Nv/ netconf.GNmyh950/ netconf.OUUtRV5K/ 
[root@orchid-vds2 yum.repos.d]# tree /var/lib/vdsm/persistence/netconf.9yxYlB22/
/var/lib/vdsm/persistence/netconf.9yxYlB22/
├── bonds
└── nets
    ├── m1
    └── ovirtmgmt

2 directories, 2 files
[root@orchid-vds2 yum.repos.d]# tree /var/lib/vdsm/persistence/netconf.
netconf.9yxYlB22/ netconf.GLoG83Nv/ netconf.GNmyh950/ netconf.OUUtRV5K/ 
[root@orchid-vds2 yum.repos.d]# tree /var/lib/vdsm/persistence/netconf.G
netconf.GLoG83Nv/ netconf.GNmyh950/ 
[root@orchid-vds2 yum.repos.d]# tree /var/lib/vdsm/persistence/netconf.G
netconf.GLoG83Nv/ netconf.GNmyh950/ 
[root@orchid-vds2 yum.repos.d]# tree /var/lib/vdsm/persistence/netconf.GLoG83Nv/
/var/lib/vdsm/persistence/netconf.GLoG83Nv/
├── bonds
└── nets
    └── ovirtmgmt

2 directories, 1 file
[root@orchid-vds2 yum.repos.d]# tree /var/lib/vdsm/persistence/netconf.GNmyh950/
/var/lib/vdsm/persistence/netconf.GNmyh950/
├── bonds
└── nets

2 directories, 0 files

It is seems to be the change that causing all the failures and errors.

Comment 2 Michael Burman 2017-05-08 12:17:53 UTC
It's an upgrade bug then on vdsm side.

Comment 3 Red Hat Bugzilla Rules Engine 2017-05-08 17:23:24 UTC
This bug report has Keywords: Regression or TestBlocker.
Since no regressions or test blockers are allowed between releases, it is also being identified as a blocker for this release. Please resolve ASAP.

Comment 4 Edward Haas 2017-05-08 21:35:13 UTC
Could you please help verify if https://gerrit.ovirt.org/#/c/76603 fixes this?

Comment 5 Michael Burman 2017-05-09 08:31:07 UTC
(In reply to Edward Haas from comment #4)
> Could you please help verify if https://gerrit.ovirt.org/#/c/76603 fixes
> this?

Verified.

Comment 6 Michael Burman 2017-06-01 13:11:08 UTC
The verification of this bug is currently depends on BZ 1457889

Comment 7 Michael Burman 2017-07-05 08:59:40 UTC
Tested and verified on - vdsm-4.20.1-120.git28558d7.el7.centos.x86_64
-Upgrade flow vdsm-4.19.20-1.el7ev > vdsm-4.20.1-120.git28558d7.el7.centos.x86_64

Comment 8 Sandro Bonazzola 2017-12-20 10:45:07 UTC
This bugzilla is included in oVirt 4.2.0 release, published on Dec 20th 2017.

Since the problem described in this bug report should be
resolved in oVirt 4.2.0 release, published on Dec 20th 2017, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.