Bug 1252268
Summary: | Networks definitions are missing after restoration of networks that were changed since last network persistence. | |||
---|---|---|---|---|
Product: | Red Hat Enterprise Virtualization Manager | Reporter: | Chaofeng Wu <cwu> | |
Component: | ovirt-node | Assignee: | Fabian Deutsch <fdeutsch> | |
Status: | CLOSED ERRATA | QA Contact: | Chaofeng Wu <cwu> | |
Severity: | urgent | Docs Contact: | ||
Priority: | urgent | |||
Version: | 3.5.4 | CC: | bazulay, bmcclain, cshao, cwu, danken, dougsland, fdeutsch, gklein, huiwa, huzhao, lpeer, lsurette, mburman, mgoldboi, yaniwang, ycui, yeylon, ykaul | |
Target Milestone: | ovirt-3.6.0-rc | Keywords: | Reopened, ZStream | |
Target Release: | 3.6.0 | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | ovirt-node-3.3.0-0.4.20150906git14a6024.el7ev | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 1254305 (view as bug list) | Environment: | ||
Last Closed: | 2016-03-09 14:35:23 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | Node | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1254305 | |||
Attachments: |
Description
Chaofeng Wu
2015-08-11 06:05:58 UTC
Created attachment 1061330 [details]
libvirt error during system reboot
Created attachment 1061331 [details]
sosreport after install rhevh6.6
Created attachment 1061332 [details]
sosreport upgrade to rhevh6.7 before reboot
Created attachment 1061333 [details]
sosreport vdsmd are libvirtd are running after rhevh6.7 reboot twice
This does not reproduce. At least not in the scenario proposed in the Description. Also, if this is a bug it is probably a duplicate of https://bugzilla.redhat.com/1251040 because network restoration is remotely related to whether there are Vlans or not. *** This bug has been marked as a duplicate of bug 1251040 *** Check with the latest iso rhev-hypervisor6-6.7-20150811.0.iso which provide in BUG1251040 commnet 31. Some network configurations are still missing. So reopen this bug. After upgrade from rhevh-6.6-20150512 to rhevh-6.7-20150811 and make some network changes, then reboot the system, all the networks are fine, then reboot the system again, you will find that the libvirtd service start up failed during the system boot up process. After the system boot up, check vdsmd and libvirtd service status, all of them are stopped. Check network configuration in /var/lib/vdsm/persistence/netconf/, all the files are fine. It seems that networks are not restored due to vdsmd service does not run after the system reboot. Created attachment 1061904 [details]
sosreport
Created attachment 1061906 [details]
manual restart service failed
Can't reproduce: 1. installed 6.6 2. create VLAN via TUI and register the host in the engine 3. attach some networks, (over bond and over NIC) and reboot. 4. upgrade 5. make some changes in SN > reboot 6. another reboot One difference between the two networks might be that the DHCP responses are slower in the network where the failure appears. Two ways to identify if it's the slow response: 1. Try a small network with a dhcp server (the response should be fast) 2. Try using a static IP Still reproduce this bug with a static IP. Created attachment 1062489 [details]
sosreport with static IP address
We just try the following steps in clean environment with rhev-hypervisor6-6.7-20150811.0.iso, also have the same issue. Steps: 1, PXE install rhev-hypervisor6-6.7-20150811.0.iso, configure eth1 with static IP and vlan tag 20 then register to RHEV-M3.5.4 2, On RHEV-M web portal, the host status is up, then create eth0 and eth2 as bond0, create Network testnet1 and drag to bond0, create Network testnet2 and drag to eth3, save. 3, After all the networks configure successful, then reboot the system. 4, After the system up, break bond0, then create eth0 and eth3 as bond0, drag testnet2 to bond0, drag testnet1 to eth2, save. 5, All the networks are up, then reboot the system. 6, Reboot system more than twice, check the vdsmd and libvirtd service status. Created attachment 1062957 [details]
sosreport in a clean environment
Hi Chaofeng, Can you please try to reproduce using this new ifcfg.py file? I am trying to save us the painful turnaround of rebuilding vdsm and rhevh before we have a tested path. steps to deploy a rhevh server: 1. remount your root to be writable # mount -o rw,remount / 2. copy the attached ifcfg.py to /usr/share/vdsm/network/configurators/ 3. compile the .py to .pyc by importing the code using python interactive code. # python >>> import sys >>> sys.path.append('/usr/share/vdsm/') >>> from network.configurators import ifcfg >>> ^D 4. back to the shell. observe that the pyc file is indeed newer: # ls -l /usr/share/vdsm/network/configurators/ 5. persist the updated pyc file: # persist ifcfg.pyc Created attachment 1063522 [details]
patched ifcfg.py code
(In reply to Ido Barkan from comment #15) > Hi Chaofeng, > Can you please try to reproduce using this new ifcfg.py file? > I am trying to save us the painful turnaround of rebuilding vdsm and rhevh > before we have a tested path. > steps to deploy a rhevh server: > 1. remount your root to be writable > # mount -o rw,remount / > 2. copy the attached ifcfg.py to /usr/share/vdsm/network/configurators/ > 3. compile the .py to .pyc by importing the code using python interactive > code. > # python > >>> import sys > >>> sys.path.append('/usr/share/vdsm/') > >>> from network.configurators import ifcfg > >>> ^D > 4. back to the shell. observe that the pyc file is indeed newer: > # ls -l /usr/share/vdsm/network/configurators/ > 5. persist the updated pyc file: > # persist ifcfg.pyc Hi Ido, Try the following steps: 1. Install rhev-hypervisor6-6.7-20150813.0.iso, configure network and follow the your steps to persist new ifcfg.pyc in rhevh, reboot system. 2. Register to RHEV-M, then create eth0 and eth2 as bond0, create Network testnet1 and drag to bond0, create Network testnet2 and drag to eth3, save. 3. After all the networks configure successful, then reboot the system. After step 3, we find that some networks are not up. Also I find some error in supervdsm.log: restore-net::WARNING::2015-08-17 02:23:52,082::__init__::491::root.ovirt.node.utils.fs::(_persist_file) File "/etc/sysconfig/network-scripts/ifcfg-eth1 " had already been persisted restore-net::ERROR::2015-08-17 02:23:52,082::__init__::52::root::(__exit__) Failed rollback transaction last known good network. ERR=%s Traceback (most recent call last): File "/usr/share/vdsm/network/api.py", line 680, in setupNetworks File "/usr/share/vdsm/network/api.py", line 213, in wrapped File "/usr/share/vdsm/network/api.py", line 302, in addNetwork File "/usr/share/vdsm/network/models.py", line 160, in configure File "/usr/share/vdsm/network/configurators/ifcfg.py", line 87, in configureBridge File "/usr/share/vdsm/network/models.py", line 124, in configure File "/usr/share/vdsm/network/configurators/ifcfg.py", line 93, in configureVlan File "/usr/share/vdsm/network/models.py", line 97, in configure File "/usr/share/vdsm/network/configurators/ifcfg.py", line 154, in configureNic File "/usr/share/vdsm/network/configurators/ifcfg.py", line 623, in addNic File "/usr/share/vdsm/network/configurators/ifcfg.py", line 558, in _createConfFile File "/usr/share/vdsm/network/configurators/ifcfg.py", line 501, in writeConfFile File "/usr/lib/python2.6/site-packages/ovirt/node/utils/fs/__init__.py", line 435, in persist File "/usr/lib/python2.6/site-packages/ovirt/node/utils/fs/__init__.py", line 95, in restorecon File "/usr/lib/python2.6/site-packages/ovirt/node/utils/security.py", line 105, in restorecon File "/usr/lib64/python2.6/site-packages/selinux/__init__.py", line 76, in restorecon TypeError: in method 'matchpathcon', argument 1 of type 'char const *' MainProcess::DEBUG::2015-08-17 02:23:55,161::supervdsmServer::102::SuperVdsm.ServerCallback::(wrapper) call readMultipathConf with () {} You can find details in attachment. Created attachment 1063659 [details]
sosreport ifcfg.py
calls of selinux.getfilecon() and selinux.chcon(), too, should be passed the utf8-encoded abspath. restore-net::ERROR::2015-08-25 05:44:35,835::__init__::432::root.ovirt.node.utils.fs::(persist) Failed to persist "/etc/sysconfig/network-scripts/ifcfg-eth1" Traceback (most recent call last): File "/usr/lib/python2.6/site-packages/ovirt/node/utils/fs/__init__.py", line 429, in persist File "/usr/lib/python2.6/site-packages/ovirt/node/utils/fs/__init__.py", line 505, in _persist_file File "/usr/lib/python2.6/site-packages/ovirt/node/utils/fs/__init__.py", line 451, in copy_attributes File "/usr/lib/python2.6/site-packages/ovirt/node/utils/security.py", line 112, in getcon TypeError: in method 'getfilecon', argument 1 of type 'char const *' *** Bug 1256742 has been marked as a duplicate of this bug. *** All of the patches have been merged in the master branch, moving this file to MODIFIED. Verified on the rhev-hypervisor7-7.2-20151104 build. Version-Release number of selected component (if applicable): ovirt-node-3.6.0-0.20.20151103git3d3779a.el7ev.noarch rhev-hypervisor7-7.2-20151104.0.iso Steps: 1, Install rhev-hypervisor7-7.1-20151015.0.iso, configure eth0 and register to RHEV-M 3.5.6 2. Create bond0 with eth1 and eth2, and create Network testnet0 and testnet1, drag testnet0 to the bond0 and drag testnet1 to eth2,save the changes. 3. Upgrade to rhev-hypervisor7-7.2-20151104 via RHEV-M 4. Break bond0, recreate bond0 with eth1 and eth3, drag testnet0 to bond0, drag testnet1 to eth2, save the changes. 5. Reboot the system more than two time, check the vdsmd and libvirtd service status, check ovirt.log and ovirt-node.log. Result: After step5 vdsmd and libvirtd service status were running successful, there was no error in both ovirt.log and ovirt-node.log. This bug is fixed, so change the status to VERIFIED. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2016-0378.html |