Bug 860279
Summary: | 3.1 - rhevm interface is removed together with other networks | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 6 | Reporter: | Martin Pavlik <mpavlik> | ||||||||
Component: | ovirt-node | Assignee: | Mike Burns <mburns> | ||||||||
Status: | CLOSED ERRATA | QA Contact: | Virtualization Bugs <virt-bugs> | ||||||||
Severity: | urgent | Docs Contact: | |||||||||
Priority: | urgent | ||||||||||
Version: | 6.3 | CC: | abaron, acathrow, achan, bazulay, bsarathy, cpelland, cshao, danken, eblake, fdeutsch, gklein, gouyang, hambrose, iheim, jboggs, leiwang, lpeer, mavital, mburns, mjenner, ovirt-maint, ycui, ykaul | ||||||||
Target Milestone: | rc | Keywords: | ZStream | ||||||||
Target Release: | --- | ||||||||||
Hardware: | x86_64 | ||||||||||
OS: | Linux | ||||||||||
Whiteboard: | network | ||||||||||
Fixed In Version: | ovirt-node-2.5.0-5.el6 | Doc Type: | Bug Fix | ||||||||
Doc Text: |
When networks were removed from host network interfaces via the Setup Network Dialog on a Red Hat Enterprise Virtualization Hypervisor 3.1 host, Red Hat Enterprise Virtualization Manager interfaces were also removed although the ifcfg file remained in the system. This caused the host to become inaccessible. This is caused by a subtle timing issue with ifcfg- files being available from the previous setup. This was fixed to ensure a single point for configuration files throughout the life of the system.
|
Story Points: | --- | ||||||||
Clone Of: | Environment: | ||||||||||
Last Closed: | 2013-02-28 16:40:04 UTC | Type: | Bug | ||||||||
Regression: | --- | Mount Type: | --- | ||||||||
Documentation: | --- | CRM: | |||||||||
Verified Versions: | Category: | --- | |||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||
Embargoed: | |||||||||||
Bug Depends On: | |||||||||||
Bug Blocks: | 863165 | ||||||||||
Attachments: |
|
Description
Martin Pavlik
2012-09-25 13:06:35 UTC
Created attachment 617026 [details]
sosreport-dell-r210ii-06-20120925093244-c9e9.tar
Created attachment 617027 [details]
sosreport-mp-rhevm31-20120925105832-d754.tar
Created attachment 617028 [details]
notes.txt
On net config rollback, Vdsm stops the network service (let's not go into "why"). Right after that, libvirt.networkDefineXML fails in a bizarre way: MainProcess|Thread-1104::DEBUG::2012-09-25 08:39:48,566::__init__::1164::Storage.Misc.excCmd::(_log) '/etc/init.d/network stop' (cwd None) MainProcess|Thread-1104::DEBUG::2012-09-25 08:39:50,025::__init__::1164::Storage.Misc.excCmd::(_log) SUCCESS: <err> = ''; <rc> = 0 MainProcess|Thread-1104::INFO::2012-09-25 08:39:50,030::configNetwork::262::root::(restoreAtomicNetworkBackup) Rolling back logical networks configuration (restoring atomic logical networks backup) MainProcess|Thread-1104::ERROR::2012-09-25 08:39:50,082::configNetwork::1367::setupNetworks::(setupNetworks) cannot rename file '/etc/libvirt/qemu/networks/vdsm-sw1.xml.new' as '/etc/libvirt/qemu/networks/vdsm-sw1.xml': Device or resource busy Traceback (most recent call last): File "/usr/share/vdsm/configNetwork.py", line 1362, in setupNetworks File "/usr/share/vdsm/configNetwork.py", line 367, in restoreBackups File "/usr/share/vdsm/configNetwork.py", line 276, in restoreAtomicNetworkBackup File "/usr/share/vdsm/configNetwork.py", line 173, in _createNetwork File "/usr/lib64/python2.6/site-packages/libvirt.py", line 2836, in networkDefineXML libvirtError: cannot rename file '/etc/libvirt/qemu/networks/vdsm-sw1.xml.new' as '/etc/libvirt/qemu/networks/vdsm-sw1.xml': Device or resource busy in libvirtd.log I see 2012-09-25 08:39:48.565+0000: 8179: error : virNetworkDeleteConfig:1735 : cannot remove config file '/etc/libvirt/qemu/networks/vdsm-sw1.xml': Device or resource busy 2012-09-25 08:39:50.035+0000: 8178: error : virNetworkDeleteConfig:1735 : cannot remove config file '/etc/libvirt/qemu/networks/vdsm-sw1.xml': Device or resource busy 2012-09-25 08:39:50.069+0000: 8182: error : virFileRewrite:405 : cannot rename file '/etc/libvirt/qemu/networks/vdsm-sw1.xml.new' as '/etc/libvirt/qemu/networks/vdsm-sw1.xml': Device or resource busy 2012-09-25 08:41:32.879+0000: 8161: error : virNetSocketReadWire:999 : End of file while reading data: Input/output error 2012-09-25 08:41:33.207+0000: 8178: error : virNetworkDeleteConfig:1735 : cannot remove config file '/etc/libvirt/qemu/networks/vdsm-rhevm.xml': Device or resource busy 2012-09-25 08:41:33.370+0000: 8184: error : virNetworkDeleteConfig:1735 : cannot remove config file '/etc/libvirt/qemu/networks/vdsm-sw2.xml': Device or resource busy 2012-09-25 08:41:34.049+0000: 8161: error : virNetSocketReadWire:999 : End of file while reading data: Input/output error where virNetworkDeleteConfig:1735 refers to this piece of code if (unlink(configFile) < 0) { virReportSystemError(errno, _("cannot remove config file '%s'"), configFile); goto error; } Now why would unlink(3) return EBUSY? (In reply to comment #4) > On net config rollback, Vdsm stops the network service (let's not go into > "why"). Right after that, libvirt.networkDefineXML fails in a bizarre way: > > 2012-09-25 08:41:33.370+0000: 8184: error : virNetworkDeleteConfig:1735 : > cannot remove config file '/etc/libvirt/qemu/networks/vdsm-sw2.xml': Device > or resource busy > 2012-09-25 08:41:34.049+0000: 8161: error : virNetSocketReadWire:999 : End > of file while reading data: Input/output error > > > where virNetworkDeleteConfig:1735 refers to this piece of code > > > if (unlink(configFile) < 0) { > virReportSystemError(errno, > _("cannot remove config file '%s'"), > configFile); > goto error; > } > > > Now why would unlink(3) return EBUSY? Does this file live on NFS or on a cifs share? I seem to recall that NFS can fail with EBUSY when a file still has some process holding it open, and that is certainly the semantics that Windows file systems tend to implement. That said, although POSIX permits unlink() to fail with EBUSY in this circumstance, most Unix file systems allow unlink()ing an in-use file, so it's not a common error seen during development. (In reply to comment #5) > > Now why would unlink(3) return EBUSY? > > Does this file live on NFS or on a cifs share? No. But I'm told that the /etc/libvirt/qemu/networks/ directory is bind-mounted. Could this be related? (In reply to comment #5) > > Does this file live on NFS or on a cifs share? I seem to recall that NFS > can fail with EBUSY when a file still has some process holding it open, and > that is certainly the semantics that Windows file systems tend to implement. > That said, although POSIX permits unlink() to fail with EBUSY in this > circumstance, most Unix file systems allow unlink()ing an in-use file, so > it's not a common error seen during development. It's actually bindmounted. The reason why this works sometimes and not others is related to whether or not you rebooted since you created the network files. I ran a quick test: mkdir /etc/mburns echo test > /etc/mburns/testfile persist /etc/mburns unlink /etc/mburns/testfile <-- works correctly echo test > /etc/mburns/testfile reboot unlink /etc/mburns/testfile <-- fails it turns out that they're mounted differently before/after the reboot. Before, the /etc/mburns directory is mounted and after the /etc/mburns/testfile is mounted. This is a problem in ovirt-node, so I'm moving this bz to ovirt-node in 6.4 and flagging for 6.3.z. Upstream patch posted: http://gerrit.ovirt.org/#/c/8317/ Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2013-0556.html |