Bug 1051297
| Summary: | setupNetworks: nic with dhcp cannot be bonded | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Virtualization Manager | Reporter: | Bryan Yount <byount> | ||||
| Component: | ovirt-engine | Assignee: | Lior Vernia <lvernia> | ||||
| Status: | CLOSED ERRATA | QA Contact: | Meni Yakove <myakove> | ||||
| Severity: | high | Docs Contact: | |||||
| Priority: | high | ||||||
| Version: | 3.2.0 | CC: | acathrow, bazulay, byount, danken, ecohen, iheim, lpeer, lvernia, masayag, nyechiel, ptavares, Rhev-m-bugs, tpoitras, yeylon | ||||
| Target Milestone: | --- | Keywords: | Triaged, ZStream | ||||
| Target Release: | 3.4.0 | ||||||
| Hardware: | x86_64 | ||||||
| OS: | Linux | ||||||
| Whiteboard: | network | ||||||
| Fixed In Version: | av3 | Doc Type: | Bug Fix | ||||
| Doc Text: |
Previously, bonds on physical host network interface cards could not be configured via the Administration Portal when the physical host network interface card was configured with DHCP. Physical host network interface card validation required that no boot protocol, IP address, subnet masks, or gateways be defined on slave network interfaces.
Now, it is now possible to configure bonds on physical host network interface cards configured with DHCP.
|
Story Points: | --- | ||||
| Clone Of: | |||||||
| : | 1082296 (view as bug list) | Environment: | |||||
| Last Closed: | 2014-06-09 15:08:26 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | Network | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Bug Depends On: | |||||||
| Bug Blocks: | 1078909, 1082296, 1142926 | ||||||
| Attachments: |
|
||||||
|
Description
Bryan Yount
2014-01-10 02:07:12 UTC
Patrick, if this issue is easily reproducible, would you lay out clear steps to do it, and include vdsm.log supervdsm.log and engine log of the whole process? Would you make sure to use the latest rhev-3.3 when you do that? (frankly, I'd love prefer if you can test it with latest ovirt-3.4-beta, but that may be too much to ask). I see no steps in comment #1, and comment #0 misses a crucial step 5 which explains the nature of the requested change in networking. Created attachment 857981 [details]
log files
Dan,
My existing env:
- RHEV-M 3.3.0-0.46 (all-in-one test env with a RHEL 6.5 + KVM box A, though customer had RHEV-M 3.2 managing their RHEL + KVM box)
- Second RHEL 6.4 + KVM box B
- Both boxes were vanilla, default RHEL 6.5 installs. Both have two NICs
- Box A was added as a hypervisor to the environment via the all-in-one install method. Box B was added from RHEV-M manually (after being subscribed to appropriate channels in Satellite)
- Both boxes have time synchronized for easier correlation :)
- 'Default' datacenter, 'Default' cluster
Steps to reproduce:
1) After a freshly added RHEL 6.5 hypervisor box B, select Box B from Hosts tab, choose 'Setup Host Networks' from 'Network Interfaces' sub-tab.
2) Fresh Box B should show the 'rhevm' network assigned to eth0 by default (it does in my env and did in the customer's). Drag eth1 onto eth0 to attempt to create a bond0 interface for the 'rhevm' network (the bond mode setting does not matter but I chose mode 4).
3) Click 'Ok' button and see the error message in attachement #855232.
Please reference the requested logs in attachment #857981 [details]. I've also included the vdsm/supervdsm logs from boxA as well. I triggered the error a few times between 14:58 and 15:00 in the log files.
Please let me know if I can provide any further info.
I will try to duplicate with an ovirt-3.4 beta environment if I find some time to re-build my env with those bits. Moti, can you make something of it? I see nothing damning in the reported getCaps.
{'HBAInventory': {'FC': [],
'iSCSI': [{'InitiatorName': 'iqn.1994-05.com.redhat:3818f73e83f9'}]},
'ISCSIInitiatorName': 'iqn.1994-05.com.redhat:3818f73e83f9',
'bondings': {'bond0': {'addr': '',
'cfg': {},
'hwaddr': '00:00:00:00:00:00',
'mtu': '1500',
'netmask': '',
'slaves': []},
'bond1': {'addr': '',
'cfg': {},
'hwaddr': '00:00:00:00:00:00',
'mtu': '1500',
'netmask': '',
'slaves': []},
'bond2': {'addr': '',
'cfg': {},
'hwaddr': '00:00:00:00:00:00',
'mtu': '1500',
'netmask': '',
'slaves': []},
'bond3': {'addr': '',
'cfg': {},
'hwaddr': '00:00:00:00:00:00',
'mtu': '1500',
'netmask': '',
'slaves': []},
'bond4': {'addr': '',
'cfg': {},
'hwaddr': '00:00:00:00:00:00',
'mtu': '1500',
'netmask': '',
'slaves': []}},
'bridges': {'rhevm': {'addr': '192.168.1.14',
'cfg': {'BOOTPROTO': 'dhcp',
'DELAY': '0',
'DEVICE': 'rhevm',
'IPV6INIT': 'yes',
'MTU': '1500',
'NM_CONTROLLED': 'no',
'ONBOOT': 'yes',
'TYPE': 'Bridge',
'UUID': 'b2e23a67-784e-4ff1-b164-f0c88df82ed1'},
'mtu': '1500',
'netmask': '255.255.255.0',
'ports': ['eth0'],
'stp': 'off'}},
'clusterLevels': ['3.0', '3.1', '3.2'],
'cpuCores': '8',
'cpuFlags': u'fpu,vme,de,pse,tsc,msr,pae,mce,cx8,apic,sep,mtrr,pge,mca,cmov,pat,pse36,clflush,dts,acpi,mmx,fxsr,sse,sse2,ss,ht,tm,pbe,syscall,nx,lm,constant_tsc,arch_perfmon,pebs,bts,rep_good,aperfmperf,pni,dtes64,monitor,ds_cpl,vmx,est,tm2,ssse3,cx16,xtpr,pdcm,dca,sse4_1,lahf_lm,dts,tpr_shadow,vnmi,flexpriority,model_Conroe,model_coreduo,model_core2duo,model_Penryn,model_n270',
'cpuModel': 'Intel(R) Xeon(R) CPU E5440 @ 2.83GHz',
'cpuSockets': '2',
'cpuSpeed': '2833.000',
'cpuThreads': '8',
'emulatedMachines': [u'rhel6.5.0',
u'pc',
u'rhel6.4.0',
u'rhel6.3.0',
u'rhel6.2.0',
u'rhel6.1.0',
u'rhel6.0.0',
u'rhel5.5.0',
u'rhel5.4.4',
u'rhel5.4.0'],
'guestOverhead': '65',
'hooks': {},
'kvmEnabled': 'true',
'lastClient': '192.168.1.13',
'lastClientIface': 'rhevm',
'management_ip': '',
'memSize': '7871',
'netConfigDirty': 'False',
'networks': {'rhevm': {'addr': '192.168.1.14',
'bridged': True,
'cfg': {'BOOTPROTO': 'dhcp',
'DELAY': '0',
'DEVICE': 'rhevm',
'IPV6INIT': 'yes',
'MTU': '1500',
'NM_CONTROLLED': 'no',
'ONBOOT': 'yes',
'TYPE': 'Bridge',
'UUID': 'b2e23a67-784e-4ff1-b164-f0c88df82ed1'},
'gateway': '192.168.1.1',
'iface': 'rhevm',
'mtu': '1500',
'netmask': '255.255.255.0',
'ports': ['eth0'],
'stp': 'off'}},
'nics': {'eth0': {'addr': '',
'cfg': {'BRIDGE': 'rhevm',
'DEVICE': 'eth0',
'HWADDR': '00:e0:81:b5:02:c1',
'IPV6INIT': 'yes',
'MTU': '1500',
'NM_CONTROLLED': 'no',
'ONBOOT': 'yes',
'UUID': 'b2e23a67-784e-4ff1-b164-f0c88df82ed1'},
'hwaddr': '00:e0:81:b5:02:c1',
'mtu': '1500',
'netmask': '',
'speed': 1000},
'eth1': {'addr': '',
'cfg': {'BOOTPROTO': 'dhcp',
'DEVICE': 'eth1',
'HWADDR': '00:E0:81:B5:02:C0',
'NM_CONTROLLED': 'yes',
'ONBOOT': 'no',
'TYPE': 'Ethernet',
'UUID': 'a481615c-c9a2-4e26-92e5-62937bb38584'},
'hwaddr': '00:e0:81:b5:02:c0',
'mtu': '1500',
'netmask': '',
'speed': 0}},
'operatingSystem': {'name': 'RHEL',
'release': '6.5.0.1.el6',
'version': '6Server'},
'packages2': {'kernel': {'buildtime': 1386939500.0,
'release': '431.3.1.el6.x86_64',
'version': '2.6.32'},
'libvirt': {'buildtime': 1386770011L,
'release': '29.el6_5.2',
'version': '0.10.2'},
'qemu-img': {'buildtime': 1384327329L,
'release': '2.415.el6_5.3',
'version': '0.12.1.2'},
'qemu-kvm': {'buildtime': 1384327329L,
'release': '2.415.el6_5.3',
'version': '0.12.1.2'},
'spice-server': {'buildtime': 1385990636L,
'release': '6.el6_5.1',
'version': '0.12.4'},
'vdsm': {'buildtime': 1385472772L,
'release': '28.0.el6ev',
'version': '4.10.2'}},
'reservedMem': '321',
'software_revision': '28.0',
'software_version': '4.10',
'supportedENGINEs': ['3.0', '3.1', '3.2'],
'supportedProtocols': ['2.2', '2.3'],
'supportedRHEVMs': ['3.0'],
'uuid': '5FE23BBC-CDCA-32C1-B040-B22721E40BD6',
'version_name': 'Snow Man',
'vlans': {},
'vmTypes': ['kvm']}
We faced same issue last week on users@ovirt: http://lists.ovirt.org/pipermail/users/2014-January/020458.html The 'eth1' configured with 'dhcp' bootprotocol, therefore it cannot serve as a slave. Providing it via the restapi without boot protocol would have work fine, but the setup network via UI doesn't clear any configuration reported from the host. The origin of this issue is on Bug 907240. We can assume that configuring a bond via the UI should clear any pre-configuration from the nics that were selected to act as slaves. By doing so, we'll override any user-settings for the slaves (before they were selected to act as slaves). The alternative is to remove the boot protocol from 'eth1' in this case, and restart network & vdsm (same solution as were suggested @users). I'm in favor of the first option (clearing the pre-configured settings). When setting up a network on top of NICs with existing dhcp/static address, it would be nice if it was easy to copy that address to the new network, though it is less clear what should be done with multiple NICs with different static addresses. In any case, being able to override pre-configured address is important. I agree with the solution where configuration of a bond via the UI should clearn any pre-configuration from the NICS that were selected to act as slaves. Having said this, I think a warning/confirmation dialog should probably alert the user to the pre-existing config, create a backup file, as well as possibly add comments to the ifcfg-eth* file stating something to the effect of "This file was updated/modifed by vdsm/ovirt-engine/something else on <date>" to inform any CLI junkies that this host is/was being managed by ovirt-engine. Thoughts? Our ifcfg files begin with the likes of # Generated by VDSM version 4.14.0 During years of messing with ifcfg files, I do not recall a true need to keep a backup of the pre-ovirt config. However, having something like that seems prudent, and requires its own RFE. While testing my patch that tried to clear the boot protocol on the engine side, I've found that VDSM apparently doesn't rewrite the boot protocol according to what's sent from the engine (so even if it is removed on the engine, it won't change on the host). Since this needs to be fixed on the VDSM side, then that in itself could be a solution to the bug without any intervention on the engine side; if slave interfaces can't have a boot protocol defined, then VDSM can clear it itself (instead of the engine asking to clear it). Hmm, this is not as simple as I made it sound; that contradicts the solution to Bug 907240. It appears to me one of them will have to remain unsolved. After some further verification the only thing causing the problem is some validation that's too strict on the engine side. rhevm-3.4.0-0.5.master.el6ev.noarch Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHSA-2014-0506.html |