Bug 1184497
Summary: | SetupNetworks> Can't change network state from dhcp to static ip | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | [oVirt] vdsm | Reporter: | Michael Burman <mburman> | ||||||||
Component: | General | Assignee: | Ondřej Svoboda <osvoboda> | ||||||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | Michael Burman <mburman> | ||||||||
Severity: | high | Docs Contact: | |||||||||
Priority: | high | ||||||||||
Version: | --- | CC: | bazulay, bugs, danken, ecohen, eedri, elevi, gklein, lsurette, mburman, mgoldboi, ogofen, osvoboda, rbalakri, yeylon, ylavi | ||||||||
Target Milestone: | ovirt-3.6.0-rc3 | Keywords: | Regression | ||||||||
Target Release: | --- | Flags: | rule-engine:
ovirt-3.6.0+
rule-engine: blocker+ ylavi: Triaged+ ylavi: planning_ack+ rule-engine: devel_ack+ rule-engine: testing_ack+ |
||||||||
Hardware: | x86_64 | ||||||||||
OS: | Linux | ||||||||||
Whiteboard: | network | ||||||||||
Fixed In Version: | vdsm-4.17.0-632.git19a83a2 | Doc Type: | Bug Fix | ||||||||
Doc Text: | Story Points: | --- | |||||||||
Clone Of: | |||||||||||
: | 1279824 (view as bug list) | Environment: | |||||||||
Last Closed: | 2015-11-04 12:58:59 UTC | Type: | Bug | ||||||||
Regression: | --- | Mount Type: | --- | ||||||||
Documentation: | --- | CRM: | |||||||||
Verified Versions: | Category: | --- | |||||||||
oVirt Team: | Network | RHEL 7.3 requirements from Atomic Host: | |||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||
Embargoed: | |||||||||||
Bug Depends On: | |||||||||||
Bug Blocks: | 1112861, 1242235, 1279824 | ||||||||||
Attachments: |
|
Verified on - 3.6.0-0.0.master.20150412172306.git55ba764.el6.noarch and vdsm-4.17.0-633.git7ad88bc.el7.x86_64 I see this behavior again on 3.6.0-0.0.master.20150627185750.git6f063c1.el6 and vdsm-4.17.0-1054.git562e711.el7.noarch If trying to change network's bootproto from DHCP to static ip via Setup Networks,(not only ovirtmgmt network) it looks like network configuration has been saved successfully, but ifcfg report that bootproto changed to none with static ip, netmask and GW: cat /etc/sysconfig/network-scripts/ifcfg-ovirtmgmt # Generated by VDSM version 4.17.0-1054.git562e711.el7 DEVICE=ovirtmgmt TYPE=Bridge DELAY=0 STP=off ONBOOT=yes IPADDR=10.35.128.8 NETMASK=255.255.255.0 GATEWAY=10.35.128.254 BOOTPROTO=none MTU=1500 DEFROUTE=yes NM_CONTROLLED=no IPV6INIT=no HOTPLUG=no But Setup Network report that the boot proto for this network is still DHCP. vdsCaps report: 'ovirtmgmt': {'addr': '10.35.128.8', 'bridged': True, 'cfg': {'BOOTPROTO': 'none', 'DEFROUTE': 'yes', 'DELAY': '0', 'DEVICE': 'ovirtmgmt', 'GATEWAY': '10.35.128.254', 'HOTPLUG': 'no', 'IPADDR': '10.35.128.8', 'IPV6INIT': 'no', 'MTU': '1500', 'NETMASK': '255.255.255.0', 'NM_CONTROLLED': 'no', 'ONBOOT': 'yes', 'STP': 'off', 'TYPE': 'Bridge'}, 'dhcpv4': False, 'dhcpv6': False, 'gateway': '10.35.128.254', 'iface': 'ovirtmgmt', 'ipv4addrs': ['10.35.128.8/24'], 'ipv6addrs': ['fe80::21d:9ff:fe68:71c1/64'], 'ipv6gateway': '::', 'mtu': '1500', 'netmask': '255.255.255.0', 'ports': ['eno1'], 'stp': 'off'}, This a bit different from original report, cause now the ifcfg-'network' has been updated to static, but SN still report dhcp, and only manual configuration will bring ifcfg-'network' back to dhcp, Setup Networks command doesn't take any effect here. So, gonna reopen as regression. Thanks. Created attachment 1049888 [details]
screenshot
can you attach ovirtmgmt ifcfg file? I want to find out if this is a race. nm. I have found in the logs: MainProcess|Thread-20::DEBUG::2015-01-19 15:05:35,251::ifcfg::538::root::(writeConfFile) Writing to file /etc/sysconfig/network-scripts/ifcfg-ovirtmgmt configuration: # Generated by VDSM version 4.16.8.1-5.el7ev DEVICE=ovirtmgmt TYPE=Bridge DELAY=0 STP=off ONBOOT=yes BOOTPROTO=dhcp MTU=1500 DEFROUTE=yes NM_CONTROLLED=no HOTPLUG=no So this is indeed a race that might be resolved providing either blockingdhcp=True by the engine or make it the default behaviour of vdsm. Dan, so you see any reason why not making it the default behaviour in vdsm? If you do, this might be supplied by the engine when it wants. this was my mistake. vdsm is fine here as it reports the proper caps. this is probably an engine bug. (In reply to Ido Barkan from comment #5) > Dan, so you see any reason why not making it the default behaviour in vdsm? > If you do, this might be supplied by the engine when it wants. The historic reason is that dhcp server can stall for long minutes, and Engine needs a quicker response (or else it would have suffered a timeout and attempted a revert). This bug reproduces on every network configured on host, thus absolute the ability to create an iscsi multipathing bonds, further more, if attempting the creation of such bond, when the network revert back to dhcp, storage becomes inaccessible (because of wrong masking) i believe it is a vdsm bug. engine is sending host setup network vds command with the following arguments (taken from engine log): 2015-08-16 14:59:19,559 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.HostSetupNetworksVDSCommand] (default task-30) [799333a2] START, HostSetupNetworksVDSCommand(HostName = centos7-1, HostSetupNetworksVdsCommandParameters:{runAsync='true', hostId='d836d879-d9e9-4073-aa7b-b74804f0e133', vds='Host[centos7-1,d836d879-d9e9-4073-aa7b-b74804f0e133]', rollbackOnFailure='true', conectivityTimeout='120', hostNetworkQosSupported='true', networks='[HostNetwork:{defaultRoute='false', bonding='false', networkName='NET0', nicName='eth1', vlan='null', mtu='0', vmNetwork='true', stp='false', properties='[]', bootProtocol='STATIC_IP', address='192.168.0.1', netmask='255.255.0.0', gateway=''}]', removedNetworks='[]', bonds='[]', removedBonds='[]'}), log id: 6d595db7 when finished, engine calls getCaps. the output of vdsGetCaps is attached. vdsm is reporting 'dhcpv4': True , resulting engine to think bootproto is DHCP. Created attachment 1063519 [details]
vdsCaps_bootproto_update_failed
I have to clarify here a bit. While dhclient probably left behind an active DHCP lease (in /var/lib/dhclient/) that VDSM uses to decide the 'dhcpv4' property for devices (here it reports True for the _bridge_ NET0), there is special handling (workaround) for networks: "If DHCP is not configured now, do not report it even though you found an active lease". You can see that for the _network_ NET0 'dhcpv4' is False because of this. This behaviour was adopted not to confuse the engine (and the user). We assumed that the engine used information from the network (and not the bridge) for display in "Edit Management Network". Is this assumption wrong? At the moment, Vdsm reports the new static IP address for the bridge, but claims that the bridge has an active dhcpv4 lease. This is misleading, as the reported IP address was not provided by a DHCP server. I'm afraid that Vdsm should report dhcpv4==True only if there is a running dhclient process bound to the interface (AND has a valid lease). Can you tweak the vdsm-side logic? *** Bug 1192778 has been marked as a duplicate of this bug. *** this fix is already in 3.6.0-15, moving to ON_QA. This bug still exist : 'ovirtmgmt': {'addr': '10.35.128.8', 'bridged': True, 'cfg': {'BOOTPROTO': 'none', 'DEFROUTE': 'yes', 'DELAY': '0', 'DEVICE': 'ovirtmgmt', 'GATEWAY': '10.35.128.254', 'HOTPLUG': 'no', 'IPADDR': '10.35.128.8', 'IPV6INIT': 'no', 'MTU': '1500', 'NETMASK': '255.255.255.0', 'NM_CONTROLLED': 'no', 'ONBOOT': 'yes', 'STP': 'off', 'TYPE': 'Bridge'}, 'dhcpv4': True, Network configured with static ip, but dhcpv4 = true and engine still reports the bootproto as dhcp. On 3.6 if changing manually the bootproto, vdsm will not recognize the change(not reading from cfg). It means this bug can be verified. Changing network state from dhcp to static ip and vise versa via setup networks working now as expected. vdsCaps and ifcfg-* files reporting as expected. Tested and verified on - 3.6.0-0.18.el6 with vdsm-4.17.8-1.el7ev.noarch Thank you, Michael! The logic we adapted in the end doesn't watch dhclient instances as that would be probably too intrusive for a backport. Instead, the currently active ("running") network configuration is queried and reported also for network devices that represent a given network (most often, bridges). There is some refactoring work to be merged later, but the behaviour will stay the same. Inspection of running dhclient instances' cmdlines were actually the first idea I had more than year and half ago. For 4.0 it will be reconsidered. oVirt 3.6.0 has been released on November 4th, 2015 and should fix this issue. If problems still persist, please open a new BZ and reference this one. |
Created attachment 982352 [details] logs Description of problem: SetupNetworks> Can't change 'ovirtmgmt' from dhcp to static ip. When trying to change 'ovirtmgmt' from dhcp to static ip via SN, i get in the event log: Network changes were saved on host orchid-vds1.qa.lab.tlv.redhat.com But, when checking - cat /etc/sysconfig/network-scripts/ifcfg-ovirtmgmt BOOTPROTO=dhcp And in SN ovirtmgmt stays with dhcp, even if the event log shows that network changes were saved on host. Version-Release number of selected component (if applicable): 3.6.0-0.0.master.20150105124153.git669ddc1.el6 vdsm-4.17.0-304.gitf191666.el7.x86_64 How reproducible: 100 Steps to Reproduce: 1. Hosts>SN> edit 'ovirtmgmt'(with pencil) 2. Change the Boot Protocol from dhcp to static ip 3. Approve operation Actual results: 'ovirtmgmt' stays with dhcp BootProto, in SN and in ifcfg-ovirtmgmt Expected results: Should succeed to change 'ovirtmgmt' from dhcp to static ip