Description of problem: Instances network are intermittently disconnected during stack update when using BCF. ivs service will be restarted several times during which the instance network get disconnected Feb 26 12:26:48 XXX.localnet os-collect-config[7070]: [2018/02/26 12:26:48 PM] [INFO] Restart ivs Feb 26 12:48:34 XXX.localnet os-collect-config[7070]: [2018/02/26 12:48:34 PM] [INFO] Restart ivs Feb 26 12:49:56 XXX.localnet os-collect-config[7070]: [2018/02/26 12:49:56 PM] [INFO] Restart ivs Feb 26 12:52:24 XXX.localnet os-collect-config[7070]: [2018/02/26 12:52:24 PM] [INFO] Restart ivs Feb 26 12:58:22 XXX.localnet os-collect-config[7070]: [2018/02/26 12:58:22 PM] [INFO] Restart ivs Feb 26 13:04:20 XXX.localnet os-collect-config[7070]: [2018/02/26 01:04:20 PM] [INFO] Restart ivs Feb 26 13:10:18 XXX.localnet os-collect-config[7070]: [2018/02/26 01:10:18 PM] [INFO] Restart ivs Feb 26 13:16:25 XXX.localnet os-collect-config[7070]: [2018/02/26 01:16:25 PM] [INFO] Restart ivs Feb 26 13:22:35 XXX.localnet os-collect-config[7070]: [2018/02/26 01:22:35 PM] [INFO] Restart ivs Version-Release number of selected component (if applicable): OSP10 BCF 4.5.1 How reproducible: 100% Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
Hi, Can we get any workaround to prevent the existing VMs' network from disconnecting ? Best Regards, Chen
>Scaling out compute nodes shouldn't affect existing VMs. Can we just simply reload ivs >instead of restart ? Chen - this is really a question for BigSwitch. The restart and all of the IVS supported was added here - https://review.openstack.org/#/c/274492/ by Xin Wu from BigSwitch. Adding a NeedInfo.
Hi Xinwu, Can we get any information for this bugzilla ? Best Regards, Chen
Hi Bob, I know nothing about ivs so really not sure whether reloading the service will be a proper workaround here or not. This is a production environment and I'm not sure whether removing the "restart" will impact the function of ivs or not... Do you have any advice Bob ? Best Regards, Chen
We have discussed the ivs restart with Sarath Kumar from BigSwitch and feel it is OK to remove the "ivs restart". Here is the text of the email: *************** I feel it should be fine to have the 'systemctl restart ivs' removed from os-net-config. We are double checking how and where we 'enable' and 'start' IVS on the Compute nodes to confirm that this change doesn't break any assumptions made in the past (if any). *************** We have confirmed that we do not make any assumptions about IVS start/enable at our end and do the right thing. The only concern/question we have is the following - When 'os-net-config' is run, we generate a config file for IVS to consume here[1]. Can we confirm that the first time os-net-config is run, it already has the correct mapping of nicX configs (provided via the RHOSP YAML files) to the correct interface name (i.e. nic1 => eno1, nic3 => p1p1, nic4 => p2p1, etc) [2] ? If this mapping changes between os-net-config calls (or nicX to actual interface name changes), then we would need to restart IVS so that it picks the correct interface configs. [1] https://github.com/openstack/os-net-config/blob/96d17b251737495be2bae1646debfa0fe44da1da/os_net_config/impl_ifcfg.py#L778 [2] https://github.com/openstack/os-net-config/blob/96d17b251737495be2bae1646debfa0fe44da1da/os_net_config/impl_ifcfg.py#L446 *************** It is our (RedHat) view that the mapping will be correct when os-net-config runs.
Upstream patch will ned to be backported - https://review.openstack.org/#/c/555369/
Hi Bob, Can we get hotfix for OSP10 ? Or is it acceptable that we just *manually* edit the file on overcloud node to workaround the issue ? Best Regards, Chen
This is upstream Ocata patch for this fix - https://review.openstack.org/#/c/561609/. Downstream backport to OSP-10 is still pending. However I'd like to confirm that this actually fixes the problem. Chen - have you been able to manually edit the file to see if it fixes this issue?
Thanks Chen. As hotfix is not required, retargeting this bug to OSP-13 where fix is already available.
We don't have specific hardware so all we can do is verify this new code is in place. Environment: (overcloud) [stack@undercloud-0 ~]$ rpm -qa | grep os-net-config os-net-config-8.4.1-4.el7ost.noarch looks like code for ivs restart has been removed in the impl_ifcfg.py if ivs_uplinks or ivs_interfaces: logger.info("Attach to ivs with " "uplinks: %s, " "interfaces: %s" % (ivs_uplinks, ivs_interfaces)) for ivs_uplink in ivs_uplinks: self.ifup(ivs_uplink) for ivs_interface in ivs_interfaces: self.ifup(ivs_interface) if nfvswitch_interfaces or nfvswitch_internal_ifaces: logger.info("Attach to nfvswitch with " "interfaces: %s, " "internal interfaces: %s" % (nfvswitch_interfaces, nfvswitch_internal_ifaces)) for nfvswitch_interface in nfvswitch_interfaces: self.ifup(nfvswitch_interface) for nfvswitch_internal in nfvswitch_internal_ifaces: self.ifup(nfvswitch_internal) If submitter finds a problem , please re-open bug or file a new one.
According to our records, this should be resolved by os-net-config-8.4.1-4.el7ost. This build is available now.