Description of problem: When I run 'openstack server stop <server_uuid>' I don't see any logs for `network-vif-unplugged` event in the server.log file (checked all the controller nodes). Port remains in ACTIVE state as per 'openstack port list' output. If I run 'openstack server start <server_uuid>' I see logs for `network-vif-plugged` as well as for `network-vif-unplugged` events in the server.log file: 2021-02-22 13:46:14.616 33 DEBUG neutron.notifiers.nova [-] Sending events: [{'server_uuid': '52928e3f-5cd8-4437-b818-89335ec88b58', 'name': 'network-vif-unplugged', 'status': 'completed', 'tag': '522f0812-fa63-47c6-8c85-12585203673f'}] send_events /usr/lib/python3.6/site-packages/neutron/notifiers/nova.py:246 2021-02-22 13:46:15.191 33 INFO neutron.notifiers.nova [-] Nova event response: {'server_uuid': '52928e3f-5cd8-4437-b818-89335ec88b58', 'name': 'network-vif-unplugged', 'status': 'completed', 'tag': '522f0812-fa63-47c6-8c85-12585203673f', 'code': 200} 2021-02-22 13:46:15.973 32 DEBUG neutron.notifiers.nova [-] Sending events: [{'server_uuid': '52928e3f-5cd8-4437-b818-89335ec88b58', 'name': 'network-vif-plugged', 'status': 'completed', 'tag': '522f0812-fa63-47c6-8c85-12585203673f'}] send_events /usr/lib/python3.6/site-packages/neutron/notifiers/nova.py:246 2021-02-22 13:46:17.056 32 INFO neutron.notifiers.nova [-] Nova event response: {'server_uuid': '52928e3f-5cd8-4437-b818-89335ec88b58', 'name': 'network-vif-plugged', 'status': 'completed', 'tag': '522f0812-fa63-47c6-8c85-12585203673f', 'code': 200} Version-Release number of selected component (if applicable): RHOS-16.1-RHEL-8-20210216.n.1 How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
Hi Alex: Just for the record, the environment you lent me has OVS with hybrid plugin. That means Nova creates a Linux Bridge that is connected to the VM from one side and to OVS, using a veth pair, in the other side: https://docs.openstack.org/neutron/pike/contributor/internals/openvswitch_agent.html. The problem is that when the VM is stopped, the TAP device (light blue in the link sent) is deleted, but the OVS port (qvo-xxx) it is not. For the OVS agent (and so for the Neutron server), this port exists and is active. When the VM is restarted, the Linux Bridge and the veth pair is re-recreated (deleted and created again). During a short time, the OVS agent notifies that the port has been deleted (vif-unplugged) and attached again (vif-plugged). Of course, this is not the expected behaviour: when the VM is stopped (and the TAP port deleted), we should reflect in Neutron server the correct status of this port. But we don't receive any information from Nova, via an RPC call or, the easiest way, deleting the Linux Bridge and the veth pair. If the veth pair is deleted, the OVS agent will detect this event and will inform Neutron server, that will set the port to DOWN. You should flip this BZ to Nova. Regards.
Hi Rodolfo, Thank you for debugging and the update.
os vif should be already deletaing the lunxubridge and veth pair. https://github.com/openstack/os-vif/blob/master/vif_plug_ovs/ovs.py#L313-L324 and nova should be calling unplug as part of server stop. looking at poweroff https://github.com/openstack/nova/blob/db666e2118972e501637141e48164a94f9bead54/nova/virt/libvirt/driver.py#L3576-L3580 the libvirt driver just calls destoy it does not call unplug_vifs so that is the issue here.
by the way just to be clear there is no is no api contract that say if you stop a vm its port status will change. that is not a valid exception to have. for sriov for example it wont and it never has for ovs with ip tables. it may have for ovn because libvirt was doing a port delete on domain destoy but that is not part of the nova or neutron api guarantees and this is something that is up to each virt dirver to implemnt differently. for example i would expect that ironic will not do anythingto update teh status of the neutron port when the server is powered off. so at the api level you should not be able to observe this behavior in anyway that you can make decision with it. it is not part of the public api contract.
os-vif use --may-exist when add the port so that it wont fail if its already there (e.g. when you restart nova-compute) and wont break network connectivity for the guest. libvirt on the other hand does ovs-vsctl --if-exists del-port then an add port on domain create normally this would cause the ovs port to be deleted and recretaed but since we destory the domian on power off libvirt already deleteded it then so when it powers on it just create a new port since there is none ot delete. in the libvirt diriver power_on is implemented by calling hard_reboot (start and stop were technically optional orgininally but reboot was required so it was added first.) hard_reboot calls destory https://github.com/openstack/nova/blob/31889ce296d1e1a62fe5825292479009118ddfab/nova/virt/libvirt/driver.py#L3454-L3455 which evenutally calls _unplug_vifs https://github.com/openstack/nova/blob/31889ce296d1e1a62fe5825292479009118ddfab/nova/virt/libvirt/driver.py#L1496-L1497 which for iptables will call _unplug_bridge https://github.com/openstack/os-vif/blob/master/vif_plug_ovs/ovs.py#L313-L324 this will delete the linux bridge an veth pair. it then calls _create_guest_with_network https://github.com/openstack/nova/blob/31889ce296d1e1a62fe5825292479009118ddfab/nova/virt/libvirt/driver.py#L3498-L3500 which will plug the vifs again using os-vif https://github.com/openstack/os-vif/blob/master/vif_plug_ovs/ovs.py#L203-L225 which recreates the bridge and veth pair. this is the correct behavior for hard reboot and power on is just hard reboot we could skip the call to destroy technically for power on but this has been the behaviour nova had since before neutron was a project so there has been no regression here. I would suggest deleting the tobiko test as this api is an internal api just for inter service synchronization. we even have a warning to that effect https://docs.openstack.org/api-ref/compute/#create-external-events-os-server-external-events ill add this to the ptg adgenda for discussion. i think in general it would be valid to unplug vifs in power_off but since this is virt dirver and ml2 diver specific behavior we should not be asserting this behavior outside of nova/neutron.
we discussed this at the ptg and decided this is not a bug and it should not be fixed https://etherpad.opendev.org/p/r.0a1509fc788d92391f50397f2ee4af9f line 312 (sean-k-mooney): should we unplug vifs in power off (gibi): ping ralonsoh currently in the libvirt diver power_on is implmente by calling hard_reboot power_off undefines the libvirt domain but does not call unplug_vifs hard reboot will both destory the domain and clean up network interfaces. the current power_off behavior results in ports being left configured on ovs while the vm is off and then deletitng and recreating them on power on. nova has done this since before neutron was a project so its expected but should we tear down the backend networking config when we power off. also should we do the same for host mounted cinder volumes? i assume they are unmounted but the attachments are not unbound. i dont belive new calls to neutron or cinder to unbinidng the port binding or volume attachments woudl be correct but we might want to remove the configurtion form the host. context for this is https://bugzilla.redhat.com/show_bug.cgi?id=1932187 tobiko is a tempest alternitive that should not be asserting this behavior in test since is not part of the api contract and changes based on both the neutron ml2 driver and nova virt dirver so its not generic. but it raises the question should we keep the logic we always had or should we unplug in power off. AGREED: (sean-k-mooney): Not a bug/wont fix
*** Bug 1983937 has been marked as a duplicate of this bug. ***