Bug 1450118 - [Mix versions] compute node delete does not remover vif_port_id from port.extra
Summary: [Mix versions] compute node delete does not remover vif_port_id from port.extra
Keywords:
Status: CLOSED INSUFFICIENT_DATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-ironic
Version: 11.0 (Ocata)
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: ---
Assignee: Dmitry Tantsur
QA Contact: mlammon
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-05-11 14:47 UTC by Raviv Bar-Tal
Modified: 2017-10-25 14:02 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-10-25 14:02:25 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
sos report (14.55 MB, application/x-xz)
2017-05-11 14:47 UTC, Raviv Bar-Tal
no flags Details
new ironic and nova logs (481.67 KB, application/x-gzip)
2017-05-16 15:07 UTC, Raviv Bar-Tal
no flags Details
new nova log (2.19 MB, application/x-gzip)
2017-05-16 15:07 UTC, Raviv Bar-Tal
no flags Details

Description Raviv Bar-Tal 2017-05-11 14:47:21 UTC
Created attachment 1277914 [details]
sos report

Description of problem:
After deleting compute node from overcloud the vif_port_id remains in the node port details

Version-Release number of selected component (if applicable):


How reproducible:
1. Install Newton undercloud and overcloud with stand along networker (composable role)
2.upgrade theundercloud to Ocata
3.scale down - delete compute node from overcloud
4. scale up - add compute node to overcloud

Steps to Reproduce:
1.
2.
3.

Actual results:
The scale up fails with error  "No valid host was found. There are not enough hosts available"

If you check `openstack baremetal node list` the node used for compute is available
(sos report is attached)
Expected results:


Additional info:

Comment 1 Raviv Bar-Tal 2017-05-11 14:49:25 UTC
The work around to re use the node it to manually delete the vif port from ironic node.

Comment 2 Dmitry Tantsur 2017-05-11 14:52:18 UTC
I suspect the upgrade is the key to this problem. We probably fail to clean up what was created by an older version.

Comment 3 Dmitry Tantsur 2017-05-15 15:59:27 UTC
I've noticed that Nova logs are missing from the sosreport. Could you please fetch them too?

Comment 4 Raviv Bar-Tal 2017-05-16 15:07:13 UTC
Created attachment 1279348 [details]
new ironic and nova logs

New logs are attache,
in this case the node in question is compute-1
UUID 0a68cec2-09ca-4c8d-aa89-7ebddccc8a7f 
Instance UUID  c1209a8b-6172-4095-872d-889845402066

Comment 5 Raviv Bar-Tal 2017-05-16 15:07:50 UTC
Created attachment 1279350 [details]
new nova log

Comment 6 Bob Fournier 2017-09-19 19:49:29 UTC
From the neutron logs it looks like we're getting a crash in openvswitch.

neutron/openvswitch-agent.log
2017-05-11 08:37:49.124 21629 ERROR neutron.agent.linux.async_process [-] Process [ovsdb-client monitor Interface name,ofport,external_ids --format=json] dies due to the error: None
2017-05-11 08:37:49.131 21629 ERROR ryu.lib.hub [-] hub: uncaught exception: Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/ryu/lib/hub.py", line 54, in _launch
    return func(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/ryu/base/app_manager.py", line 545, in close
    self.uninstantiate(app_name)
  File "/usr/lib/python2.7/site-packages/ryu/base/app_manager.py", line 528, in uninstantiate
    app = self.applications.pop(name)
KeyError: 'ofctl_service'

Following that we're getting these timeout errors when processing the VIF ports, which I assume is why the vif_port_id is not removed
2017-05-11 08:45:28.734 14051 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent [req-8b0f24b8-f69f-4c3f-8f7d-69b5fc94f41f - - - - -] Error while processing VIF ports

2017-05-11 08:47:33.752 14051 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent [req-8b0f24b8-f69f-4c3f-8f7d-69b5fc94f41f - - - - -] Error while processing VIF ports

2017-05-11 08:51:40.724 14051 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent [req-8b0f24b8-f69f-4c3f-8f7d-69b5fc94f41f - - - - -] Error while processing VIF ports

And finally:
2017-05-11 08:51:40.740 14051 ERROR neutron.agent.linux.async_process [-] Error received from [ovsdb-client monitor tcp:127.0.0.1:6640 Interface name,ofport,external_ids --format=json]: None
2017-05-11 08:51:40.741 14051 ERROR neutron.agent.linux.async_process [-] Process [ovsdb-client monitor tcp:127.0.0.1:6640 Interface name,ofport,external_ids --format=json] dies due to the error: None
2017-05-11 08:51:40.827 14051 INFO oslo_rootwrap.client [-] Stopping rootwrap daemon process with pid=14120

Comment 7 Bob Fournier 2017-09-19 20:07:55 UTC
Because the VIF has not been detached, Ironic ends up failing the attach:

2017-05-11 08:51:18.615 23418 DEBUG wsme.api [req-4a10c1ec-26cc-4a95-a5c3-ee04ac1c1a90 37f6c36867574db78b3f104aa70ab1ff 0727b18a0b6e48bba49fa657877187c9 - - -] Client-side error: Unable to attach VIF because VIF d54874a8-1eb7-4609-b940-e39d4c7ad5c7 is already attached to Ironic Port 60bbaa52-4ad9-48c7-b466-1a8cd67972dc
Traceback (most recent call last):

  File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/server.py", line 218, in inner
    return func(*args, **kwargs)

  File "/usr/lib/python2.7/site-packages/ironic/conductor/manager.py", line 2546, in vif_attach
    task.driver.network.vif_attach(task, vif_info)

  File "/usr/lib/python2.7/site-packages/ironic/drivers/modules/network/common.py", line 289, in vif_attach
    port_like_obj = get_free_port_like_object(task, vif_id)

  File "/usr/lib/python2.7/site-packages/ironic/drivers/modules/network/common.py", line 124, in get_free_port_like_object
    free_portgroups, free_ports = _get_free_portgroups_and_ports(task, vif_id)

  File "/usr/lib/python2.7/site-packages/ironic/drivers/modules/network/common.py", line 81, in _get_free_portgroups_and_ports
    if _vif_attached(p, vif_id):

  File "/usr/lib/python2.7/site-packages/ironic/drivers/modules/network/common.py", line 52, in _vif_attached
    vif=vif_id, object_uuid=port_like_obj.uuid)

VifAlreadyAttached: Unable to attach VIF because VIF d54874a8-1eb7-4609-b940-e39d4c7ad5c7 is already attached to Ironic Port 60bbaa52-4ad9-48c7-b466-1a8cd67972dc
 format_exception /usr/lib/python2.7/site-packages/wsme/api.py:222


I'd like to see if the Neutron team can take a look at the neutron errors in Comment 6.  It looks like Ironic attempted to detach the port but the detach failed in Neutron. This may be an issue that has been resolved, although I could not find the exact signature in the bug list.

Comment 8 Assaf Muller 2017-10-11 13:42:04 UTC
Hi Raviv,

The sosreport shows 0 bytes for var/log/neutron/server.log. We need either a complete sosreport with all Neutron logs or access to a reproducing machine.

Comment 11 Jakub Libosvar 2017-10-25 14:02:25 UTC
As we weren't provided any logs and we have nobody to ask for it, I'm closing this bug. If there is still somebody testing this scenario, please feel free to re-open.


Note You need to log in before you can comment on or make changes to this bug.