Bug 1612052 - Failed to detach interface to VM in rhos
Summary: Failed to detach interface to VM in rhos
Keywords:
Status: CLOSED CANTFIX
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: rhosp-director
Version: 13.0 (Queens)
Hardware: x86_64
OS: Linux
high
high
Target Milestone: ---
: ---
Assignee: RHOS Maint
QA Contact: Gurenko Alex
URL:
Whiteboard: libvirt_OSP_INT
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-08-03 11:03 UTC by chhu
Modified: 2020-04-10 03:41 UTC (History)
27 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-09-18 07:16:13 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
libvirtd log, nova-compute log, guest log (2.52 MB, application/x-gzip)
2018-08-03 11:08 UTC, chhu
no flags Details
bug1612052-log (11.16 MB, application/x-gzip)
2018-08-27 08:37 UTC, chhu
no flags Details
sosreport.0 (15.00 MB, application/x-gzip)
2018-09-18 02:18 UTC, chhu
no flags Details
sosreport.1 (10.12 MB, application/octet-stream)
2018-09-18 02:20 UTC, chhu
no flags Details

Description chhu 2018-08-03 11:03:00 UTC
Description of problem:
Failed to detach interface from VM in rhos

Version-Release number of selected component (if applicable):
rhel7.6:
libvirt-4.5.0-3.el7.x86_64
qemu-kvm-rhev-2.12.0-7.el7.x86_64
with rhos13: openstack-nova-compute-17.0.3-0.20180420001141.el7ost.noarch

rhel7.5.z:
libvirt-3.9.0-14.el7_5.7.x86_64
qemu-kvm-rhev-2.10.0-21.el7_5.4.x86_64
with rhos10: openstack-nova-compute-14.1.0-22.el7ost.noarch

How reproducible:
100%

Steps to Reproduce:
1. Start a VM in rhos
2. Attach an interface to VM successfully, login to the guest, check there are two interfaces
# openstack server list
+--------------------------------------+-----------------+--------+-------------
| ID                                   | Name            | Status | Networks                              | Image Name |
+--------------------------------------+-----------------+--------+---------------------------------------+------------+
| 70834bed-af4b-4382-bed3-b633d7873295 | vm-r7-qcow2     | ACTIVE | net2=192.168.28.3; net1=192.168.32.11 | 7.5-qcow2  |
+--------------------------------------+-----------------+--------+------------

# virsh list --all
 Id    Name                           State
----------------------------------------------------
 9     instance-00000009              running

# virsh dumpxml instance-00000009|grep interface -A 8
    <interface type='bridge'>
      <mac address='fa:16:3e:53:b9:fa'/>
      <source bridge='qbr860d144d-8d'/>
      <target dev='tap860d144d-8d'/>
      <model type='virtio'/>
      <alias name='net0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
    </interface>
    <interface type='bridge'>
      <mac address='fa:16:3e:f0:b1:d1'/>
      <source bridge='qbr6ce3f876-90'/>
      <target dev='tap6ce3f876-90'/>
      <model type='virtio'/>
      <alias name='net1'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
    </interface>

3. Detach one interface, the nova command return without error, but the interface is not dettached
# nova interface-detach vm-r7-qcow2 8a315184-5e40-4a78-9bc7-58e0ba0cecf8

# openstack server list
+--------------------------------------+-----------------+--------+-------------
| ID                                   | Name            | Status | Networks                              | Image Name |
+--------------------------------------+-----------------+--------+---------------------------------------+------------+
| 70834bed-af4b-4382-bed3-b633d7873295 | vm-r7-qcow2     | ACTIVE | net2=192.168.28.3; net1=192.168.32.11 | 7.5-qcow2  |
+--------------------------------------+-----------------+--------+-------------
# virsh dumpxml instance-00000009|grep interface -A 8
    <interface type='bridge'>
      <mac address='fa:16:3e:53:b9:fa'/>
      <source bridge='qbr860d144d-8d'/>
      <target dev='tap860d144d-8d'/>
      <model type='virtio'/>
      <alias name='net0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
    </interface>
    <interface type='bridge'>
      <mac address='fa:16:3e:f0:b1:d1'/>
      <source bridge='qbr6ce3f876-90'/>
      <target dev='tap6ce3f876-90'/>
      <model type='virtio'/>
      <alias name='net1'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
    </interface>

5. Stop the VM, then detach the interface again, the interface is still in the VM
# openstack server stop vm-r7-qcow2
# nova interface-detach vm-r7-qcow2 8a315184-5e40-4a78-9bc7-58e0ba0cecf8
# openstack server list
+--------------------------------------+-----------------+---------+------------
| ID                                   | Name            | Status  | Networks                              | Image Name |
+--------------------------------------+-----------------+---------+------------
| 70834bed-af4b-4382-bed3-b633d7873295 | vm-r7-qcow2     | SHUTOFF | net2=192.168.28.3; net1=192.168.32.11 | 7.5-qcow2  |
+--------------------------------------+-----------------+---------+---------------------------------------+------------+
# virsh list --all
 Id    Name                           State
----------------------------------------------------
 -     instance-00000009              shut off

# virsh dumpxml instance-00000009| grep interface -A 8
    <interface type='bridge'>
      <mac address='fa:16:3e:53:b9:fa'/>
      <source bridge='qbr860d144d-8d'/>
      <target dev='tap860d144d-8d'/>
      <model type='virtio'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
    </interface>
    <interface type='bridge'>
      <mac address='fa:16:3e:f0:b1:d1'/>
      <source bridge='qbr6ce3f876-90'/>
      <target dev='tap6ce3f876-90'/>
      <model type='virtio'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
    </interface>

6. Start the VM, login to the VM, check there are 2 interfaces

Actual results:
In step4-6: the interface is not detached

Expected results:
In step4-6: the interface is detached

Additional info:
- libvirtd.log:
  "error : qemuMonitorIO:719 : internal error: End of file from qemu monitor"
- nova.log:
  "ERROR oslo_messaging.rpc.server PortNotFound: Port 8a315184-5e40-4a78-9bc7-58e0ba0cecf8 is not attached"
- qemu/vm.log:
  /var/log/libvirt/qemu/instance-00000009.log

Comment 2 chhu 2018-08-03 11:08:44 UTC
Created attachment 1472954 [details]
libvirtd log, nova-compute log, guest log

Comment 3 yalzhang@redhat.com 2018-08-13 06:25:31 UTC
I can not reproduce the issue in pure libvirt. And in the nova-compute.log attached in comment 2, there are error messages as below, so suggest to change the component to python-nova.
on the problematic system:
# rpm -q python-nova
python-nova-14.1.0-22.el7ost.noarch

check the log in comment 2:
$ grep -i error nova-compute.log 
2018-08-03 06:39:49.783 4113 ERROR oslo_messaging.rpc.server [req-741449d2-45c6-439b-8923-18f88341bde6 4fc5d3e01396461c9feb2c18f2e7f77e 9b9fbbb09a244b808b386dc97dc3b795 - - -] Exception during message handling
2018-08-03 06:39:49.783 4113 ERROR oslo_messaging.rpc.server Traceback (most recent call last):
2018-08-03 06:39:49.783 4113 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/server.py", line 133, in _process_incoming
2018-08-03 06:39:49.783 4113 ERROR oslo_messaging.rpc.server     res = self.dispatcher.dispatch(message)
2018-08-03 06:39:49.783 4113 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 150, in dispatch
2018-08-03 06:39:49.783 4113 ERROR oslo_messaging.rpc.server     return self._do_dispatch(endpoint, method, ctxt, args)
2018-08-03 06:39:49.783 4113 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 121, in _do_dispatch
2018-08-03 06:39:49.783 4113 ERROR oslo_messaging.rpc.server     result = func(ctxt, **new_args)
.......

Comment 4 melanie witt 2018-08-16 20:58:18 UTC
The error in nova-compute is saying that the port_id passed to 'nova interface-detach <server> <port_id> was not found in the instance.info_cache.network_info for the server [1].

Can you share the command line you used to attach the interface to the VM? If you used the 'nova interface-attach' command [2], did you provide a --port-id? If not, did you get the port_id from the 'nova interface-list <server>' command [3]? Can you show the steps you used?

Finally, it would be helpful if you can share the part of the nova-compute.log that shows when you attached the interface.

[1] https://github.com/openstack/nova/blob/df3dd2b5c7f63ca69c9fb5d95ab3c496f729b0d0/nova/compute/manager.py#L5967-L5974
[2] https://docs.openstack.org/python-novaclient/latest/cli/nova.html#nova-interface-attach
[3] https://docs.openstack.org/python-novaclient/latest/cli/nova.html#nova-interface-list

Comment 5 chhu 2018-08-27 08:31:27 UTC
Try to reproduce the bug on latest package, but hit more serious issues. Please see more details as below:


Test on packages:
libvirt-4.5.0-7.el7.x86_64
qemu-kvm-rhev-2.12.0-11.el7.x86_64
openstack-nova-compute-17.0.3-0.20180420001141.el7ost.noarch (rhos13 z1)

Test steps:
1. New installed rhos13 z1 successfully, create 2 networks, create one image successfully.
2. Boot an instance from image in rhos13 z1 with one network successfully.
# virsh list --all
 Id    Name                           State
----------------------------------------------------
 1     instance-00000001              running

# virsh dumpxml instance-00000001|grep interface -A 10
    <interface type='bridge'>
      <mac address='fa:16:3e:67:5f:91'/>
      <source bridge='qbrc55bec8f-02'/>
      <target dev='tapc55bec8f-02'/>
      <model type='virtio'/>
      <alias name='net0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
    </interface>

3. Try to attach one network to the instance, but failed. 
4. Try to check the server list, met error:
# openstack server list
Unexpected API Error. Please report this at http://bugs.launchpad.net/nova/ and attach the Nova API log if possible.
<type 'exceptions.TypeError'> (HTTP 500) (Request-ID: req-b8cbbd84-4aa7-4cd5-984a-6aeddf01f39e)

5. Try to check the image list, seems hang, Ctrl +C exit
# openstack image list
^CTraceback (most recent call last):
  File "/usr/bin/openstack", line 10, in <module>
    sys.exit(main())
  File "/usr/lib/python2.7/site-packages/openstackclient/shell.py", line 210, in main
    return OpenStackShell().run(argv)
  File "/usr/lib/python2.7/site-packages/osc_lib/shell.py", line 134, in run
    ret_val = super(OpenStackShell, self).run(argv)
  File "/usr/lib/python2.7/site-packages/cliff/app.py", line 279, in run
    result = self.run_subcommand(remainder)
  File "/usr/lib/python2.7/site-packages/osc_lib/shell.py", line 169, in run_subcommand
    ret_value = super(OpenStackShell, self).run_subcommand(argv)
  File "/usr/lib/python2.7/site-packages/cliff/app.py", line 393, in run_subcommand
    self.prepare_to_run_command(cmd)
  File "/usr/lib/python2.7/site-packages/openstackclient/shell.py", line 197, in prepare_to_run_command
    return super(OpenStackShell, self).prepare_to_run_command(cmd)
  File "/usr/lib/python2.7/site-packages/osc_lib/shell.py", line 482, in prepare_to_run_command
    self.client_manager.auth_ref
  File "/usr/lib/python2.7/site-packages/openstackclient/common/clientmanager.py", line 99, in auth_ref
    return super(ClientManager, self).auth_ref
  File "/usr/lib/python2.7/site-packages/osc_lib/clientmanager.py", line 256, in auth_ref
    self._auth_ref = self.auth.get_auth_ref(self.session)
  File "/usr/lib/python2.7/site-packages/keystoneauth1/identity/generic/base.py", line 201, in get_auth_ref
    return self._plugin.get_auth_ref(session, **kwargs)
  File "/usr/lib/python2.7/site-packages/keystoneauth1/identity/v3/base.py", line 177, in get_auth_ref
    authenticated=False, log=False, **rkwargs)
  File "/usr/lib/python2.7/site-packages/keystoneauth1/session.py", line 848, in post
    return self.request(url, 'POST', **kwargs)
  File "/usr/lib/python2.7/site-packages/osc_lib/session.py", line 40, in request
    resp = super(TimingSession, self).request(url, method, **kwargs)
  File "/usr/lib/python2.7/site-packages/keystoneauth1/session.py", line 698, in request
    resp = send(**kwargs)
  File "/usr/lib/python2.7/site-packages/keystoneauth1/session.py", line 756, in _send_request
    resp = self.session.request(method, url, **kwargs)
  File "/usr/lib/python2.7/site-packages/requests/sessions.py", line 518, in request
    resp = self.send(prep, **send_kwargs)
  File "/usr/lib/python2.7/site-packages/requests/sessions.py", line 639, in send
    r = adapter.send(request, **kwargs)
  File "/usr/lib/python2.7/site-packages/requests/adapters.py", line 438, in send
    timeout=timeout
  File "/usr/lib/python2.7/site-packages/requests/packages/urllib3/connectionpool.py", line 600, in urlopen
    chunked=chunked)
  File "/usr/lib/python2.7/site-packages/requests/packages/urllib3/connectionpool.py", line 379, in _make_request
    httplib_response = conn.getresponse(buffering=True)
  File "/usr/lib64/python2.7/httplib.py", line 1113, in getresponse
    response.begin()
  File "/usr/lib64/python2.7/httplib.py", line 444, in begin
    version, status, reason = self._read_status()
  File "/usr/lib64/python2.7/httplib.py", line 400, in _read_status
    line = self.fp.readline(_MAXLINE + 1)
  File "/usr/lib64/python2.7/socket.py", line 476, in readline
    data = self._sock.recv(self._rbufsize)
KeyboardInterrupt

6. Check the neutron service status:
# neutron agent-list
neutron CLI is deprecated and will be removed in the future. Use openstack CLI instead.
+--------------------------------------+----------------------+-----------------
| id                                   | agent_type           | host                                   | availability_zone | alive | admin_state_up | binary                    |
+--------------------------------------+----------------------+-----------------
| 4c7db43a-c3c5-499e-8b6e-57f5035bfd15 | Loadbalancerv2 agent | dell-per730-65.lab.eng.pek2.redhat.com |                   | :-)   | True           | neutron-lbaasv2-agent     |
| 55ccd28d-e61d-4d40-8a26-46a42e6a3570 | DHCP agent           | dell-per730-65.lab.eng.pek2.redhat.com | nova              | xxx   | True           | neutron-dhcp-agent        |
| 66b528a5-15c9-47bd-a31e-b6710e00311b | Metadata agent       | dell-per730-65.lab.eng.pek2.redhat.com |                   | xxx   | True           | neutron-metadata-agent    |
| 745f1a0a-0e2d-4795-8ad2-751d64c2ba26 | Metering agent       | dell-per730-65.lab.eng.pek2.redhat.com |                   | :-)   | True           | neutron-metering-agent    |
| 96bc9f4e-7a06-4d8b-9b99-ceb0373e7877 | Open vSwitch agent   | dell-per730-65.lab.eng.pek2.redhat.com |                   | :-)   | True           | neutron-openvswitch-agent |
| d4e3f12d-7df4-48c9-9494-7a6964aca53a | L3 agent             | dell-per730-65.lab.eng.pek2.redhat.com | nova              | xxx   | True           | neutron-l3-agent          |
+--------------------------------------+----------------------+-----------------

Actual results:
In step3: attach the interface successfully. 
In step4-6: no error

Expected results:
In step3: failed to attach the interface
In step4-6: hit error 

Additional info:
- libvirtd.log
- nova-compute.log
- nova-conductor.log
- openvswitch-agent.log
- server.log

Comment 6 chhu 2018-08-27 08:37:12 UTC
Created attachment 1478890 [details]
bug1612052-log

Comment 7 chhu 2018-08-27 09:57:44 UTC
Hit the similar issue when try to migration on VM to another compute node.

The error in the compute node on source host:
2018-08-27 05:51:59.934 27990 ERROR nova.compute.manager [req-1e925e05-d949-49e0-bf04-67fa668f3227 087b005c097247e59d25082e84bd807a 0dd8465326bd4471b142df7f863b9dcb - default default] [instance: f882cb15-1096-47a0-9498-292a598d11fa] Pre live migration failed at ****: RemoteError: Remote error: RemoteError Remote error: DBError (pymysql.err.InternalError) (23, u'Out of resources when opening file \'/tmp/#sql_6016_0.MAI\' (Errcode: 24 "Too many open files")')
...

Comment 8 chhu 2018-08-28 10:18:17 UTC
After reboot the host, can attach interface, live migration successfully. 
As the env is new installed and only do the operations described in comment5/ comment7: start instance, attach network/ live migration, and the failure caused several service down, I think the "Too many open files" issues should be fix.

Comment 15 chhu 2018-09-11 02:40:07 UTC
Hi, Michele

I don't have the env to create sosreports, but I'll reproduce this issue, and provide you the env, thank you!

Comment 16 chhu 2018-09-18 02:18:51 UTC
Created attachment 1484231 [details]
sosreport.0

Comment 17 chhu 2018-09-18 02:20:07 UTC
Created attachment 1484232 [details]
sosreport.1

Comment 18 chhu 2018-09-18 02:22:55 UTC
Hi, Michele


I reproduced the issue and uploaded the sosreport, please help to have a look, thank you!


Regards,
chhu


Note You need to log in before you can comment on or make changes to this bug.