Bug 1561870
| Summary: | OSP10: Support for dpdkvhostuserclient mode | ||
|---|---|---|---|
| Product: | Red Hat OpenStack | Reporter: | Saravanan KR <skramaja> |
| Component: | openstack-nova | Assignee: | Sahid Ferdjaoui <sferdjao> |
| Status: | CLOSED CURRENTRELEASE | QA Contact: | OSP DFG:Compute <osp-dfg-compute> |
| Severity: | high | Docs Contact: | |
| Priority: | high | ||
| Version: | 10.0 (Newton) | CC: | amuller, atelang, berrange, chrisw, dasmith, eglynn, fbaudin, jhakimra, kchamart, nyechiel, samccann, sbauza, sferdjao, sgordon, srevivo, tfreger, vchundur, vromanso, yrachman |
| Target Milestone: | async | Keywords: | TestOnly, Triaged, ZStream |
| Target Release: | 10.0 (Newton) | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | openstack-nova-14.0.3-2.el7ost | Doc Type: | If docs needed, set a value |
| Doc Text: | Story Points: | --- | |
| Clone Of: | 1557850 | Environment: | |
| Last Closed: | 2018-06-27 17:23:28 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | 1557850, 1568355, 1568356, 1572510 | ||
| Bug Blocks: | 1561869 | ||
| Attachments: | |||
|
Comment 1
Sahid Ferdjaoui
2018-03-29 07:04:37 UTC
While validating the changes in OSP10 puddle, there is a ERROR log after below changes:
* Deploy OSP10 puddle 2018-04-02.1
* Create a VM with DPDK network
* Integrate all 3 patches on controller and compute
* On Compute node
mkdir /var/lib/vhost_sockets
chown qemu:qemu /var/lib/vhost_sockets
Modified file /etc/neutron/plugins/ml2/openvswitch_agent.ini
vhostuser_socket_dir = /var/lib/vhost_sockets
* Restart neutron and nova services
By looking at the log files, below ERROR log pops-up on the nova-compute.log after restarting the services (sosreport attached)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2018-04-03 15:00:19.441 102408 ERROR nova.compute.manager [req-abbb0aaf-9c95-4622-9ea4-9ab81e62c351 - - - - -] [instance: 4c6cc3fd-096a-416d-9f55-a7ce093d3a0e] An error occurred while refreshing the network cache.
2018-04-03 15:00:19.441 102408 ERROR nova.compute.manager [instance: 4c6cc3fd-096a-416d-9f55-a7ce093d3a0e] Traceback (most recent call last):
2018-04-03 15:00:19.441 102408 ERROR nova.compute.manager [instance: 4c6cc3fd-096a-416d-9f55-a7ce093d3a0e] File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 5814, in _heal_instance_info_cache
2018-04-03 15:00:19.441 102408 ERROR nova.compute.manager [instance: 4c6cc3fd-096a-416d-9f55-a7ce093d3a0e] self.network_api.get_instance_nw_info(context, instance)
2018-04-03 15:00:19.441 102408 ERROR nova.compute.manager [instance: 4c6cc3fd-096a-416d-9f55-a7ce093d3a0e] File "/usr/lib/python2.7/site-packages/nova/network/base_api.py", line 249, in get_instance_nw_info
2018-04-03 15:00:19.441 102408 ERROR nova.compute.manager [instance: 4c6cc3fd-096a-416d-9f55-a7ce093d3a0e] result = self._get_instance_nw_info(context, instance, **kwargs)
2018-04-03 15:00:19.441 102408 ERROR nova.compute.manager [instance: 4c6cc3fd-096a-416d-9f55-a7ce093d3a0e] File "/usr/lib/python2.7/site-packages/nova/network/neutronv2/api.py", line 1300, in _get_instance_nw_info
2018-04-03 15:00:19.441 102408 ERROR nova.compute.manager [instance: 4c6cc3fd-096a-416d-9f55-a7ce093d3a0e] preexisting_port_ids)
2018-04-03 15:00:19.441 102408 ERROR nova.compute.manager [instance: 4c6cc3fd-096a-416d-9f55-a7ce093d3a0e] File "/usr/lib/python2.7/site-packages/nova/network/neutronv2/api.py", line 2220, in _build_network_info_model
2018-04-03 15:00:19.441 102408 ERROR nova.compute.manager [instance: 4c6cc3fd-096a-416d-9f55-a7ce093d3a0e] data = client.list_ports(**search_opts)
2018-04-03 15:00:19.441 102408 ERROR nova.compute.manager [instance: 4c6cc3fd-096a-416d-9f55-a7ce093d3a0e] File "/usr/lib/python2.7/site-packages/nova/network/neutronv2/api.py", line 110, in wrapper
2018-04-03 15:00:19.441 102408 ERROR nova.compute.manager [instance: 4c6cc3fd-096a-416d-9f55-a7ce093d3a0e] ret = obj(*args, **kwargs)
2018-04-03 15:00:19.441 102408 ERROR nova.compute.manager [instance: 4c6cc3fd-096a-416d-9f55-a7ce093d3a0e] File "/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 743, in list_ports
2018-04-03 15:00:19.441 102408 ERROR nova.compute.manager [instance: 4c6cc3fd-096a-416d-9f55-a7ce093d3a0e] **_params)
2018-04-03 15:00:19.441 102408 ERROR nova.compute.manager [instance: 4c6cc3fd-096a-416d-9f55-a7ce093d3a0e] File "/usr/lib/python2.7/site-packages/nova/network/neutronv2/api.py", line 110, in wrapper
2018-04-03 15:00:19.441 102408 ERROR nova.compute.manager [instance: 4c6cc3fd-096a-416d-9f55-a7ce093d3a0e] ret = obj(*args, **kwargs)
2018-04-03 15:00:19.441 102408 ERROR nova.compute.manager [instance: 4c6cc3fd-096a-416d-9f55-a7ce093d3a0e] File "/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 376, in list
2018-04-03 15:00:19.441 102408 ERROR nova.compute.manager [instance: 4c6cc3fd-096a-416d-9f55-a7ce093d3a0e] for r in self._pagination(collection, path, **params):
2018-04-03 15:00:19.441 102408 ERROR nova.compute.manager [instance: 4c6cc3fd-096a-416d-9f55-a7ce093d3a0e] File "/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 391, in _pagination
2018-04-03 15:00:19.441 102408 ERROR nova.compute.manager [instance: 4c6cc3fd-096a-416d-9f55-a7ce093d3a0e] res = self.get(path, params=params)
2018-04-03 15:00:19.441 102408 ERROR nova.compute.manager [instance: 4c6cc3fd-096a-416d-9f55-a7ce093d3a0e] File "/usr/lib/python2.7/site-packages/nova/network/neutronv2/api.py", line 110, in wrapper
2018-04-03 15:00:19.441 102408 ERROR nova.compute.manager [instance: 4c6cc3fd-096a-416d-9f55-a7ce093d3a0e] ret = obj(*args, **kwargs)
2018-04-03 15:00:19.441 102408 ERROR nova.compute.manager [instance: 4c6cc3fd-096a-416d-9f55-a7ce093d3a0e] File "/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 361, in get
2018-04-03 15:00:19.441 102408 ERROR nova.compute.manager [instance: 4c6cc3fd-096a-416d-9f55-a7ce093d3a0e] headers=headers, params=params)
2018-04-03 15:00:19.441 102408 ERROR nova.compute.manager [instance: 4c6cc3fd-096a-416d-9f55-a7ce093d3a0e] File "/usr/lib/python2.7/site-packages/nova/network/neutronv2/api.py", line 110, in wrapper
2018-04-03 15:00:19.441 102408 ERROR nova.compute.manager [instance: 4c6cc3fd-096a-416d-9f55-a7ce093d3a0e] ret = obj(*args, **kwargs)
2018-04-03 15:00:19.441 102408 ERROR nova.compute.manager [instance: 4c6cc3fd-096a-416d-9f55-a7ce093d3a0e] File "/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 338, in retry_request
2018-04-03 15:00:19.441 102408 ERROR nova.compute.manager [instance: 4c6cc3fd-096a-416d-9f55-a7ce093d3a0e] headers=headers, params=params)
2018-04-03 15:00:19.441 102408 ERROR nova.compute.manager [instance: 4c6cc3fd-096a-416d-9f55-a7ce093d3a0e] File "/usr/lib/python2.7/site-packages/nova/network/neutronv2/api.py", line 110, in wrapper
2018-04-03 15:00:19.441 102408 ERROR nova.compute.manager [instance: 4c6cc3fd-096a-416d-9f55-a7ce093d3a0e] ret = obj(*args, **kwargs)
2018-04-03 15:00:19.441 102408 ERROR nova.compute.manager [instance: 4c6cc3fd-096a-416d-9f55-a7ce093d3a0e] File "/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 289, in do_request
2018-04-03 15:00:19.441 102408 ERROR nova.compute.manager [instance: 4c6cc3fd-096a-416d-9f55-a7ce093d3a0e] resp, replybody = self.httpclient.do_request(action, method, body=body)
2018-04-03 15:00:19.441 102408 ERROR nova.compute.manager [instance: 4c6cc3fd-096a-416d-9f55-a7ce093d3a0e] File "/usr/lib/python2.7/site-packages/neutronclient/client.py", line 311, in do_request
2018-04-03 15:00:19.441 102408 ERROR nova.compute.manager [instance: 4c6cc3fd-096a-416d-9f55-a7ce093d3a0e] return self.request(url, method, **kwargs)
2018-04-03 15:00:19.441 102408 ERROR nova.compute.manager [instance: 4c6cc3fd-096a-416d-9f55-a7ce093d3a0e] File "/usr/lib/python2.7/site-packages/neutronclient/client.py", line 299, in request
2018-04-03 15:00:19.441 102408 ERROR nova.compute.manager [instance: 4c6cc3fd-096a-416d-9f55-a7ce093d3a0e] resp = super(SessionClient, self).request(*args, **kwargs)
2018-04-03 15:00:19.441 102408 ERROR nova.compute.manager [instance: 4c6cc3fd-096a-416d-9f55-a7ce093d3a0e] File "/usr/lib/python2.7/site-packages/keystoneauth1/adapter.py", line 112, in request
2018-04-03 15:00:19.441 102408 ERROR nova.compute.manager [instance: 4c6cc3fd-096a-416d-9f55-a7ce093d3a0e] return self.session.request(url, method, **kwargs)
2018-04-03 15:00:19.441 102408 ERROR nova.compute.manager [instance: 4c6cc3fd-096a-416d-9f55-a7ce093d3a0e] File "/usr/lib/python2.7/site-packages/positional/__init__.py", line 101, in inner
2018-04-03 15:00:19.441 102408 ERROR nova.compute.manager [instance: 4c6cc3fd-096a-416d-9f55-a7ce093d3a0e] return wrapped(*args, **kwargs)
2018-04-03 15:00:19.441 102408 ERROR nova.compute.manager [instance: 4c6cc3fd-096a-416d-9f55-a7ce093d3a0e] File "/usr/lib/python2.7/site-packages/keystoneauth1/session.py", line 579, in request
2018-04-03 15:00:19.441 102408 ERROR nova.compute.manager [instance: 4c6cc3fd-096a-416d-9f55-a7ce093d3a0e] resp = send(**kwargs)
2018-04-03 15:00:19.441 102408 ERROR nova.compute.manager [instance: 4c6cc3fd-096a-416d-9f55-a7ce093d3a0e] File "/usr/lib/python2.7/site-packages/keystoneauth1/session.py", line 620, in _send_request
2018-04-03 15:00:19.441 102408 ERROR nova.compute.manager [instance: 4c6cc3fd-096a-416d-9f55-a7ce093d3a0e] raise exceptions.ConnectTimeout(msg)
2018-04-03 15:00:19.441 102408 ERROR nova.compute.manager [instance: 4c6cc3fd-096a-416d-9f55-a7ce093d3a0e] ConnectTimeout: Request to http://10.150.81.25:9696/v2.0/ports.json?tenant_id=6c5e5823e570438b9d770eb0bc097418&device_id=4c6cc3fd-096a-416d-9f55-a7ce093d3a0e timed out
2018-04-03 15:00:19.441 102408 ERROR nova.compute.manager [instance: 4c6cc3fd-096a-416d-9f55-a7ce093d3a0e]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Created attachment 1417082 [details]
sosreport to analyze error log after applying vhostuserclient patches
Setup: 1 controller and 2 compute nodes * Deployed OSP10 with ovs as server mode (dpdkvhostuser) * Created a VM in compute-1 * Executed minor update with neutron and nova changes required for dpdkvhostuserclient mode and other tripleo change required for ovs2.9 * Rebooted compute-0 after the update is complete * Create a new VM, created successfully on compute-0 with dpdkvhostuserclient mode * Migrate existing VM in compute-1 to compute-0 * Migration failed * Neutron port is updated with the new socket directory path * Virsh xml is created with old directory where as ovs is listening to the new directory * Migration hangs Created attachment 1425799 [details]
Migration Hang - compute-0 (migration target)
Created attachment 1425800 [details]
Migration Hang - compute-1 (migration source)
This time instead of chaning the vhost socket directory to new "/var/lib/vhost_sockets" as like comment #6, I tried migration with the same directory "/var/run/opevswitch". Here is the behavior: 1 controller + 2 computes, 1 VM on each compute. scenario 1: ----------- compute-0 in ovs2.9 after reboot compute-1 in ovs2.6 without reboot migrate vm from compute-1 to compute-0 migration is sucessful vm migrated to compute-0 with the same mode - dpdkvhostuser scenario 2: ----------- after above step, reboot compute-1 now compute-1 up with ovs2.9 migrate the same vm from compute-0 to compute-1 migration fails ovs is creating socket in dpdkvhostuserclient mode qemu is also in client mode --------------------------------- Apr 26 09:59:31 overcloud-compute-1 ovs-vsctl: ovs|00001|vsctl|INFO|Called as ovs-vsctl -- --may-exist add-br br-int -- set Bridge br-int datapath_type=netdev Apr 26 09:59:31 overcloud-compute-1 ovs-vsctl: ovs|00001|vsctl|INFO|Called as ovs-vsctl --timeout=120 -- --may-exist add-port br-int vhu41649dbf-6c -- set Interface vhu41649dbf-6c external-ids:iface-id=41649dbf-6cc1-4779-a9c4-fbeff1f2f4b0 external-ids:iface-status=active external-ids:attached-mac=fa:16:3e:8e:dd:59 external-ids:vm-uuid=930eee66-589b-4a60-92f8-ff8f7197e8d7 type=dpdkvhostuserclient options:vhost-server-path=/var/run/openvswitch/vhu41649dbf-6c Apr 26 09:59:31 overcloud-compute-1 ovs-vsctl: ovs|00001|vsctl|INFO|Called as ovs-vsctl --timeout=120 -- set interface vhu41649dbf-6c mtu_request=1496 Apr 26 09:59:33 overcloud-compute-1 nova_migration_wrapper: Allowing connection='10.150.81.23 51804 10.150.81.27 22' command=['nc', '-U', '/var/run/libvirt/libvirt-sock'] Apr 26 09:59:33 overcloud-compute-1 systemd-machined: New machine qemu-3-instance-00000001. Apr 26 09:59:33 overcloud-compute-1 systemd: Started Virtual Machine qemu-3-instance-00000001. Apr 26 09:59:33 overcloud-compute-1 systemd: Starting Virtual Machine qemu-3-instance-00000001. Apr 26 09:59:33 overcloud-compute-1 libvirtd: 2018-04-26 09:59:33.165+0000: 3567: error : qemuMonitorIORead:595 : Unable to read from monitor: Connection reset by peer Apr 26 09:59:33 overcloud-compute-1 libvirtd: 2018-04-26 09:59:33.165+0000: 3567: error : qemuProcessReportLogError:1912 : internal error: qemu unexpectedly closed the monitor: 2018-04-26T09:59:33.157557Z qemu-kvm: -chardev socket,id=charnet0,path=/var/run/openvswitch/vhu41649dbf-6c: Failed to connect socket /var/run/openvswitch/vhu41649dbf-6c: No such file or directory Apr 26 09:59:33 overcloud-compute-1 systemd-machined: Machine qemu-3-instance-00000001 terminated. ------------------------ sosreport will be attached for this scenario. Created attachment 1427124 [details] Migration fail (comment #9) compute-1 migration target Created attachment 1427127 [details] Migration fail (comment #9) compute-0 migration source According to our records, this should be resolved by openstack-nova-14.1.0-3.el7ost. This build is available now. Live Migration w/o cpu pinning is working see BZ https://bugzilla.redhat.com/show_bug.cgi?id=1542107 |