Bug 1583976
| Summary: | Contrail: Support for qemu reconnect in client mode of operation | ||
|---|---|---|---|
| Product: | Red Hat OpenStack | Reporter: | Jeya ganesh babu J <jjeya> |
| Component: | openstack-nova | Assignee: | smooney |
| Status: | CLOSED NOTABUG | QA Contact: | OSP DFG:Compute <osp-dfg-compute> |
| Severity: | medium | Docs Contact: | |
| Priority: | medium | ||
| Version: | 14.0 (Rocky) | CC: | cylopez, dasmith, egallen, eglynn, jhakimra, jjeya, kchamart, knoel, lyarwood, mbooth, mprivozn, mrussell, sbauza, sferdjao, sgordon, smooney, srevivo, virt-maint, vromanso |
| Target Milestone: | --- | Keywords: | FutureFeature |
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2019-05-17 14:09:11 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Jeya ganesh babu J
2018-05-30 06:03:36 UTC
I don't see anything specific to OpenStack here. If you want to request changes in 'qemu-kvm-rhev', the Product should be Red Hat Enterprise Linux. So I've changed the bug 'Product'. Based on the QEMU patch shared I imagine the request is on Nova to configure the vhostuser interfaces with reconnect enabled [0]. Could you confirm that? [0] https://libvirt.org/formatdomain.html#elementVhostuser Yes, ideally it should be from nova. But i am not sure if there is a way to set this from nova. Michal, can you have a look? Should the reconnect config mentioned above (comment#3) be done by libvirt or Nova? Seems to be already on libvirt: https://libvirt.org/formatdomain.html#elementVhostuser The question is what would be the good value or do we have to let operators decide. (In reply to Amnon Ilan from comment #5) > Yes, libvirt already supports reconnect (from version 4.1.0 onwards). So the only part that is missing is Nova putting the attribute in the domain XML. (In reply to Sahid Ferdjaoui from comment #6) > I think the value should be somewhere around units of seconds. ten seconds being the upper limit. Restart of vhostuser does not take too long and reconnect basically tells qemu how long to wait between each connect retry. In other words, it's not like qemu gives up reconnecting after first failed attempt. (In reply to Kashyap Chamarthy from comment #1) > I think this bug should be moved back to OpenStack. the bevhior described related to restart the vhost user server( the contrail vrouter in this instance) is the expect behaviour of the vhost user protcol. vhost user recoonect is not supported by qemu when qemu is the client. the featre was devel sepficly to only work when qemu is the sever and dpdk is the client. the reason the connect is broken is that when the vhost user server retarts the unix socket i created are closed but qemu is still holind an open file descrtorp to the socket. when the backend restart it creates new unix sockets at the same file paths with different file descriptors. qemu will not detect this and recoonect when it is in client mode. the desision not to support this when qemu was the clint was made between the dpdk and qemu comunityies to simplfy both codebases and converge on a common deployment configuration where the life time of the unix socket is tied to the liftime of the vm by making qemu the server and dpdk the client. from a nova neutron design perspective nova required neutron to pass the vhost user socket mode as part of the vif binding details form the neutron ml2 driver. introducing a nova config option for enabling retry of violates several previous decisions relating to not add networking options to nova. as such i do not belive this is a bug but rather a miss configuration of the vrouter. if you wish to use reconnect you should set the vhost-user mode in the ml2 dirver to server to indicate that qemu is the server and configure the vrouter as the client. Sean I think the context of ther equest is vhu server so QEMU in client. QEMU is now providing a reconnect timeout so the socket will not be closed. That looks reasonable for other vswitch than OVS which are using vhu server (QEMU in client). That has been accepted upstream and it's exposed by libvirt [0]. That looks reasonable to provide such tunable in Nova. Or do you see something not right here? [0] https://www.redhat.com/archives/libvir-list/2017-September/msg00180.html allowing the use of the feature is reasonable. provideidng a nova config option for it is not. it would be reasonbable to add the xml genration code to nova but enableing it needs to be done via neutron. *** Bug 1608531 has been marked as a duplicate of this bug. *** As per bug triage call i have closed https://bugzilla.redhat.com/show_bug.cgi?id=1608531 as a duplicate and we will use this BZ to track the delivery of this feature. as also noted this has expressly been requested for backport to osp 13 which is the version on which the customer is currently deployed. as we cannot determine how feasible a backport is at this time it should be reassessed as part of closing this BZ. (In reply to Sahid Ferdjaoui from comment #10) > Sean I think the context of ther equest is vhu server so QEMU in client. > > QEMU is now providing a reconnect timeout so the socket will not be closed. > That looks reasonable for other vswitch than OVS which are using vhu server > (QEMU in client). That has been accepted upstream and it's exposed by > libvirt [0]. That looks reasonable to provide such tunable in Nova. Or do > you see something not right here? > > [0] https://www.redhat.com/archives/libvir-list/2017-September/msg00180.html Trying to be practical here: what is the technical need/justification for this? Why can't we just set contrail vrouter as a vhost user client and QEMU as a vhost user server? Same way OVS/DPDK and QEMU have been re-designed. Why are we trying to fix something that we've already decided to move away from? See Sean's c#9. As of contrail version 5.0 https://github.com/Juniper/contrail-controller/releases/tag/r5.0 release in july 2018 the contrail controller no longer use vhost-user client mode and exclusively uses vhost user mode server where qemu is the server and vrouter is the client. https://github.com/Juniper/contrail-controller/blob/R5.0/src/config/api-server/vnc_cfg_api_server/vnc_cfg_types.py#L2239-L2242 https://github.com/Juniper/contrail-controller/blob/R5.0/src/config/api-server/vnc_cfg_api_server/vnc_cfg_types.py#L2266-L2271 https://github.com/Juniper/contrail-controller/blob/R5.0/src/config/api-server/vnc_cfg_api_server/vnc_cfg_types.py#L2425-L2431 https://github.com/Juniper/contrail-controller/blob/R5.0/src/config/api-server/vnc_cfg_api_server/vnc_cfg_types.py#L2443-L2448 based on https://www.juniper.net/documentation/en_US/contrail5.0/topics/concept/Deploying-Contrail-with-RedHatOpenStack.html#jd0e214 the supported version of contrail with osp 13 was 5.0.1 as such this is not required as contrail 4.1 is not supproted with osp 13 or later and features can no longer be backproted to osp10 so i am closing as "not a bug". |