Bug 1597236
Summary: | [Netvirt] Tempest tests fail indicating FIP connectivity problems, vpnid=-1 | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat OpenStack | Reporter: | Waldemar Znoinski <wznoinsk> | ||||
Component: | opendaylight | Assignee: | Vishal Thapar <vthapar> | ||||
Status: | CLOSED ERRATA | QA Contact: | Waldemar Znoinski <wznoinsk> | ||||
Severity: | urgent | Docs Contact: | |||||
Priority: | high | ||||||
Version: | 13.0 (Queens) | CC: | asuryana, joflynn, jschluet, mkolesni, nyechiel, vthapar, wznoinsk | ||||
Target Milestone: | z2 | Keywords: | Triaged, ZStream | ||||
Target Release: | 13.0 (Queens) | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | Netvirt | ||||||
Fixed In Version: | opendaylight-8.3.0-2.el7ost | Doc Type: | If docs needed, set a value | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: |
N/A
|
|||||
Last Closed: | 2018-08-29 16:20:16 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Waldemar Znoinski
2018-07-02 10:29:52 UTC
Created attachment 1455932 [details]
sosreport and odltools from a moment cold_migration test failed (but didn't clean the resources yet)
There are some false alarm logs for routers that don't exist, likely https://bugzilla.redhat.com/show_bug.cgi?id=1519783 These are the different SNAT default flows for this external router: Table:21, Host:host-192-168-24-6.localdomain, DpnId:497381571937/0x73ce407d61, FlowId:DefaultFibRouteForSNATSNAT.497381571937.21.100184,VpnId:100184/0x30eb0,Reason:None Table:21, Host:host-192-168-24-7.localdomain, DpnId:127745018911474/0x742ef4795af2, FlowId:DefaultFibRouteForSNATSNAT.127745018911474.21.-1,VpnId:8388607/0xfffffe,Reason:VpnInstance for VpnId not found Table:21, Host:host-192-168-24-17.localdomain, DpnId:123382746230149/0x703748c2ad85, FlowId:DefaultFibRouteForSNATSNAT.123382746230149.21.100184,VpnId:100184/0x30eb0,Reason:None Table:21, Host:host-192-168-24-11.localdomain, DpnId:154348852276935/0x8c612482eec7, FlowId:DefaultFibRouteForSNATSNAT.154348852276935.21.100184,VpnId:100184/0x30eb0,Reason:None Note that other 3 OVS got flows correctly, only on one it failed. It is likely the selected NAPT switch. Need to confirm from some other models as to which one is selected NAPT. Confirmed this is indeed the selected NAPT switch: { "napt-switches": { "router-to-napt-switch": [ { "router-name": "75875187-4dd1-4211-b62f-aea702df4f54", "primary-switch-id": 127745018911474 } ] } } Router ID also matches the one in logs: 2018-06-29T02:13:39,407 | WARN | org.opendaylight.yang.gen.v1.urn.opendaylight.neutron.ports.rev150712.ports.attributes.ports.Port_AsyncDataTreeChangeListenerBase-DataTreeChangeHandler-0 | NeutronPortChangeListener | 360 - org.opendaylight.netvirt.neutronvpn-impl - 0.6.3.redhat-1 | No router found for router GW port a7494061-ff84-49a3-a898-e7ea5ee0238b for router 75875187-4dd1-4211-b62f-aea702df4f54 2018-06-29T02:13:39,518 | INFO | org.opendaylight.yang.gen.v1.urn.opendaylight.neutron.l3.rev150712.routers.attributes.routers.Router_AsyncClusteredDataTreeChangeListenerBase-DataTreeChangeHandler-0 | NeutronRouterChangeListener | 354 - org.opendaylight.netvirt.ipv6service-impl - 0.6.3.redhat-1 | Add Router notification handler is invoked Uuid [_value=75875187-4dd1-4211-b62f-aea702df4f54]. But the NAPT code does show it has correct routerId: 2018-06-29T02:13:39,550 | INFO | org.opendaylight.yang.gen.v1.urn.opendaylight.netvirt.natservice.rev160111.napt.switches.RouterToNaptSwitch_AsyncDataTreeChangeListenerBase-DataTreeChangeHandler-0 | ConntrackBasedSnatService | 358 - org.opendaylight.netvirt.natservice-impl - 0.6.3.redhat-1 | installSnatSpecificEntriesForNaptSwitch: called for router 75875187-4dd1-4211-b62f-aea702df4f54 2018-06-29T02:13:39,550 | INFO | org.opendaylight.yang.gen.v1.urn.opendaylight.netvirt.natservice.rev160111.napt.switches.RouterToNaptSwitch_AsyncDataTreeChangeListenerBase-DataTreeChangeHandler-0 | ConntrackBasedSnatService | 358 - org.opendaylight.netvirt.natservice-impl - 0.6.3.redhat-1 | installSnatSpecificEntriesForNaptSwitch : called for the primary NAPT switch dpnId 127745018911474 2018-06-29T02:13:39,551 | INFO | org.opendaylight.yang.gen.v1.urn.opendaylight.netvirt.natservice.rev160111.napt.switches.RouterToNaptSwitch_AsyncDataTreeChangeListenerBase-DataTreeChangeHandler-0 | ConntrackBasedSnatService | 358 - org.opendaylight.netvirt.natservice-impl - 0.6.3.redhat-1 | installTerminatingServiceTblEntry : creating entry for Terminating Service Table for switch 127745018911474, routerId 100184 2018-06-29T02:13:39,551 | INFO | org.opendaylight.yang.gen.v1.urn.opendaylight.netvirt.natservice.rev160111.napt.switches.RouterToNaptSwitch_AsyncDataTreeChangeListenerBase-DataTreeChangeHandler-0 | ConntrackBasedSnatService | 358 - org.opendaylight.netvirt.natservice-impl - 0.6.3.redhat-1 | createOutboundTblTrackEntry : called for switch 127745018911474, routerId 100184 2018-06-29T02:13:39,552 | INFO | org.opendaylight.yang.gen.v1.urn.opendaylight.netvirt.natservice.rev160111.napt.switches.RouterToNaptSwitch_AsyncDataTreeChangeListenerBase-DataTreeChangeHandler-0 | ConntrackBasedSnatService | 358 - org.opendaylight.netvirt.natservice-impl - 0.6.3.redhat-1 | createOutboundTblEntry : dpId 127745018911474 and routerId 100184 2018-06-29T02:13:39,553 | INFO | org.opendaylight.yang.gen.v1.urn.opendaylight.netvirt.natservice.rev160111.napt.switches.RouterToNaptSwitch_AsyncDataTreeChangeListenerBase-DataTreeChangeHandler-0 | ConntrackBasedSnatService | 358 - org.opendaylight.netvirt.natservice-impl - 0.6.3.redhat-1 | installNaptPfibFlow : dpId 127745018911474, extNetId 100000 2018-06-29T02:13:39,553 | INFO | org.opendaylight.yang.gen.v1.urn.opendaylight.netvirt.natservice.rev160111.napt.switches.RouterToNaptSwitch_AsyncDataTreeChangeListenerBase-DataTreeChangeHandler-0 | ConntrackBasedSnatService | 358 - org.opendaylight.netvirt.natservice-impl - 0.6.3.redhat-1 | installInboundEntry : dpId 127745018911474 and routerId 100184 2018-06-29T02:13:39,554 | INFO | org.opendaylight.yang.gen.v1.urn.opendaylight.netvirt.natservice.rev160111.napt.switches.RouterToNaptSwitch_AsyncDataTreeChangeListenerBase-DataTreeChangeHandler-0 | ConntrackBasedSnatService | 358 - org.opendaylight.netvirt.natservice-impl - 0.6.3.redhat-1 | installNaptPfibEntry : called for dpnId 127745018911474 and routerId 100184 And found the smoking gun: 2018-06-29T02:13:39,549 | INFO | org.opendaylight.yang.gen.v1.urn.opendaylight.netvirt.natservice.rev160111.napt.switches.RouterToNaptSwitch_AsyncDataTreeChangeListenerBase-DataTreeChangeHandler-0 | AbstractSnatService | 358 - org.opendaylight.netvirt.natservice-impl - 0.6.3.redhat-1 | installInboundTerminatingServiceTblEntry : creating entry for Terminating Service Table for switch 127745018911474, routerId -1 This is flow for this one, note -1 in flowId: Table:36, Host:host-192-168-24-7.localdomain, DpnId:127745018911474/0x742ef4795af2, FlowId:SNAT.127745018911474.36.-1INBOUND,MplsLabel:100001,Reason:None Flow: {"barrier": false, "flow-name": "SNAT.127745018911474.36.-1INBOUND", "idle-timeout": 0, "installHw": true, "priority": 42, "strict": false, "table_id": 36, "id": "SNAT.127745018911474.36.-1INBOUND", "cookie": "0x8000006", "hard-timeout": 0, "match": {"tunnel": {"tunnel-id": 100001}, "ethernet-match": {"ethernet-type": {"type": 2048}}}, "instructions": {"instruction": [{"order": 0, "apply-actions": {"action": [{"order": 0, "openflowplugin-extension-nicira-action:nx-reg-load": {"dst": {"of-metadata": [null], "start": 0, "end": 23}, "value": "0x30d42"}}, {"order": 1, "openflowplugin-extension-nicira-action:nx-resubmit": {"table": 44}}]}}]}} we are checking for the presence of the value in neutron model [1]? nat util tries to retrieve it from a vpnservice model, shouldn't that create an issue? [1]https://github.com/opendaylight/netvirt/blob/stable/oxygen/neutronvpn/impl/src/main/java/org/opendaylight/netvirt/neutronvpn/NeutronvpnUtils.java#L37, [2]https://github.com/opendaylight/netvirt/blob/stable/oxygen/natservice/impl/src/main/java/org/opendaylight/netvirt/natservice/internal/NatUtil.java#L258 when we retrieve it in AbstractSnatService it seems to be -1 , but a while later when we retrieve it in ConntrackBasedSnatService it seems to have a valid value (In reply to Aswin Suryanarayanan from comment #5) > > we are checking for the presence of the value in neutron model [1]? nat util > tries to retrieve it from a vpnservice model, shouldn't that create an issue? > > [1]https://github.com/opendaylight/netvirt/blob/stable/oxygen/neutronvpn/ > impl/src/main/java/org/opendaylight/netvirt/neutronvpn/NeutronvpnUtils. > java#L37, > > [2]https://github.com/opendaylight/netvirt/blob/stable/oxygen/natservice/ > impl/src/main/java/org/opendaylight/netvirt/natservice/internal/NatUtil. > java#L258 > > > when we retrieve it in AbstractSnatService it seems to be -1 , but a while > later when we retrieve it in ConntrackBasedSnatService it seems to have a > valid value Do they both retrieve from different places? One from neutron model other from vpn? BTW, link in [1] is pointing to import statement. Oh Copy paste error. Yes it seems to be different model in the first look. [1]https://github.com/opendaylight/netvirt/blob/stable/oxygen/neutronvpn/impl/src/main/java/org/opendaylight/netvirt/neutronvpn/NeutronvpnUtils.java#L371 If routerId is null, we're not doing anything, even if routerId becomes available later. So, fix is to wait for routerId to be available and once it is, process it. *** Bug 1609334 has been marked as a duplicate of this bug. *** This bug is marked for inclusion in the errata but does not currently contain draft documentation text. To ensure the timely release of this advisory please provide draft documentation text for this bug as soon as possible. If you do not think this bug requires errata documentation, set the requires_doc_text flag to "-". To add draft documentation text: * Select the documentation type from the "Doc Type" drop down field. * A template will be provided in the "Doc Text" field based on the "Doc Type" value selected. Enter draft text in the "Doc Text" field. checked failures from last 2 weeks in downstream OSP13 CI (using latest opendaylight RPM) and no sight of this problem nor the vpnId messages Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2018:2598 The needinfo request[s] on this closed bug have been removed as they have been unresolved for 365 days |