Description of problem: When running a Browbeat+Rally use case which does: Create a network Create a sbunet Create a router Attach router to subnet and oublic network Boot VM with floating IP Ping VM with concurrency 8 and times set to 50 we see that some VMs remain unpingable even after 300 seconds and we see a lot of the following exceptions in karaf logs 17-09-01 12:19:12,555 | ERROR | Pool-1-worker-28 | DataStoreJobCoordinator | 319 - org.opendaylight.genius.mdsalutil-api - 0.2.2.Carbon | Exception when executing jobEntry: JobEntry{key='FIB-100003-119049511607856-172.21.0.191/32', mainWorker=org.opendaylight.netvirt.fibmanager.VrfEntryListener$$Lambda$655/157999811@5e29fcbe, rollbackWorker=null, retryCount=0, futures=null} java.lang.IllegalArgumentException: All keys must be specified for class org.opendaylight.yang.gen.v1.urn.opendaylight.netvirt.neutronvpn.rev150602.subnetmaps.SubnetmapKey. Missing key is getId. Supplied key is SubnetmapKey [] at com.google.common.base.Preconditions.checkArgument(Preconditions.java:145) at org.opendaylight.yangtools.binding.data.codec.impl.ValueContext.getAndSerialize(ValueContext.java:43) at org.opendaylight.yangtools.binding.data.codec.impl.IdentifiableItemCodec.serialize(IdentifiableItemCodec.java:116) at org.opendaylight.yangtools.binding.data.codec.impl.IdentifiableItemCodec.serialize(IdentifiableItemCodec.java:30) at org.opendaylight.yangtools.binding.data.codec.impl.KeyedListNodeCodecContext.addYangPathArgument(KeyedListNodeCodecContext.java:52) at org.opendaylight.yangtools.binding.data.codec.impl.DataObjectCodecContext.bindingPathArgumentChild(DataObjectCodecContext.java:187) at org.opendaylight.yangtools.binding.data.codec.impl.BindingCodecContext.getCodecContextNode(BindingCodecContext.java:127) at org.opendaylight.yangtools.binding.data.codec.impl.InstanceIdentifierCodec.serialize(InstanceIdentifierCodec.java:29) at org.opendaylight.yangtools.binding.data.codec.impl.InstanceIdentifierCodec.serialize(InstanceIdentifierCodec.java:19) at org.opendaylight.yangtools.binding.data.codec.impl.BindingNormalizedNodeCodecRegistry.toYangInstanceIdentifier(BindingNormalizedNodeCodecRegistry.java:87) at org.opendaylight.controller.md.sal.binding.impl.BindingToNormalizedNodeCodec.toYangInstanceIdentifierBlocking(BindingToNormalizedNodeCodec.java:101) at org.opendaylight.controller.md.sal.binding.impl.AbstractForwardedTransaction.doRead(AbstractForwardedTransaction.java:64) at org.opendaylight.controller.md.sal.binding.impl.BindingDOMReadTransactionAdapter.read(BindingDOMReadTransactionAdapter.java:31) at org.opendaylight.genius.datastoreutils.SingleTransactionDataBroker.syncReadOptional(SingleTransactionDataBroker.java:70) at org.opendaylight.genius.mdsalutil.MDSALUtil.read(MDSALUtil.java:564) at org.opendaylight.netvirt.fibmanager.FibUtil.getSubnetMap(FibUtil.java:622) at org.opendaylight.netvirt.fibmanager.FibUtil.isVxlanNetworkAndInternalRouterVpn(FibUtil.java:704) at org.opendaylight.netvirt.fibmanager.FibUtil.enforceVxlanDatapathSemanticsforInternalRouterVpn(FibUtil.java:723) at org.opendaylight.netvirt.fibmanager.VrfEntryListener.lambda$checkDeleteLocalFibEntry$8(VrfEntryListener.java:985) at org.opendaylight.genius.datastoreutils.DataStoreJobCoordinator$MainTask.run(DataStoreJobCoordinator.java:285)[319:org.opendaylight.genius.mdsalutil-api:0.2.2.Carbon] at java.util.concurrent.ForkJoinTask$RunnableExecuteAction.exec(ForkJoinTask.java:1402)[:1.8.0_141] at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)[:1.8.0_141] at java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)[:1.8.0_141] at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692)[:1.8.0_141] at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157)[:1.8.0_141] Version-Release number of selected component (if applicable): ODL Caarbon+OSP12 opendaylight/6.2.0-0.1.20170829rel1948.el7.noarch python-networking-odl-11.0.0-0.20170806093629.2e78dca.el7ost.noarch How reproducible: 100% Steps to Reproduce: 1. Deploy overcloud with ODL 2. Run custom Browbeat+Rally plugin which boots vm and pings floatingip 3. Actual results: Tracebacks in karaf logs Expected results: No tracebacks in karaf Additional info:
Sai, upstream discussions in https://lists.opendaylight.org/pipermail/netvirt-dev/2017-September/005434.html have identified that a fix for this one probably went in about a week after (c/62627 from Sep 5th = 20170905) the image you've used (20170829) when in when you hit this. Could I ask you to keep an eye out if you see this again with a newer carbon image?
Another upstream patch was merged: https://git.opendaylight.org/gerrit/62470 And another that is not merged yet: https://git.opendaylight.org/gerrit/63041
Since we lost access to the environment, have to wait until the next round of testing to confirm.
I can confirm that this is no longer seen when running the same use case on OSP13 + Carbon
I meant OSP13 + Oxygen in above comment.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2018:2086