Bug 1392819

Summary: [NetKVM] Race between ParaNdis_SetupRSSQueueMap and ParaNdis6_SendNetBufferLists due to the lack of synchronisation
Product: Red Hat Enterprise Linux 7 Reporter: Yvugenfi <yvugenfi>
Component: virtio-winAssignee: ybendito
virtio-win sub component: virtio-win-prewhql QA Contact: Virtualization Bugs <virt-bugs>
Status: CLOSED ERRATA Docs Contact:
Severity: medium    
Priority: medium CC: lijin, lmiksik, mfuruta, ybendito
Version: 7.4   
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Windows   
URL: https://github.com/YanVugenfirer/kvm-guest-drivers-windows/pull/83
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-08-01 12:53:08 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Yvugenfi@redhat.com 2016-11-08 10:04:34 UTC
Description of problem:

There are patches submitted upstream to fix RSS configuration and TX path race condition.
Those patches need additional work to become part of the code base.

https://github.com/YanVugenfirer/kvm-guest-drivers-windows/pull/83

--->>>
There is a race between ParaNdis_SetupRSSQueueMap and
ParaNdis6_SendNetBufferLists due to the lack of r/w sync:

Processor 1 call stack and code:
NDIS!ndisMInvokeOidRequest calls
ParaNdis6_OidRequest calls (through OidsDB callback table)
OIDENTRYPROC(OID_GEN_RECEIVE_SCALE_PARAMETERS, 0,0,0, ohfSet
| ohfSetMoreOK, RSSSetParameters),
RSSSetParameters calls
ParaNdis_SetupRSSQueueMap

if (pContext->RSS2QueueLength && 
    pContext->RSS2QueueLength < rssTableSize)
{
    DPrintf(0, ("[%s] Freeing RSS2Queue Map\n", __FUNCTION__));
--> NdisFreeMemoryWithTagPriority(pContext->MiniportHandle,
pContext->RSS2QueueMap, PARANDIS_MEMORY_TAG);
pContext->RSS2QueueLength = 0;
}

if (!pContext->RSS2QueueLength)
{
    pContext->RSS2QueueLength = USHORT(rssTableSize);
    pContext->RSS2QueueMap = (CPUPathesBundle **)
        NdisAllocateMemoryWithTagPriority(pContext->MiniportHandle,
        rssTableSize * sizeof(*pContext->RSS2QueueMap),
        PARANDIS_MEMORY_TAG, NormalPoolPriority);
    if (pContext->RSS2QueueMap == nullptr)
--> {
DPrintf(0, ("[%s] - Allocating RSS to queue mapping failed\n",
FUNCTION));
NdisFreeMemoryWithTagPriority(pContext->MiniportHandle,
cpuIndexTable, PARANDIS_MEMORY_TAG);
return NDIS_STATUS_RESOURCES;
}

--> NdisZeroMemory(pContext->RSS2QueueMap,
sizeof(*pContext->RSS2QueueMap) * pContext->RSS2QueueLength);
}

Processor 2 call stack and code:
NDIS!NdisSendNetBufferLists calls
netkvm!ParaNdis6_SendNetBufferLists

--> pContext->RSS2QueueMap[indirectionIndex]->txPath.Send(pNBL);

Another race is indirectionIndex calculation via RSSHashMask.
Mask can be updated before map update. So there is a need to check
that index fits into bounds. (We suppose that there is NO_PROBLEM
if packet will be sent not exactly in the proper queue by its hash value).

This patch set introduces r/w spinlock as well as index inbound validation.

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 ybendito 2016-11-18 11:49:19 UTC
Posted to downstream

Comment 2 lijin 2017-03-29 03:23:33 UTC
Hi Yuri,

How could QE verify this bug?
Is "no regression" enough?

Comment 3 ybendito 2017-03-29 10:22:45 UTC
(In reply to lijin from comment #2)
> Hi Yuri,
> 
> How could QE verify this bug?
> Is "no regression" enough?

Yes, as there is no specific failure we were able to reproduce.

Comment 4 lijin 2017-06-07 05:53:58 UTC
change status to verified as all jobs can pass with build139

Comment 7 errata-xmlrpc 2017-08-01 12:53:08 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:2341