Hide Forgot
Description of problem: There are patches submitted upstream to fix RSS configuration and TX path race condition. Those patches need additional work to become part of the code base. https://github.com/YanVugenfirer/kvm-guest-drivers-windows/pull/83 --->>> There is a race between ParaNdis_SetupRSSQueueMap and ParaNdis6_SendNetBufferLists due to the lack of r/w sync: Processor 1 call stack and code: NDIS!ndisMInvokeOidRequest calls ParaNdis6_OidRequest calls (through OidsDB callback table) OIDENTRYPROC(OID_GEN_RECEIVE_SCALE_PARAMETERS, 0,0,0, ohfSet | ohfSetMoreOK, RSSSetParameters), RSSSetParameters calls ParaNdis_SetupRSSQueueMap if (pContext->RSS2QueueLength && pContext->RSS2QueueLength < rssTableSize) { DPrintf(0, ("[%s] Freeing RSS2Queue Map\n", __FUNCTION__)); --> NdisFreeMemoryWithTagPriority(pContext->MiniportHandle, pContext->RSS2QueueMap, PARANDIS_MEMORY_TAG); pContext->RSS2QueueLength = 0; } if (!pContext->RSS2QueueLength) { pContext->RSS2QueueLength = USHORT(rssTableSize); pContext->RSS2QueueMap = (CPUPathesBundle **) NdisAllocateMemoryWithTagPriority(pContext->MiniportHandle, rssTableSize * sizeof(*pContext->RSS2QueueMap), PARANDIS_MEMORY_TAG, NormalPoolPriority); if (pContext->RSS2QueueMap == nullptr) --> { DPrintf(0, ("[%s] - Allocating RSS to queue mapping failed\n", FUNCTION)); NdisFreeMemoryWithTagPriority(pContext->MiniportHandle, cpuIndexTable, PARANDIS_MEMORY_TAG); return NDIS_STATUS_RESOURCES; } --> NdisZeroMemory(pContext->RSS2QueueMap, sizeof(*pContext->RSS2QueueMap) * pContext->RSS2QueueLength); } Processor 2 call stack and code: NDIS!NdisSendNetBufferLists calls netkvm!ParaNdis6_SendNetBufferLists --> pContext->RSS2QueueMap[indirectionIndex]->txPath.Send(pNBL); Another race is indirectionIndex calculation via RSSHashMask. Mask can be updated before map update. So there is a need to check that index fits into bounds. (We suppose that there is NO_PROBLEM if packet will be sent not exactly in the proper queue by its hash value). This patch set introduces r/w spinlock as well as index inbound validation. Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
Posted to downstream
Hi Yuri, How could QE verify this bug? Is "no regression" enough?
(In reply to lijin from comment #2) > Hi Yuri, > > How could QE verify this bug? > Is "no regression" enough? Yes, as there is no specific failure we were able to reproduce.
change status to verified as all jobs can pass with build139
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:2341