Bug 1486758
Summary: | testpmd pmd threads getting page faults | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Luiz Capitulino <lcapitulino> | ||||
Component: | dpdk | Assignee: | Eelco Chaudron <echaudro> | ||||
Status: | CLOSED UPSTREAM | QA Contact: | Jean-Tsung Hsiao <jhsiao> | ||||
Severity: | medium | Docs Contact: | |||||
Priority: | medium | ||||||
Version: | 7.5 | CC: | bhu, dgilbert, fleitner, linville, maxime.coquelin, pezhang | ||||
Target Milestone: | rc | Keywords: | Extras | ||||
Target Release: | --- | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2017-10-10 07:47:38 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Luiz Capitulino
2017-08-30 13:45:07 UTC
Created attachment 1320079 [details]
full trace output
This is the full trace output. Note that there's even page allocation going on when the PMD thread is in kernel-space.
I took a quick look at this today. I was hoping that there would be an option in DPDK config that would allow me to build testpmd statically but still provide the dynamic version of the libraries (which I think is what we want for RHEL, no?). But the only option I found was to have everything dynamic or everything static. Do we only provide dynamic versions of the libraries in RHEL7? Does this mean that apps developed in RHEL7 will have to support prefaulting by themselves? Is this actually possible given that the dynamic libs is loaded by the eal lib itself? I'm feeling I'm missing something important because my findings seem to indicate that we may want to do the opposite of what we're doing today. Anyways, as I can't be blocked by this I've built a dpdk package for my own testing which is statically built and I've also added a call to mlockall() in testpmd initialization. I can't reproduce my issue with that one. Another very interesting test I could do to compare static vs. dynamic testpmd is to check if the static built one can achieve higher throughout in the zero-loss test-case in the PVP scenario. But this will have to wait several days, as I have other tests to run. Eelco, The most important is having testpmd statically linked. Getting page faults because of the missing mlockall() should be extremely hard: you'd need memory pressure and the kernel would have to evict pages containing code testpmd is using. Still, having mlockall() is necessary and correct. I think that the first step for this BZ is to acknowledge (or confirm) that having testpmd dynamically linked is the root cause for the page faults I'm seeing. Once this is done, the next step would be to fix the dpdk package to have a statically linked testpmd. Finally, I think we have to offer static libs for customers developing dpdk apps in RHEL7. The "feeling of missing something important" I mentioned in comment 4 is because there's a big machinery in DPDK build system to allow shared libs but if they can incurr page faults, why offer this in the first place? Anyways, I'm CC'ing package maintainers as this is mostly about packaging. Did a lot of experiments, and I see these page faults with both a statically and dynamically linked version of testpmd. Linux will load pages on demand and probably for the statically linked version we are lucky and the pages get loaded due to other "close" code being executed. Adding the mlockall() will fix the pagefaults for both versions as the .text pages get loaded and locked at startup. You can see this behaviour by examining /proc/<id>/smap. I do think DPDK should not force the mlockall() in their libraries, but the NFV application should do it as part of their startup code. Also for other tasks than performance testing mlockall() might not make much sense, as over time the correct pages are in memory, and will probably stay in on a well dimensioned system. However for testpmd I have send out a patch to include mlockall(), lets see the upstream response on this. Patch can be found here: http://dpdk.org/ml/archives/dev/2017-September/075176.html (In reply to Eelco Chaudron from comment #7) > Did a lot of experiments, and I see these page faults with both a statically > and dynamically linked version of testpmd. > > Linux will load pages on demand and probably for the statically linked > version we are lucky and the pages get loaded due to other "close" code > being executed. > > Adding the mlockall() will fix the pagefaults for both versions as the .text > pages get loaded and locked at startup. You can see this behaviour by > examining /proc/<id>/smap. That's an awesome finding! When doing the static vs. dynamic comparisons, I always had mlockall() for the static version of testpmd. But I assumed I wasn't getting page faults because it was statically linked and hence the kernel would load all text section at once at startup. I did not think mlockall() had anything to do with it. I think I've learned something today :) > I do think DPDK should not force the mlockall() in their libraries, but the > NFV application should do it as part of their startup code. Also for other > tasks than performance testing mlockall() might not make much sense, as over > time the correct pages are in memory, and will probably stay in on a well > dimensioned system. I disagree with that part. I think that at the very least EAL should have a command-line option to call mlockall() from the DPDK application. But that's an upstream discussion. The most important for the immediate term is to get testpmd fixed for those using it for testing. > However for testpmd I have send out a patch to include mlockall(), lets see > the upstream response on this. Patch can be found here: > > http://dpdk.org/ml/archives/dev/2017-September/075176.html Thanks a lot, that patch certainly solves this BZ. Hi Eelco, Thanks for the patch. I have one question about using mlockall() with regard to postcopy live-migration (adding David in cc:). When mlockall(CL_CURRENT | MCL_FUTURE) is called, do you know whether later mmap() calls on shared memory without MAP_POPULATE flag will or not prefault the mmap'ed pages? (In reply to Maxime Coquelin from comment #9) > Hi Eelco, > > Thanks for the patch. > I have one question about using mlockall() with regard to postcopy > live-migration (adding David in cc:). > > When mlockall(CL_CURRENT | MCL_FUTURE) is called, do you know whether later > mmap() calls on shared memory without MAP_POPULATE flag will or not prefault > the mmap'ed pages? I do not know the behaviour, but found this online (assuming code has not changed) http://www.spinics.net/lists/linux-man/msg09821.html (In reply to Maxime Coquelin from comment #9) > Hi Eelco, > > Thanks for the patch. > I have one question about using mlockall() with regard to postcopy > live-migration (adding David in cc:). > > When mlockall(CL_CURRENT | MCL_FUTURE) is called, do you know whether later > mmap() calls on shared memory without MAP_POPULATE flag will or not prefault > the mmap'ed pages? I'm not sure (I always ask aarcange for this type of thing). (Note in the postcopy code in qemu I do an munlock before the postcopy and turn it back on later) (In reply to Eelco Chaudron from comment #10) > (In reply to Maxime Coquelin from comment #9) > > Hi Eelco, > > > > Thanks for the patch. > > I have one question about using mlockall() with regard to postcopy > > live-migration (adding David in cc:). > > > > When mlockall(CL_CURRENT | MCL_FUTURE) is called, do you know whether later > > mmap() calls on shared memory without MAP_POPULATE flag will or not prefault > > the mmap'ed pages? > > I do not know the behaviour, but found this online (assuming code has not > changed) > > http://www.spinics.net/lists/linux-man/msg09821.html Thanks for the link, that's interesting. IIUC, MCL_ONFAULT could do the trick, but we may want to do mlockall(MCL_CURRENT); mlockall(MCL_FUTURE | MCL_ONFAULT); But it may be better to keep it as you proposed and do an munlock()/mlock() at postcopy time as suggests David: (In reply to Dr. David Alan Gilbert from comment #11) > I'm not sure (I always ask aarcange for this type of thing). > (Note in the postcopy code in qemu I do an munlock before the postcopy and > turn it back on later) I'll keep this in mind when implementing postcopy, for now your patch is fine to me. Thanks, Maxime Patch for testpmd and documentation got accepted: http://dpdk.org/ml/archives/dev/2017-October/078316.html http://dpdk.org/ml/archives/dev/2017-October/078317.html |