Bug 1575956
Summary: | rpc.rquotad version 4.04 burns 100% CPU | ||||||
---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Bert DeKnuydt <Bert.Deknuydt> | ||||
Component: | quota | Assignee: | Petr Pisar <ppisar> | ||||
Status: | CLOSED ERRATA | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||
Severity: | high | Docs Contact: | |||||
Priority: | unspecified | ||||||
Version: | 28 | CC: | ovasik, ppisar, steved | ||||
Target Milestone: | --- | ||||||
Target Release: | --- | ||||||
Hardware: | All | ||||||
OS: | Linux | ||||||
URL: | https://sourceforge.net/p/linuxquota/feature-requests/16/ | ||||||
Whiteboard: | |||||||
Fixed In Version: | quota-4.04-6.fc29 | Doc Type: | If docs needed, set a value | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2019-05-03 11:36:04 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Bert DeKnuydt
2018-05-08 12:03:24 UTC
Thanks for the report. I can reproduce it. This happens inside tirpc library: #0 0x00007f86ab2e3254 in accept () from /lib64/libc.so.6 #1 0x00007f86ab5c1274 in rendezvous_request () from /lib64/libtirpc.so.3 #2 0x00007f86ab5bd6b8 in svc_getreq_common () from /lib64/libtirpc.so.3 #3 0x00007f86ab5bd947 in svc_getreq_poll () from /lib64/libtirpc.so.3 #4 0x00007f86ab5c016e in svc_run () from /lib64/libtirpc.so.3 #5 0x000055e6f0a3d6c7 in main (argc=2, argv=0x7ffc9afd7d18) at rquota_svc.c:538 The polled descriptors are: rpc.rquot 21981 root 4u IPv4 80449159 0t0 UDP *:rquotad rpc.rquot 21981 root 5u sock 0,9 0t0 80449162 protocol: TCP rpc.rquot 21981 root 6u IPv6 80449165 0t0 UDP *:rquotad rpc.rquot 21981 root 7u sock 0,9 0t0 80449168 protocol: TCPv6 One can use "/usr/sbin/rpc.rquotad -F" command to prevent from daemonization. Fedora 27 is not affected, because it does not use tirpc there. I can also reproduce it on Fedora 29. Either it's a bug in libtirpc-1.0.3-1.rc1.fc28.x86_64 that wants to accept raw sockets, or rpc.rquotad setups the tirpc service inappropriately. Maybe libtirpc maintainer will know more. rpc.rquotad code is in rquota_svc.c (In reply to Petr Pisar from comment #2) > > Either it's a bug in libtirpc-1.0.3-1.rc1.fc28.x86_64 that wants to accept > raw sockets, or rpc.rquotad setups the tirpc service inappropriately. Maybe > libtirpc maintainer will know more. rpc.rquotad code is in rquota_svc.c Well it is spinning in rendezvous_request due to this loop: again: len = sizeof addr; sock = accept(xprt->xp_fd, (struct sockaddr *)(void *)&addr, &len); if (sock < 0) { if (errno == EINTR) goto again; return (FALSE); } There was the same loop in the old glibc code as well so nothing there has changed there... It definitly has something to do with creating the TCP listening socket since the UDP socket seems to be fine: # rpcinfo -T udp localhost 100011 program 100011 version 1 ready and waiting program 100011 version 2 ready and waiting # rpcinfo -T tcp localhost 100011 rpcinfo: RPC: Remote system error - Connection refused In Comment 3 I was mistaken with that again loop since the errno being returned from accept is EINVAL not EINTR. The reason for this looping and accept returning EINVAL is a listen(2) call is not being done the the fd used to create the SVCXPRT pointer. The reason a listen is not done is because svc_tli_create() is given pre-created socket and a bound socket. In this cause svc_tli_create() will not call listen(). In the glibc code, a listen was called, I proved that with strace, but I'm not sure from where since I can not find a svc_til_create() in the older (Fedora 27) glibc code. Since there does not seem to be an upstream git tree, I can not tell when that call was added. Was that call recently added? So the assumption is made by the tirpc version of svc_tli_create() is when handed a socket that is bound, its assumed the listen call has also been made. In other apps using it (in nfs-utils) this is the case. Created attachment 1435510 [details]
Patch that fixes the problem
The attached patch adds a listen() call to svc_create_sock()
routine which is needed with libtirpc version of svc_tli_create()
as well as a needed IPv6 setsockopt().
It appears the glibc version of that call must have done
the listen() call. Not clear how that ever worked since
other versions of svc_tli_create() (aka BSD) don't
do the bind either plus the IPV6_V6ONLY is needed
to allow the binds to work.
I can conform that this solves the issue, at least here (no ipv6 in use). Thanks for the quick fix ... Now wait for the rpms... The git tree is at <git://git.code.sf.net/p/linuxquota/code>. Fedora 28 rebased quota to 4.04 that migrated from NIS RPC to tirpc. Hence Fedora 28 is the first one that uses tirpc and supports IPv6. Thanks for the patch. I will review it, forward to the upstream and apply to Fedora. The patch looks good and I confirm it fixes this issue. The RPC calls still work over both IPv4 and IPv6. I sent the patch to <https://sourceforge.net/p/linuxquota/feature-requests/16/> quota-4.04-6.fc28 has been submitted as an update to Fedora 28. https://bodhi.fedoraproject.org/updates/FEDORA-2018-4adeec12ed quota-4.04-6.fc28 has been pushed to the Fedora 28 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2018-4adeec12ed quota-4.04-6.fc28 has been pushed to the Fedora 28 stable repository. If problems still persist, please make note of it in this bug report. This message is a reminder that Fedora 28 is nearing its end of life. On 2019-May-28 Fedora will stop maintaining and issuing updates for Fedora 28. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as EOL if it remains open with a Fedora 'version' of '28'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version. Thank you for reporting this issue and we are sorry that we were not able to fix it before Fedora 28 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior this bug is closed as described in the policy above. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. |