| Summary: | epoll appears to be unfair in rawhide | ||||||
|---|---|---|---|---|---|---|---|
| Product: | [Fedora] Fedora | Reporter: | Steven Dake <sdake> | ||||
| Component: | libqb | Assignee: | Ryan McCabe <rmccabe> | ||||
| Status: | CLOSED EOL | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||
| Severity: | urgent | Docs Contact: | |||||
| Priority: | urgent | ||||||
| Version: | 19 | CC: | fdinitto, jbrouer, jfriesse, jpokorny, knoel, lhh | ||||
| Target Milestone: | --- | ||||||
| Target Release: | --- | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | 800865 | Environment: | |||||
| Last Closed: | 2015-02-17 14:10:00 UTC | Type: | --- | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Bug Depends On: | 800865 | ||||||
| Bug Blocks: | |||||||
| Attachments: |
|
||||||
|
Description
Steven Dake
2012-03-10 03:27:22 UTC
Ryan, If you have some interested in getting involved in libqb and more systems programming, getting to the root cause of this bug (is the implementation in libqb broken or is kernel.org broken, and if so in what way?) would be of great help. Regards -steve Hi Jason, I'm adding you to this case, as it seems you have more inside into epoll, and have some recent upstream commits/fixes that might solve this starvation epoll issue? --Jesper Brouer (new RH network kernel guy) Perhaps this upstream commit fixes the issue?
UPSTREAM commit 28d82dc1c4edbc352129f97f4ca22624d1fe61de
Author: Jason Baron <jbaron>
Date: Thu Jan 12 17:17:43 2012 -0800
epoll: limit paths
The current epoll code can be tickled to run basically indefinitely in
both loop detection path check (on ep_insert()), and in the wakeup paths.
The programs that tickle this behavior set up deeply linked networks of
epoll file descriptors that cause the epoll algorithms to traverse them
indefinitely. A couple of these sample programs have been previously
posted in this thread: https://lkml.org/lkml/2011/2/25/297.
[...cut...]
git describe --contains 28d82dc1c4edbc352129f97f4ca22624d1fe61de
v3.3-rc1~81^2~80
Or perhaps its an regression introduced in upstream commit 0884d7aa24 (v3.2-rc5~56^2~23), which gotten fixed by Eric Dumazet in upstream commit 6f01fd6e6f680 (v3.3-rc2~6^2).
commit 6f01fd6e6f6809061b56e78f1e8d143099716d70
Author: Eric Dumazet <eric.dumazet>
Date: Sat Jan 28 16:11:03 2012 +0000
af_unix: fix EPOLLET regression for stream sockets
Commit 0884d7aa24 (AF_UNIX: Fix poll blocking problem when reading from
a stream socket) added a regression for epoll() in Edge Triggered mode
(EPOLLET)
Appropriate fix is to use skb_peek()/skb_unlink() instead of
skb_dequeue(), and only call skb_unlink() when skb is fully consumed.
This remove the need to requeue a partial skb into sk_receive_queue head
and the extra sk->sk_data_ready() calls that added the regression.
This is safe because once skb is given to sk_receive_queue, it is not
modified by a writer, and readers are serialized by u->readlock mutex.
This also reduce number of spinlock acquisition for small reads or
MSG_PEEK users so should improve overall performance.
Reported-by: Nick Mathewson <nickm>
Signed-off-by: Eric Dumazet <eric.dumazet>
Cc: Alexey Moiseytsev <himeraster>
Signed-off-by: David S. Miller <davem>
I tested with kernel 3.3.0-rc7+ (HEAD: fde7d9049e55ab85a390be7f415d74c9f62dd0f9) that appears to include the above fixes and I can still reproduce the problem. libqb is libqb-0.11.1-1.fc18 The version included in fedora builds without epoll support and it works. Rebuilding the same source with epoll triggers the issue again. corosync is at git head: c00502a70a8a0b4907f9daf370e3ff1e54c7100a Contrary to what Angus saw in his tests, starting corosync with real time priority does not help. Further investigation: I tested kernel 3.2.9-x from F16 on rawhide and I can still reproduce the problem. Running the same test in pure F16 does not trigger the issue. At this point i suspect there are either multiple issues in the whole epoll chain or something is going wrong with the toolchain. This kind of debugging is far too advanced for me. Angus do we have a simple test case to reproduce the problem that does not involve running multiple nodes and corosync? (In reply to comment #6) > Further investigation: > > I tested kernel 3.2.9-x from F16 on rawhide and I can still reproduce the > problem. > > Running the same test in pure F16 does not trigger the issue. > > At this point i suspect there are either multiple issues in the whole epoll > chain or something is going wrong with the toolchain. This kind of debugging is > far too advanced for me. > > Angus do we have a simple test case to reproduce the problem that does not > involve running multiple nodes and corosync? Nope, sorry. Re comment 6: Fabio, did you used respective fresh version of modules as well? I mean. e.g. virtio_net, if you used VM for your test. (In reply to comment #8) > Re comment 6: > Fabio, did you used respective fresh version of modules as well? > I mean. e.g. virtio_net, if you used VM for your test. The host is running F16 and VM fedora rawhide, but the problem is not on the network. I've tried to cover one of many possibilities to reduce the space by trying rawhide kernel 3.3.0-0.rc7.git0.5 [1] built on my F16 box (details below) back on rawhide VMs and cannot reproduce any of the two testcases from 2012-03-09 above (asalkeld, fdinitto). You can test it on your own, please find such built kernel at temporary location [2] (I've rebooted back to old anaconda kernel, uninstalled current kernel via yum and installed all these packages via rpm, then rebooted on top of it [at least as per uname]). If this kernel (meaning complete, incl. modules, see comment 8, I have no idea if, e.g., drivers can affect epoll) is proved to work well, than rawhide toolchain may suffer from a defect. Details: toolchain used to build [2] of F16 box $ rpm -q gcc glibc binutils gcc-4.6.3-2.fc16.x86_64 glibc-2.14.90-24.fc16.6.x86_64 binutils-2.21.53.0.1-6.fc16.x86_64 Details: rawhide packages used for testing $ rpm -q corosync libqb corosynclib corosynclib corosynclib-devel \ corosynclib-devel libqb-devel corosync-1.99.8-1.fc18.x86_64 libqb-0.11.1-1.fc18.x86_64 corosynclib-1.99.8-1.fc18.x86_64 corosynclib-1.99.8-1.fc18.x86_64 corosynclib-devel-1.99.8-1.fc18.x86_64 corosynclib-devel-1.99.8-1.fc18.x86_64 libqb-devel-0.11.1-1.fc18.x86_64 Details: testing rawhide VMs - uname -r -v: 3.0-0.rc7.git0.5.fc16.x86_64 #1 SMP Fri Mar 16 23:29:59 CET 2012 - single CPU, 1024 MB RAM, virtio NIC - corosync: corosync.conf.example with adjusted bindnetaddr (to match the network), mcastaddr and mcastport [1] http://kojipkgs.fedoraproject.org/packages/kernel/3.3.0/0.rc7.git0.5.fc18/src/kernel-3.3.0-0.rc7.git0.5.fc18.src.rpm [2] http://jpokorny.fedorapeople.org/temp/kernel_F16_toolchain/ Created attachment 571214 [details]
corosync_glibc_backtrace_VM_quick_pause_resume.txt
Aside: when cleaning up the environment on the VMs, I paused of them
shortly, realized I wanted to continue, so resumed back in a short time
and in terminal for the respective node with running "corosync -f",
I got "double free or corruption" (see attached backtrace).
Clearly, stopping and resuming VM in a quick sequence is probably quite
careless, but one might expect better robustness of whatever side that
caused this.
Re comment 11: these are /var/log/messages records from around that time (haven't checked the time of accident, sorry) [TOTEM ] A processor failed, forming new configuration. [QB ] poll_mod : can't modify entry already deleted [TOTEM ] A processor joined or left the membership and a new membership was formed. [MAIN ] Completed service synchronization, ready to provide service. [QB ] poll_mod : can't modify entry already deleted (after 19 seconds) [SERV ] Unloading all Corosync service engines. [QB ] withdrawing server sockets [SERV ] Service engine unloaded: corosync configuration map access [QB ] Could not accept client connection from fd:11: Bad file descriptor (9) [QB ] withdrawing server sockets [SERV ] Service engine unloaded: corosync configuration service [QB ] Could not accept client connection from fd:12: Bad file descriptor (9) I've tried the same as in comment 10 with original 3.3.0-0.rc7.git0.5.fc18.x86_64 kernel Fedora build [3], this time on both single and two CPUs per VM, and still was not able to reproduce the issue. The newly initiated node always joined successfully and immediately. The only thing I noticed was that while running cpgbench, 100x more UDP datagrams were interchanged per second as per tcpdump (I have no idea if this is expected or these were forced retransmits due to timeouts). Probably cannot help more here. Maybe "double free" as per comment 11 would deserve some check if it is by any chance apparent in the code (qb_ipcs_disconnect). [3] http://koji.fedoraproject.org/koji/buildinfo?buildID=306760 This bug appears to have been reported against 'rawhide' during the Fedora 19 development cycle. Changing version to '19'. (As we did not run this process for some time, it could affect also pre-Fedora 19 development cycle bugs. We are very sorry. It will help us with cleanup during Fedora 19 End Of Life. Thank you.) More information and reason for this action is here: https://fedoraproject.org/wiki/BugZappers/HouseKeeping/Fedora19 This message is a notice that Fedora 19 is now at end of life. Fedora has stopped maintaining and issuing updates for Fedora 19. It is Fedora's policy to close all bug reports from releases that are no longer maintained. Approximately 4 (four) weeks from now this bug will be closed as EOL if it remains open with a Fedora 'version' of '19'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version. Thank you for reporting this issue and we are sorry that we were not able to fix it before Fedora 19 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior this bug is closed as described in the policy above. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. Fedora 19 changed to end-of-life (EOL) status on 2015-01-06. Fedora 19 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. If you are unable to reopen this bug, please file a new report against the current release. If you experience problems, please add a comment to this bug. Thank you for reporting this bug and we are sorry it could not be fixed. |