Note: This bug is displayed in read-only format because
the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Created attachment 1211993[details]
Result of: strace -Cvfr -p `pidof ovs-vswitchd` -o vswitchd.strace
When the openvswitch service is started, the ovs-vswitchd process takes a 100% of CPU time.
Top dump:
1227 root 10 -10 48208 11240 9908 R 100.0 0.1 1534:17 ovs-vswitchd
The error seems to have appeared without any specific cause.
Once it started, it is now always occurring whenever the openvswitch service is started.
Result of: strace -Cvfr -p `pidof ovs-vswitchd` -o vswitchd.strace
shows the following recurring piece of log being generated at an enormous rate:
1227 0.000013 recvfrom(3, 0x55f5a48f6dcd, 339, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable)
1227 0.000013 recvmsg(11, {msg_namelen=0}, MSG_DONTWAIT) = -1 EAGAIN (Resource temporarily unavailable)
1227 0.000012 accept(10, 0x7ffc6ea263d0, [128]) = -1 EAGAIN (Resource temporarily unavailable)
1227 0.000014 recvmsg(14, {msg_namelen=0}, MSG_DONTWAIT) = -1 EAGAIN (Resource temporarily unavailable)
1227 0.000012 recvmsg(13, {msg_namelen=0}, MSG_DONTWAIT) = -1 EAGAIN (Resource temporarily unavailable)
1227 0.000013 recvmsg(13, {msg_namelen=0}, MSG_DONTWAIT) = -1 EAGAIN (Resource temporarily unavailable)
1227 0.000012 recvmsg(13, {msg_namelen=0}, MSG_DONTWAIT) = -1 EAGAIN (Resource temporarily unavailable)
1227 0.000014 poll([{fd=13, events=POLLIN}, {fd=14, events=POLLIN}, {fd=11, events=POLLIN}, {fd=8, events=POLLIN}, {fd=10, events=POLLIN}, {fd=3, events=POLLIN}], 6, 0) = 0 (Timeout)
1227 0.000015 getrusage(0x1 /* RUSAGE_??? */, {ru_utime={53405, 379161}, ru_stime={38423, 776487}, ru_maxrss=11240, ru_ixrss=0, ru_idrss=0, ru_isrss=0, ru_minflt=496, ru_majflt=0, ru_nswap=0, ru_inblock=1688, ru_oublock=0, ru_msgsnd=0, ru_msgrcv=0, ru_nsignals=0, ru_nvcsw=53779241, ru_nivcsw=260447}) = 0
Comment 2Thadeu Lima de Souza Cascardo
2016-10-30 09:09:35 UTC
Created attachment 1215516[details]
patched ovs with fix and debug
Hi.
Can you test this package and let me know if it fixes the problem? If it doesn't, can you enable debug for poll-loop and send the logs, please?
ovs-appctl vlog/set poll_loop:DBG
Thanks.
Cascardo.
Created attachment 1216532[details]
openswitch logs after patch from 2016-10-30
The patch seems to have fixed the issue.
I attach the collected openvswitch logs.
Comment 4Thadeu Lima de Souza Cascardo
2016-11-03 18:09:50 UTC
Hi, Marcin.
The fix has been applied upstream, can you test master and see if it fixes your problem? Just in case it is really fixed by the fix and not the debug messages I added.
Thanks.
Cascardo.
Hi guys,
I have the problem reproduced on my environment: (Open vSwitch) 2.6.90.
Comment 7Thadeu Lima de Souza Cascardo
2016-11-14 12:22:26 UTC
The logs seem to indicate timeouts every 500ms for revalidator and every 5s for stats update. All normal timeouts. Also, only 0% CPU use is shown. Either the logs do not represent the time when the issue happened, or there is something that is not being accounted.
Cascardo.
I grabbed the log only after verifying that ovs-vswitchd process consumes high CPU usage. I also set the OVS logging level to debug (ovs-appctl vlog/set dbg). I will try to upgrade to OVS 2.6.1.
Thanks,
Mor
Hi Marcin,
I'm network-QE and reviewing the BZ, but cannot reproduce the issue. Could you please give more info? like kernel version, openvswitch package, steps, etc.
Thanks
QJ
Hi,
Its not always reproducible on my environment. And I have no steps to reproduce it.
I am attaching additional debug level logs taken from OVN host.
Kernel version:
3.10.0-512.el7.x86_64
OVS packages:
openvswitch-2.6.90-1.fc24.x86_64
(In reply to Mor from comment #10)
>
> Its not always reproducible on my environment. And I have no steps to
> reproduce it.
>
> I am attaching additional debug level logs taken from OVN host.
>
> Kernel version:
> 3.10.0-512.el7.x86_64
>
> OVS packages:
> openvswitch-2.6.90-1.fc24.x86_64
Thank you for the update.
Without steps, could you please give something like command line history? I'd like to know what you have done before the issue happens to help me to reproduce.
thanks
QJ
In our case, the changes are done by a software provider (which provides CMS access to OVN). This software component uses the OVS API. So unfortunately, the command-line history won't us to track the changes.
(In reply to Mor from comment #16)
> In our case, the changes are done by a software provider (which provides CMS
> access to OVN). This software component uses the OVS API. So unfortunately,
> the command-line history won't us to track the changes.
Now the issue has been fixed in openvswitch-2.5.0-22.git20160727.el7fdb. I cannot reproduce the issue. Could you please help verify it?
thank you
QJ
(In reply to Mor from comment #21)
> Hi QJ,
>
> In our environment we require OVS 2.6.0 or higher, so I unable to verify it
> on 2.5.0.
>
It's ok. Thank you anyway.
Created attachment 1211993 [details] Result of: strace -Cvfr -p `pidof ovs-vswitchd` -o vswitchd.strace When the openvswitch service is started, the ovs-vswitchd process takes a 100% of CPU time. Top dump: 1227 root 10 -10 48208 11240 9908 R 100.0 0.1 1534:17 ovs-vswitchd The error seems to have appeared without any specific cause. Once it started, it is now always occurring whenever the openvswitch service is started. Result of: strace -Cvfr -p `pidof ovs-vswitchd` -o vswitchd.strace shows the following recurring piece of log being generated at an enormous rate: 1227 0.000013 recvfrom(3, 0x55f5a48f6dcd, 339, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable) 1227 0.000013 recvmsg(11, {msg_namelen=0}, MSG_DONTWAIT) = -1 EAGAIN (Resource temporarily unavailable) 1227 0.000012 accept(10, 0x7ffc6ea263d0, [128]) = -1 EAGAIN (Resource temporarily unavailable) 1227 0.000014 recvmsg(14, {msg_namelen=0}, MSG_DONTWAIT) = -1 EAGAIN (Resource temporarily unavailable) 1227 0.000012 recvmsg(13, {msg_namelen=0}, MSG_DONTWAIT) = -1 EAGAIN (Resource temporarily unavailable) 1227 0.000013 recvmsg(13, {msg_namelen=0}, MSG_DONTWAIT) = -1 EAGAIN (Resource temporarily unavailable) 1227 0.000012 recvmsg(13, {msg_namelen=0}, MSG_DONTWAIT) = -1 EAGAIN (Resource temporarily unavailable) 1227 0.000014 poll([{fd=13, events=POLLIN}, {fd=14, events=POLLIN}, {fd=11, events=POLLIN}, {fd=8, events=POLLIN}, {fd=10, events=POLLIN}, {fd=3, events=POLLIN}], 6, 0) = 0 (Timeout) 1227 0.000015 getrusage(0x1 /* RUSAGE_??? */, {ru_utime={53405, 379161}, ru_stime={38423, 776487}, ru_maxrss=11240, ru_ixrss=0, ru_idrss=0, ru_isrss=0, ru_minflt=496, ru_majflt=0, ru_nswap=0, ru_inblock=1688, ru_oublock=0, ru_msgsnd=0, ru_msgrcv=0, ru_nsignals=0, ru_nvcsw=53779241, ru_nivcsw=260447}) = 0