Bug 1121175

Summary: ospf6d hogs CPU
Product: [Fedora] Fedora Reporter: Pete Zaitcev <zaitcev>
Component: quaggaAssignee: Michal Sekletar <msekleta>
Status: CLOSED NOTABUG QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 20CC: balajig81, msekleta, vonsch
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-12-08 17:50:05 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Pete Zaitcev 2014-07-18 14:42:19 UTC
Description of problem:

After a switch from initscripts to NetworkManager, ospf6d started
to burn the CPU a lot. See the 293 minutes here:

[zaitcev@elanor ~]$ ps auxw | grep ospf
quagga     520  0.0  0.3  19848  3144 ?        Ss   Jul17   0:29 /usr/sbin/ospfd -d -A 127.0.0.1 -f /etc/quagga/ospfd.conf
quagga     521 20.9  4.0  56988 40764 ?        Rs   Jul17 293:02 /usr/sbin/ospf6d -d -A ::1 -f /etc/quagga/ospf6d.conf

Version-Release number of selected component (if applicable):

quagga-0.99.22.4-4.fc20.i686

How reproducible:

Unknown

Steps to Reproduce:
1. Install quagga
2. Configure OSPFv6

Actual results:

ospf6d burns CPU

Expected results:

Idling like ospfd

Additional info:

It looks like ospf6d is getting a constant flap through its Netlink
socket:

[root@elanor zaitcev]# strace -p 521
Process 521 attached   
select(1024, [8 11 12 13], [], [], {2, 837289}) = 1 (in [13], left {1, 408018})
clock_gettime(CLOCK_MONOTONIC, {84070, 702498212}) = 0
clock_gettime(CLOCK_MONOTONIC, {84070, 702706268}) = 0
getrusage(RUSAGE_SELF, {ru_utime={17566, 289000}, ru_stime={63, 815000}, ...}) = 0
gettimeofday({1405691811, 266787}, NULL) = 0
read(13, "\0 \377\2\0\n", 6)            = 6
read(13, "\1\20\3\0\1\376\200\0\0\0\0\0\0\306q\376\377\376v \342\1\0\0\0\6", 26) = 26
clock_gettime(CLOCK_MONOTONIC, {84070, 705993471}) = 0
getrusage(RUSAGE_SELF, {ru_utime={17566, 289000}, ru_stime={63, 815000}, ...}) = 0
gettimeofday({1405691811, 269944}, NULL) = 0
clock_gettime(CLOCK_MONOTONIC, {84070, 708144362}) = 0
clock_gettime(CLOCK_MONOTONIC, {84070, 708764829}) = 0
clock_gettime(CLOCK_MONOTONIC, {84070, 709307772}) = 0
clock_gettime(CLOCK_MONOTONIC, {84070, 710438357}) = 0
clock_gettime(CLOCK_MONOTONIC, {84070, 711054284}) = 0
clock_gettime(CLOCK_MONOTONIC, {84070, 711535558}) = 0
getrusage(RUSAGE_SELF, {ru_utime={17566, 289000}, ru_stime={63, 816000}, ...}) = 0
gettimeofday({1405691811, 275072}, NULL) = 0
clock_gettime(CLOCK_MONOTONIC, {84070, 713230388}) = 0
getrusage(RUSAGE_SELF, {ru_utime={17566, 289000}, ru_stime={63, 816000}, ...}) = 0
gettimeofday({1405691811, 277267}, NULL) = 0
select(1024, [8 11 12 13], [], [], {0, 0}) = 1 (in [13], left {0, 0})

It's possible that NetworkManager does something bad. But without
knowing what exactly, I cannot put this issue on Dan's plate.

And again, ospfd is not affected.

Comment 1 Michal Sekletar 2014-12-08 15:20:27 UTC
Sorry for not getting to this earlier. Can you still reproduce the issue? In case you can, please attach your /etc/quagga/ospf6d.conf while at it. Thanks!

Comment 2 Pete Zaitcev 2014-12-08 17:50:05 UTC
I worked around that, so I'm not interested anymore.

For the record, in some ancient version of Quagga, "redistribute connected"
used not to work. It just blankly refused to advertise the default to
peers. So, I worked around that by using this construct:

router ospf6
  redistribute kernel route-map TBD

ipv6 prefix-list TBD-prefix permit ::/0

route-map TBD permit 10
  match ipv6 address prefix-list TBD-prefix

I noticed that NM produces flap in netlink socket and that flap is
amplified by ospf6d. So, I went back to "redistribute connected"
and it seems working now, so there's no need to redistribute kernel.
Using quagga-0.99.22.4-4.fc20.i686 now.