Bug 1272596

Summary:	goferd taking 100% CPU after reconnection to qdrouterd failed
Product:	Red Hat Satellite	Reporter:	Pavel Moravec <pmoravec>
Component:	katello-agent	Assignee:	Jeff Ortel <jortel>
Status:	CLOSED ERRATA	QA Contact:	Tazim Kolhar <tkolhar>
Severity:	high	Docs Contact:
Priority:	high
Version:	6.1.1	CC:	aupadhye, bbuckingham, chrobert, cwelton, ddevra, gsim, iboverma, jswensso, kshravag, me, mmccune, mshimura, mtaru, nmiao, nstiasni, peter.vreman, sauchter, sreber, tkolhar
Target Milestone:	Unspecified	Keywords:	Triaged
Target Release:	Unused
Hardware:	All
OS:	Linux
Whiteboard:
Fixed In Version:	qpid-proton-0.9-10	Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2015-12-15 09:32:09 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	1122832

Description Pavel Moravec 2015-10-16 20:34:45 UTC

Description of problem:
There is a race condition causing goferd process taking 100% CPU but not attempting to reconnect to qdrouterd. In nutshell, goferd is busy waiting to get an event that remote end is not further active, in order to gracefully close timeouted connection.

The bug lies in proton reactor API or in proton - see details below.


Version-Release number of selected component (if applicable):
python-qpid-proton-0.9-7.el7.x86_64
python-gofer-qpid-2.6.6-2.el7sat.noarch
qpid-proton-debuginfo-0.9-7.el7.x86_64
qpid-proton-c-0.9-7.el7.x86_64
python-gofer-proton-2.6.6-2.el7sat.noarch
gofer-2.6.6-2.el7sat.noarch
python-gofer-2.6.6-2.el7sat.noarch


How reproducible:
100% after some time


Steps to Reproduce:
1. Just to speed up reproducer, shorten reconnect delays of goferd in a content host. Update in /usr/lib/python2.7/site-packages/gofer/messaging/adapter/proton/connection.py:

DELAY = 1
MAX_DELAY = 10

2. Restart goferd service on the content host.
3. On Satellite, let qdrouterd be flapping in operation, by running simple script:
while true; do
  date
  service qdrouterd stop
  sleep 10; sleep $(($((RANDOM))%20))
  date
  service qdrouterd start
  sleep 10; sleep $(($((RANDOM))%20))
done

4. Monitor CPU usage of goferd on the content host, and TCP connections from it to Satellite's port 5647 (i.e. run tcpdump).


Actual results:
After several minutes:
- CPU usage starts to grow
- no TCP connection attempt being made - tcpdump wont show any activity, all connections are closed or in CLOSE_WAIT state
- the situation does not change even when letting qdrouterd running all the time (after hitting the CPU usage growth already)


Expected results:
- no CPU usage growth
- new TCP connection attempts made all the time, per connection retries specified in connection.py


Additional info:
- does not depend on using SSL or not. I.e. when I disabled SSL per https://access.redhat.com/solutions/1594323 , I was able to reproduce as well
- adding some debugs, goferd is busy waiting in /usr/lib64/python2.7/site-packages/proton/utils.py :

    def wait(self, condition, timeout=False, msg=None):
        """Call process until condition() is true"""
        if timeout is False:
            timeout = self.timeout
        if timeout is None:
            while not condition() and not self.disconnected:
                self.container.process()

after some time, proton reactor enters the while loop and never finishes it (checked by adding debugs outside and inside the loop)

- backtrace of relevant thread is often like below:
(gdb) bt
#0  0x00007fff8ad5e7d2 in clock_gettime ()
#1  0x00007ff890681dfd in clock_gettime () from /lib64/libc.so.6
#2  0x00007ff83514365e in pn_i_now () at /usr/src/debug/qpid-proton-0.9/proton-c/src/platform.c:31
#3  0x00007ff83513b039 in pn_reactor_mark (reactor=reactor@entry=0x7ff85403fb10)
    at /usr/src/debug/qpid-proton-0.9/proton-c/src/reactor/reactor.c:58
#4  0x00007ff83513b8e5 in pn_reactor_process (reactor=reactor@entry=0x7ff85403fb10)
    at /usr/src/debug/qpid-proton-0.9/proton-c/src/reactor/reactor.c:387
#5  0x00007ff835399c33 in _wrap_pn_reactor_process (self=<optimized out>, args=<optimized out>)
    at /usr/src/debug/qpid-proton-0.9/proton-c/bindings/python/cprotonPYTHON_wrap.c:23694
#6  0x00007ff89133eb74 in call_function (oparg=<optimized out>, pp_stack=0x7ff833fd2f50)
    at /usr/src/debug/Python-2.7.5/Python/ceval.c:4098
#7  PyEval_EvalFrameEx (
    f=f@entry=Frame 0x7f0d0c6748e0, for file /usr/lib64/python2.7/site-packages/proton/reactor.py, line 143, in process (self=<Container(_impl=<SwigPyObject at remote 0x7f0d0c6c38d0>, _attrs={'ssl': <SSLConfig(client=<SSLDomain(_domain=<SwigPyObject at remote 0x7f0d0c6c3cf0>) at remote 0x7f0d0c6cd910>, server=<SSLDomain(_domain=<SwigPyObject at remote 0x7f0d0c6c3210>) at remote 0x7f0d0c6d2390>) at remote 0x7f0d0c6cd8d0>, 'errors': [], 'subclass': <type at remote 0x7f0d0c1eb010>, 'container_id': '6c7286d5-cdb1-47ad-a74c-832733df3fcb', 'trigger': None}) at remote 0x7f0d0c6cd590>), throwflag=throwflag@entry=0) at /usr/src/debug/Python-2.7.5/Python/ceval.c:2740
#8  0x00007f0d51550930 in fast_function (nk=<optimized out>, na=1, n=1, pp_stack=0x7f0cf3fd30b0, 
    func=<function at remote 0x7f0d0c12ac80>) at /usr/src/debug/Python-2.7.5/Python/ceval.c:4184
#9  call_function (oparg=<optimized out>, pp_stack=0x7f0cf3fd30b0) at /usr/src/debug/Python-2.7.5/Python/ceval.c:4119
#10 PyEval_EvalFrameEx (
    f=f@entry=Frame 0x7f0d0c3826a0, for file /usr/lib64/python2.7/site-packages/proton/utils.py, line 237, in wait (self=<BlockingConnection(url=<Url(_url=<SwigPyObject at remote 0x7f0d0c6c32a0>) at remote 0x7f0d0c6d2410>, container=<Container(_impl=<SwigPyObject at remote 0x7f0d0c6c38d0>, _attrs={'ssl': <SSLConfig(client=<SSLDomain(_domain=<SwigPyObject at remote 0x7f0d0c6c3cf0>) at remote 0x7f0d0c6cd910>, server=<SSLDomain(_domain=<SwigPyObject at remote 0x7f0d0c6c3210>) at remote 0x7f0d0c6d2390>) at remote 0x7f0d0c6cd8d0>, 'errors': [], 'subclass': <type at remote 0x7f0d0c1eb010>, 'container_id': '6c7286d5-cdb1-47ad-a74c-832733df3fcb', 'trigger': None}) at remote 0x7f0d0c6cd590>, disconnected=False, timeout=None, conn=<Connection(_impl=<SwigPyObject at remote 0x7f0d0c6c3ba0>, _attrs={'_session_policy': <SessionPerConnection(_default_session=<Session(_impl=<SwigPyObject at remote 0x7f0d0c6c3ae0>, _attrs={'context': <...>, 'condition': None}) at remote 0x7f0d0c6d27d0>) at remote 0x7f0d0c6d2690>, 'desired_capabilities': No...(truncated), 
    throwflag=throwflag@entry=0) at /usr/src/debug/Python-2.7.5/Python/ceval.c:2740
#11 0x00007f0d5155218d in PyEval_EvalCodeEx (co=<optimized out>, globals=<optimized out>, locals=locals@entry=0x0, 
    args=<optimized out>, argcount=argcount@entry=2, kws=0x7f0d0c367c08, kwcount=1, defs=0x7f0d0c09a338, defcount=2, 
    closure=closure@entry=0x0) at /usr/src/debug/Python-2.7.5/Python/ceval.c:3330
#12 0x00007f0d5155083f in fast_function (nk=<optimized out>, na=2, n=4, pp_stack=0x7f0cf3fd32b0, 
    func=<function at remote 0x7f0d0c130230>) at /usr/src/debug/Python-2.7.5/Python/ceval.c:4194
#13 call_function (oparg=<optimized out>, pp_stack=0x7f0cf3fd32b0) at /usr/src/debug/Python-2.7.5/Python/ceval.c:4119
#14 PyEval_EvalFrameEx (
    f=f@entry=Frame 0x7f0d0c367a70, for file /usr/lib64/python2.7/site-packages/proton/utils.py, line 220, in close (self=<BlockingConnection(url=<Url(_url=<SwigPyObject at remote 0x7f0d0c6c32a0>) at remote 0x7f0d0c6d2410>, container=<Container(_impl=<SwigPyObject at remote 0x7f0d0c6c38d0>, _attrs={'ssl': <SSLConfig(client=<SSLDomain(_domain=<SwigPyObject at remote 0x7f0d0c6c3cf0>) at remote 0x7f0d0c6cd910>, server=<SSLDomain(_domain=<SwigPyObject at remote 0x7f0d0c6c3210>) at remote 0x7f0d0c6d2390>) at remote 0x7f0d0c6cd8d0>, 'errors': [], 'subclass': <type at remote 0x7f0d0c1eb010>, 'container_id': '6c7286d5-cdb1-47ad-a74c-832733df3fcb', 'trigger': None}) at remote 0x7f0d0c6cd590>, disconnected=False, timeout=None, conn=<Connection(_impl=<SwigPyObject at remote 0x7f0d0c6c3ba0>, _attrs={'_session_policy': <SessionPerConnection(_default_session=<Session(_impl=<SwigPyObject at remote 0x7f0d0c6c3ae0>, _attrs={'context': <...>, 'condition': None}) at remote 0x7f0d0c6d27d0>) at remote 0x7f0d0c6d2690>, 'desired_capabilities': N...(truncated), 
    throwflag=throwflag@entry=0) at /usr/src/debug/Python-2.7.5/Python/ceval.c:2740
#15 0x00007f0d5155218d in PyEval_EvalCodeEx (co=<optimized out>, globals=<optimized out>, locals=locals@entry=0x0, 
    args=<optimized out>, argcount=argcount@entry=1, kws=0x7f0d0c377748, kwcount=0, defs=0x0, defcount=0, closure=closure@entry=0x0)
    at /usr/src/debug/Python-2.7.5/Python/ceval.c:3330
#16 0x00007f0d5155083f in fast_function (nk=<optimized out>, na=1, n=1, pp_stack=0x7f0cf3fd34b0, 
    func=<function at remote 0x7f0d0c1300c8>) at /usr/src/debug/Python-2.7.5/Python/ceval.c:4194
#17 call_function (oparg=<optimized out>, pp_stack=0x7f0cf3fd34b0) at /usr/src/debug/Python-2.7.5/Python/ceval.c:4119
#18 PyEval_EvalFrameEx (
    f=f@entry=Frame 0x7f0d0c3775b0, for file /usr/lib/python2.7/site-packages/gofer/messaging/adapter/proton/connection.py, line 156, in close (self=<Connection(url='proton+amqp://pmoravec-sat61-rhel7.gsslab.brq.redhat.com:5647', _impl=None) at remote 0x7f0d0c316690>, connection=<BlockingConnection(url=<Url(_url=<SwigPyObject at remote 0x7f0d0c6c32a0>) at remote 0x7f0d0c6d2410>, container=<Container(_impl=<SwigPyObject at remote 0x7f0d0c6c38d0>, _attrs={'ssl': <SSLConfig(client=<SSLDomain(_domain=<SwigPyObject at remote 0x7f0d0c6c3cf0>) at remote 0x7f0d0c6cd910>, server=<SSLDomain(_domain=<SwigPyObject at remote 0x7f0d0c6c3210>) at remote 0x7f0d0c6d2390>) at remote 0x7f0d0c6cd8d0>, 'errors': [], 'subclass': <type at remote 0x7f0d0c1eb010>, 'container_id': '6c7286d5-cdb1-47ad-a74c-832733df3fcb', 'trigger': None}) at remote 0x7f0d0c6cd590>, disconnected=False, timeout=None, conn=<Connection(_impl=<SwigPyObject at remote 0x7f0d0c6c3ba0>, _attrs={'_session_policy': <SessionPerConnection(_default_session=<Session(_impl=<SwigPyO...(truncated), 
    throwflag=throwflag@entry=0) at /usr/src/debug/Python-2.7.5/Python/ceval.c:2740

- tcpdump of one such event (on nonSSL traffic) to be attached

Comment 1 Pavel Moravec 2015-10-16 20:38:51 UTC

Created attachment 1083808 [details]
tcpdump during reproducer test

See latest 10 frames there - self-explanatory what circumstances trigger the bug.

Comment 2 Pavel Moravec 2015-10-16 20:43:36 UTC

(In reply to Pavel Moravec from comment #1)
> Created attachment 1083808 [details]
> tcpdump during reproducer test
> 
> See latest 10 frames there - self-explanatory what circumstances trigger the
> bug.

Hi Gordon,
would you be so kind to check this bug? (let me know if you prefer standalone reproducer without Satellite bits, should not be complex)

Comment 4 Gordon Sim 2015-10-19 13:22:05 UTC

(In reply to Pavel Moravec from comment #2)
> (In reply to Pavel Moravec from comment #1)
> > Created attachment 1083808 [details]
> > tcpdump during reproducer test
> > 
> > See latest 10 frames there - self-explanatory what circumstances trigger the
> > bug.
> 
> Hi Gordon,
> would you be so kind to check this bug? (let me know if you prefer
> standalone reproducer without Satellite bits, should not be complex)

Is it possible that this is caused by https://issues.apache.org/jira/browse/PROTON-1023 (now fixed upstream)? The protocol trace shows that the last attach attempt is failing due to the router not being able to route it, and depending exactly how that gets processed it could result in an AttributeError being raised instead of the expected LinkDetached or LinkException.

If not, a standalone reproducer would indeed be very useful.

Comment 5 Pavel Moravec 2015-10-19 14:27:06 UTC

(In reply to Gordon Sim from comment #4)
> (In reply to Pavel Moravec from comment #2)
> > (In reply to Pavel Moravec from comment #1)
> > > Created attachment 1083808 [details]
> > > tcpdump during reproducer test
> > > 
> > > See latest 10 frames there - self-explanatory what circumstances trigger the
> > > bug.
> > 
> > Hi Gordon,
> > would you be so kind to check this bug? (let me know if you prefer
> > standalone reproducer without Satellite bits, should not be complex)
> 
> Is it possible that this is caused by
> https://issues.apache.org/jira/browse/PROTON-1023 (now fixed upstream)? The
> protocol trace shows that the last attach attempt is failing due to the
> router not being able to route it, and depending exactly how that gets
> processed it could result in an AttributeError being raised instead of the
> expected LinkDetached or LinkException.
> 
> If not, a standalone reproducer would indeed be very useful.

Applying that patch helped but did not fully prevented the bug. I will prepare standalone reproducer, should be matter of very few (working) hours.

Comment 6 Pavel Moravec 2015-10-19 20:53:57 UTC

(In reply to Gordon Sim from comment #4)
> (In reply to Pavel Moravec from comment #2)
> > (In reply to Pavel Moravec from comment #1)
> > > Created attachment 1083808 [details]
> > > tcpdump during reproducer test
> > > 
> > > See latest 10 frames there - self-explanatory what circumstances trigger the
> > > bug.
> > 
> > Hi Gordon,
> > would you be so kind to check this bug? (let me know if you prefer
> > standalone reproducer without Satellite bits, should not be complex)
> 
> Is it possible that this is caused by
> https://issues.apache.org/jira/browse/PROTON-1023 (now fixed upstream)? The
> protocol trace shows that the last attach attempt is failing due to the
> router not being able to route it, and depending exactly how that gets
> processed it could result in an AttributeError being raised instead of the
> expected LinkDetached or LinkException.
> 
> If not, a standalone reproducer would indeed be very useful.

Here is a reproducer script - interestingly un-commenting either consumer close line, I was unable to reproduce (or I was not patient enough):

(in parallel, bounce qdrouterd every 5 seconds)


#!/usr/bin/python

from time import sleep
from uuid import uuid4

from proton import ConnectionException

from proton.utils import BlockingConnection

import traceback
import random

while True:
  sleep(random.uniform(0.3,3))
  try:
    conn = BlockingConnection("proton+amqp://localhost:5672", ssl_domain=None, heartbeat=2)
    rec = conn.create_receiver("another_address", name=str(uuid4()), dynamic=False, options=None)
    print "sleeping.."
    sleep(random.uniform(0.3,3))
    rec2 = conn.create_receiver("some_address", name=str(uuid4()), dynamic=False, options=None)
  except ConnectionException:
    try:
#      rec.close()
#      rec2.close()
      if conn:
        conn.close()
    except Exception, e:
      print e
      print(traceback.format_exc())

Comment 7 Gordon Sim 2015-10-20 17:27:54 UTC

(In reply to Pavel Moravec from comment #5)
> (In reply to Gordon Sim from comment #4)
> > (In reply to Pavel Moravec from comment #2)
> > > (In reply to Pavel Moravec from comment #1)
> > > > Created attachment 1083808 [details]
> > > > tcpdump during reproducer test
> > > > 
> > > > See latest 10 frames there - self-explanatory what circumstances trigger the
> > > > bug.
> > > 
> > > Hi Gordon,
> > > would you be so kind to check this bug? (let me know if you prefer
> > > standalone reproducer without Satellite bits, should not be complex)
> > 
> > Is it possible that this is caused by
> > https://issues.apache.org/jira/browse/PROTON-1023 (now fixed upstream)? The
> > protocol trace shows that the last attach attempt is failing due to the
> > router not being able to route it, and depending exactly how that gets
> > processed it could result in an AttributeError being raised instead of the
> > expected LinkDetached or LinkException.
> > 
> > If not, a standalone reproducer would indeed be very useful.
> 
> Applying that patch helped but did not fully prevented the bug. I will
> prepare standalone reproducer, should be matter of very few (working) hours.

I've committed a further fix for a slightly different case now: https://issues.apache.org/jira/browse/PROTON-1024

This was what I was hitting when running your standalone reproducer. With that fix in place I haven't yet been able to reproduce.

Comment 8 Pavel Moravec 2015-10-20 20:52:42 UTC

(In reply to Gordon Sim from comment #7)
> > Applying that patch helped but did not fully prevented the bug. I will
> > prepare standalone reproducer, should be matter of very few (working) hours.
> 
> I've committed a further fix for a slightly different case now:
> https://issues.apache.org/jira/browse/PROTON-1024
> 
> This was what I was hitting when running your standalone reproducer. With
> that fix in place I haven't yet been able to reproduce.

Testing both standalone reproducer and/or goferd with both fixes of PROTON-1023 and PROTON-1024 does not trigger this bug further. Good catch!

Anyway, when running the standalone reproducer regardless of the patches applied or not, I spotted https://issues.apache.org/jira/browse/PROTON-1025. This close_wait leak is _not_ observed in goferd reproducer, therefore I do not plan to raise Satellite bugzilla for it - just reported to proton upstream.

I.e. for Satellite purposes, backport of PROTON-1023 and PROTON-1024 is the proper/sufficient fix.

Comment 13 Tazim Kolhar 2015-12-08 12:34:53 UTC

VERIFIED:

# rpm -qa | grep foreman
foreman-compute-1.7.2.48-1.el7sat.noarch
ruby193-rubygem-foreman-redhat_access-0.2.4-1.el7sat.noarch
rubygem-hammer_cli_foreman_discovery-0.0.1.10-1.el7sat.noarch
ruby193-rubygem-foreman_bootdisk-4.0.2.13-1.el7sat.noarch
ruby193-rubygem-foreman_docker-1.2.0.24-1.el7sat.noarch
rubygem-hammer_cli_foreman_docker-0.0.3.10-1.el7sat.noarch
foreman-debug-1.7.2.48-1.el7sat.noarch
foreman-postgresql-1.7.2.48-1.el7sat.noarch
ruby193-rubygem-foreman_openscap-0.3.2.10-1.el7sat.noarch
ruby193-rubygem-foreman_hooks-0.3.7-2.el7sat.noarch
foreman-gce-1.7.2.48-1.el7sat.noarch
ruby193-rubygem-foreman_discovery-2.0.0.22-1.el7sat.noarch
rubygem-hammer_cli_foreman_tasks-0.0.3.5-1.el7sat.noarch
puppet-foreman_scap_client-0.3.3-9.el7sat.noarch
foreman-selinux-1.7.2.17-1.el7sat.noarch
foreman-vmware-1.7.2.48-1.el7sat.noarch
ruby193-rubygem-foreman_gutterball-0.0.1.9-1.el7sat.noarch
foreman-proxy-1.7.2.6-1.el7sat.noarch
ibm-x3655-04.ovirt.rhts.eng.bos.redhat.com-foreman-proxy-client-1.0-1.noarch
foreman-discovery-image-3.0.5-3.el7sat.noarch
foreman-1.7.2.48-1.el7sat.noarch
ruby193-rubygem-foreman-tasks-0.6.15.7-1.el7sat.noarch
rubygem-hammer_cli_foreman_bootdisk-0.1.2.7-1.el7sat.noarch
foreman-libvirt-1.7.2.48-1.el7sat.noarch
ibm-x3655-04.ovirt.rhts.eng.bos.redhat.com-foreman-client-1.0-1.noarch
ibm-x3655-04.ovirt.rhts.eng.bos.redhat.com-foreman-proxy-1.0-1.noarch
foreman-ovirt-1.7.2.48-1.el7sat.noarch
rubygem-hammer_cli_foreman-0.1.4.14-1.el7sat.noarch

1. Just to speed up reproducer, shorten reconnect delays of goferd in a content host. Update in /usr/lib/python2.7/site-packages/gofer/messaging/adapter/proton/connection.py:

DELAY = 1
MAX_DELAY = 10

2. Restart goferd service on the content host.
3. On Satellite, let qdrouterd be flapping in operation, by running simple script:
while true; do
  date
  service qdrouterd stop
  sleep 10; sleep $(($((RANDOM))%20))
  date
  service qdrouterd start
  sleep 10; sleep $(($((RANDOM))%20))
done

4. Monitor CPU usage of goferd on the content host
# while true; do date; ps aux | grep python | grep goferd; slp 5; done
Tue Dec  8 07:26:30 EST 2015
root     30877  0.5  1.1 1936004 43172 ?       Ssl  07:21   0:01 python /usr/bin/goferd --foreground
Tue Dec  8 07:26:35 EST 2015
root     30877  0.5  1.1 1936004 43172 ?       Ssl  07:21   0:01 python /usr/bin/goferd --foreground
Tue Dec  8 07:26:40 EST 2015
root     30877  0.5  1.1 1936004 43172 ?       Ssl  07:21   0:01 python /usr/bin/goferd --foreground
Tue Dec  8 07:26:45 EST 2015
root     30877  0.5  1.1 1936004 43172 ?       Ssl  07:21   0:01 python /usr/bin/goferd --foreground
Tue Dec  8 07:26:50 EST 2015
root     30877  0.5  1.1 1936004 43292 ?       Ssl  07:21   0:01 python /usr/bin/goferd --foreground
Tue Dec  8 07:26:55 EST 2015
root     30877  0.5  1.1 1936004 43396 ?       Ssl  07:21   0:01 python /usr/bin/goferd --foreground
Tue Dec  8 07:27:00 EST 2015
root     30877  0.5  1.1 1936004 43396 ?       Ssl  07:21   0:01 python /usr/bin/goferd --foreground
Tue Dec  8 07:27:05 EST 2015
root     30877  0.5  1.1 1936004 43396 ?       Ssl  07:21   0:01 python /usr/bin/goferd --foreground
Tue Dec  8 07:27:10 EST 2015
root     30877  0.5  1.1 1936004 43396 ?       Ssl  07:21   0:01 python /usr/bin/goferd --foreground
Tue Dec  8 07:27:15 EST 2015
root     30877  0.5  1.1 1936004 43396 ?       Ssl  07:21   0:01 python /usr/bin/goferd --foreground
Tue Dec  8 07:27:20 EST 2015
root     30877  0.5  1.1 1936004 43396 ?       Ssl  07:21   0:01 python /usr/bin/goferd --foreground
Tue Dec  8 07:27:25 EST 2015
root     30877  0.5  1.1 1936004 43396 ?       Ssl  07:21   0:01 python /usr/bin/goferd --foreground
Tue Dec  8 07:27:30 EST 2015
root     30877  0.5  1.1 1936004 43516 ?       Ssl  07:21   0:01 python /usr/bin/goferd --foreground
Tue Dec  8 07:27:36 EST 2015
root     30877  0.5  1.1 1936004 43732 ?       Ssl  07:21   0:02 python /usr/bin/goferd --foreground
Tue Dec  8 07:27:41 EST 2015
root     30877  0.5  1.1 1936004 43872 ?       Ssl  07:21   0:02 python /usr/bin/goferd --foreground
Tue Dec  8 07:27:46 EST 2015
root     30877  0.5  1.1 1936004 43936 ?       Ssl  07:21   0:02 python /usr/bin/goferd --foreground
Tue Dec  8 07:27:51 EST 2015
root     30877  0.5  1.1 1936004 43996 ?       Ssl  07:21   0:02 python /usr/bin/goferd --foreground
Tue Dec  8 07:27:56 EST 2015
root     30877  0.5  1.1 1936004 44076 ?       Ssl  07:21   0:02 python /usr/bin/goferd --foreground
Tue Dec  8 07:28:01 EST 2015
root     30877  0.5  1.1 1936004 44076 ?       Ssl  07:21   0:02 python /usr/bin/goferd --foreground
Tue Dec  8 07:28:06 EST 2015
root     30877  0.5  1.1 1936004 44076 ?       Ssl  07:21   0:02 python /usr/bin/goferd --foreground
Tue Dec  8 07:28:11 EST 2015
root     30877  0.5  1.1 1936004 44308 ?       Ssl  07:21   0:02 python /usr/bin/goferd --foreground


So, as per the expected result :
- no CPU usage growth
it is constant

Comment 15 errata-xmlrpc 2015-12-15 09:32:09 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2015:2621

Comment 17 Pavel Moravec 2016-03-08 08:52:17 UTC

*** Bug 1276176 has been marked as a duplicate of this bug. ***