Bug 1250029 - capsule sync fails as Host did not respond within 20 seconds
capsule sync fails as Host did not respond within 20 seconds
Status: CLOSED NOTABUG
Product: Red Hat Satellite 6
Classification: Red Hat
Component: Capsule (Show other bugs)
6.1.0
x86_64 Linux
unspecified Severity high (vote)
: Unspecified
: --
Assigned To: Stephen Benjamin
Sachin Ghai
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2015-08-04 07:44 EDT by Pradeep Kumar Surisetty
Modified: 2017-02-23 14:48 EST (History)
4 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2015-08-20 12:55:41 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
satelitte server foreman logs (1.89 MB, text/plain)
2015-08-04 08:05 EDT, Pradeep Kumar Surisetty
no flags Details
/var/log/messages of satelitte server (7.09 MB, text/plain)
2015-08-04 08:15 EDT, Pradeep Kumar Surisetty
no flags Details
capsule logs (205.03 KB, text/plain)
2015-08-04 08:18 EDT, Pradeep Kumar Surisetty
no flags Details

  None (edit)
Description Pradeep Kumar Surisetty 2015-08-04 07:44:14 EDT
Description of problem:


1) Sync one of the capsule 

[root@perfc-380g8-01 ~]# time hammer -u admin -p changeme capsule content synchronize  --id 82
[..................................................................................................................] [100%]
Host did not respond within 20 seconds. Is katello-agent installed and goferd running on the Host?

real	0m21.661s
user	0m0.805s
sys	0m0.110s
[root@perfc-380g8-01 ~]# 


It fails with above details. 

In spite of sync is stopped on satelitte server as shown above, capsule sync is completed on capsule server.  

Before capsule sync: 

[root@vmcapsule081 ~]# df -h
Filesystem                        Size  Used Avail Use% Mounted on
/dev/mapper/rhel_dhcp23--55-root  8.5G  1.8G  6.7G  22% /
devtmpfs                          911M     0  911M   0% /dev
tmpfs                             920M     0  920M   0% /dev/shm
tmpfs                             920M  8.4M  912M   1% /run
tmpfs                             920M     0  920M   0% /sys/fs/cgroup
/dev/vda1                         497M  161M  336M  33% /boot
[root@vmcapsule081 ~]# 



After capsule sync: 

[root@vmcapsule081 ~]# df -h
Filesystem                        Size  Used Avail Use% Mounted on
/dev/mapper/rhel_dhcp23--55-root  8.5G  5.8G  2.7G  69% /
devtmpfs                          911M     0  911M   0% /dev
tmpfs                             920M     0  920M   0% /dev/shm
tmpfs                             920M  8.4M  912M   1% /run
tmpfs                             920M     0  920M   0% /sys/fs/cgroup
/dev/vda1                         497M  161M  336M  33% /boot


4) goferd status on capsule

[root@vmcapsule081 ~]# service goferd status
Redirecting to /bin/systemctl status  goferd.service
goferd.service - Gofer Agent
   Loaded: loaded (/usr/lib/systemd/system/goferd.service; enabled)
   Active: active (running) since Tue 2015-08-04 07:01:24 EDT; 1min 28s ago
 Main PID: 12706 (python)
   CGroup: /system.slice/goferd.service
           └─12706 python /usr/bin/goferd --foreground

Aug 04 07:01:26 vmcapsule081.perf.lab.eng.rdu.redhat.com goferd[12706]: [INFO][worker-0] gofer.messaging.adapter.proton.connection:100 - connecting: URL: amqps://perfc-...on: None
Aug 04 07:01:26 vmcapsule081.perf.lab.eng.rdu.redhat.com goferd[12706]: [INFO][worker-0] root:485 - connecting to perfc-380g8-01.perf.lab.eng.rdu.redhat.com:5647...
Aug 04 07:01:27 vmcapsule081.perf.lab.eng.rdu.redhat.com goferd[12706]: [INFO][worker-0] gofer.messaging.adapter.proton.connection:103 - connected: amqps://perfc-380g8-...com:5647
Aug 04 07:01:27 vmcapsule081.perf.lab.eng.rdu.redhat.com goferd[12706]: [INFO][worker-0] root:505 - connected to perfc-380g8-01.perf.lab.eng.rdu.redhat.com:5647
Aug 04 07:01:27 vmcapsule081.perf.lab.eng.rdu.redhat.com goferd[12706]: [INFO][worker-0] gofer.messaging.adapter.proton.connection:154 - closed: amqps://perfc-380g8-01....com:5647
Aug 04 07:01:27 vmcapsule081.perf.lab.eng.rdu.redhat.com goferd[12706]: [INFO][pulp.agent.10210573-dd51-4200-ac93-189643f128bb] gofer.messaging.adapter.proton.connectio...on: None
Aug 04 07:01:27 vmcapsule081.perf.lab.eng.rdu.redhat.com goferd[12706]: [INFO][worker-0] gofer.agent.plugin:318 - plugin:katelloplugin queue:pulp.agent.10210573-dd51-42...attached
Aug 04 07:01:27 vmcapsule081.perf.lab.eng.rdu.redhat.com goferd[12706]: [INFO][pulp.agent.10210573-dd51-4200-ac93-189643f128bb] root:485 - connecting to perfc-380g8-01....:5647...
Aug 04 07:01:27 vmcapsule081.perf.lab.eng.rdu.redhat.com goferd[12706]: [INFO][pulp.agent.10210573-dd51-4200-ac93-189643f128bb] gofer.messaging.adapter.proton.connectio...com:5647
Aug 04 07:01:27 vmcapsule081.perf.lab.eng.rdu.redhat.com goferd[12706]: [INFO][pulp.agent.10210573-dd51-4200-ac93-189643f128bb] root:505 - connected to perfc-380g8-01.p...com:5647
Hint: Some lines were ellipsized, use -l to show in full.
[root@vmcapsule081 ~]# 

Katello agent: 

[root@vmcapsule081 ~]# rpm -qa| grep katello
katello-debug-2.2.0.14-1.el7sat.noarch
katello-default-ca-1.0-1.noarch
katello-ca-consumer-perfc-380g8-01.perf.lab.eng.rdu.redhat.com-1.0-1.noarch
katello-installer-base-2.3.17-1.el7sat.noarch
katello-agent-2.2.5-1.el7sat.noarch
katello-server-ca-1.0-1.noarch
pulp-katello-0.5-1.el7sat.noarch
katello-service-2.2.0.14-1.el7sat.noarch
katello-certs-tools-2.2.1-1.el7sat.noarch
[root@vmcapsule081 ~]# 


  
 

Version-Release number of selected component (if applicable):

capsule:  3.10.0-229.7.2.el7.x86_64
satelitte: 3.10.0-229.7.2.el7.x86_64



How reproducible:


Steps to Reproduce:
1.Sync one of the capsule 
2.
3.

Actual results:

Noticed this error. 
Host did not respond within 20 seconds. Is katello-agent installed and goferd running on the Host?

But sync is happening on capsule server. 

Expected results:

Sync should happen without any errors

Additional info:
Comment 1 Pradeep Kumar Surisetty 2015-08-04 08:05:21 EDT
Created attachment 1059056 [details]
satelitte server foreman logs
Comment 2 Pradeep Kumar Surisetty 2015-08-04 08:15:46 EDT
Created attachment 1059058 [details]
/var/log/messages of satelitte server
Comment 3 Pradeep Kumar Surisetty 2015-08-04 08:18:31 EDT
Created attachment 1059060 [details]
capsule logs
Comment 4 Pradeep Kumar Surisetty 2015-08-06 05:18:49 EDT
trying to reset dropped connectin multiple times 


Aug  6 04:17:07 vmcapsule002 pulp: requests.packages.urllib3.connectionpool:INFO: Resetting dropped connection: perfc-380g8-01.perf.lab.eng.rdu.redhat.com
Aug  6 04:17:10 vmcapsule002 pulp: requests.packages.urllib3.connectionpool:INFO: Resetting dropped connection: perfc-380g8-01.perf.lab.eng.rdu.redhat.com
Aug  6 04:17:10 vmcapsule002 pulp: requests.packages.urllib3.connectionpool:INFO: Resetting dropped connection: perfc-380g8-01.perf.lab.eng.rdu.redhat.com
Aug  6 04:17:12 vmcapsule002 pulp: requests.packages.urllib3.connectionpool:INFO: Resetting dropped connection: perfc-380g8-01.perf.lab.eng.rdu.redhat.com
Aug  6 04:17:17 vmcapsule002 pulp: requests.packages.urllib3.connectionpool:INFO: Resetting dropped connection: perfc-380g8-01.perf.lab.eng.rdu.redhat.com
Aug  6 04:17:18 vmcapsule002 pulp: requests.packages.urllib3.connectionpool:INFO: Resetting dropped connection: perfc-380g8-01.perf.lab.eng.rdu.redhat.com
Aug  6 04:17:22 vmcapsule002 pulp: requests.packages.urllib3.connectionpool:INFO: Resetting dropped connection: perfc-380g8-01.perf.lab.eng.rdu.redhat.com
Aug  6 04:17:23 vmcapsule002 pulp: requests.packages.urllib3.connectionpool:INFO: Resetting dropped connection: perfc-380g8-01.perf.lab.eng.rdu.redhat.com
Aug  6 04:17:27 vmcapsule002 pulp: requests.packages.urllib3.connectionpool:INFO: Resetting dropped connection: perfc-380g8-01.perf.lab.eng.rdu.redhat.com
Aug  6 04:17:28 vmcapsule002 pulp: requests.packages.urllib3.connectionpool:INFO: Resetting dropped connection: perfc-380g8-01.perf.lab.eng.rdu.redhat.com
Aug  6 04:17:30 vmcapsule002 pulp: requests.packages.urllib3.connectionpool:INFO: Resetting dropped connection: perfc-380g8-01.perf.lab.eng.rdu.redhat.com
Aug  6 04:17:32 vmcapsule002 pulp: requests.packages.urllib3.connectionpool:INFO: Resetting dropped connection: perfc-380g8-01.perf.lab.eng.rdu.redhat.com
Aug  6 04:17:33 vmcapsule002 pulp: requests.packages.urllib3.connectionpool:INFO: Resetting dropped connection: perfc-380g8-01.perf.lab.eng.rdu.redhat.com
Aug  6 04:17:33 vmcapsule002 pulp: requests.packages.urllib3.connectionpool:INFO: Resetting dropped connection: perfc-380g8-01.perf.lab.eng.rdu.redhat.com
Aug  6 04:17:36 vmcapsule002 pulp: requests.packages.urllib3.connectionpool:INFO: Resetting dropped connection: perfc-380g8-01.perf.lab.eng.rdu.redhat.com
Aug  6 04:17:37 vmcapsule002 pulp: requests.packages.urllib3.connectionpool:INFO: Resetting dropped connection: perfc-380g8-01.perf.lab.eng.rdu.redhat.com
Aug  6 04:17:41 vmcapsule002 pulp: requests.packages.urllib3.connectionpool:INFO: Resetting dropped connection: perfc-380g8-01.perf.lab.eng.rdu.redhat.com
Aug  6 04:17:41 vmcapsule002 pulp: requests.packages.urllib3.connectionpool:INFO: Resetting dropped connection: perfc-380g8-01.perf.lab.eng.rdu.redhat.com
Aug  6 04:17:44 vmcapsule002 pulp: requests.packages.urllib3.connectionpool:INFO: Resetting dropped connection: perfc-380g8-01.perf.lab.eng.rdu.redhat.com
Aug  6 04:17:45 vmcapsule002 pulp: requests.packages.urllib3.connectionpool:INFO: Resetting dropped connection: perfc-380g8-01.perf.lab.eng.rdu.redhat.com
Aug  6 04:17:47 vmcapsule002 pulp: requests.packages.urllib3.connectionpool:INFO: Resetting dropped connection: perfc-380g8-01.perf.lab.eng.rdu.redhat.com
Aug  6 04:17:47 vmcapsule002 pulp: requests.packages.urllib3.connectionpool:INFO: Resetting dropped connection: perfc-380g8-01.perf.lab.eng.rdu.redhat.com
Aug  6 04:17:56 vmcapsule002 pulp: celery.worker.job:INFO: Task pulp.server.managers.repo.sync.sync[775eac06-89ce-4524-a08f-5b3d8cb93df8] succeeded in 84.852102816s: <pulp.server.async.tasks.TaskResult object at 0x290d810>
Aug  6 04:17:56 vmcapsule002 pulp: celery.worker.strategy:INFO: Received task: pulp.server.async.tasks._queue_reserved_task[528a0a80-f693-4d48-b224-6f9d417e86fb]
Aug  6 04:17:56 vmcapsule002 pulp: celery.worker.job:INFO: Task pulp.server.async.tasks._release_resource[e06ae0c2-7b6b-4063-9055-945a01900d58] succeeded in 0.012236169s: None
Aug  6 04:17:56 vmcapsule002 pulp: celery.worker.strategy:INFO: Received task: pulp.server.managers.repo.publish.publish[27786258-b1dd-4ddb-b276-353aeb2d1613]
Aug  6 04:17:56 vmcapsule002 pulp: celery.worker.strategy:INFO: Received task: pulp.server.async.tasks._release_resource[1fb72430-5cfc-49b5-bc0c-23eadd0d396c]
Aug  6 04:17:56 vmcapsule002 pulp: celery.worker.job:INFO: Task pulp.server.async.tasks._queue_reserved_task[528a0a80-f693-4d48-b224-6f9d417e86fb] succeeded in 0.033959336s: None
Aug  6 04:17:59 vmcapsule002 goferd: [INFO][worker-0] gofer.agent.rmi:128 - sn=693a4424-9185-498f-9432-cbb6c841dbbc processed in: 1.616 (minutes)
Aug  6 04:18:07 vmcapsule002 pulp: celery.worker.job:INFO: Task pulp.server.managers.repo.publish.publish[27786258-b1dd-4ddb-b276-353aeb2d1613] succeeded in 11.123642728s: {'exception': None, 'repo_id':...
Aug  6 04:18:07 vmcapsule002 pulp: celery.worker.job:INFO: Task pulp.server.async.tasks._release_resource[1fb72430-5cfc-49b5-bc0c-23eadd0d396c] succeeded in 0.013110976s: None


http://pastebin.test.redhat.com/303282
Comment 5 Sachin Ghai 2015-08-07 05:59:26 EDT
I'm getting same message after disconneted (ISO) upgrade.

[root@cloud-qe-6 ~]# hammer -u admin -p changeme capsule content synchronize --id=2
[............................................................................................................................................] [100%]
Host did not respond within 20 seconds. Is katello-agent installed and goferd running on the Host?
----------

2015-08-07 05:46:23 [E] Host did not respond within 20 seconds. Is katello-agent installed and goferd running on the Host? (RuntimeError)
/opt/rh/ruby193/root/usr/share/gems/gems/katello-2.2.0.65/app/lib/actions/pulp/consumer/sync_node.rb:35:in `process_timeout'
/opt/rh/ruby193/root/usr/share/gems/gems/dynflow-0.7.7.9/lib/dynflow/action/polling.rb:23:in `run'
/opt/rh/ruby193/root/usr/share/gems/gems/dynflow-0.7.7.9/lib/dynflow/action/cancellable.rb:9:in `run'
/opt/rh/ruby193/root/usr/share/gems/gems/katello-2.2.0.65/app/lib/actions/pulp/abstract_async_task.rb:57:in `run'
/opt/rh/ruby193/root/usr/share/gems/gems/dynflow-0.7.7.9/lib/dynflow/action.rb:487:in `block (3 levels) in execute_run'
/opt/rh/ruby193/root/usr/share/gems/gems/dynflow-0.7.7.9/lib/dynflow/middleware/stack.rb:26:in `call'
/opt/rh/ruby193/root/usr/share/gems/gems/dynflow-0.7.7.9/lib/dynflow/middleware/stack.rb:26:in `pass'
/opt/rh/ruby193/root/usr/share/gems/gems/dynflow-0.7.7.9/lib/dynflow/middleware.rb:16:in `pass'
/opt/rh/ruby193/root/usr/share/gems/gems/dynflow-0.7.7.9/lib/dynflow/middleware.rb:25:in `run'
/opt/rh/ruby193/root/usr/share/gems/gems/dynflow-0.7.7.9/lib/dynflow/middleware/stack.rb:22:in `call'
-----
Comment 7 Sachin Ghai 2015-08-07 08:53:12 EDT
Firewall rules were enabled on satellite server. Flushed out the rules  and re -ran the sync and this time it was completed successfully without any error:

[root@cloud-qe-6 ~]# hammer -u admin -p changeme capsule content synchronize --id=2
[............................................................................................................................................] [100%]
[

so removing blocker flag. thanks
Comment 8 Pradeep Kumar Surisetty 2015-08-07 09:25:58 EDT
i have already flushed out firewall rules on satellite server. Still i see this issue.
Comment 10 Pradeep Kumar Surisetty 2015-08-20 12:55:41 EDT
After installing capsule server, if user reboot capsule server user will face this issue for sure. 



Disabling firewall and restarting all below services  helps. 

Redirecting to /bin/systemctl start  foreman-proxy.service
Redirecting to /bin/systemctl start  goferd.service
Redirecting to /bin/systemctl start  qdrouterd.service
Redirecting to /bin/systemctl start  qpidd.service
Redirecting to /bin/systemctl start  pulp_workers.service
Redirecting to /bin/systemctl start  pulp_resource_manager.service
Redirecting to /bin/systemctl start  pulp_celerybeat.service
Redirecting to /bin/systemctl start  httpd.service
Comment 12 Pradeep Kumar Surisetty 2015-09-03 05:43:11 EDT
Stephen 

Please help me to understand.
is there some thing i need to provide here? 

--Pradeep
Comment 14 Og Maciel 2015-10-13 10:19:02 EDT
I spoke to Pradeep and he told me that in the end he re-enabled the firewall and restarted all services. He did not see the issue anymore.

Note You need to log in before you can comment on or make changes to this bug.