Bug 1413745 - Read timeout at beginning of CFME deployment
Summary: Read timeout at beginning of CFME deployment
Keywords:
Status: CLOSED WORKSFORME
Alias: None
Product: Red Hat Quickstart Cloud Installer
Classification: Red Hat
Component: Installation - CloudForms
Version: 1.1
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: 1.1
Assignee: John Matthews
QA Contact: Sudhir Mallamprabhakara
Dan Macpherson
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-01-16 20:59 UTC by James Olin Oden
Modified: 2017-01-17 20:23 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-01-17 20:23:02 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description James Olin Oden 2017-01-16 20:59:35 UTC
Description of problem:
Was doing an OSP(ceph) + CFME deployment.   It got past OSP fine, but when it got to CFME task, it hung at 0% for like 30 minutes and then went to 31.8% and errored out.   The task error was:

   read timeout reached

In the production.log the following error can be found:

   2017-01-16 15:34:22 [foreman-tasks/action] [E] read timeout reached (Excon::Errors::Timeout)

The deployment log had no errors but had this to say about the CFME deployment:

   I, [2017-01-16T15:07:58.408467 #19518]  INFO -- : Planning CFME Deployment 
   D, [2017-01-16T15:07:59.149979 #19518] DEBUG -- : ====== CFME Upload run method ======
   D, [2017-01-16T15:33:13.412775 #19518] DEBUG -- : ====== Leaving CFME Upload run method ======
   D, [2017-01-16T15:33:13.464958 #19518] DEBUG -- : ====== CFME Security Group run method ======
   D, [2017-01-16T15:33:19.212941 #19518] DEBUG -- : ====== Leaving CFME Security Group run method ======
   D, [2017-01-16T15:33:19.254404 #19518] DEBUG -- : ====== CFME Flavor run method ======
   D, [2017-01-16T15:33:21.507910 #19518] DEBUG -- : ====== Leaving CFME Flavor run method ======
   D, [2017-01-16T15:33:21.550262 #19518] DEBUG -- : ====== CFME Launch run method ======

Version-Release number of selected component (if applicable):
QCI-1.1-RHEL-7-20170112.t.0

How reproducible:
First occurrence.

Steps to Reproduce:
1.   Do OSP(ceph) + CFME deployment

Actual results:
At this one time I saw this error.

Expected results:
No errors

Additional info:

Here is the full stack trace from the production.log:

    2017-01-16 15:34:22 [foreman-tasks/action] [E] read timeout reached (Excon::Errors::Timeout)
     | /opt/theforeman/tfm/root/usr/share/gems/gems/excon-0.45.3/lib/excon/socket.rb:299:in `raise_timeout_error'
     | /opt/theforeman/tfm/root/usr/share/gems/gems/excon-0.45.3/lib/excon/socket.rb:49:in `rescue in readline'
     | /opt/theforeman/tfm/root/usr/share/gems/gems/excon-0.45.3/lib/excon/socket.rb:44:in `readline'
     | /opt/theforeman/tfm/root/usr/share/gems/gems/excon-0.45.3/lib/excon/response.rb:53:in `parse'
     | /opt/theforeman/tfm/root/usr/share/gems/gems/excon-0.45.3/lib/excon/middlewares/response_parser.rb:6:in `response_call'
     | /opt/theforeman/tfm/root/usr/share/gems/gems/docker-api-1.28.0/lib/excon/middlewares/hijack.rb:45:in `response_call'
     | /opt/theforeman/tfm/root/usr/share/gems/gems/excon-0.45.3/lib/excon/connection.rb:372:in `response'
     | /opt/theforeman/tfm/root/usr/share/gems/gems/excon-0.45.3/lib/excon/connection.rb:236:in `request'
     | /opt/theforeman/tfm/root/usr/share/gems/gems/fog-core-1.36.0/lib/fog/core/connection.rb:81:in `request'
     | /opt/theforeman/tfm/root/usr/share/gems/gems/fog-openstack-0.1.3.5/lib/fog/openstack/compute.rb:359:in `request'
     | /opt/theforeman/tfm/root/usr/share/gems/gems/fog-openstack-0.1.3.5/lib/fog/openstack/requests/compute/list_images_detail.rb:6:in `list_images_detail'
     | /opt/theforeman/tfm/root/usr/share/gems/gems/fog-openstack-0.1.3.5/lib/fog/openstack/models/compute/images.rb:21:in `all'
     | /opt/theforeman/tfm/root/usr/share/gems/gems/fog-core-1.36.0/lib/fog/core/collection.rb:113:in `lazy_load'
     | /opt/theforeman/tfm/root/usr/share/gems/gems/fog-core-1.36.0/lib/fog/core/collection.rb:28:in `select'
     | /usr/share/foreman/app/models/compute_resources/foreman/model/openstack.rb:55:in `available_images'
     | /opt/theforeman/tfm/root/usr/share/gems/gems/fusor_server-1.1.28/app/lib/actions/fusor/deployment/open_stack/cfme_launch.rb:58:in `create_image'
     | /opt/theforeman/tfm/root/usr/share/gems/gems/fusor_server-1.1.28/app/lib/actions/fusor/deployment/open_stack/cfme_launch.rb:33:in `run'
     | /opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.13.3/lib/dynflow/action.rb:506:in `block (3 levels) in execute_run'
     | /opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.13.3/lib/dynflow/middleware/stack.rb:26:in `call'
     | /opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.13.3/lib/dynflow/middleware/stack.rb:26:in `pass'
     | /opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.13.3/lib/dynflow/middleware.rb:17:in `pass'
     | /opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.13.3/lib/dynflow/action/progress.rb:30:in `with_progress_calculation'
     | /opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.13.3/lib/dynflow/action/progress.rb:16:in `run'
     | /opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.13.3/lib/dynflow/middleware/stack.rb:22:in `call'
     | /opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.13.3/lib/dynflow/middleware/stack.rb:26:in `pass'
     | /opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.13.3/lib/dynflow/middleware.rb:17:in `pass'
     | /opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.0.0.90/app/lib/actions/middleware/keep_locale.rb:11:in `block in run'
     | /opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.0.0.90/app/lib/actions/middleware/keep_locale.rb:22:in `with_locale'
     | /opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.0.0.90/app/lib/actions/middleware/keep_locale.rb:11:in `run'
     | /opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.13.3/lib/dynflow/middleware/stack.rb:22:in `call'
     | /opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.13.3/lib/dynflow/middleware/stack.rb:26:in `pass'
     | /opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.13.3/lib/dynflow/middleware.rb:17:in `pass'
     | /opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.13.3/lib/dynflow/middleware.rb:30:in `run'
     | /opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.13.3/lib/dynflow/middleware/stack.rb:22:in `call'
     | /opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.13.3/lib/dynflow/middleware/world.rb:30:in `execute'
     | /opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.13.3/lib/dynflow/action.rb:505:in `block (2 levels) in execute_run'
     | /opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.13.3/lib/dynflow/action.rb:504:in `catch'
     | /opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.13.3/lib/dynflow/action.rb:504:in `block in execute_run'
     | /opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.13.3/lib/dynflow/action.rb:419:in `call'
     | /opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.13.3/lib/dynflow/action.rb:419:in `block in with_error_handling'
     | /opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.13.3/lib/dynflow/action.rb:419:in `catch'
     | /opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.13.3/lib/dynflow/action.rb:419:in `with_error_handling'
     | /opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.13.3/lib/dynflow/action.rb:499:in `execute_run'
     | /opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.13.3/lib/dynflow/action.rb:260:in `execute'
     | /opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.13.3/lib/dynflow/execution_plan/steps/abstract_flow_step.rb:9:in `block (2 levels) in execute'
     | /opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.13.3/lib/dynflow/execution_plan/steps/abstract.rb:155:in `call'
     | /opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.13.3/lib/dynflow/execution_plan/steps/abstract.rb:155:in `with_meta_calculation'
     | /opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.13.3/lib/dynflow/execution_plan/steps/abstract_flow_step.rb:8:in `block in execute'
     | /opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.13.3/lib/dynflow/execution_plan/steps/abstract_flow_step.rb:22:in `open_action'
     | /opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.13.3/lib/dynflow/execution_plan/steps/abstract_flow_step.rb:7:in `execute'
     | /opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.13.3/lib/dynflow/executors/parallel/worker.rb:15:in `block in on_message'
     | /opt/theforeman/tfm/root/usr/share/gems/gems/algebrick-0.7.3/lib/algebrick/matchers/abstract.rb:74:in `block in assigns'
     | /opt/theforeman/tfm/root/usr/share/gems/gems/algebrick-0.7.3/lib/algebrick/matchers/abstract.rb:73:in `tap'
     | /opt/theforeman/tfm/root/usr/share/gems/gems/algebrick-0.7.3/lib/algebrick/matchers/abstract.rb:73:in `assigns'
     | /opt/theforeman/tfm/root/usr/share/gems/gems/algebrick-0.7.3/lib/algebrick/matching.rb:56:in `match_value'
     | /opt/theforeman/tfm/root/usr/share/gems/gems/algebrick-0.7.3/lib/algebrick/matching.rb:36:in `block in match?'
     | /opt/theforeman/tfm/root/usr/share/gems/gems/algebrick-0.7.3/lib/algebrick/matching.rb:35:in `each'
     | /opt/theforeman/tfm/root/usr/share/gems/gems/algebrick-0.7.3/lib/algebrick/matching.rb:35:in `match?'
     | /opt/theforeman/tfm/root/usr/share/gems/gems/algebrick-0.7.3/lib/algebrick/matching.rb:23:in `match'
     | /opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.13.3/lib/dynflow/executors/parallel/worker.rb:12:in `on_message'
     | /opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-edge-0.2.0/lib/concurrent/actor/context.rb:46:in `on_envelope'
     | /opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-edge-0.2.0/lib/concurrent/actor/behaviour/executes_context.rb:7:in `on_envelope'
     | /opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-edge-0.2.0/lib/concurrent/actor/behaviour/abstract.rb:25:in `pass'
     | /opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.13.3/lib/dynflow/actor.rb:26:in `on_envelope'
     | /opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-edge-0.2.0/lib/concurrent/actor/behaviour/abstract.rb:25:in `pass'
     | /opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-edge-0.2.0/lib/concurrent/actor/behaviour/awaits.rb:15:in `on_envelope'
     | /opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-edge-0.2.0/lib/concurrent/actor/behaviour/abstract.rb:25:in `pass'
     | /opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-edge-0.2.0/lib/concurrent/actor/behaviour/sets_results.rb:14:in `on_envelope'
     | /opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-edge-0.2.0/lib/concurrent/actor/behaviour/abstract.rb:25:in `pass'
     | /opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-edge-0.2.0/lib/concurrent/actor/behaviour/buffer.rb:38:in `process_envelope'
     | /opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-edge-0.2.0/lib/concurrent/actor/behaviour/buffer.rb:31:in `process_envelopes?'
     | /opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-edge-0.2.0/lib/concurrent/actor/behaviour/buffer.rb:20:in `on_envelope'
     | /opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-edge-0.2.0/lib/concurrent/actor/behaviour/abstract.rb:25:in `pass'
     | /opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-edge-0.2.0/lib/concurrent/actor/behaviour/termination.rb:55:in `on_envelope'
     | /opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-edge-0.2.0/lib/concurrent/actor/behaviour/abstract.rb:25:in `pass'
     | /opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-edge-0.2.0/lib/concurrent/actor/behaviour/removes_child.rb:10:in `on_envelope'
     | /opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-edge-0.2.0/lib/concurrent/actor/behaviour/abstract.rb:25:in `pass'
     | /opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-edge-0.2.0/lib/concurrent/actor/behaviour/sets_results.rb:14:in `on_envelope'
     | /opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-edge-0.2.0/lib/concurrent/actor/core.rb:161:in `process_envelope'
     | /opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-edge-0.2.0/lib/concurrent/actor/core.rb:95:in `block in on_envelope'
     | /opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-edge-0.2.0/lib/concurrent/actor/core.rb:118:in `block (2 levels) in schedule_execution'
     | /opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-1.0.0/lib/concurrent/synchronization/mri_lockable_object.rb:38:in `block in synchronize'
     | /opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-1.0.0/lib/concurrent/synchronization/mri_lockable_object.rb:38:in `synchronize'
     | /opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-1.0.0/lib/concurrent/synchronization/mri_lockable_object.rb:38:in `synchronize'
     | /opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-edge-0.2.0/lib/concurrent/actor/core.rb:115:in `block in schedule_execution'
     | /opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-1.0.0/lib/concurrent/executor/serialized_execution.rb:18:in `call'
     | /opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-1.0.0/lib/concurrent/executor/serialized_execution.rb:18:in `call'
     | /opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-1.0.0/lib/concurrent/executor/serialized_execution.rb:96:in `work'
     | /opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-1.0.0/lib/concurrent/executor/serialized_execution.rb:77:in `block in call_job'
     | /opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-1.0.0/lib/concurrent/executor/ruby_thread_pool_executor.rb:333:in `call'
     | /opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-1.0.0/lib/concurrent/executor/ruby_thread_pool_executor.rb:333:in `run_task'
     | /opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-1.0.0/lib/concurrent/executor/ruby_thread_pool_executor.rb:322:in `block (3 levels) in create_worker'
     | /opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-1.0.0/lib/concurrent/executor/ruby_thread_pool_executor.rb:305:in `loop'
     | /opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-1.0.0/lib/concurrent/executor/ruby_thread_pool_executor.rb:305:in `block (2 levels) in create_worker'
     | /opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-1.0.0/lib/concurrent/executor/ruby_thread_pool_executor.rb:304:in `catch'
     | /opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-1.0.0/lib/concurrent/executor/ruby_thread_pool_executor.rb:304:in `block in create_worker'
     | /opt/theforeman/tfm/root/usr/share/gems/gems/logging-1.8.2/lib/logging/diagnostic_context.rb:323:in `call'
     | /opt/theforeman/tfm/root/usr/share/gems/gems/logging-1.8.2/lib/logging/diagnostic_context.rb:323:in `block in create_with_logging_context'

Comment 3 Jason Montleon 2017-01-17 13:45:25 UTC
If the installation hangs the overcloud deployment can't communicate with the ceph storage cluster for some reason. I agree it's not ideal, but failure to connect doesn't cause the deployment to fail or images to outright fail uploading. It just blocks until the storage becomes available again.

Make sure you're using Ceph Storage  2.0, not Ceph 1.3 as we were using in OSP 8. They are not compatible. If I recall correctly Ceph Storage 1.3 packages will have a version of roughly 0.97 and Ceph Storage 2.0 packages will have a version of roughly 10.x

Also make sure no firewall rules on your host would prevent the overcloud hosts from communicating with the ceph host if your using a virt environment to test and try again.

Comment 6 James Olin Oden 2017-01-17 14:26:49 UTC
Concerning the version of ceph, here is what I have:

[vagrant@ceph ~]$ rpm -qa | grep ceph
ceph-selinux-10.2.3-17.el7cp.x86_64
ceph-deploy-1.5.36-20.el7cp.noarch
python-cephfs-10.2.3-17.el7cp.x86_64
ceph-base-10.2.3-17.el7cp.x86_64
ceph-mon-10.2.3-17.el7cp.x86_64
ceph-mds-10.2.3-17.el7cp.x86_64
libcephfs1-10.2.3-17.el7cp.x86_64
ceph-common-10.2.3-17.el7cp.x86_64
ceph-osd-10.2.3-17.el7cp.x86_64
ceph-radosgw-10.2.3-17.el7cp.x86_64

Comment 7 James Olin Oden 2017-01-17 14:30:29 UTC
And here are the firewall rules in question:

1340K 4168M ACCEPT     all  --  *      virbr9  0.0.0.0/0            192.168.175.0/24     ctstate RELATED,ESTABLISHED
3542K  227M ACCEPT     all  --  virbr9 *       192.168.175.0/24     0.0.0.0/0           

These basically allow connections on the 192.168.175.0/24 network, which is the provisioning network for the fusor and where the ceph server lives (it's under the provisioning range for handing out addresses via DHCP).

Comment 8 James Olin Oden 2017-01-17 14:35:39 UTC
My second run of the same deployment fails the same way.   So I guess this is reproducible.

Comment 9 Jason Montleon 2017-01-17 16:08:12 UTC
ceph status reports:
ceph status
    cluster bc5e37bc-3cf8-4823-a7c1-5109ddeefe63
     health HEALTH_ERR
     ...
            no osds

I'd guess this is why it is failing.

Mine reads:
[root@ceph20 ~]# ceph status
    cluster 42e66d84-53f0-4283-9d63-7077e1b566f4
     health HEALTH_OK
     ...
     osdmap e18: 3 osds: 3 up, 3 in


Please double check your changes to hostnames, etc. and review the output for errors when vagrant runs the setup.sh script.

Comment 10 Jason Montleon 2017-01-17 20:21:06 UTC
Set up a ceph host. started with:
/dev/vdb1                                                      1.0T  5.1G 1019G   1% /storage1
/dev/vdc1                                                      1.0T  5.1G 1019G   1% /storage2
/dev/vdd1                                                      1.0T  5.1G 1019G   1% /storage3

After the entire deployment was complete:
/dev/vdb1                                                      1.0T  9.1G 1015G   1% /storage1
/dev/vdc1                                                      1.0T  9.1G 1015G   1% /storage2
/dev/vdd1                                                      1.0T  9.1G 1015G   1% /storage3

[root@ceph20 ~]# ceph status
    cluster 42e66d84-53f0-4283-9d63-7077e1b566f4
     health HEALTH_OK
     monmap e1: 1 mons at {ceph20=192.168.240.254:6789/0}
            election epoch 3, quorum 0 ceph20
     osdmap e23: 3 osds: 3 up, 3 in
            flags sortbitwise
      pgmap v530: 124 pgs, 4 pools, 4083 MB data, 916 objects
            27756 MB used, 3043 GB / 3070 GB avail
                 124 active+clean

I will look at adding a validation so we can catch inaccessible and unhealthy clusters in 1.2.


Note You need to log in before you can comment on or make changes to this bug.