Bug 1734517

Summary: [DCN][Spine & Leaf] failed to deploy Direct Interface OC computes Unable to write image to /tmp/{uuid}
Product: Red Hat OpenStack Reporter: bjacot
Component: documentationAssignee: RHOS Documentation Team <rhos-docs>
Status: CLOSED NOTABUG QA Contact: RHOS Documentation Team <rhos-docs>
Severity: high Docs Contact:
Priority: unspecified    
Version: 15.0 (Stein)CC: bfournie, dtantsur, mburns
Target Milestone: ---Keywords: Reopened
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-08-07 12:47:29 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
nova-compute log none

Description bjacot 2019-07-30 18:10:07 UTC
Description of problem:
OC deployment is failing on OPS15 on compute nodes.  The controllers were successful in their deployment.  I see in the nova-compute.log Error downloading image failed.  'Download of image 17433796-7590-4780-bf3c-59b8a1b0ca93 failed: Unable to write image to /tmp/17433796-7590-4780-bf3c-59b8a1b0ca93. Error: [Errno 28] No space left on device'.  Computes have a 50GB hard drive.

3: controller leaf0 <-- Successful
1: compute leaf0 <-- ERROR
1: compute leaf1 <-- Failed
1: compute leaf2 <-- Failed

Version-Release number of selected component (if applicable):
15  -p RHOS_TRUNK-15.0-RHEL-8-20190725.n.1

[root@site-undercloud-0 stack]# rpm -qa | grep ironic
puppet-ironic-14.4.1-0.20190423121513.cd9417e.el8ost.noarch
python3-ironic-inspector-client-3.5.0-0.20190313131319.9bb1150.el8ost.noarch
python3-ironicclient-2.7.2-0.20190529060404.266a700.el8ost.noarch
[root@site-undercloud-0 stack]# rpm -qa | grep heat
openstack-heat-engine-12.0.1-0.20190704050403.bf16acc.el8ost.noarch
puppet-heat-14.4.1-0.20190420110320.4425351.el8ost.noarch
python3-heat-agent-hiera-1.8.1-0.20190523210450.1e15344.el8ost.noarch
openstack-heat-common-12.0.1-0.20190704050403.bf16acc.el8ost.noarch
python3-tripleoclient-heat-installer-11.5.1-0.20190723181704.f54216d.el8ost.noarch
python3-heatclient-1.17.0-0.20190312144725.8af5deb.el8ost.noarch
python3-heat-agent-1.8.1-0.20190523210450.1e15344.el8ost.noarch
python3-heat-agent-json-file-1.8.1-0.20190523210450.1e15344.el8ost.noarch
python3-heat-agent-docker-cmd-1.8.1-0.20190523210450.1e15344.el8ost.noarch
python3-heat-agent-ansible-1.8.1-0.20190523210450.1e15344.el8ost.noarch
python3-heat-agent-puppet-1.8.1-0.20190523210450.1e15344.el8ost.noarch
openstack-heat-agents-1.8.1-0.20190523210450.1e15344.el8ost.noarch
openstack-heat-api-12.0.1-0.20190704050403.bf16acc.el8ost.noarch
python3-heat-agent-apply-config-1.8.1-0.20190523210450.1e15344.el8ost.noarch
openstack-tripleo-heat-templates-10.6.1-0.20190725000448.e49b8db.el8ost.noarch
heat-cfntools-1.4.2-6.el8ost.noarch
openstack-heat-monolith-12.0.1-0.20190704050403.bf16acc.el8ost.noarch

How reproducible:
Always

Steps to Reproduce:
1. deploy UC on OSP15
2. Prepare templates for OC deployment
3. Try to deploy OC

Actual results:
Controllers will succeed computes will fail

Expected results:
All succeed

Additional info:
No workaround

Note:
there has been no change in the templates.  This deployment succeeded in puddle  RHOS_TRUNK-15.0-RHEL-8-20190716.n.0

Errors:
2019-07-30 17:21:04.617 8 ERROR oslo.service.loopingcall [-] Fixed interval looping call 'nova.virt.ironic.driver.IronicDriver._wait_for_active' failed: nova.exception.InstanceDeployFailure: Failed to provision instance c238b73a-d3a8-4599-af17-48efa357a187: node 081c30b8-ee88-4d7b-b073-41c0faa05ae5 command status errored: {'type': 'ImageDownloadError', 'code': 500, 'message': 'Error downloading image: Download of image 17433796-7590-4780-bf3c-59b8a1b0ca93 failed: Unable to write image to /tmp/17433796-7590-4780-bf3c-59b8a1b0ca93. Error: [Errno 28] No space left on device', 'details': 'Download of image 17433796-7590-4780-bf3c-59b8a1b0ca93 failed: Unable to write image to /tmp/17433796-7590-4780-bf3c-59b8a1b0ca93. Error: [Errno 28] No space left on device'}

Comment 1 bjacot 2019-07-30 18:17:43 UTC
Created attachment 1594767 [details]
nova-compute log

Comment 2 bjacot 2019-07-31 19:00:57 UTC
Hello BTW i am deploying via direct interface

Comment 3 bjacot 2019-08-02 12:24:31 UTC
I ended up increasing the ram from 6GB --> 8GB and was able to provision.  Looks like a min requirement may have changed.

Comment 4 bjacot 2019-08-05 17:14:42 UTC
Re-opening bug as documentation states min requirements is 6GB not 8GB.  please advise

https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/15-beta/html/director_installation_and_usage/planning-your-overcloud#compute-node-requirements

Comment 5 Dmitry Tantsur 2019-08-06 14:42:28 UTC
With the direct deploy interface you either use image streaming (I can elaborate) or the image needs to fit into RAM so that it can be converted. The minimum requirement of 6GB has nothing to do with that, it's calculated for iscsi deployment. We probably need to update DCN docs to mention this behavior OR use streaming.

Comment 6 Bob Fournier 2019-08-06 18:27:36 UTC
Removing blocker and setting as doc bug to document that when using direct deploy need to have enough ram to fit image (8 8GB in this case).

Comment 7 Bob Fournier 2019-08-06 20:10:32 UTC
Note that overcloud images are built requiring a minimum of 7 GB for tmpfs - https://github.com/openstack/tripleo-common/blob/master/image-yaml/overcloud-images.yaml#L37

Comment 8 bjacot 2019-08-07 12:47:29 UTC
Thank you bob i am closing for now.