Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1585916

Summary: give a clear error message when ceph-ansible package is missing
Product: Red Hat OpenStack Reporter: Udi Kalifon <ukalifon>
Component: openstack-tripleo-validationsAssignee: John Fulton <johfulto>
Status: CLOSED EOL QA Contact: Yogev Rabl <yrabl>
Severity: low Docs Contact:
Priority: low    
Version: 13.0 (Queens)CC: gfidente, jjoyce, jschluet, mburns, mgarciac, slinaber, stchen, tvignaud, ukalifon
Target Milestone: ---Keywords: Triaged, ZStream
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-09-30 19:43:30 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
ceph-ansible workflow none

Description Udi Kalifon 2018-06-05 05:47:41 UTC
Description of problem:
When trying to deploy Ceph and forgetting to install the separate package ceph-ansible, the error message you get looks like this:

    resources.WorkflowTasks_Step2_Execution: ERROR

From this, the user is supposed to know somehow that he/she needs to check var/log/mistral/ceph-install-workflow.log, where the error would point to the missing package.

Even though we hit this error so many times already, it's still causing us huge amounts of wasted time. Please handle this error better and make sure it specifies what went wrong and what's the usual cause of it (and way to fix it).


Version-Release number of selected component (if applicable):
openstack-tripleo-common-8.6.1-18.el7ost.noarch
openstack-tripleo-heat-templates-8.0.2-29.el7ost.noarch


How reproducible:
100%


Steps to Reproduce:
1. Deploy Ceph without having ceph-ansible on the undercloud


Additional info:
There is a validator that catches this error, but in my case I hit the ceph-ansible problem on a deployment where I didn't even intend to deploy ceph (it was enabled by mistake) so I didn't even look in that direction and didn't understand the error I got. I was also deploying from the CLI where the validators are almost never used.

Comment 1 Giulio Fidente 2018-06-08 08:57:23 UTC
As pointed in the report, we already have a check in tripleo-validations [1]; I don't think there is much more we can do so I'd be tempted to close this as WORKSFORME.

1. https://github.com/openstack/tripleo-validations/blob/stable/queens/validations/ceph-ansible-installed.yaml

Comment 2 Udi Kalifon 2018-06-10 07:54:53 UTC
All error messages must be clear. If you just get "ERROR" with no additional hints besides "Step 2" it means that the user has to do a lot more work to find out what's going on, instead of the source of the problem being immediately clear.

Comment 3 Udi Kalifon 2018-06-10 11:04:22 UTC
Created attachment 1449653 [details]
ceph-ansible workflow

b.t.w - I just hit this error again, and as you can see in the attached screenshot the validator passed so the error must be caused by more factors. It is crucial to get error messages with clear info on the exact problem.

Comment 4 Udi Kalifon 2018-06-10 13:36:53 UTC
I got this error even though I had ceph-ansible installed, apparently because I didn't pass the following:

parameter_defaults:
    CinderEnableIscsiBackend: false
    CinderEnableRbdBackend: true
    CinderEnableNfsBackend: false
    NovaEnableRbdBackend: true
    GlanceBackend: rbd
    CinderRbdPoolName: "volumes"
    NovaRbdPoolName: "vms"
    GlanceRbdPoolName: "images"
    CephPoolDefaultPgNum: 32
    CephAnsibleDisksConfig:
        devices:
            - '/dev/vdb'
        journal_size: 512
        osd_scenario: collocated

Some of the above parameters are the default anyways, not sure which one of them made the difference between a successful deployment and a failure...

Comment 11 stchen 2020-09-30 19:43:30 UTC
Closing EOL, OSP 15 has been retired as of Sept 19, 2020