Bug 1575115

Summary: OSP13 - Container configuration generation fails if the host file system is xfs that was created with ftype=0
Product: Red Hat OpenStack Reporter: Carlos Camacho <ccamacho>
Component: openstack-tripleo-validationsAssignee: Carlos Camacho <ccamacho>
Status: CLOSED ERRATA QA Contact: Marius Cornea <mcornea>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 13.0 (Queens)CC: agurenko, aschultz, augol, ccamacho, dwalsh, emacchi, esandeen, gcharot, jcoufal, jjoyce, jschluet, mbracho, mburns, mcornea, morazi, mszeredi, ohochman, pablo.iranzo, rhel-osp-director-maint, rhos-flags, roxenham, rscarazz, sbaker, slinaber, tvignaud, vcojot, vgoyal
Target Milestone: betaKeywords: Triaged
Target Release: 13.0 (Queens)Flags: mcornea: needinfo? (ccamacho)
Hardware: All   
OS: All   
Whiteboard:
Fixed In Version: openstack-tripleo-validations-8.4.1-4.el7ost Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1564671
: 1580463 (view as bug list) Environment:
Last Closed: 2018-06-27 13:55:29 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Bug Depends On: 1564671    
Bug Blocks: 1580463, 1580469, 1580476    

Comment 9 Marius Cornea 2018-05-16 20:54:30 UTC
Hey Carlos,

I see this ticket has attached a tripleo-validation patch. I'm assuming we're going to document the need to run validation before upgrade or do this validation gets triggered automatically before upgrade? Please let me know so I can proceed to validate it.

Thanks!

Comment 10 Carlos Camacho 2018-05-21 14:51:31 UTC
Hey Marius, currently the only thing we can do is to validate that there are no volumes affected by this issue.

This validation is part of the pre-upgrade tasks, so if the validations are not disabled this should run.

I have duplicated this bug into:

OSP12 - https://bugzilla.redhat.com/show_bug.cgi?id=1580463
OSP11 - https://bugzilla.redhat.com/show_bug.cgi?id=1580469
OSP10 - https://bugzilla.redhat.com/show_bug.cgi?id=1580476

For tracking all the backports.

Comment 11 Marius Cornea 2018-05-21 23:08:10 UTC
(In reply to Carlos Camacho from comment #10)
> Hey Marius, currently the only thing we can do is to validate that there are
> no volumes affected by this issue.
> 
> This validation is part of the pre-upgrade tasks, so if the validations are
> not disabled this should run.
> 

Since this bug is filed against OSP13 I believe this should be used for keeping track of running the validation before the overcloud fast forward upgrade? If so at which step in the fast forward upgrade procedure do we expect this validation to run? I'd assume that before 'openstack overcloud ffwd-upgrade prepare' but I just want to make sure I get it right before trying to validate it. Thanks!

Comment 12 Marius Cornea 2018-05-25 02:33:55 UTC
I tried validating this BZ and upgraded an affected environment to OSP13 undercloud then tried to run this validation via:

(undercloud) [stack@undercloud-0 ~]$ mistral execution-get-output $(openstack workflow execution create -f value -c ID tripleo.validations.v1.run_groups '{"group_names": ["pre-upgrade"]}')

and only got this output:

{}

Comment 13 Marius Cornea 2018-05-25 14:44:44 UTC
Steps to run the validation:

(undercloud) [stack@undercloud-0 ~]$ uuid=$(openstack workflow execution create tripleo.validations.v1.run_validation '{"validation_name": "check-ftype"}'  -f json | jq -r -c '.ID');

(undercloud) [stack@undercloud-0 ~]$ mistral execution-get-output $uuid

{
    "status": "FAILED", 
    "result": null, 
    "stderr": "[DEPRECATION WARNING]: DEFAULT_SUDO_FLAGS option, In favor of become which is a\n generic framework . This feature will be removed in version 2.8. Deprecation \nwarnings can be disabled by setting deprecation_warnings=False in ansible.cfg.\n [WARNING]: Could not match supplied host pattern, ignoring: overcloud\n", 
    "stdout": "Task 'Check ftype' failed:\nHost: localhost\nMessage: XFS volumes formatted using ftype=0 are incompatible with the docker overlayfs driver. Run xfs_info in localhost.localdomain and fix those volumes before proceeding with the upgrade.\n\n\nFailure! The validation failed for all hosts:\n* localhost\n"
}

We can see that the validation doesn't run on the overcloud nodes. I suspect this is because the tripleo ansible inventory doesn't contain the overcloud nodes yet as they get added after running overcloud ffwd-upgrade prepare but at that point the upgrade process has already started and running the validation should prevent the upgrade process from starting.

Comment 16 errata-xmlrpc 2018-06-27 13:55:29 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2018:2086