Bug 1391554 - Deployment FAILED: /usr/bin/clustercheck >/dev/null returned 1 instead of one of [0] [NEEDINFO]
Summary: Deployment FAILED: /usr/bin/clustercheck >/dev/null returned 1 instead of one...
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: mariadb-galera
Version: 8.0 (Liberty)
Hardware: x86_64
OS: Linux
unspecified
urgent
Target Milestone: ---
: ---
Assignee: Damien Ciabrini
QA Contact: Udi Shkalim
URL:
Whiteboard:
Depends On:
Blocks: 1520573
TreeView+ depends on / blocked
 
Reported: 2016-11-03 14:28 UTC by Francisco Javier Lopez Y Grueber
Modified: 2017-12-12 15:29 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1520573 (view as bug list)
Environment:
Last Closed: 2017-12-12 15:29:54 UTC
Target Upstream Version:
chjones: needinfo? (flg)


Attachments (Terms of Use)
Filtered Messages on failed deployment (16.21 KB, text/plain)
2016-11-03 14:28 UTC, Francisco Javier Lopez Y Grueber
no flags Details

Description Francisco Javier Lopez Y Grueber 2016-11-03 14:28:40 UTC
Created attachment 1217018 [details]
Filtered Messages on failed deployment

Description of problem:


    "deploy_stderr": "Could not retrieve fact='apache_version', resolution='<anonymous>': undefined method `[]' for nil:NilClass\nCould not retrieve fact='apache_version', resolution='<anonymous>': undefined method `[]' for nil:NilClass\n\u001b[1;31mWarning: Scope(Class[Mongodb::Server]): Replset specified, but no replset_members or replset_config provided.\u001b[0m\n\u001b[1;31mWarning: Scope(Haproxy::Config[haproxy]): haproxy: The $merge_options parameter will default to true in the next major release. Please review the documentation regarding the implications.\u001b[0m\n\u001b[1;31mError: /usr/bin/clustercheck >/dev/null returned 1 instead of one of [0]\u001b[0m\n\u001b[1;31mError: /Stage[main]/Main/Exec[galera-ready]/returns: change from notrun to 0 failed: /usr/bin/clustercheck >/dev/null returned 1 instead of one of [0]\u001b[0m\n", 
    "deploy_status_code": 6
  }, 
  "creation_time": "2016-11-03T08:52:58", 
  "updated_time": "2016-11-03T09:25:49", 
  "input_values": {}, 
  "action": "CREATE", 
  "status_reason": "deploy_status_code : Deployment exited with non-zero status code: 6", 
  "id": "7770a8d7-288b-4e9e-9106-31cca8cf855c"


Version-Release number of selected component (if applicable):

openstack-tripleo-heat-templates-0.8.14-14.el7ost.noarch
openstack-tripleo-heat-templates-kilo-0.8.14-14.el7ost.noarch

How reproducible:
Always 

Steps to Reproduce:
1. 
openstack overcloud deploy ${DEBUGON} --templates -e ${TEMPLATEDIR7}/nodeuserdata_env.yaml -e ${TEMPLATEDIR7}/cloudname.yaml -e  ${TEMPLATEDIR7}/environments/network-isolation.yaml -e ${TEMPLATEDIR7}/puppet-ceph-external.yaml -e ${T
EMPLATEDIR7}/ips-from-pool-all.yaml -e ${TEMPLATEDIR7}/timezone.yaml  -e ${TEMPLATEDIR7}/network-environment.yaml -e ${TEMPLATEDIR7}/network-management.yaml   -e ${TEMPLATEDIR7}/puppet-ceph-external.yaml  -e ${TEMPLATEDIR7}/scheduler_hin
ts.yaml -e ${TEMPLATEDIR7}/parameters/customer.yaml --control-scale 3 --compute-scale 4   --ceph-storage-scale 0   --control-flavor control --compute-flavor compute --ntp-server ${NTPSRV} --validation-errors-fatal  --block-storage-scale 0 --
swift-storage-scale 0


2. Wait for the deployment to finish
3. Watch Resources during deployment

Actual results:

FAILED Deployment. Nodes Up, but not finishing Post Deployment. Environment not operational 

Expected results:

Successfull OSP Deployment. 


Additional info:

There exists a puppet ticket for the same message: 

https://tickets.puppetlabs.com/browse/MODULES-3476

Here puppet 3.8.6 was in use. 

Current version on OSP8 is 

puppet-3.6.2-4.el7sat.noarch

openstack-puppet-modules-7.0.19-1.el7ost.noarch
openstack-tripleo-puppet-elements-0.0.5-1.el7ost.noarch

Comment 1 Francisco Javier Lopez Y Grueber 2016-11-03 14:40:47 UTC
Hi, 

there was a change in regards of the switch configuration. Network has been verified. All nodes are able ping each other on all interfaces. So, I assume this is ok. 

I am not investigating the irretating os-collect-config messages as in 

https://bugs.launchpad.net/os-collect-config/+bug/1437952

Comment 2 Chris Jones 2017-11-14 14:26:09 UTC
How reproducible is this issue? Do you have full deployment logs from a failed deployment? Or a deployment we could access that has this issue?

Comment 3 Dan Trainor 2017-11-30 21:02:05 UTC
I am able to consistently produce this in my environment, though for a different fact:

"stderr: \u001b[1;33mWarning: Facter: Could not retrieve fact='erl_ssl_path', resolution='<anonymous>': undefined method `gsub!' for false:FalseClass\u001b[0m",

Full 'openstack stack failures list overcloud --long' at http://pastebin.test.redhat.com/536719


I'm using the 2017-11-28.3 puddle, deploying via UI, with the following deployment plan options:

Base resources configuration, Containerized Deployment, environments/containers-default-parameters.yaml, environments/docker-ha.yaml, High Availability (Pacemaker)

The Overcloud deployment contains three controllers and one compute node.

I'll leave the environment up and allow access to it for Damien Ciabrini, on the suggestion of Chris Jones.

Comment 4 Chris Jones 2017-12-12 15:29:54 UTC
After some discussions, we feel this is more related to network configuration issues, than a bug. Please re-open if your understanding of this is different/changes.


Note You need to log in before you can comment on or make changes to this bug.