Bug 1379177 - When deploying a new overcloud, db_sync will fail to run properly
Summary: When deploying a new overcloud, db_sync will fail to run properly
Keywords:
Status: CLOSED INSUFFICIENT_DATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: rhosp-director
Version: 9.0 (Mitaka)
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: ---
Assignee: Angus Thomas
QA Contact: Omri Hochman
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-09-25 19:14 UTC by David Hill
Modified: 2017-03-01 17:04 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-03-01 17:03:56 UTC
Target Upstream Version:


Attachments (Terms of Use)

Description David Hill 2016-09-25 19:14:53 UTC
Description of problem:
When deploying a new overcloud, db_sync will fail to run properly because mysql doesn't seem to be available as per this error message:

Sep 25 18:25:47 overcloud-controller-0.localdomain os-collect-config[8404]: in 71.99 seconds\u001b[0m\n", "deploy_stderr": "\u001b[1;31mWarning: Scope(Class[Mongodb::Server]): Replset specified, but no replset_members or replset_config provided.\u001b[0m\n\u001b[1;31mWarning: Scope(Haproxy::Config[haproxy]): haproxy: The $merge_options parameter will default to true in the next major release. Please review the documentation regarding the implications.\u001b[0m\n\u001b[1;31mError: Could not prefetch mysql_user provider 'mysql': Execution of '/usr/bin/mysql -NBe SELECT CONCAT(User, '@',Host) AS User FROM mysql.user' returned 1: ERROR 2002 (HY000): Can't connect to local MySQL server through socket '/var/lib/mysql/mysql.sock' (2)\u001b[0m\n\u001b[1;31mError: Could not prefetch mysql_database provider 'mysql': Execution of '/usr/bin/mysql -NBe show databases' returned 1: ERROR 2002 (HY000): Can't connect to local MySQL server through socket '/var/lib/mysql/mysql.sock' (2)\u001b[0m\n", "deploy_status_code": 0}


Version-Release number of selected component (if applicable):


How reproducible:
Almost everytime

Steps to Reproduce:
1. Deploy a new overcloud
2.
3.

Actual results:
Fails at step2 of controller deployment

Expected results:
Succeeds

Additional info:

Comment 1 David Hill 2016-09-27 17:50:07 UTC
This issue starts with RHOSP 8.x.   As soon as the undercloud VM is under a bit of stress, step4 of overcloudcontrollerdeployment almost always fails to succeed.

Comment 2 James Slagle 2017-03-01 16:05:32 UTC
can't reproduce based on the steps:
1. Deploy a new overcloud

please reopen if this is reproducable

Comment 3 David Hill 2017-03-01 17:03:27 UTC
2. Make sure that you fail to deploy at least one compute node and that nova/ironic will put a state in error, delete the compute and retry creating one.

I found a work-around to this and it was to delete the computes/controller that had "deleted_at" != null in the "instances" table of the "nova" database.

You might have to do this in an older version of RHOSP (Like 8) because the original environment was RHOSP 8 which was updated to RHOSP 9 and then RHOSP 10.


Note You need to log in before you can comment on or make changes to this bug.