Hide Forgot
[OSP12] overcloud deployment fails on Bare-metal due to "ERROR 1047 (08S01) at line 1: WSREP has not yet prepared node for application use" Environment (puddle is 2017-08-18.3, images TAG=2017-08-18.2) -------------------------------------------------------------- openstack-tripleo-heat-templates-7.0.0-0.20170805163048.el7ost.noarch steps : ------- attempt deployment on BM : 3 controler , 1 compute results : ----------- - overcloud deployment fails: | AllNodesDeploySteps | e79423e8-dba6-4c0e-b350-0ed7869fd225 | OS::TripleO::PostDeploySteps | CREATE_FAILED | 2017-06-13T16:10:47Z | overcloud | | ControllerDeployment_Step1 | aa764d47-2a8b-4377-ad9f-4bfe7428527a | OS::Heat::StructuredDeploymentGroup | CREATE_FAILED | 2017-06-13T16:38:43Z | overcloud-AllNodesDeploySteps-3rqma3wcyrum | | 2 | 7c4c33eb-b4db-466f-b2b8-ac222f0e861a | OS::Heat::StructuredDeployment | CREATE_FAILED | 2017-06-13T16:41:48Z | overcloud-AllNodesDeploySteps-3rqma3wcyrum-ControllerDeployment_Step1-r3rbji33anuj | -------------- failure view : -------------- 2:36:00 2017-08-09 21:23:44Z [overcloud.AllNodesDeploySteps.ControllerDeployment_Step1]: CREATE_FAILED Resource CREATE failed: Error: resources[0]: Deployment to server failed: deploy_status_code : Deployment exited with non-zero status code: 2 02:36:00 2017-08-09 21:23:45Z [overcloud.AllNodesDeploySteps.ControllerDeployment_Step1]: CREATE_FAILED Error: resources.ControllerDeployment_Step1.resources[0]: Deployment to server failed: deploy_status_code: Deployment exited with non-zero status code: 2 02:36:00 2017-08-09 21:23:45Z [overcloud.AllNodesDeploySteps]: CREATE_FAILED Resource CREATE failed: Error: resources.ControllerDeployment_Step1.resources[0]: Deployment to server failed: deploy_status_code: Deployment exited with non-zero status code: 2 02:36:00 2017-08-09 21:23:46Z [overcloud.AllNodesDeploySteps]: CREATE_FAILED Error: resources.AllNodesDeploySteps.resources.ControllerDeployment_Step1.resources[0]: Deployment to server failed: deploy_status_code: Deployment exited with non-zero status code: 2 02:36:00 2017-08-09 21:23:46Z [overcloud]: CREATE_FAILED Resource CREATE failed: Error: resources.AllNodesDeploySteps.resources.ControllerDeployment_Step1.resources[0]: Deployment to server failed: deploy_status_code: Deployment exited with non-zero status code: 2 02:36:00 02:36:00 Stack overcloud CREATE_FAILED 02:36:00 02:36:00 overcloud.AllNodesDeploySteps.ControllerDeployment_Step1.1: 02:36:00 resource_type: OS::Heat::StructuredDeployment 02:36:00 physical_resource_id: d8769363-2a8f-4cab-a78a-77ea3e92dd85 02:36:00 status: CREATE_FAILED 02:36:00 status_reason: | 02:36:00 Error: resources[1]: Deployment to server failed: deploy_status_code : Deployment exited with non-zero status code: 2 02:36:00 deploy_stdout: | 02:36:00 ... 02:36:00 ], 02:36:00 "changed": false, 02:36:00 "failed": true, 02:36:00 "failed_when_result": true 02:36:00 } 02:36:00 to retry, use: --limit @/var/lib/heat-config/heat-config-ansible/1f2a23d0-5e97-4e7a-9a3a-4fdaf0ba918d_playbook.retry 02:36:00 02:36:00 PLAY RECAP ********************************************************************* 02:36:00 localhost : ok=8 changed=1 unreachable=0 failed=1 02:36:00 02:36:00 (truncated, view all with --long) 02:36:00 deploy_stderr: | 02:36:00 02:36:00 overcloud.AllNodesDeploySteps.ControllerDeployment_Step1.0: 02:36:00 resource_type: OS::Heat::StructuredDeployment 02:36:00 physical_resource_id: de764d1f-1256-4446-ad8c-f5e0b2a62a7a 02:36:00 status: CREATE_FAILED 02:36:00 status_reason: | 02:36:00 Error: resources[0]: Deployment to server failed: deploy_status_code : Deployment exited with non-zero status code: 2 02:36:00 deploy_stdout: | 02:36:00 ... 02:36:00 ], 02:36:00 "changed": false, 02:36:00 "failed": true, 02:36:00 "failed_when_result": true 02:36:00 } 02:36:00 to retry, use: --limit @/var/lib/heaHeat Stack create failed. 02:36:00 Heat Stack create failed. 02:36:00 t-config/heat-config-ansible/34fd2f6f-de26-4db7-993b-f1b9765ac2fa_playbook.retry 02:36:00 02:36:00 PLAY RECAP ********************************************************************* 02:36:00 localhost : ok=8 changed=1 unreachable=0 failed=1 02:36:00 02:36:00 (truncated, view all with --long) 02:36:00 deploy_stderr: | 02:36:00 02:36:00 Build step 'Virtualenv Builder' marked build as failure 02:36:01 Build step 'Groovy Postbuild' marked build as failure 02:36:01 Build step 'Groovy Postbuild' marked build as failure 02:36:01 [BFA] Scanning build for known causes... 02:36:02 .[BFA] Found failure cause(s): 02:36:02 [BFA] no-such-file-or-directory 02:36:02 [BFA] command not found from category shell 02:36:02 [BFA] Done. 1s 02:36:02 Started calculate disk usage of build 02:36:02 Finished Calculation of disk usage of build in 0 seconds 02:36:02 Started calculate disk usage of workspace 02:36:02 Finished Calculation of disk usage of workspace in 0 seconds 02:36:03 Finished: FAILURE (undercloud) [stack@undercloud ~]$ heat deployment-show 7c4c33eb-b4db-466f-b2b8-ac222f0e861a | less WARNING (shell) "heat deployment-show" is deprecated, please use "openstack software deployment show" instead { "status": "FAILED", "server_id": "10d956cc-9468-426c-b309-d078e081be90", "config_id": "8a882786-c197-4b73-b981-ea6339ddcdc3", "output_values": { "deploy_stdout": "\nPLAY [localhost] ***************************************************************\n\nTASK [Gathering Facts] *********************************************************\nok: [localhost]\n\nTASK [Write the config_step hieradata] *****************************************\nchanged: [localhost]\n\nTASK [Run puppet host configuration for step 1] ********************************\nok: [localhost]\n\nTASK [debug] *******************************************************************\nok: [localhost] => {\n \"(outputs.stderr|default('')).split('\\n')|union(outputs.stdout_lines|default([]))\": [\n \"exception: connect failed\", \n \"Warning: Undefined variable 'deploy_config_name'; \", \n \" (file & line not available)\", \n \"Warning: This method is deprecated, please use the stdlib validate_legacy function, with Stdlib::Compat::Bool. There is further documentation for validate_legacy function in the README. at [\\\"/etc/puppet/modules/ntp/manifests/init.pp\\\", 54]:[\\\"/etc/puppet/modules/tripleo/manifests/profile/base/time/ntp.pp\\\", 29]\", \n \" (at /etc/puppet/modules/stdlib/lib/puppet/functions/deprecation.rb:25:in `deprecation')\", \n \"Warning: This method is deprecated, please use the stdlib validate_legacy function, with Stdlib::Compat::Absolute_Path. There is further documentation for validate_legacy function in the README. at [\\\"/etc/puppet/modules/ntp/manifests/init.pp\\\", 55]:[\\\"/etc/puppet/modules/tripleo/manifests/profile/base/time/ntp.pp\\\", 29]\", \n \"Warning: This method is deprecated, please use the stdlib validate_legacy function, with Stdlib::Compat::String. There is further documentation for validate_legacy function in the README. at [\\\"/etc/puppet/modules/ntp/manifests/init.pp\\\", 56]:[\\\"/etc/puppet/modules/tripleo/manifests/profile/base/time/ntp.pp\\\", 29]\", \n \"Warning: This method is deprecated, please use the stdlib validate_legacy function, with Stdlib::Compat::Array. There is further documentation for validate_legacy function in the README. at [\\\"/etc/puppet/modules/ntp/manifests/init.pp\\\", 66]:[\\\"/etc/puppet/modules/tripleo/manifests/profile/base/time/ntp.pp\\\", 29]\", \n \"Warning: This method is deprecated, please use the stdlib validate_legacy function, with Pattern[]. There is further documentation for validate_legacy function in the README. at [\\\"/etc/puppet/modules/ntp/manifests/init.pp\\\", 68]:[\\\"/etc/puppet/modules/tripleo/manifests/profile/base/time/ntp.pp\\\", 29]\", \n \"Warning: This method is deprecated, please use the stdlib validate_legacy function, with Stdlib::Compat::Numeric. There is further documentation for validate_legacy function in the README. at [\\\"/etc/puppet/modules/ntp/manifests/init.pp\\\", 89]:[\\\"/etc/puppet/modules/tripleo/manifests/profile/base/time/ntp.pp\\\", 29]\", \n \"Warning: ModuleLoader: module 'ssh' has unresolved dependencies - it will only see those that are resolved. Use 'puppet module list --tree' to see information about modules\", \n \"Warning: This method is deprecated, please use the stdlib validate_legacy function, with Stdlib::Compat::Hash. There is further documentation for validate_legacy function in the README. at [\\\"/etc/puppet/modules/ssh/manifests/server.pp\\\", 12]:[\\\"/var/lib/tripleo-config/puppet_step_config.pp\\\", 31]\", \n \"Warning: ModuleLoader: module 'timezone' has unresolved dependencies - it will only see those that are resolved. Use 'puppet module list --tree' to see information about modules\", \n \"Notice: hiera(): Cannot load backend module_data: cannot load such file -- hiera/backend/module_data_backend\", \n \"Notice: Scope(Class[Tripleo::Firewall::Post]): At this stage, all network traffic is blocked.\", \n \"Notice: Compiled catalog for overcloud-controller-2.localdomain in environment production in 4.78 seconds\", \n \"Notice: /Stage[main]/Tripleo::Profile::Base::Docker/Augeas[docker-sysconfig-options]/returns: executed successfully\", \n \"Notice: /Stage[main]/Tripleo::Profile::Base::Docker/Augeas[docker-sysconfig-registry]/returns: executed successfully\", \n -------------------------------------------------------------------- connecting to controller-2 : --------------------------- [heat-admin@overcloud-controller-2 ~]$ sudo journalctl -u os-collect-config | grep ERROR Aug 21 01:56:47 overcloud-controller-2 os-collect-config[3368]: [2017-08-20 21:56:47,258] (heat-config) [ERROR] Skipping group os-apply-config with no hook script None Aug 21 01:57:11 overcloud-controller-2 os-collect-config[3368]: [2017-08-20 21:57:11,117] (heat-config) [ERROR] Skipping group os-apply-config with no hook script None Aug 21 01:57:32 overcloud-controller-2 os-collect-config[3368]: [2017-08-20 21:57:32,312] (heat-config) [ERROR] Skipping group os-apply-config with no hook script None Aug 21 01:58:08 overcloud-controller-2 os-collect-config[3368]: [2017-08-20 21:58:08,319] (heat-config) [ERROR] Skipping group os-apply-config with no hook script None Aug 21 01:58:51 overcloud-controller-2 os-collect-config[3368]: [2017-08-20 21:58:51,792] (heat-config) [ERROR] Skipping group os-apply-config with no hook script None Aug 21 02:01:12 overcloud-controller-2 os-collect-config[3368]: [2017-08-20 22:01:12,922] (heat-config) [ERROR] Skipping group os-apply-config with no hook script None Aug 21 02:02:30 overcloud-controller-2 os-collect-config[3368]: [2017-08-20 22:02:30,453] (heat-config) [ERROR] Skipping group os-apply-config with no hook script None Aug 21 02:03:18 overcloud-controller-2 os-collect-config[3368]: [2017-08-20 22:03:18,549] (heat-config) [ERROR] Skipping group os-apply-config with no hook script None Aug 21 02:04:32 overcloud-controller-2 os-collect-config[3368]: [2017-08-20 22:04:32,321] (heat-config) [ERROR] Skipping group os-apply-config with no hook script None Aug 21 02:06:21 overcloud-controller-2 os-collect-config[3368]: [2017-08-20 22:06:21,841] (heat-config) [ERROR] Skipping group os-apply-config with no hook script None Aug 21 02:16:11 overcloud-controller-2 os-collect-config[3368]: , \n \"OK, successfully used password, moving on...\\r\", \n \"Setting the root password ensures that nobody can log into the MariaDB\\r\", \n \"root user without the proper authorisation.\\r\", \n \"Set root password? [Y/n] y\\r\", \n \"New password: \\r\", \n \"Re-enter new password: \\r\", \n \"ERROR 1047 (08S01) at line 1: WSREP has not yet prepared node for application use\\r\", \n \"Password update failed!\\r\", \n \"Cleaning up...\\r\", \n \"Failed to get 'Remove anonymous users?' prompt\", \n \"stderr: INFO:__main__:Loading config file at /var/lib/kolla/config_files/config.json\", \n \"INFO:__main__:Validating config file\", \n \"INFO:__main__:Kolla config strategy set to: COPY_ALWAYS\", \n \"INFO:__main__:Copying service configuration files\", \n \"INFO:__main__:Copying /dev/null to /etc/libqb/force-filesystem-sockets\", \n \"INFO:__main__:Setting permission for /etc/libqb/force-filesystem-sockets\", \n \"INFO:__main__:Deleting /etc/my.cnf.d/galera.cnf\", \n \"INFO:__main__:Copying /var/lib/kolla/config_files/src/etc/my.cnf.d/galera.cnf to /etc/my.cnf.d/galera.cnf\", \n \"INFO:__main__:Copying /var/lib/kolla/config_files/src/etc/sysconfig/clustercheck to /etc/sysconfig/clustercheck\", \n \"INFO:__main__:Copying /var/lib/kolla/config_files/src/root/.my.cnf to /root/.my.cnf\", \n \"INFO:__main__:Writing out command to execute\", \n \"170821 2:15:59 [Warning] option 'open_files_limit': unsigned value 18446744073709551615 adjusted to 4294967295\", \n \"170821 2:16:01 [Warning] option 'open_files_limit': unsigned value 18446744073709551615 adjusted to 4294967295\", \n \"stdout: 1aadbb2545d1701665c7a1516d5dd60469779a4b67e5b4473038ffd3c3a7de48\"\n ], \n \"changed\": false, \n \"failed\": true, \n \"failed_when_result\": true\n}\n\tto retry, use: --limit @/var/lib/heat-config/heat-config-ansible/8a882786-c197-4b73-b981-ea6339ddcdc3_p Aug 21 02:16:12 overcloud-controller-2 os-collect-config[3368]: "ERROR 1047 (08S01) at line 1: WSREP has not yet prepared node for application use\r", Aug 21 02:16:12 overcloud-controller-2 os-collect-config[3368]: [2017-08-21 02:16:11,806] (heat-config) [ERROR] Error running /var/lib/heat-config/heat-config-ansible/8a882786-c197-4b73-b981-ea6339ddcdc3_playbook.yaml. [2] Aug 21 02:16:13 overcloud-controller-2 os-collect-config[3368]: [2017-08-21 02:16:13,816] (heat-config) [ERROR] Skipping group os-apply-config with no hook script None
Adding notes from IRC : ------------------------ (1) <bandini> ohochman, mandre: I think the reason is we are missing https://github.com/openstack/tripleo-heat-templates/commit/7968f37f6e65ed542340437f23b6c267b8b1f2ce from this tht (2) [10:32:26] <tvignaud> ohochman: ahrechan dciabrin|pto https://code.engineering.redhat.com/gerrit/115956 MariaDB: create clustercheck user at container bootstrap
above mentioned patch 489277 is in downstream build openstack-tripleo-heat-templates-7.0.0-0.20170901051303.0rc1.el7ost
looks like the issue reproduced with: openstack-tripleo-heat-templates-7.0.0-0.20170901051303.0rc1.el7ost.noarch \"In order to log into MariaDB to secure it, we'll need the current\\r\", \n \"password for the root user. If you've just installed MariaDB, and\\r\", \n \"you haven't set the root password yet, the password will be blank,\\r\", \n \"so you should just press enter here.\\r\", \n \"Enter current password for root (enter for none): \\r\", \n \"OK, successfully used password, moving on...\\r\", \n \"Setting the root password ensures that nobody can log into the MariaDB\\r\", \n \"root user without the proper authorisation.\\r\", \n \"Set root password? [Y/n] y\\r\", \n \"New password: \\r\", \n \"Re-enter new password: \\r\", \n \"ERROR 1047 (08S01) at line 1: WSREP has not yet prepared node for application use\\r\", \n \"Password update failed!\\r\", \n \"Cleaning up...\\r\", \n \"Failed to get 'Remove anonymous users?' prompt\", \n \"stderr: + '[' -e /var/lib/mysql/mysql ']'\", \n \"+ kolla_start\", \n \"INFO:__main__:Loading config file at /var/lib/kolla/config_files/config.json\", \n \"INFO:__main__:Validating config file\", \n \"INFO:__main__:Kolla config strategy set to: COPY_ALWAYS\", \n \"INFO:__main__:Copying service configuration files\", \n \"INFO:__main__:Copying /dev/null to /etc/libqb/force-filesystem-sockets\", \n \"INFO:__main__:Setting permission for /etc/libqb/force-filesystem-sockets\", \n \"INFO:__main__:Deleting /etc/my.cnf.d/galera.cnf\", \n \"INFO:__main__:Copying /var/lib/kolla/config_files/src/etc/my.cnf.d/galera.cnf to /etc/my.cnf.d/galera.cnf\", \n \"INFO:__main__:Copying /var/lib/kolla/config_files/src/etc/sysconfig/clustercheck to /etc/sysconfig/clustercheck\", \n \"INFO:__main__:Copying /var/lib/kolla/config_files/src/root/.my.cnf to /root/.my.cnf\", \n \"INFO:__main__:Writing out command to execute\", \n \"170913 17:51:25 [Warning] option 'open_files_limit': unsigned value 18446744073709551615 adjusted to 4294967295\", \n \"170913 17:51:27 [Warning] option 'open_files_limit': unsigned value 18446744073709551615 adjusted to 4294967295\", \n \"stdout: 386224818d266b6f82532fad1ac3fe0ea45ecd04c1db754cb5f80dd8a7474a19\"\n ], \n \"changed\": false, \n \"failed\": true, \n \"failed_when_result\": true\n}\n\tto retry, use: --limit @/var/lib/heat-config/heat-config-ansible/2de16e6c-e601-4c18-bd1b-ce5a267accf8_playbook.retry\n\nPLAY RECAP *********************************************************************\nlocalhost : ok=8 changed=1 unreachable=0 failed=1 \n\n",
controller-2 failed in mysql_bootstrap: Sep 13 17:51:40 overcloud-controller-2 os-collect-config[3191]: "Error running ['docker', 'run', '--name', 'mysql_bootstrap', '--label', 'config_id=tripleo_step1', '--label', 'container_name=mysql_bootstrap', '--label', 'managed_by=paunch', '--label', 'config_data={\"environment\": [\"KOLLA_CONFIG_STRATEGY=COPY_ALWAYS\", \"KOLLA_BOOTSTRAP=True\", \"KOLLA_KUBERNETES=True\", \"DB_MAX_TIMEOUT=60\", \"DB_CLUSTERCHECK_PASSWORD=wNZ6dj6DHpQg38Mn3wH4Mv9sj\", \"DB_ROOT_PASSWORD=kZlnL8BpMl\"], \"start_order\": 1, \"command\": [\"bash\", \"-ecx\", \"if [ -e /var/lib/mysql/mysql ]; then exit 0; fi\\\\nkolla_start\\\\nmysqld_safe --skip-networking --wsrep-on=OFF --wsrep-provider=none &\\\\ntimeout ${DB_MAX_TIMEOUT} /bin/bash -c \\'until mysqladmin -uroot -p\\\\\"${DB_ROOT_PASSWORD}\\\\\" ping 2>/dev/null; do sleep 1; done\\'\\\\nmysql -uroot -p\\\\\"${DB_ROOT_PASSWORD}\\\\\" -e \\\\\"CREATE USER \\'clustercheck\\'@\\'localhost\\' IDENTIFIED BY \\'${DB_CLUSTERCHECK_PASSWORD}\\';\\\\\"\\\\nmysql -uroot -p\\\\\"${DB_ROOT_PASSWORD}\\\\\" -e \\\\\"GRANT PROCESS ON *.* TO \\'clustercheck\\'@\\'localhost\\' WITH GRANT OPTION;\\\\\"\\\\ntimeout ${DB_MAX_TIMEOUT} mysqladmin -uroot -p\\\\\"${DB_ROOT_PASSWORD}\\\\\" shutdown\"], \"volumes\": [\"/etc/hosts:/etc/hosts:ro\", \"/etc/localtime:/etc/localtime:ro\", \"/etc/puppet:/etc/puppet:ro\", \"/etc/pki/ca-trust/extracted:/etc/pki/ca-trust/extracted:ro\", \"/etc/pki/tls/certs/ca-bundle.crt:/etc/pki/tls/certs/ca-bundle.crt:ro\", \"/etc/pki/tls/certs/ca-bundle.trust.crt:/etc/pki/tls/certs/ca-bundle.trust.crt:ro\", \"/etc/pki/tls/cert.pem:/etc/pki/tls/cert.pem:ro\", \"/dev/log:/dev/log\", \"/etc/ssh/ssh_known_hosts:/etc/ssh/ssh_known_hosts:ro\", \"/var/lib/kolla/config_files/mysql.json:/var/lib/kolla/config_files/config.json\", \"/var/lib/config-data/puppet-generated/mysql/:/var/lib/kolla/config_files/src:ro\", \"/var/lib/mysql:/var/lib/mysql\"], \"image\": \"192.168.24.1:8787/rhosp12/openstack-mariadb-docker:2017-09-08.3\", \"detach\": false, \"net\": \"host\"}', '--env=KOLLA_CONFIG_STRATEGY=COPY_ALWAYS', '--env=KOLLA_BOOTSTRAP=True', '--env=KOL Sep 13 17:51:40 overcloud-controller-2 os-collect-config[3191]: LA_KUBERNETES=True', '--env=DB_MAX_TIMEOUT=60', '--env=DB_CLUSTERCHECK_PASSWORD=wNZ6dj6DHpQg38Mn3wH4Mv9sj', '--env=DB_ROOT_PASSWORD=kZlnL8BpMl', '--net=host', '--volume=/etc/hosts:/etc/hosts:ro', '--volume=/etc/localtime:/etc/localtime:ro', '--volume=/etc/puppet:/etc/puppet:ro', '--volume=/etc/pki/ca-trust/extracted:/etc/pki/ca-trust/extracted:ro', '--volume=/etc/pki/tls/certs/ca-bundle.crt:/etc/pki/tls/certs/ca-bundle.crt:ro', '--volume=/etc/pki/tls/certs/ca-bundle.trust.crt:/etc/pki/tls/certs/ca-bundle.trust.crt:ro', '--volume=/etc/pki/tls/cert.pem:/etc/pki/tls/cert.pem:ro', '--volume=/dev/log:/dev/log', '--volume=/etc/ssh/ssh_known_hosts:/etc/ssh/ssh_known_hosts:ro', '--volume=/var/lib/kolla/config_files/mysql.json:/var/lib/kolla/config_files/config.json', '--volume=/var/lib/config-data/puppet-generated/mysql/:/var/lib/kolla/config_files/src:ro', '--volume=/var/lib/mysql:/var/lib/mysql', '192.168.24.1:8787/rhosp12/openstack-mariadb-docker:2017-09-08.3', 'bash', '-ecx', 'if [ -e /var/lib/mysql/mysql ]; then exit 0; fi\\nkolla_start\\nmysqld_safe --skip-networking --wsrep-on=OFF --wsrep-provider=none &\\ntimeout ${DB_MAX_TIMEOUT} /bin/bash -c \\'until mysqladmin -uroot -p\"${DB_ROOT_PASSWORD}\" ping 2>/dev/null; do sleep 1; done\\'\\nmysql -uroot -p\"${DB_ROOT_PASSWORD}\" -e \"CREATE USER \\'clustercheck\\'@\\'localhost\\' IDENTIFIED BY \\'${DB_CLUSTERCHECK_PASSWORD}\\';\"\\nmysql -uroot -p\"${DB_ROOT_PASSWORD}\" -e \"GRANT PROCESS ON *.* TO \\'clustercheck\\'@\\'localhost\\' WITH GRANT OPTION;\"\\ntimeout ${DB_MAX_TIMEOUT} mysqladmin -uroot -p\"${DB_ROOT_PASSWORD}\" shutdown']. [1]", Sep 13 17:51:40 overcloud-controller-2 os-collect-config[3191]: "stdout: Installing MariaDB/MySQL system tables in '/var/lib/mysql' ...", Sep 13 17:51:40 overcloud-controller-2 os-collect-config[3191]: "OK", Sep 13 17:51:40 overcloud-controller-2 os-collect-config[3191]: "Filling help tables...", Sep 13 17:51:40 overcloud-controller-2 os-collect-config[3191]: "To start mysqld at boot time you have to copy", Sep 13 17:51:40 overcloud-controller-2 os-collect-config[3191]: "support-files/mysql.server to the right place for your system", Sep 13 17:51:40 overcloud-controller-2 os-collect-config[3191]: "PLEASE REMEMBER TO SET A PASSWORD FOR THE MariaDB root USER !", Sep 13 17:51:40 overcloud-controller-2 os-collect-config[3191]: "To do so, start the server, then issue the following commands:", Sep 13 17:51:40 overcloud-controller-2 os-collect-config[3191]: "'/usr/bin/mysqladmin' -u root password 'new-password'", Sep 13 17:51:40 overcloud-controller-2 os-collect-config[3191]: "'/usr/bin/mysqladmin' -u root -h overcloud-controller-2 password 'new-password'", Sep 13 17:51:40 overcloud-controller-2 os-collect-config[3191]: "Alternatively you can run:", Sep 13 17:51:40 overcloud-controller-2 os-collect-config[3191]: "'/usr/bin/mysql_secure_installation'", Sep 13 17:51:40 overcloud-controller-2 os-collect-config[3191]: "which will also give you the option of removing the test", Sep 13 17:51:40 overcloud-controller-2 os-collect-config[3191]: "databases and anonymous user created by default. This is", Sep 13 17:51:40 overcloud-controller-2 os-collect-config[3191]: "strongly recommended for production servers.", Sep 13 17:51:40 overcloud-controller-2 os-collect-config[3191]: "See the MariaDB Knowledgebase at http://mariadb.com/kb or the", Sep 13 17:51:40 overcloud-controller-2 os-collect-config[3191]: "MySQL manual for more instructions.", Sep 13 17:51:40 overcloud-controller-2 os-collect-config[3191]: "You can start the MariaDB daemon with:", Sep 13 17:51:40 overcloud-controller-2 os-collect-config[3191]: "cd '/usr' ; /usr/bin/mysqld_safe --datadir='/var/lib/mysql'", Sep 13 17:51:40 overcloud-controller-2 os-collect-config[3191]: "You can test the MariaDB daemon with mysql-test-run.pl", Sep 13 17:51:40 overcloud-controller-2 os-collect-config[3191]: "cd '/usr/mysql-test' ; perl mysql-test-run.pl", Sep 13 17:51:40 overcloud-controller-2 os-collect-config[3191]: "Please report any problems at http://mariadb.org/jira", Sep 13 17:51:40 overcloud-controller-2 os-collect-config[3191]: "The latest information about MariaDB is available at http://mariadb.org/.", Sep 13 17:51:40 overcloud-controller-2 os-collect-config[3191]: "You can find additional information about the MySQL part at:", Sep 13 17:51:40 overcloud-controller-2 os-collect-config[3191]: "http://dev.mysql.com", Sep 13 17:51:40 overcloud-controller-2 os-collect-config[3191]: "Support MariaDB development by buying support/new features from MariaDB", Sep 13 17:51:40 overcloud-controller-2 os-collect-config[3191]: "Corporation Ab. You can contact us about this at sales.", Sep 13 17:51:40 overcloud-controller-2 os-collect-config[3191]: "Alternatively consider joining our community based development effort:", Sep 13 17:51:40 overcloud-controller-2 os-collect-config[3191]: "http://mariadb.com/kb/en/contributing-to-the-mariadb-project/", Sep 13 17:51:40 overcloud-controller-2 os-collect-config[3191]: "170913 17:51:29 mysqld_safe Logging to '/var/log/mariadb/mariadb.log'.", Sep 13 17:51:40 overcloud-controller-2 os-collect-config[3191]: "170913 17:51:29 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql", Sep 13 17:51:40 overcloud-controller-2 os-collect-config[3191]: "170913 17:51:29 mysqld_safe WSREP: Running position recovery with --log_error='/var/lib/mysql/wsrep_recovery.xYeqx3' --pid-file='/var/lib/mysql/overcloud-controller-2-recover.pid'", Sep 13 17:51:40 overcloud-controller-2 os-collect-config[3191]: "170913 17:51:32 mysqld_safe WSREP: Recovered position 00000000-0000-0000-0000-000000000000:-1", Sep 13 17:51:40 overcloud-controller-2 os-collect-config[3191]: "spawn mysql_secure_installation\r", Sep 13 17:51:40 overcloud-controller-2 os-collect-config[3191]: "/usr/bin/mysql_secure_installation: line 379: find_mysql_client: command not found\r", Sep 13 17:51:40 overcloud-controller-2 os-collect-config[3191]: "\r", Sep 13 17:51:40 overcloud-controller-2 os-collect-config[3191]: "NOTE: RUNNING ALL PARTS OF THIS SCRIPT IS RECOMMENDED FOR ALL MariaDB\r", Sep 13 17:51:40 overcloud-controller-2 os-collect-config[3191]: " SERVERS IN PRODUCTION USE! PLEASE READ EACH STEP CAREFULLY!\r", Sep 13 17:51:40 overcloud-controller-2 os-collect-config[3191]: "In order to log into MariaDB to secure it, we'll need the current\r", Sep 13 17:51:40 overcloud-controller-2 os-collect-config[3191]: "password for the root user. If you've just installed MariaDB, and\r", Sep 13 17:51:40 overcloud-controller-2 os-collect-config[3191]: "you haven't set the root password yet, the password will be blank,\r", Sep 13 17:51:40 overcloud-controller-2 os-collect-config[3191]: "so you should just press enter here.\r", Sep 13 17:51:40 overcloud-controller-2 os-collect-config[3191]: "Enter current password for root (enter for none): \r", Sep 13 17:51:40 overcloud-controller-2 os-collect-config[3191]: "OK, successfully used password, moving on...\r", Sep 13 17:51:40 overcloud-controller-2 os-collect-config[3191]: "Setting the root password ensures that nobody can log into the MariaDB\r", Sep 13 17:51:40 overcloud-controller-2 os-collect-config[3191]: "root user without the proper authorisation.\r", Sep 13 17:51:40 overcloud-controller-2 os-collect-config[3191]: "Set root password? [Y/n] y\r", Sep 13 17:51:40 overcloud-controller-2 os-collect-config[3191]: "New password: \r", Sep 13 17:51:40 overcloud-controller-2 os-collect-config[3191]: "Re-enter new password: \r", Sep 13 17:51:40 overcloud-controller-2 os-collect-config[3191]: "ERROR 1047 (08S01) at line 1: WSREP has not yet prepared node for application use\r", Sep 13 17:51:40 overcloud-controller-2 os-collect-config[3191]: "Password update failed!\r", Sep 13 17:51:40 overcloud-controller-2 os-collect-config[3191]: "Cleaning up...\r", Sep 13 17:51:40 overcloud-controller-2 os-collect-config[3191]: "Failed to get 'Remove anonymous users?' prompt", Sep 13 17:51:40 overcloud-controller-2 os-collect-config[3191]: "stderr: + '[' -e /var/lib/mysql/mysql ']'", Sep 13 17:51:40 overcloud-controller-2 os-collect-config[3191]: "+ kolla_start", Interestingly enough it did work on controller-0: 11c5320a7b9c 192.168.24.1:8787/rhosp12/openstack-mariadb-docker:2017-09-08.3 "bash -ecx 'if [ -e /" 11 hours ago Exited (0) 11 hours ago mysql_bootstrap Same on controller-1: 4c8be359a5db 192.168.24.1:8787/rhosp12/openstack-mariadb-docker:2017-09-08.3 "bash -ecx 'if [ -e /" 11 hours ago Exited (0) 11 hours ago mysql_bootstrap Only contorller-2 failed. If I start the container by hand I get the following: [root@overcloud-controller-2 ~]# docker run --rm -it -e "KOLLA_CONFIG_STRATEGY=COPY_ALWAYS" -e "KOLLA_BOOTSTRAP=True" -e "KOLLA_KUBERNETES=True" -e "DB_MAX_TIMEOUT=60" -e "DB_CLUSTERCHECK_PASSWORD=wNZ6dj6DHpQg38Mn3wH4Mv9sj" -e "DB_ROOT_PASSWORD=kZlnL8BpMl" -e "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin" -e "container=oci" -e "KOLLA_BASE_DISTRO=rhel" -e "KOLLA_INSTALL_TYPE=binary" -e "KOLLA_INSTALL_METATYPE=rhos" -e "PS1=$(tput bold)($(printenv KOLLA_SERVICE_NAME))$(tput sgr0)[$(id -un)@$(hostname -s) $(pwd)]$ " -v /dev/log:/dev/log -v /etc/ssh/ssh_known_hosts:/etc/ssh/ssh_known_hosts:ro -v /var/lib/mysql:/var/lib/mysql -v /etc/pki/tls/certs/ca-bundle.crt:/etc/pki/tls/certs/ca-bundle.crt:ro -v /etc/pki/tls/certs/ca-bundle.trust.crt:/etc/pki/tls/certs/ca-bundle.trust.crt:ro -v /var/lib/config-data/puppet-generated/mysql:/var/lib/kolla/config_files/src:ro -v /etc/hosts:/etc/hosts:ro -v /etc/puppet:/etc/puppet:ro -v /etc/pki/tls/cert.pem:/etc/pki/tls/cert.pem:ro -v /var/lib/kolla/config_files/mysql.json:/var/lib/kolla/config_files/config.json -v /etc/localtime:/etc/localtime:ro -v /etc/pki/ca-trust/extracted:/etc/pki/ca-trust/extracted:ro -ti 192.168.24.1:8787/rhosp12/openstack-mariadb-docker:2017-09-08.3 /bin/bash ()[root@overcloud-controller-2 /root]$ id uid=42434(mysql) gid=42434(mysql) groups=42434(mysql),42400(kolla) ()[root@overcloud-controller-2 /root]$ export TERM=xterm Now if I run the same command that THT runs: ()[root@overcloud-controller-2 /root]$ mysqld_safe --skip-networking --wsrep-on=OFF --wsrep-provider=none & [1] 10 ()[root@overcloud-controller-2 /root]$ 170914 05:50:58 mysqld_safe Logging to '/var/log/mariadb/mariadb.log'. 170914 05:50:58 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql 170914 05:50:58 mysqld_safe WSREP: Running position recovery with --log_error='/var/lib/mysql/wsrep_recovery.G83dzc' --pid-file='/var/lib/mysql/003dbc6362fb-recover.pid' 170914 05:51:00 mysqld_safe WSREP: Recovered position 2db1852d-98ac-11e7-93b0-824a02ed3674:0 ()[root@overcloud-controller-2 /root]$ more /var/log/mariadb/mariadb.log 170914 05:50:58 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql 170914 05:50:58 mysqld_safe WSREP: Running position recovery with --log_error='/var/lib/mysql/wsrep_recovery.G83dzc' --pid-file='/var/lib/mysql/003dbc6362fb-recover.pid' nohup: ignoring input 170914 05:51:00 mysqld_safe WSREP: Recovered position 2db1852d-98ac-11e7-93b0-824a02ed3674:0 170914 5:51:00 [Note] WSREP: wsrep_start_position var submitted: '2db1852d-98ac-11e7-93b0-824a02ed3674:0' 170914 5:51:00 [Warning] You need to use --log-bin to make --binlog-format work. 170914 5:51:00 InnoDB: The InnoDB memory heap is disabled 170914 5:51:00 InnoDB: Mutexes and rw_locks use GCC atomic builtins 170914 5:51:00 InnoDB: Compressed tables use zlib 1.2.7 170914 5:51:00 InnoDB: Using Linux native AIO 170914 5:51:00 InnoDB: Initializing buffer pool, size = 128.0M 170914 5:51:00 InnoDB: Completed initialization of buffer pool 170914 5:51:00 InnoDB: highest supported file format is Barracuda. 170914 5:51:00 InnoDB: Waiting for the background threads to start 170914 5:51:01 Percona XtraDB (http://www.percona.com) 5.5.41-MariaDB-37.0 started; log sequence number 1598283 170914 5:51:01 [Note] Plugin 'FEEDBACK' is disabled. 170914 5:51:01 [Note] Event Scheduler: Loaded 0 events 170914 5:51:01 [Note] WSREP: Initial position: 2db1852d-98ac-11e7-93b0-824a02ed3674:0 170914 5:51:01 [Note] WSREP: wsrep_load(): loading provider library 'none' 170914 5:51:01 [Note] /usr/libexec/mysqld: ready for connections. Version: '5.5.42-MariaDB-wsrep' socket: '/var/lib/mysql/mysql.sock' port: 0 MariaDB Server, wsrep_25.11.r4026 What I don't get is is this WSREP recovering position stuff is normal. Will chat with Damien once he is up. (leaving needinfo on me)
From the logs, this seems to happen during the regular kolla bootstrap shell script, so _before_ we add the clustercheck user. I'm currently looking at the kolla steps in detail to find out why it would randomly fail.
Omri, So it looks to me that some of the calls to msqld_safe were still happening with WSREP enabled while they shouldn't during the bootstrap of the DB. Could you maybe try to apply this patch from https://review.openstack.org/#/c/504049/ and see if it fixes your issues?
Hi Damien thanks for your help , I saw the deployment fails again on the BM env after applying the patch. I assume you're still reviewing the issue, just wanted to show the output I'm getting on the controller : [root@overcloud-controller-0 ~]# journalctl -u os-collect-config | grep ERROR Sep 14 19:34:12 overcloud-controller-0 os-collect-config[3395]: O:__main__:Copying service configuration files\", \n \"INFO:__main__:Copying /dev/null to /etc/libqb/force-filesystem-sockets\", \n \"INFO:__main__:Setting permission for /etc/libqb/force-filesystem-sockets\", \n \"ERROR:__main__:MissingRequiredSource: /var/lib/kolla/config_files/src/* file is not found\", \n \"stdout: 4b35cbd91e0a094b5c07be5a1d156914d89ba2cab1f13b737b2e6f4f567bc936\"\n ], \n \"changed\": false, \n \"failed\": true, \n \"failed_when_result\": true\n}\n\tto retry, use: --limit @/var/lib/heat-config/heat-config-ansible/b6072a53-edef-4116-933f-43dc524b6aeb_playbook.retry\n\nPLAY RECAP *********************************************************************\nlocalhost : ok=8 changed=1 unreachable=0 failed=1 \n\n", "deploy_stderr": "", "deploy_status_code": 2} Sep 14 19:34:12 overcloud-controller-0 os-collect-config[3395]: "ERROR:__main__:MissingRequiredSource: /var/lib/kolla/config_files/src/* file is not found", Sep 14 19:34:12 overcloud-controller-0 os-collect-config[3395]: [2017-09-14 19:34:12,671] (heat-config) [ERROR] Error running /var/lib/heat-config/heat-config-ansible/b6072a53-edef-4116-933f-43dc524b6aeb_playbook.yaml. [2] not sure why this playbook fails : [root@overcloud-controller-0 ~]# less /var/lib/heat-config/heat-config-ansible/b6072a53-edef-4116-933f-43dc524b6aeb_playbook.yaml http://pastebin.test.redhat.com/516337
Yes, the version you tested was still under review. I've fixed that error and did a deploy on a downstream env to validate it. The review upstream has all the +2 needed, I'm waiting for the jobs to pass green.
*** Bug 1492314 has been marked as a duplicate of this bug. ***
fix merged in Master and Pike upstream
Unable to reproduce with : OpenStack-12.0-RHEL-7 Puddle: 2017-10-31.1
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2017:3462