Bug 1592505 - OVN - Deploying OSP13 fails in the overcloud deployment phase
Summary: OVN - Deploying OSP13 fails in the overcloud deployment phase
Keywords:
Status: CLOSED DUPLICATE of bug 1578849
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo
Version: 13.0 (Queens)
Hardware: Unspecified
OS: Unspecified
unspecified
urgent
Target Milestone: ---
: ---
Assignee: James Slagle
QA Contact: Arik Chernetsky
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-06-18 17:31 UTC by Daniel Alvarez Sanchez
Modified: 2018-06-20 16:33 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-06-20 16:33:05 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Daniel Alvarez Sanchez 2018-06-18 17:31:47 UTC
I'm hitting this issue attempting to deploy OSP13 with OVN as a backend using latest passed phase1 puddle:

    TASK [Start containers for step 1] *********************************************
    ok: [localhost]

    TASK [Debug output for task which failed: Start containers for step 1] *********
    fatal: [localhost]: FAILED! => {
        "changed": false,
        "failed_when_result": true,
        "outputs.stdout_lines|default([])|union(outputs.stderr_lines|default([]))": [
            "stdout: Trying to pull repository 192.168.24.1:8787/rhosp13/openstack-cinder-volume ... ",
            "2018-06-15.2: Pulling from 192.168.24.1:8787/rhosp13/openstack-cinder-volume",
            "e0f71f706c2a: Already exists",
            "121ab4741000: Already exists",
            "2824d3a8e244: Already exists",
            "f574ab7e56a4: Already exists",
            "d48a80b9272a: Already exists",
            "36a0c8e1d9d9: Pulling fs layer",
            "36a0c8e1d9d9: Verifying Checksum",
            "36a0c8e1d9d9: Download complete",
            "36a0c8e1d9d9: Pull complete",
            "Digest: sha256:3208d63a7b9efbcfdcd3e6aaf48037764548a09590f9d000a82aafe46a51e2f1",
            "Status: Downloaded newer image for 192.168.24.1:8787/rhosp13/openstack-cinder-volume:2018-06-15.2",
            "",
            "stderr: ",
            "stdout: ",
            "stdout: 6d2d7be10850597b8a6b06c56f5824397d09d9209a9ef751a0ed9f04d523c90f",
            "Error running ['docker', 'run', '--name', 'mysql_bootstrap', '--label', 'config_id=tripleo_step1', '--label', 'container_name=mysql_bootstrap', '--label', 'managed_b
y=paunch', '--label', 'config_data={\"start_order\": 1, \"image\": \"192.168.24.1:8787/rhosp13/openstack-mariadb:2018-06-15.2\", \"environment\": [\"KOLLA_CONFIG_STRATEGY=COPY_AL
WAYS\", \"KOLLA_BOOTSTRAP=True\", \"DB_MAX_TIMEOUT=60\", \"DB_CLUSTERCHECK_PASSWORD=GCd4kYCPCMTF82K3zdcycXkE6\", \"DB_ROOT_PASSWORD=dvesBgF4Jg\", \"TRIPLEO_CONFIG_HASH=0541b52d76
f205c6977f8bc68d9068bd\"], \"command\": [\"bash\", \"-ec\", \"if [ -e /var/lib/mysql/mysql ]; then exit 0; fi\\\\necho -e \\\\\"\\\\\\\\n[mysqld]\\\\\\\\nwsrep_provider=none\\\\\
" >> /etc/my.cnf\\\\nkolla_set_configs\\\\nsudo -u mysql -E kolla_extend_start\\\\nmysqld_safe --skip-networking --wsrep-on=OFF &\\\\ntimeout ${DB_MAX_TIMEOUT} /bin/bash -c \\'un
til mysqladmin -uroot -p\\\\\"${DB_ROOT_PASSWORD}\\\\\" ping 2>/dev/null; do sleep 1; done\\'\\\\nmysql -uroot -p\\\\\"${DB_ROOT_PASSWORD}\\\\\" -e \\\\\"CREATE USER \\'clusterch
eck\\'@\\'localhost\\' IDENTIFIED BY \\'${DB_CLUSTERCHECK_PASSWORD}\\';\\\\\"\\\\nmysql -uroot -p\\\\\"${DB_ROOT_PASSWORD}\\\\\" -e \\\\\"GRANT PROCESS ON *.* TO \\'clustercheck\
\'@\\'localhost\\' WITH GRANT OPTION;\\\\\"\\\\ntimeout ${DB_MAX_TIMEOUT} mysqladmin -uroot -p\\\\\"${DB_ROOT_PASSWORD}\\\\\" shutdown\"], \"user\": \"root\", \"volumes\": [\"/et
c/hosts:/etc/hosts:ro\", \"/etc/localtime:/etc/localtime:ro\", \"/etc/pki/ca-trust/extracted:/etc/pki/ca-trust/extracted:ro\", \"/etc/pki/tls/certs/ca-bundle.crt:/etc/pki/tls/cer
ts/ca-bundle.crt:ro\", \"/etc/pki/tls/certs/ca-bundle.trust.crt:/etc/pki/tls/certs/ca-bundle.trust.crt:ro\", \"/etc/pki/tls/cert.pem:/etc/pki/tls/cert.pem:ro\", \"/dev/log:/dev/l
og\", \"/etc/ssh/ssh_known_hosts:/etc/ssh/ssh_known_hosts:ro\", \"/etc/puppet:/etc/puppet:ro\", \"/var/lib/kolla/config_files/mysql.json:/var/lib/kolla/config_files/config.json\"
, \"/var/lib/config-data/puppet-generated/mysql/:/var/lib/kolla/config_files/src:ro\", \"/var/lib/mysql:/var/lib/mysql\"], \"net\": \"host\", \"detach\": false}', '--env=KOLLA_CO
NFIG_STRATEGY=COPY_ALWAYS', '--env=KOLLA_BOOTSTRAP=True', '--env=DB_MAX_TIMEOUT=60', '--env=DB_CLUSTERCHECK_PASSWORD=GCd4kYCPCMTF82K3zdcycXkE6', '--env=DB_ROOT_PASSWORD=dvesBgF4J
g', '--env=TRIPLEO_CONFIG_HASH=0541b52d76f205c6977f8bc68d9068bd', '--net=host', '--user=root', '--volume=/etc/hosts:/etc/hosts:ro', '--volume=/etc/localtime:/etc/localtime:ro', '--volume=/etc/pki/ca-trust/extracted:/etc/pki/ca-trust/extracted:ro', '--volume=/etc/pki/tls/certs/ca-bundle.crt:/etc/pki/tls/certs/ca-bundle.crt:ro', '--volume=/etc/pki/tls/certs/ca-bundle.trust.crt:/etc/pki/tls/certs/ca-bundle.trust.crt:ro', '--volume=/etc/pki/tls/cert.pem:/etc/pki/tls/cert.pem:ro', '--volume=/dev/log:/dev/log', '--volume=/etc/ssh/ssh_known_hosts:/etc/ssh/ssh_known_hosts:ro', '--volume=/etc/puppet:/etc/puppet:ro', '--volume=/var/lib/kolla/config_files/mysql.json:/var/lib/kolla/config_files/config.json', '--volume=/var/lib/config-data/puppet-generated/mysql/:/var/lib/kolla/config_files/src:ro', '--volume=/var/lib/mysql:/var/lib/mysql', '192.168.24.1:8787/rhosp13/openstack-mariadb:2018-06-15.2', 'bash', '-ec', 'if [ -e /var/lib/mysql/mysql ]; then exit 0; fi\\necho -e \"\\\\n[mysqld]\\\\nwsrep_provider=none\" >> /etc/my.cnf\\nkolla_set_configs\\nsudo -u mysql -E kolla_extend_start\\nmysqld_safe --skip-networking --wsrep-on=OFF &\\ntimeout ${DB_MAX_TIMEOUT} /bin/bash -c \\'until mysqladmin -uroot -p\"${DB_ROOT_PASSWORD}\" ping 2>/dev/null; do sleep 1; done\\'\\nmysql -uroot -p\"${DB_ROOT_PASSWORD}\" -e \"CREATE USER \\'clustercheck\\'@\\'localhost\\' IDENTIFIED BY \\'${DB_CLUSTERCHECK_PASSWORD}\\';\"\\nmysql -uroot -p\"${DB_ROOT_PASSWORD}\" -e \"GRANT PROCESS ON *.* TO \\'clustercheck\\'@\\'localhost\\' WITH GRANT OPTION;\"\\ntimeout ${DB_MAX_TIMEOUT} mysqladmin -uroot -p\"${DB_ROOT_PASSWORD}\" shutdown']. [2]",
            "stderr: INFO:__main__:Loading config file at /var/lib/kolla/config_files/config.json",
            "INFO:__main__:Validating config file",
            "INFO:__main__:Kolla config strategy set to: COPY_ALWAYS",
            "INFO:__main__:Copying service configuration files",
            "INFO:__main__:Copying /dev/null to /etc/libqb/force-filesystem-sockets",
            "INFO:__main__:Setting permission for /etc/libqb/force-filesystem-sockets",
            "INFO:__main__:Deleting /etc/my.cnf.d/galera.cnf",
            "INFO:__main__:Copying /var/lib/kolla/config_files/src/etc/my.cnf.d/galera.cnf to /etc/my.cnf.d/galera.cnf",
            "INFO:__main__:Copying /var/lib/kolla/config_files/src/etc/sysconfig/clustercheck to /etc/sysconfig/clustercheck",
            "INFO:__main__:Deleting /etc/hosts",
            "ERROR:__main__:Unexpected error:",
            "Traceback (most recent call last):",
            "  File \"/usr/local/bin/kolla_set_configs\", line 411, in main",
            "    execute_config_strategy(config)",
            "  File \"/usr/local/bin/kolla_set_configs\", line 377, in execute_config_strategy",
            "    copy_config(config)",
            "  File \"/usr/local/bin/kolla_set_configs\", line 306, in copy_config",
            "    config_file.copy()",
            "  File \"/usr/local/bin/kolla_set_configs\", line 150, in copy",
            "    self._merge_directories(source, dest)",
            "  File \"/usr/local/bin/kolla_set_configs\", line 97, in _merge_directories",
            "    os.path.join(dest, to_copy))",
            "  File \"/usr/local/bin/kolla_set_configs\", line 99, in _merge_directories",
            "    self._copy_file(source, dest)",
            "  File \"/usr/local/bin/kolla_set_configs\", line 75, in _copy_file",
            "    self._delete_path(dest)",
            "  File \"/usr/local/bin/kolla_set_configs\", line 108, in _delete_path",
            "    os.remove(path)",
            "OSError: [Errno 16] Device or resource busy: '/etc/hosts'",
            "stdout: 31283b13088bfd5cca6f56861cc07009dbae3bdf68c4d75cc38e0d0ea60ed264"
        ]
    }

Comment 2 David Peacock 2018-06-18 20:23:26 UTC
Please can you reproduce this in a non-infrared env; that is to say a regular deployment.

Looks like an env problem at first glance.

Comment 3 Daniel Alvarez Sanchez 2018-06-18 20:40:35 UTC
It's a freshly provisioned BM with CentOS. You want me to use Director instead?
What kind of env problem are you thinking of? 
Thanks!

Comment 5 Daniel Alvarez Sanchez 2018-06-19 21:27:46 UTC
Through docker logs I see the following:

INFO:__main__:Creating directory /etc/rabbitmq/ssl
INFO:__main__:Copying /var/lib/kolla/config_files/src/etc/rabbitmq/inetrc to /etc/rabbitmq/inetrc
INFO:__main__:Copying /var/lib/kolla/config_files/src/etc/rabbitmq/rabbitmq-env.conf to /etc/rabbitmq/rabbitmq-env.conf
INFO:__main__:Deleting /etc/rabbitmq/rabbitmq.config
INFO:__main__:Copying /var/lib/kolla/config_files/src/etc/rabbitmq/rabbitmq.config to /etc/rabbitmq/rabbitmq.config
INFO:__main__:Copying /var/lib/kolla/config_files/src/etc/rabbitmq/rabbitmqadmin.conf to /etc/rabbitmq/rabbitmqadmin.conf
INFO:__main__:Copying /var/lib/kolla/config_files/src/etc/security/limits.d/rabbitmq-server.conf to /etc/security/limits.d/rabbitmq-server.conf
INFO:__main__:Creating directory /etc/systemd/system/rabbitmq-server.service.d
INFO:__main__:Copying /var/lib/kolla/config_files/src/etc/systemd/system/rabbitmq-server.service.d/limits.conf to /etc/systemd/system/rabbitmq-server.service.d/limits.conf
INFO:__main__:Deleting /etc/hosts
ERROR:__main__:Unexpected error:
Traceback (most recent call last):
  File "/usr/local/bin/kolla_set_configs", line 411, in main
    execute_config_strategy(config)
  File "/usr/local/bin/kolla_set_configs", line 377, in execute_config_strategy
    copy_config(config)
  File "/usr/local/bin/kolla_set_configs", line 306, in copy_config
    config_file.copy()
  File "/usr/local/bin/kolla_set_configs", line 150, in copy
    self._merge_directories(source, dest)
  File "/usr/local/bin/kolla_set_configs", line 97, in _merge_directories
    os.path.join(dest, to_copy))
  File "/usr/local/bin/kolla_set_configs", line 99, in _merge_directories
    self._copy_file(source, dest)
  File "/usr/local/bin/kolla_set_configs", line 75, in _copy_file
    self._delete_path(dest)
  File "/usr/local/bin/kolla_set_configs", line 108, in _delete_path
    os.remove(path)
OSError: [Errno 16] Device or resource busy: '/etc/hosts'






[root@controller-2 ~]# docker logs 7e3686b1f5ef
INFO:__main__:Loading config file at /var/lib/kolla/config_files/config.json
INFO:__main__:Validating config file
INFO:__main__:Kolla config strategy set to: COPY_ALWAYS
INFO:__main__:Copying service configuration files
INFO:__main__:Copying /dev/null to /etc/libqb/force-filesystem-sockets
INFO:__main__:Setting permission for /etc/libqb/force-filesystem-sockets
INFO:__main__:Deleting /etc/my.cnf.d/galera.cnf
INFO:__main__:Copying /var/lib/kolla/config_files/src/etc/my.cnf.d/galera.cnf to /etc/my.cnf.d/galera.cnf
ERROR:__main__:Unexpected error:
Traceback (most recent call last):
  File "/usr/local/bin/kolla_set_configs", line 411, in main
    execute_config_strategy(config)
  File "/usr/local/bin/kolla_set_configs", line 377, in execute_config_strategy
    copy_config(config)
  File "/usr/local/bin/kolla_set_configs", line 306, in copy_config
    config_file.copy()
  File "/usr/local/bin/kolla_set_configs", line 150, in copy
    self._merge_directories(source, dest)
  File "/usr/local/bin/kolla_set_configs", line 97, in _merge_directories
    os.path.join(dest, to_copy))
  File "/usr/local/bin/kolla_set_configs", line 97, in _merge_directories
    os.path.join(dest, to_copy))
  File "/usr/local/bin/kolla_set_configs", line 97, in _merge_directories
    os.path.join(dest, to_copy))
  File "/usr/local/bin/kolla_set_configs", line 92, in _merge_directories
    self._set_properties(source, dest)
  File "/usr/local/bin/kolla_set_configs", line 117, in _set_properties
    self._set_properties_from_file(source, dest)
  File "/usr/local/bin/kolla_set_configs", line 122, in _set_properties_from_file
    shutil.copystat(source, dest)
  File "/usr/lib64/python2.7/shutil.py", line 98, in copystat
    os.utime(dst, (st.st_atime, st.st_mtime))
OSError: [Errno 30] Read-only file system: '/etc/pki/ca-trust/extracted'

Comment 8 Damien Ciabrini 2018-06-20 14:48:44 UTC
For some reason on that deployment, docker-puppet.py seems to copy file /etc/hosts and directory /etc/pki into every service's /var/lib/config-data/puppet-generated/{service}, which is unexpected and invalid.

[...]
/var/lib/config-data/puppet-generated/redis/etc/pki
/var/lib/config-data/puppet-generated/redis/etc/hosts
/var/lib/config-data/puppet-generated/heat/etc/pki
/var/lib/config-data/puppet-generated/heat/etc/hosts
/var/lib/config-data/puppet-generated/nova/etc/pki
/var/lib/config-data/puppet-generated/nova/etc/hosts
/var/lib/config-data/puppet-generated/glance_api/etc/pki
/var/lib/config-data/puppet-generated/glance_api/etc/hosts
/var/lib/config-data/puppet-generated/rabbitmq/etc/pki
/var/lib/config-data/puppet-generated/rabbitmq/etc/hosts
/var/lib/config-data/puppet-generated/heat_api_cfn/etc/pki
/var/lib/config-data/puppet-generated/heat_api_cfn/etc/hosts
/var/lib/config-data/puppet-generated/keystone/etc/pki
/var/lib/config-data/puppet-generated/keystone/etc/hosts
[...]


Not only HA-specific. Trying to figure out why this is happening

Comment 9 Damien Ciabrini 2018-06-20 15:47:13 UTC
It seems that when the docker-puppet-{service} container were ran, some timestamps from the container image were more recent than the timezone the container were started in (so file that appeared to be in the future), so when  docker-puppet.sh touches the file /var/lib/config-data/{service}.origin_of_time, so some files in the container still had a more recent timestamps, and when puppet finished and docker-puppet.sh ended up copying spurious files in /var/lib/config-data/puppet-generated/{service}. So subsequent kolla_init failed and deployment went in error.

This seems like another occurrence of https://bugzilla.redhat.com/show_bug.cgi?id=1578849

Comment 10 Damien Ciabrini 2018-06-20 15:51:19 UTC
(In reply to Damien Ciabrini from comment #9)
> This seems like another occurrence of
> https://bugzilla.redhat.com/show_bug.cgi?id=1578849

Either a timezone issue due to NTP setting, or the consumption of that container happened in a timezone where the container's file in the image appeared to be in the future.

Comment 12 Alex Schultz 2018-06-20 16:33:05 UTC
Marking this as a dupe of Bug 1578849 as we'll be addressing time sync issues with that bug. That should resolve the underlying cause for this as well.

*** This bug has been marked as a duplicate of bug 1578849 ***


Note You need to log in before you can comment on or make changes to this bug.