Bug 1847507 - [OSP13] Docker containers stuck in restarting. OSError: [Errno 16] Device or resource busy: '/etc/hosts'
Summary: [OSP13] Docker containers stuck in restarting. OSError: [Errno 16] Device or ...
Keywords:
Status: CLOSED DUPLICATE of bug 1794119
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-containers
Version: 13.0 (Queens)
Hardware: x86_64
OS: Linux
high
high
Target Milestone: ---
: ---
Assignee: Dan Prince
QA Contact: Marius Cornea
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-06-16 14:09 UTC by Irina Petrova
Modified: 2023-10-06 20:39 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-07-01 13:00:46 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker OSP-14256 0 None None None 2022-03-24 18:57:18 UTC

Description Irina Petrova 2020-06-16 14:09:41 UTC
This bug was initially created as a copy of Bug #1794119

I am copying this bug because: 
The issue might be similar if not the same. Cloning/Opening a new Bug for verification. 


Description of problem:
What we know so far:
Client upgraded OSP10 to OSP13 on December 6.
After that things were working fine.
Until on 1 compute node, 6 containers couldn't be restarted.

The error we had was:
Jan 20 13:26:38 compute13 journal: INFO:__main__:Deleting /etc/neutron/neutron.conf
Jan 20 13:26:38 compute13 journal: INFO:__main__:Copying /var/lib/kolla/config_files/src/etc/neutron/neutron.conf to /etc/neutron/neutron.conf
Jan 20 13:26:38 compute13 journal: ERROR:__main__:Unexpected error:
Jan 20 13:26:38 compute13 journal: Traceback (most recent call last):
Jan 20 13:26:38 compute13 journal:  File "/usr/local/bin/kolla_set_configs", line 411, in main
Jan 20 13:26:38 compute13 journal:    execute_config_strategy(config)
Jan 20 13:26:38 compute13 journal:  File "/usr/local/bin/kolla_set_configs", line 377, in execute_config_strategy
Jan 20 13:26:38 compute13 journal:    copy_config(config)
Jan 20 13:26:38 compute13 journal:  File "/usr/local/bin/kolla_set_configs", line 306, in copy_config
Jan 20 13:26:38 compute13 journal:    config_file.copy()
Jan 20 13:26:38 compute13 journal:  File "/usr/local/bin/kolla_set_configs", line 150, in copy
Jan 20 13:26:38 compute13 journal:    self._merge_directories(source, dest)
Jan 20 13:26:38 compute13 journal:  File "/usr/local/bin/kolla_set_configs", line 97, in _merge_directories
Jan 20 13:26:38 compute13 journal:    os.path.join(dest, to_copy))
Jan 20 13:26:38 compute13 journal:  File "/usr/local/bin/kolla_set_configs", line 97, in _merge_directories
Jan 20 13:26:38 compute13 journal:    os.path.join(dest, to_copy))
Jan 20 13:26:38 compute13 journal:  File "/usr/local/bin/kolla_set_configs", line 97, in _merge_directories
Jan 20 13:26:38 compute13 journal:    os.path.join(dest, to_copy))
Jan 20 13:26:38 compute13 journal:  File "/usr/local/bin/kolla_set_configs", line 92, in _merge_directories
Jan 20 13:26:38 compute13 journal:    self._set_properties(source, dest)
Jan 20 13:26:38 compute13 journal:  File "/usr/local/bin/kolla_set_configs", line 117, in _set_properties
Jan 20 13:26:38 compute13 journal:    self._set_properties_from_file(source, dest)
Jan 20 13:26:38 compute13 journal:  File "/usr/local/bin/kolla_set_configs", line 122, in _set_properties_from_file
Jan 20 13:26:38 compute13 journal:    shutil.copystat(source, dest)
Jan 20 13:26:38 compute13 journal:  File "/usr/lib64/python2.7/shutil.py", line 98, in copystat
Jan 20 13:26:38 compute13 journal:    os.utime(dst, (st.st_atime, st.st_mtime))
Jan 20 13:26:38 compute13 journal: OSError: [Errno 30] Read-only file system: '/etc/pki/ca-trust/extracted'

The containers with the issue:
19f836ead8ba sat:5000/osp-osp13_containers-neutron-sriov-agent:13.0-89 "kolla_start" 12 days ago Restarting (2) 26 minutes ago neutron_sriov_agent
b156ed0fbe14 sat:5000/osp-osp13_containers-nova-compute-hotfix:13.0-98-1703225 "kolla_start" 12 days ago Restarting (2) 26 minutes ago nova_compute
44e83ede071d sat:5000/osp-osp13_containers-nova-compute-hotfix:13.0-98-1703225 "kolla_start" 12 days ago Restarting (2) 26 minutes ago nova_migration_target
f851155eaafd sat:5000/osp-osp13_containers-cron:13.0-90 "kolla_start" 12 days ago Restarting (2) 26 minutes ago logrotate_crond
1e4426c5ead3 sat:5000/osp-osp13_containers-nova-libvirt:13.0-101 "kolla_start" 12 days ago Restarting (2) 26 minutes ago nova_libvirt
e43197378dc1 sat:5000/osp-osp13_containers-nova-libvirt:13.0-101 "kolla_start" 12 days ago Restarting (2) 26 minutes ago nova_virtlogd

The solution was found with the help of engineering.

For an unknown reason, /etc/pki directory was copied to subdfolders inside /var/lib/config-data/puppet-generated/<container>/.
So when the containers were started, the binding that is supposed to be done with /etc/pki/ca-trust/extracted was done with the one inside /var/lib/config-data/puppet-generated/<container>/etc/pki/ca-trust/extracted.

Removing that folder from each subfolder solved the issue and allowed us to start the containers.

One thing to know is that it wasn't copied into /var/lib/config-data/puppet-generated/iscsid/. This container was running just fine.

The theory so far is that the folder /etc/pki was modified for some reason on that single compute node and because of no restriction in the code, that folder was copied inside puppet-generated folder to be fed to containers.

I have logs from that compute node for analysis.
If anything else is needed please let me know.

I'm also asking the client to ask internally what was done around January 8th (openstack operations) to explain why this directory was changed.

Version-Release number of selected component (if applicable):
puppet-tripleo-8.4.1-14.el7ost.noarch

How reproducible:
100%

Steps to Reproduce:
1. cp /etc/pki /var/lib/config-data/puppet-generated/<subfolder>/etc/
2. docker restart container
3. docker container stuck in Restarting state

Actual results:
docker container stuck in Restarting state

Expected results:
docker containers start normally

Additional info:

Comment 3 Alex Schultz 2020-07-01 13:00:46 UTC
Please ensure that /var/lib/config-data/puppet-generated/*/etc/hosts does not exist. This bug is 1794119 which was just released last week.

*** This bug has been marked as a duplicate of bug 1794119 ***


Note You need to log in before you can comment on or make changes to this bug.