Bug 1803198 - docker-puppet failing because /opt/puppetlabs is read-only
Summary: docker-puppet failing because /opt/puppetlabs is read-only
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo-heat-templates
Version: 13.0 (Queens)
Hardware: Unspecified
OS: Unspecified
low
low
Target Milestone: ---
: ---
Assignee: Alex Schultz
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-02-14 16:25 UTC by Lars Kellogg-Stedman
Modified: 2023-09-07 21:52 UTC (History)
10 users (show)

Fixed In Version: openstack-tripleo-heat-templates-8.4.1-60.el7ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-10-28 18:23:41 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
OpenStack gerrit 733147 0 None MERGED Improve facter cache reliability 2021-01-27 07:14:50 UTC
Red Hat Issue Tracker OSP-18865 0 None None None 2022-09-22 06:34:50 UTC
Red Hat Knowledge Base (Solution) 5154501 0 None None None 2020-06-12 15:22:30 UTC
Red Hat Product Errata RHBA-2020:4388 0 None None None 2020-10-28 18:23:57 UTC

Description Lars Kellogg-Stedman 2020-02-14 16:25:38 UTC
Description of problem:

During an OSP 13 deploy, the docker-puppet step is failing with:

  puppetlabs.facter - unhandled exception: boost::filesystem::remove: Read-only file system: "/opt/puppetlabs/facter/cache/cached_facts/kernel"

That path is mounted inside the container by docker-puppet.py like this:

  --volume /var/lib/container-puppet/puppetlabs/:/opt/puppetlabs/:ro

It is explicitly read-only. We could fix that error by mounting a tmpfs filesystem on /opt/puppetlabs/facter: i.e., by modifying docker-puppet.py so that the final command line looks like:

  --volume /var/lib/container-puppet/puppetlabs/:/opt/puppetlabs/:ro --tmpfs /opt/puppetlabs/facter

But it seems like if that were the problem everyone would be hitting it, so I assume that something else is going on.

Version-Release number of selected component (if applicable):

On the undercloud:

openstack-tripleo-heat-templates-8.4.1-16.el7ost.noarch

We are currently hitting this with tag 13.0-96 of the openstack-keystone image, which has:

puppet-4.8.2-3.el7ost.noarch
facter-3.9.3-7.el7ost.x86_64

Comment 2 Lars Kellogg-Stedman 2020-02-14 16:58:38 UTC
Kevin,

I guess I wasn't clear in the description.  This is an error coming from puppet running *inside the container* on an *overcloud* node. The read-only status is due entirely to the :ro flag on the --volume mount.

The local filesystem is not read-only and is not out of space.

[root@neu-15-39-control1 container-puppet]# cd /var/lib/container-puppet/puppetlabs/
[root@neu-15-39-control1 puppetlabs]# df -h .
Filesystem      Size  Used Avail Use% Mounted on
/dev/sda2       558G   85G  473G  16% /
[root@neu-15-39-control1 puppetlabs]# touch a_test_file
[root@neu-15-39-control1 puppetlabs]# rm a_test_file
rm: remove regular empty file ‘a_test_file’? y
[root@neu-15-39-control1 puppetlabs]#

Comment 4 Kevin Carter 2020-02-14 17:38:04 UTC
Thanks for clarifying where the issue is happening. 

Does "/var/lib/container-puppet/puppetlabs/facter/cache/cached_facts/kernel" exist on the host?

In inspecting one of my local environments I see the following.

[tripleo-admin@overcloud-controller-0 ~]$ find /var/lib/container-puppet/puppetlabs/facter/
/var/lib/container-puppet/puppetlabs/facter/
/var/lib/container-puppet/puppetlabs/facter/cache
/var/lib/container-puppet/puppetlabs/facter/cache/cached_facts
/var/lib/container-puppet/puppetlabs/facter/cache/cached_facts/kernel
/var/lib/container-puppet/puppetlabs/facter/cache/cached_facts/memory
/var/lib/container-puppet/puppetlabs/facter/cache/cached_facts/networking
/var/lib/container-puppet/puppetlabs/facter/cache/cached_facts/operating system
/var/lib/container-puppet/puppetlabs/facter/cache/cached_facts/processor

[tripleo-admin@overcloud-controller-0 ~]$ cat /var/lib/container-puppet/puppetlabs/facter/cache/cached_facts/kernel
{
  "kernel": "Linux",
  "kernelversion": "3.10.0",
  "kernelrelease": "3.10.0-1062.12.1.el7.x86_64",
  "kernelmajversion": "3.10"
}

Can you confirm that you have something similar?

Comment 5 Lars Kellogg-Stedman 2020-02-14 18:10:59 UTC
Our configuration looks pretty much identical:

[root@neu-15-39-control1 ~]# find /var/lib/container-puppet/puppetlabs/ -type f
/var/lib/container-puppet/puppetlabs/facter.conf
/var/lib/container-puppet/puppetlabs/facter/cache/cached_facts/kernel
/var/lib/container-puppet/puppetlabs/facter/cache/cached_facts/memory
/var/lib/container-puppet/puppetlabs/facter/cache/cached_facts/networking
/var/lib/container-puppet/puppetlabs/facter/cache/cached_facts/operating system
/var/lib/container-puppet/puppetlabs/facter/cache/cached_facts/processor


Michele has pointed at a bug that may be related, and has suggested the following workaround:

rm /opt/puppetlabs/facter/cache/ -rf
rm -fr /var/lib/container-puppet/puppetlabs/
mkdir /var/lib/container-puppet/puppetlabs/
cat > /var/lib/container-puppet/puppetlabs/facter.conf <<EOF
facts : {
  ttls: [
    { "kernel" : 8 hour },
    { "memory" : 8 hour },
    { "networking" : 8 hour },
    { "operating system" : 8 hour },
    { "processor" : 8 hour },
  ]
}
EOF
facter --config /var/lib/container-puppet/puppetlabs/facter.conf
cp -rdp /opt/puppetlabs/ /var/lib/container-puppet/


The workaround seems to resolve the problem. I think the issue is that when docker-puppet runs, the facter cache is stale, so facter inside the container attempts to update the cache and fails because of the read-only filesystem. Following the procedure that michele suggested refreshes the cache so that facter in the container won't attempt to update it.  It's not clear to me that either of Alix's patches ([1], [2]) actually address this problem.

It seems that the solution would be either to (a) ensure that we refresh the facter cache immediately before running docker-puppet, or (b) just let the cache directory be writable inside the container. I hear that Alex is on PTO right now, but hopefully he can weigh in when he returns. Michele's workaround seems fine for now.

[1]: https://review.opendev.org/#/c/695769/
[2]: https://review.opendev.org/#/c/695758/

Comment 6 Alex Schultz 2020-02-17 15:53:17 UTC
Will need the current failures list. I'm not sure what's going on here.  docker-puppet.py executions shouldn't generate any new cache facts because that's handled prior to the docker-puppet exeuction. If you're running things by hand and not pre-caching the facts prior to running docker-puppet.py you'll get this kind of error. I've also seen where the pre-cache fails (because of bad facter versions on the host) that result in a similar error because the cache is empty.

Comment 7 David Vallee Delisle 2020-06-12 14:49:04 UTC
[1] I've reproduced this problem on one of our internal lab with ConfigDebug enabled if that helps, during a z2 to z11 update.

The issue I had was that I changed the satellite activation key to point on the z11 contentview but I didn't make a "yum clean all" afterward and facter wasn't updated [2].

After a yum clean all, facter got updated and got past this error.

I'll make a KCS about this.

[1]
~~~
        "2020-06-12 13:51:08,364 DEBUG: 649337 -- Running docker command: /usr/bin/docker run --user root --name docker-puppet-neutron-e8ssd8x6 --env PUPPET_TAGS=file,file_line,concat,augeas,cron,neutron_plugin_ml2,neutron_config,neutron_agent_ovs,neutron_plugin_ml2 --env NAME=neutron --env HOSTNAME=ess13z2-scpu-1 --env NO_ARCHIVE= --env STEP=6 --volume /etc/localtime:/etc/localtime:ro --volume /tmp/tmpegBWjp:/etc/config.pp:ro,z --volume /etc/puppet/:/tmp/puppet-etc/:ro,z --volume /usr/share/openstack-puppet/modules/:/usr/share/openstack-puppet/modules/:ro --volume /var/lib/config-data:/var/lib/config-data/:z --volume tripleo_logs:/var/log/tripleo/ --volume /dev/log:/dev/log --volume /etc/pki/ca-trust/extracted:/etc/pki/ca-trust/extracted:ro --volume /etc/pki/tls/certs/ca-bundle.crt:/etc/pki/tls/certs/ca-bundle.crt:ro --volume /etc/pki/tls/certs/ca-bundle.trust.crt:/etc/pki/tls/certs/ca-bundle.trust.crt:ro --volume /etc/pki/tls/cert.pem:/etc/pki/tls/cert.pem:ro --volume /var/lib/container-puppet/puppetlabs/facter.conf:/etc/puppetlabs/facter/facter.conf:ro --volume /var/lib/container-puppet/puppetlabs/:/opt/puppetlabs/:ro --volume /var/lib/docker-puppet/docker-puppet.sh:/var/lib/docker-puppet/docker-puppet.sh:z --volume /lib/modules:/lib/modules:ro --volume /run/openvswitch:/run/openvswitch --entrypoint /var/lib/docker-puppet/docker-puppet.sh --net host --volume /etc/hosts:/etc/hosts:ro docker-registry.upshift.redhat.com/ess-rhosp13/openstack-neutron-server:z11",
        "2020-06-12 13:51:08,394 DEBUG: 649335 -- Trying to pull repository docker-registry.upshift.redhat.com/ess-rhosp13/openstack-nova-compute ... ",
        "z11: Pulling from docker-registry.upshift.redhat.com/ess-rhosp13/openstack-nova-compute",
        "Digest: sha256:14b602cab1715b5a1135f7ebc4272e10fe5a271b317d58d01374ad4bf1af95fb",
        "Status: Image is up to date for docker-registry.upshift.redhat.com/ess-rhosp13/openstack-nova-compute:z11",
        "2020-06-12 13:51:08,399 DEBUG: 649335 -- NET_HOST enabled",
        "2020-06-12 13:51:08,399 DEBUG: 649335 -- Running docker command: /usr/bin/docker run --user root --name docker-puppet-nova_libvirt-2n518ql9 --env PUPPET_TAGS=file,file_line,concat,augeas,cron,nova_config,nova_paste_api_ini,libvirtd_config,nova_config,file,libvirt_tls_password --env NAME=nova_libvirt --env HOSTNAME=ess13z2-scpu-1 --env NO_ARCHIVE= --env STEP=6 --volume /etc/localtime:/etc/localtime:ro --volume /tmp/tmpJGruYZ:/etc/config.pp:ro,z --volume /etc/puppet/:/tmp/puppet-etc/:ro,z --volume /usr/share/openstack-puppet/modules/:/usr/share/openstack-puppet/modules/:ro --volume /var/lib/config-data:/var/lib/config-data/:z --volume tripleo_logs:/var/log/tripleo/ --volume /dev/log:/dev/log --volume /etc/pki/ca-trust/extracted:/etc/pki/ca-trust/extracted:ro --volume /etc/pki/tls/certs/ca-bundle.crt:/etc/pki/tls/certs/ca-bundle.crt:ro --volume /etc/pki/tls/certs/ca-bundle.trust.crt:/etc/pki/tls/certs/ca-bundle.trust.crt:ro --volume /etc/pki/tls/cert.pem:/etc/pki/tls/cert.pem:ro --volume /var/lib/container-puppet/puppetlabs/facter.conf:/etc/puppetlabs/facter/facter.conf:ro --volume /var/lib/container-puppet/puppetlabs/:/opt/puppetlabs/:ro --volume /var/lib/docker-puppet/docker-puppet.sh:/var/lib/docker-puppet/docker-puppet.sh:z --entrypoint /var/lib/docker-puppet/docker-puppet.sh --net host --volume /etc/hosts:/etc/hosts:ro docker-registry.upshift.redhat.com/ess-rhosp13/openstack-nova-compute:z11",
        "2020-06-12 13:51:16,356 ERROR: 649335 -- Failed running docker-puppet.py for nova_libvirt",
        "2020-06-12 13:51:16,357 ERROR: 649335 -- Notice: hiera(): Cannot load backend module_data: cannot load such file -- hiera/backend/module_data_backend",
        "Notice: hiera(): Cannot load backend module_data: cannot load such file -- hiera/backend/module_data_backend",
        "2020-06-12 13:51:16,357 ERROR: 649335 -- + mkdir -p /etc/puppet",
        "+ cp -a /tmp/puppet-etc/auth.conf /tmp/puppet-etc/hiera.yaml /tmp/puppet-etc/hieradata /tmp/puppet-etc/modules /tmp/puppet-etc/puppet.conf /tmp/puppet-etc/ssl /etc/puppet",
        "+ rm -Rf /etc/puppet/ssl",
        "+ echo '{\"step\": 6}'",
        "+ TAGS=",
        "+ '[' -n file,file_line,concat,augeas,cron,nova_config,nova_paste_api_ini,libvirtd_config,nova_config,file,libvirt_tls_password ']'",
        "+ TAGS='--tags file,file_line,concat,augeas,cron,nova_config,nova_paste_api_ini,libvirtd_config,nova_config,file,libvirt_tls_password'",
        "+ origin_of_time=/var/lib/config-data/nova_libvirt.origin_of_time",
        "+ touch /var/lib/config-data/nova_libvirt.origin_of_time",
        "+ sync",
        "+ set +e",
        "+ export FACTER_deployment_type=containers",
        "+ FACTER_deployment_type=containers",
        "++ tr '[:upper:]' '[:lower:]'",
        "++ cat /sys/class/dmi/id/product_uuid",
        "+ export FACTER_uuid=4c4c4544-0030-4610-8037-b1c04f485132",
        "+ FACTER_uuid=4c4c4544-0030-4610-8037-b1c04f485132",
        "+ FACTER_hostname=ess13z2-scpu-1",
        "+ /usr/bin/puppet apply --summarize --detailed-exitcodes --color=false --logdest syslog --logdest console --modulepath=/etc/puppet/modules:/usr/share/openstack-puppet/modules --tags file,file_line,concat,augeas,cron,nova_config,nova_paste_api_ini,libvirtd_config,nova_config,file,libvirt_tls_password /etc/config.pp",
        "Error: Facter: Facter.value uncaught exception: boost::filesystem::create_directories: Read-only file system: \"/opt/puppetlabs/facter/cache/cached_facts\"",
        "Error: Could not autoload puppet/provider/service/init: undefined method `downcase' for nil:NilClass",
        "Error: Could not autoload puppet/provider/service/bsd: Could not autoload puppet/provider/service/init: undefined method `downcase' for nil:NilClass",
        "Error: Facter: error while resolving custom facts in /usr/share/openstack-puppet/modules/stdlib/lib/facter/service_provider.rb: Could not autoload puppet/provider/service/bsd: Could not autoload puppet/provider/service/init: undefined method `downcase' for nil:NilClass",
        "Error: Facter: Facter.add uncaught exception: boost::filesystem::create_directories: Read-only file system: \"/opt/puppetlabs/facter/cache/cached_facts\"",
        "Error: Could not autoload puppet/provider/service/debian: Could not autoload puppet/provider/service/init: undefined method `downcase' for nil:NilClass",
        "Error: Facter: error while resolving custom facts in /usr/share/openstack-puppet/modules/stdlib/lib/facter/service_provider.rb: Could not autoload puppet/provider/service/debian: Could not autoload puppet/provider/service/init: undefined method `downcase' for nil:NilClass",
        "Error: Facter: Facter.fact uncaught exception: boost::filesystem::create_directories: Read-only file system: \"/opt/puppetlabs/facter/cache/cached_facts\"",
        "Error: Facter: error while resolving custom fact \"java_version\": undefined method `downcase' for nil:NilClass",
        "Warning: Found multiple default providers for package: norpm, yum, pip3; using norpm",
        "Failed to get D-Bus connection: Operation not permitted",
        "Warning: Could not retrieve fact fqdn",
        "Warning: Could not retrieve fact ipaddress",
        "Warning: Undefined variable 'deploy_config_name'; ",
        "   (file & line not available)",
        "Warning: Undefined variable 'osfamily'; ",
        "Warning: Unknown variable: '::osfamily'. at /etc/puppet/modules/tripleo/manifests/packages.pp:39:10",
        "Warning: Scope(Class[Tripleo::Packages]): enable_install option not supported for this distro.",
        "Warning: Unknown variable: '::hostname'. at /etc/puppet/modules/tripleo/manifests/profile/base/nova.pp:94:6",
        "Warning: This method is deprecated, please use match expressions with Stdlib::Compat::Ipv6 instead. They are described at https://docs.puppet.com/puppet/latest/reference/lang_data_type.html#match-expressions. at [\"/etc/puppet/modules/tripleo/manifests/profile/base/nova.pp\", 100]:[\"/etc/puppet/modules/tripleo/manifests/profile/base/nova/compute.pp\", 59]",
        "   (at /etc/puppet/modules/stdlib/lib/puppet/functions/deprecation.rb:28:in `deprecation')",
        "Warning: ModuleLoader: module 'nova' has unresolved dependencies - it will only see those that are resolved. Use 'puppet module list --tree' to see information about modules",
~~~

[2]
~~~
root@ess13z2-scpu-1 log]# rpm -qa | grep facter
facter-2.4.4-4.el7.x86_64
[root@ess13z2-ctrl-0 ~]# rpm -qa | grep facter
ruby-facter-3.9.3-7.el7ost.x86_64
facter-3.9.3-7.el7ost.x86_64
~~~

Comment 8 David Vallee Delisle 2020-06-12 15:22:30 UTC
Created: https://access.redhat.com/solutions/5154501

Comment 9 Alex Schultz 2020-06-12 17:47:50 UTC
We have a patch to improve the reliability of this so it will explicitly fail if facter fails.  /opt/puppetlabs is caused by facter not pre-caching the facts (usually related to a facter2/facter3 mis-match).

Comment 24 errata-xmlrpc 2020-10-28 18:23:41 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat OpenStack Platform 13.0 director bug fix advisory), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:4388


Note You need to log in before you can comment on or make changes to this bug.