Bug 1654064

Summary: AMIs built with 3.11.44 for CRIO contain invalid /etc/containers/registries.conf
Product: OpenShift Container Platform Reporter: Justin Pierce <jupierce>
Component: InstallerAssignee: Urvashi Mohnani <umohnani>
Installer sub component: openshift-ansible QA Contact: Gaoyun Pei <gpei>
Status: CLOSED ERRATA Docs Contact:
Severity: unspecified    
Priority: unspecified CC: brad.williams, dornelas, gpei, mwoodson, rbohne, umohnani, vrutkovs
Version: 3.11.0   
Target Milestone: ---   
Target Release: 3.11.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-01-30 15:19:31 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1186913    

Description Justin Pierce 2018-11-28 00:38:11 UTC
Description of problem:

docker fails to start on these AMIs due to the following entry in /etc/containers/registries.conf
>>>>
# If you need to access insecure registries, add the registry's fully-qualified name.
# An insecure registry is one that does not have a valid SSL certificate or only does HTTP.
[registries.insecure]
registries = [""]
<<<<

When this entry is present, /var/run/containers/registries.conf is computed by docker as follows:
>>>>
REGISTRIES="--add-registry registry.redhat.io --add-registry docker.io --insecure-registry  "
<<<< 

The failure to include anything after --insecure-registry makes docker fail to start.
>>>>
Nov 14 12:58:48 ip-172-16-0-24.ec2.internal systemd[1]: Starting Docker Application Container Engine...
Nov 14 12:58:48 ip-172-16-0-24.ec2.internal dockerd-current[4419]: Status: flag needs an argument: --insecure-registry
Nov 14 12:58:48 ip-172-16-0-24.ec2.internal dockerd-current[4419]: See 'dockerd --help'.
Nov 14 12:58:48 ip-172-16-0-24.ec2.internal dockerd-current[4419]: Usage:        dockerd COMMAND
Nov 14 12:58:48 ip-172-16-0-24.ec2.internal dockerd-current[4419]: A self-sufficient runtime for containers.
Nov 14 12:58:48 ip-172-16-0-24.ec2.internal dockerd-current[4419]: Options:
....
<<<<

Version-Release number of the following components:
v3.11.44. This issue did not exist in v3.11.16.

How reproducible:
100%


Actual results:
Docker fails to start.

Expected results:
If the /etc/containers/registries.conf entry is changed to the following, docker will not include the invalid --insecure-registry argument in the systemd invocation ("" is removed from [""]). 
>>>>
[registries.insecure]
registries = []
<<<<

Additional info:
This configuration results from building a CRI-O AMI using openshift-ansible (openshift-ansible/playbooks/aws/openshift-cluster/build_ami.yml), but it is possible other flows are affected. Docker is required on CRI-O nodes for OpenShift builds.

Comment 4 brad.williams 2018-12-04 19:19:59 UTC
We have created an AMI with the fix contained in the following PR:
https://github.com/openshift/openshift-ansible/pull/10799

We are seeing the following content in /etc/containers/registries.conf:

[registries.search]
registries = [["registry.redhat.io", "docker.io"]]


# If you need to access insecure registries, add the registry's fully-qualified name.
# An insecure registry is one that does not have a valid SSL certificate or only does HTTP.
[registries.insecure]
registries = [[]]

The extra set of embedded brackets are preventing docker from starting properly.

Comment 5 Scott Dodson 2019-01-03 18:39:07 UTC
In openshift-ansible-3.11.58-1 and later.

Comment 6 Gaoyun Pei 2019-01-04 07:50:05 UTC
Could reproduce this bug with openshift-ansible-3.11.44-1.git.0.11d174e.el7.noarch.rpm

When setting up a cri-o&docker mixed env, no openshift_docker_insecure_registries specified in ansible inventory file,
installer will leave '[""]' value in [registries.insecure] of /etc/containers/registries.conf

[root@ip-172-18-10-241 ~]# grep '\[registries.insecure\]' -A 1 /etc/containers/registries.conf
[registries.insecure]
registries = [""]

Then docker service failed to start for invalid argument in /var/run/containers/registries.conf

Jan 04 02:03:15 ip-172-18-10-241.ec2.internal systemd[1]: Starting Docker Application Container Engine...
Jan 04 02:03:15 ip-172-18-10-241.ec2.internal dockerd-current[22185]: Status: flag needs an argument: --insecure-registry


[root@ip-172-18-10-241 ~]# cat /var/run/containers/registries.conf
REGISTRIES="--add-registry registry.redhat.io --add-registry docker.io --insecure-registry  "



Verified with openshift-ansible-3.11.65-1.git.0.6a0837b.el7.noarch.rpm.
With PR https://github.com/openshift/openshift-ansible/pull/10799 applied, the default value of [registries.insecure] turns into [].

[root@ip-172-18-5-125 ~]# grep '\[registries.insecure\]' -A 1 /etc/containers/registries.conf
[registries.insecure]
registries = []
[root@ip-172-18-5-125 ~]# cat /var/run/containers/registries.conf
REGISTRIES="--add-registry registry.access.redhat.com --add-registry docker.io --add-registry registry.fedoraproject.org --add-registry quay.io --add-registry registry.centos.org "

cri-o and docker service both run well.

Comment 8 errata-xmlrpc 2019-01-30 15:19:31 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0096