Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1452939

Summary: [3.5] Should use "imagePullPolicy: IfNotPresent" instead of "imagePullPolicy: Always" in logging and metrics deployer images
Product: OpenShift Container Platform Reporter: Antonio Gallego <agallego>
Component: InstallerAssignee: Jan Wozniak <jwozniak>
Status: CLOSED ERRATA QA Contact: Anping Li <anli>
Severity: urgent Docs Contact:
Priority: high    
Version: 3.5.0CC: aos-bugs, bmcelvee, jokerman, mmccomas, sdodson, tatanaka, xiazhao
Target Milestone: ---   
Target Release: 3.5.z   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
With this bug fix, the `imagePullPolicy` for logging and metrics images is now set to `IfNotPresent` rather than `Always`, which prevents unnecessary image pulls.
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-11-21 05:41:13 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Antonio Gallego 2017-05-20 18:59:23 UTC
Description of problem:

On page https://docs.openshift.com/container-platform/3.5/install_config/install/disconnected_install.html it writes:

Pull all of the required OpenShift Container Platform containerized components for the additional centralized log aggregation and metrics aggregation components. Replace <tag> with 3.5.0 for the latest version.

# docker pull registry.access.redhat.com/openshift3/logging-deployer:<tag>
# docker pull registry.access.redhat.com/openshift3/logging-elasticsearch:<tag>
# docker pull registry.access.redhat.com/openshift3/logging-kibana:<tag>
# docker pull registry.access.redhat.com/openshift3/logging-fluentd:<tag>
# docker pull registry.access.redhat.com/openshift3/logging-curator:<tag>
# docker pull registry.access.redhat.com/openshift3/logging-auth-proxy:<tag>
# docker pull registry.access.redhat.com/openshift3/metrics-deployer:<tag>
# docker pull registry.access.redhat.com/openshift3/metrics-hawkular-metrics:<tag>
# docker pull registry.access.redhat.com/openshift3/metrics-cassandra:<tag>
# docker pull registry.access.redhat.com/openshift3/metrics-heapster:<tag>

logging-deployer:3.5.0 and metrics-deployer:3.5.0 images don´t exist on the registry so these two lines should be removed from that list.


Version-Release number of selected component (if applicable):

3.5.0


How reproducible:

Try to pull both images. They don´t exist.

Or when searching both images on Red Hat registry, they don´t exist.


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:

Remove the lines:
# docker pull registry.access.redhat.com/openshift3/logging-deployer:<tag>
# docker pull registry.access.redhat.com/openshift3/metrics-deployer:<tag>

Additional info:


Document URL: 

https://docs.openshift.com/container-platform/3.5/install_config/install/disconnected_install.html#disconnected-syncing-images

Section Number and Name: 

Describe the issue: 

logging-deployer and metrics-deployer images don´t exist but they are listed on that page.

Suggestions for improvement: 

Additional information:

Comment 1 Takayoshi Tanaka 2017-08-30 02:34:27 UTC
It appears the document has changed to "v3.5". However, the customer fails to deploy metrics at disconnected environment though he follows the document.

He's installing OpenShift 3.5 and pulled these images.

************************:
# docker images | grep metrics
registry.access.redhat.com/openshift3/metrics-hawkular-metrics     3.5.0               e0d108bd9b0c        6 weeks ago         1.27 GB
registry.access.redhat.com/openshift3/metrics-hawkular-metrics     v3.5                e0d108bd9b0c        6 weeks ago         1.27 GB
registry.access.redhat.com/openshift3/metrics-cassandra            3.5.0               042236fd907e        6 weeks ago         540.6 MB
registry.access.redhat.com/openshift3/metrics-cassandra            v3.5                042236fd907e        6 weeks ago         540.6 MB
registry.access.redhat.com/openshift3/metrics-heapster             3.5.0               4e29df6bda85        8 weeks ago         318.5 MB
registry.access.redhat.com/openshift3/metrics-heapster             v3.5                4e29df6bda85        8 weeks ago         318.5 MB
registry.access.redhat.com/openshift3/metrics-deployer             v3.5                f5c500d7a624        8 weeks ago         892.9 MB
#
************************:

Here are parameters.

openshift_hosted_metrics_deploy=true
openshift_hosted_metrics_storage_kind=nfs
openshift_hosted_metrics_storage_access_modes=['ReadWriteOnce']
openshift_hosted_metrics_storage_host=XXX.XXX.XXX.XXX
openshift_hosted_metrics_storage_nfs_directory=/metrics
openshift_hosted_metrics_storage_volume_name=metrics
openshift_hosted_metrics_storage_volume_size=10Gi

Then he got below errors.

************************:
# oc get pod --all-namespaces
NAMESPACE         NAME                         READY     STATUS             RESTARTS   AGE
default           docker-registry-2-x6tr8      1/1       Running            2          2d
default           registry-console-1-wbnsp     1/1       Running            1          2d
default           router-1-1d3wq               1/1       Running            1          2d
default           router-1-c4wmq               1/1       Running            1          2d
default           router-1-p8s0n               1/1       Running            2          2d
openshift-infra   hawkular-cassandra-1-d8mrk   0/1       ImagePullBackOff   0          2d
openshift-infra   hawkular-metrics-29tpv       0/1       ImagePullBackOff   0          2d
openshift-infra   heapster-2m7rc               0/1       ImagePullBackOff   0          2d
#
************************:

After he connected his environment to the Internet, metrics was installed successfully. After that, he found a newew image is pulled.

[root@master-02-XXX ~]#  docker images | grep metrics
registry.access.redhat.com/openshift3/metrics-hawkular-metrics                        3.5.0               b12e45828aad        4 weeks ago         1.456 GB
registry.access.redhat.com/openshift3/metrics-hawkular-metrics                        v3.5                e0d108bd9b0c        6 weeks ago         1.27 GB
registry.access.redhat.com/openshift3/metrics-cassandra                               3.5.0               042236fd907e        6 weeks ago         540.6 MB
registry.access.redhat.com/openshift3/metrics-cassandra                               v3.5                042236fd907e        6 weeks ago         540.6 MB
registry.access.redhat.com/openshift3/metrics-heapster                                3.5.0               4e29df6bda85        8 weeks ago         318.5 MB
registry.access.redhat.com/openshift3/metrics-heapster                                v3.5                4e29df6bda85        8 weeks ago         318.5 MB
registry.access.redhat.com/openshift3/metrics-deployer                                v3.5                f5c500d7a624        8 weeks ago         892.9 MB

The customer wants to know the right procedure of disconnected installation and the case severity is now Sev1. Could you check the document and provide any workaround soon?

Comment 2 Takayoshi Tanaka 2017-08-31 00:54:10 UTC
The metrics template has a imagePullPolicy: Always. Is it related to this issue? Also, the customer as well as I can't set up disconnected installation right now. However, the customer's Severity is Sev1. I'll escalate this case.

# cat /usr/share/ansible/openshift-ansible/playbooks/byo/roles/openshift_metrics/templates/hawkular_metrics_rc.j2 | grep imagePull
        imagePullPolicy: Always

Comment 3 Xia Zhao 2017-08-31 07:05:18 UTC
(In reply to Takayoshi Tanaka from comment #2)
> The metrics template has a imagePullPolicy: Always. Is it related to this
> issue? Also, the customer as well as I can't set up disconnected
> installation right now. However, the customer's Severity is Sev1. I'll
> escalate this case.
> 
> # cat
> /usr/share/ansible/openshift-ansible/playbooks/byo/roles/openshift_metrics/
> templates/hawkular_metrics_rc.j2 | grep imagePull
>         imagePullPolicy: Always

Yes, exactly -- Tested on openshift v3.5.5.31.19, the imagePullPolicy matters: 

1. Firstly reproduced the original issue as follows:
-- Deployed metrics stacks with "imagePullPolicy: Always" which is the default setting in template, disconnect network connection by chaning to "no-internet" security group, metrics pods turn to "ImagePullBackOff" status while the other infra pods (e.g. the router and registry pods) can still be in "running" status

2. Edit rc for each metrics pods, changing to "imagePullPolicy: IfNotPresent", then redeploy them, metrics pods become "running"

3. Edit rc for each metrics pods, changing to "imagePullPolicy: Never", then redeploy them, metrics pods become "running"

Comment 4 Takayoshi Tanaka 2017-08-31 07:08:53 UTC
Thanks a lot! I'll reply the customer.

Comment 5 Vikram Goyal 2017-08-31 07:10:43 UTC
Thanks Xia! Much appreciated!!

Takayoshi - please let us know the outcome with the customer and if we need to put this fix in the docs.

Comment 6 Takayoshi Tanaka 2017-08-31 07:19:57 UTC
Can I confirm one thing? Is the workaround is modifying all the imagePullPolicy in logging and metrics templates?

# grep imagePullPolicy /usr/share/ansible/openshift-ansible/playbooks/byo/roles/openshift_metrics/templates/* 
/usr/share/ansible/openshift-ansible/playbooks/byo/roles/openshift_metrics/templates/hawkular_cassandra_rc.j2:        imagePullPolicy: Always
/usr/share/ansible/openshift-ansible/playbooks/byo/roles/openshift_metrics/templates/hawkular_metrics_rc.j2:        imagePullPolicy: Always
/usr/share/ansible/openshift-ansible/playbooks/byo/roles/openshift_metrics/templates/hawkular_openshift_agent_ds.j2:        imagePullPolicy: Always
/usr/share/ansible/openshift-ansible/playbooks/byo/roles/openshift_metrics/templates/heapster.j2:        imagePullPolicy: Always

# grep imagePullPolicy /usr/share/ansible/openshift-ansible/playbooks/byo/roles/openshift_logging*/*/*
/usr/share/ansible/openshift-ansible/playbooks/byo/roles/openshift_logging_curator/templates/curator.j2:          imagePullPolicy: Always
/usr/share/ansible/openshift-ansible/playbooks/byo/roles/openshift_logging_elasticsearch/templates/es.j2:          imagePullPolicy: Always
/usr/share/ansible/openshift-ansible/playbooks/byo/roles/openshift_logging_fluentd/templates/fluentd.j2:        imagePullPolicy: Always
/usr/share/ansible/openshift-ansible/playbooks/byo/roles/openshift_logging_kibana/templates/kibana.j2:          imagePullPolicy: Always
/usr/share/ansible/openshift-ansible/playbooks/byo/roles/openshift_logging_kibana/templates/kibana.j2:          imagePullPolicy: Always
/usr/share/ansible/openshift-ansible/playbooks/byo/roles/openshift_logging_mux/templates/mux.j2:        imagePullPolicy: Always
/usr/share/ansible/openshift-ansible/playbooks/byo/roles/openshift_logging/templates/jks_pod.j2:    imagePullPolicy: Always

Comment 7 Takayoshi Tanaka 2017-08-31 13:11:21 UTC
The customer wants Red Hat provides a patch command until we fix the issue. Is this command enough?

# grep -l 'imagePullPolicy: Always' /usr/share/ansible/openshift-ansible/playbooks/byo/roles/openshift_metrics/templates/* | xargs sed -i.bak -e 's/imagePullPolicy: Always/imagePullPolicy: IfNotPresent/g'

# grep -l 'imagePullPolicy: Always' /usr/share/ansible/openshift-ansible/playbooks/byo/roles/openshift_logging*/*/* | xargs sed -i.bak -e 's/imagePullPolicy: Always/imagePullPolicy: IfNotPresent/g'

Also, I found the same is existing in OCP 3.6.

Comment 8 Takayoshi Tanaka 2017-09-01 05:18:06 UTC
Could you confirm the workaround is enough or not?

Comment 9 Xia Zhao 2017-09-01 09:21:11 UTC
Confirmed the workaround steps in comment #7 worked fine in disconnected installation env for logging & metrics deployments on openshift v3.5.5.31.19. All logging & metrics pods can be in running status after deployment:

Some statictics as more details:
openshift-infra   hawkular-cassandra-1-rpxj9       1/1       Running             0          4m
openshift-infra   hawkular-metrics-7d006           1/1       Running             0          4m
openshift-infra   heapster-x5bgm                   1/1       Running             0          4m
default           hawkular-openshift-agent-7jm6h   1/1       Running             0          2m
logging        logging-curator-1-wpv32          1/1       Running            1          31m
logging        logging-es-oe4fy2fg-1-kxkbh      1/1       Running            0          15m
logging        logging-fluentd-8x746            1/1       Running            0          15m
logging        logging-fluentd-z3vjz            1/1       Running            0          15m
logging        logging-kibana-1-fwcn5           2/2       Running            0          31m

Comment 10 Xia Zhao 2017-09-01 09:26:50 UTC
One more thing, for hawkular-openshift-agent, you have to modify the imagePullPolicy here: https://github.com/openshift/origin-metrics/blob/enterprise/hawkular-openshift-agent/hawkular-openshift-agent.yaml#L67, and make sure the hawkular-openshift-agent image ready on all the openshift nodes.

Comment 11 Takayoshi Tanaka 2017-09-08 00:00:44 UTC
The customer confirmed the workaround went fine. As the customer doesn't use hawkular-openshift-agent, the executed steps are as follows:

# grep -l 'imagePullPolicy: Always' /usr/share/ansible/openshift-ansible/playbooks/byo/roles/openshift_metrics/templates/* | xargs sed -i.bak -e 's/imagePullPolicy: Always/imagePullPolicy: IfNotPresent/g'
# grep -l 'imagePullPolicy: Always' /usr/share/ansible/openshift-ansible/playbooks/byo/roles/openshift_logging*/*/* | xargs sed -i.bak -e 's/imagePullPolicy: Always/imagePullPolicy: IfNotPresent/g'


Also, note that I recommended executing these two commands on all masters. It seems the first master is enough as far as the playbook, but just in case.

Comment 12 Matt Wringe 2017-09-13 19:17:38 UTC
The only recommended approach to install metrics and logging is to use ansible. 

If we want to have a disconnected installation process, it should be done by the playbooks and have that configure the pullpolicy.

Comment 15 Anping Li 2017-10-12 04:34:28 UTC
The fix is not in the latest package openshift-ansible:v3.5.132. Waiting for new pacakges

Comment 17 Anping Li 2017-10-16 02:56:12 UTC
@scott,  
1) I think the correct errata for this bug should be https://errata.devel.redhat.com/advisory/30242,

2) The fix is in openshift-ansible-roles-3.5.134-1.git.0.e5f4029.el7.noarch. But the latest packages openshift-ansible-roles-3.5.132  is not in 30242 




[1]
with openshift-ansible-roles-3.5.134-1.git.0.e5f4029.el7.noarch

[root@131c8e9a37a7 roles]# grep -r imagePullPolicy |grep openshift_logging
openshift_logging/README.md:- Default imagePullPolicy changed from Always to IfNotPresent 
openshift_logging/templates/curator.j2:          imagePullPolicy: IfNotPresent
openshift_logging/templates/es.j2:          imagePullPolicy: IfNotPresent
openshift_logging/templates/fluentd.j2:        imagePullPolicy: IfNotPresent
openshift_logging/templates/jks_pod.j2:    imagePullPolicy: IfNotPresent
openshift_logging/templates/kibana.j2:          imagePullPolicy: IfNotPresent
openshift_logging/templates/kibana.j2:          imagePullPolicy: IfNotPresent

Comment 19 Scott Dodson 2017-10-16 14:44:06 UTC
I've moved it to the next errata.

Comment 21 openshift-github-bot 2017-10-25 01:27:13 UTC
Commits pushed to master at https://github.com/openshift/openshift-ansible

https://github.com/openshift/openshift-ansible/commit/862f50ff66324d7d1f23fe9bedd5d9d664578302
Bug 1452939 - change Logging & Metrics imagePullPolicy

- all images logging and metrics change their default imagePullPolicy
  from Always to IfNotPresent

https://github.com/openshift/openshift-ansible/commit/f0da12b7292cddabfc7c33206cabf0ff34aa9852
Merge pull request #5700 from wozniakjan/bz_1452939

Automatic merge from submit-queue.

Bug 1452939 - change imagePullPolicy in logging and metrics

cc: @jcantrill

Comment 22 Anping Li 2017-10-26 09:15:50 UTC
Verified and pass with openshift-ansible-roles-3.5.137.

Once deployed by openshift-ansible-roles-3.5.137,  the imagePullPolicy is IfNotPresent.
# oc get ds -n logging -o yaml |grep imagePullPolicy
          imagePullPolicy: IfNotPresent
# oc get dc -n logging -o yaml |grep imagePullPolicy
          imagePullPolicy: IfNotPresent
          imagePullPolicy: IfNotPresent
          imagePullPolicy: IfNotPresent
          imagePullPolicy: IfNotPresent
[root@host-8-241-7 ~]#

Comment 25 errata-xmlrpc 2017-11-21 05:41:13 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:3255