Bug 1404193
| Summary: | excluded docker packages should not be removed when update atomic-docker-excluder | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | liujia <jiajliu> |
| Component: | Cluster Version Operator | Assignee: | Troy Dawson <tdawson> |
| Status: | CLOSED ERRATA | QA Contact: | liujia <jiajliu> |
| Severity: | medium | Docs Contact: | |
| Priority: | medium | ||
| Version: | 3.4.0 | CC: | aos-bugs, jiajliu, jokerman, mmccomas |
| Target Milestone: | --- | ||
| Target Release: | 3.4.z | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: |
An error in the atomic-openshift-docker-excluder package led to packages being removed from the exclusion list when upgraded. This error has been resolved ensuring that the proper packages are excluded from yum operations.
|
Story Points: | --- |
| Clone Of: | Environment: | ||
| Last Closed: | 2017-01-31 20:19:22 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
liujia
2016-12-13 10:01:11 UTC
I can not reproduce this error using the "Steps to Reproduce" in your report. After step 1 your "before update:" results are different than mine.
- Fresh EC2 instance registered with subscription manager
> # oc version
> oc v3.3.1.7
> ...
>
> # yum -y install atomic-openshift-docker-excluder atomic-openshift-excluder
> ...
>
> # rpm -q atomic-openshift-docker-excluder atomic-openshift-excluder
> atomic-openshift-docker-excluder-3.3.1.7-1.git.0.0988966.el7.noarch
> atomic-openshift-excluder-3.3.1.7-1.git.0.0988966.el7.noarch
>
> # grep exclude /etc/yum.conf
> excludes=
> excludes=
I do not have `openshift_additional_repos` set to any value in my inventory. I am not sure how to reproduce the error you described. Please advise.
Hi Tim Your "before update:" is different from mine because there is a bug in 3.3(1403696) which has been fixed in 3.4. So u can workaround it as following: 1) install atomic-openshift-docker-excluder(should not install atomic-openshift-excluder before because bug1403655) 2) edit /usr/sbin/atomic-openshift-docker-excluder(refer to https://github.com/openshift/ose/pull/499) - sed -i 's|\(obsoletes=.*\)|\1\nexcludes=|' "${CONF_FILE}" + sed -i 's|\(installonly_limit=.*\)|\1\nexclude=|' "${CONF_FILE}" 3) run "atomic-openshift-docker-excluder exclude" Then u can do next steps to reproduce. liujia, great. Following those instructions worked for me
> [root@m01 ~]# atomic-openshift-docker-excluder exclude
> Adding exclude= to /etc/yum.conf
> [root@m01 ~]# grep exclude /etc/yum.conf
> exclude= docker*1.20* docker*1.19* docker*1.18* docker*1.17*
> docker*1.16* docker*1.15* docker*1.14* docker*1.13*
Working on reproducing the rest of the issue now.
This does indeed reproduce PRE UPGRADE: > [root@m01 ~]# grep exclude /etc/yum.conf > exclude= docker*1.20* docker*1.19* docker*1.18* docker*1.17* > docker*1.16* docker*1.15* docker*1.14* docker*1.13* > [root@m01 ~]# oc version > oc v3.3.1.7 > [root@m01 ~]# date > Wed Dec 14 12:52:19 EST 2016 Then apply upgrade playbook > [root@m01 ~]# oc version && date > oc v3.4.0.36+ca20a16 > Wed Dec 14 13:05:33 EST 2016 Current docker excluder: > [root@m01 ~]# grep exclude /etc/yum.conf > exclude= docker*1.20* docker*1.19* docker*1.18* docker*1.17* > docker*1.16* docker*1.15* docker*1.14* docker*1.13* > [root@m01 ~]# rpm -q atomic-openshift-docker-excluder.noarch > atomic-openshift-docker-excluder-3.3.1.7-1.git.0.0988966.el7.noarch Upgrade docker excluder.... > [root@m01 ~]# rpm -q atomic-openshift-docker-excluder.noarch > atomic-openshift-docker-excluder-3.4.0.36-1.git.0.ca20a16.el7.noarch > [root@m01 ~]# grep exclude /etc/yum.conf > exclude= This behavior is caused by the order in which `rpm` evaluates package scripts during upgrades. Package scripts are first ran **FOR THE NEW (UPGRADE) PACKAGE** and then **For the Old Package**: 1. Run the %pre section of the RPM being installed. 2. Install the files that the RPM provides. 3. Run the %post section of the RPM. 4. Run the %preun of the old package. 5. Delete any old files not overwritten by the newer version. (This step deletes files that the new package does not require.) 6. Run the %postun hook of the old package. * Source [1]. Verify yourself with "rpm -U -vv atomic-openshift-docker-excluder-3.4.0.36-1.git.0.ca20a16.el7.noarch.rpm" and read the upgrade process step by step: https://gist.github.com/tbielawa/f74b34103dec4195c8c484c735c0ab6f ---- 1: N/A: The new excluder package does not have a %pre (install) section 2. The updated excluder files (/usr/sbin/atomic-openshift-docker-excluder) are installed on the filesystem 3: The %post (install) section of the new rpm is ran > atomic-openshift-docker-excluder exclude 4: The %preun script of the **old** package is ran > /usr/sbin/atomic-openshift-docker-excluder unexclude 5: N/A 6: N/A: Old package has no %postun script ---- What does this mean to us? 3: The new rpm's post-installation script is ran. This happens when you install a package for the first time, or upgrade an existing package. This means that the line > exclude= docker*1.20 docker*1.19* .... Is added to yum.conf (In reality, since this is an upgrade and presumably you have already ran the 'exclude' command, that line **is already present**) 4: The OLD rpm's pre-uninstall script is ran. This action runs the excluders **unexclude** command (all those packages we just added to the exclude list are removed). --- In effect RPM is: - Ensuring that the packages are present in the "exclude=" list. - Running the old rpms uninstall process which results in the excluders unexclude command REMOVING all of the packages we wanted listed in the exclude list. We need to patch the order these are being ran or add some logic to the excluder script. * [1] http://www.ibm.com/developerworks/library/l-rpm2/ Troy, We came to the conclusion that we should probably make the excluder script ensure that the excludes for docker is correct on install regardless of current state and then only fire the postun scriptlet when $1 == 0. Scott, The problem with your statement is that the incoming docker-excluder script has no way of knowing what the previous docker-excluder script was excluding. Adding that type of smarts into the scripts will most likely create more bugs. Comment #5 left out %posttrans, which get's ran after everything. I propose we leave postun as it is, and change %post to %posttrans I ran some quick tests and it looks like this does the job. For OCP (for this bug) this pull request will fix the problem. https://github.com/openshift/ose/pull/525 Will get the fix upstream. Version: atomic-openshift-utils-3.4.55-1.git.0.9cb1f40.el7.noarch Steps: 1. Install atomic-openshift-docker-excluder-3.3.1.9-1.git.0.a7f5265.el7.noarch 2. Update docker-excluder to 3.4 #yum update atomic-openshift-docker-excluder Result: ->before update exclude= docker*1.20* docker*1.19* docker*1.18* docker*1.17* docker*1.16* docker*1.15* docker*1.14* docker*1.13* ->after update exclude= docker*1.20* docker*1.19* docker*1.18* docker*1.17* docker*1.16* docker*1.15* docker*1.14* docker*1.13* Verify the bug and change the status. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:0218 |