Bug 1374796

Summary: Controller upgrade fails for dependencies, grub2-efi-modules-2.02-0.34.el7_2.x86_64 (@rhelosp-rhel-7.2-z) Requires: grub2-tools = 1:2.02-0.34.el7_2
Product: Red Hat OpenStack Reporter: Marios Andreou <mandreou>
Component: rhosp-directorAssignee: Angus Thomas <athomas>
Status: CLOSED NEXTRELEASE QA Contact: Omri Hochman <ohochman>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 10.0 (Newton)CC: dbecker, jschluet, lmartins, markmc, mburns, morazi, rhel-osp-director-maint, srevivo, ykawada
Target Milestone: rcKeywords: TestOnly, Triaged
Target Release: 10.0 (Newton)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1375679 (view as bug list) Environment:
Last Closed: 2016-10-13 13:44:44 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Bug Depends On: 1292317, 1375679    
Bug Blocks: 1337794    

Description Marios Andreou 2016-09-09 16:14:59 UTC
Description of problem:

when trying to upgrade OSP9 poodle to latest OSP 10 poodle, i.e. with the upgrade init like:


cat > overcloud-repos.yaml <<EOF
parameter_defaults:
  UpgradeInitCommand: |
    set -e
    yum localinstall -y http://rhos-release.virt.bos.redhat.com/repos/rhos-release/rhos-release-latest.noarch.rpm
    # You need Red-Hat 7.3, see
    # https://bugzilla.redhat.com/show_bug.cgi?id=1373140
    rhos-release -P 10 -d -r 7.3
    ! [ -e /usr/share/openstack-dashboard/openstack_dashboard/local/local_settings.d ] || rm /usr/share/openstack-dashboard/openstack_dashboard/local/local_settings.d
EOF


(SO -d for the poodle), after the upgrade init (and repo switch), the controller upgrades fail like 

        Sep 08 15:25:53 overcloud-controller-0.localdomain os-collect-config[5855]: Error: Package: 1:grub2-efi-modules-2.02-0.34.el7_2.x86_64 (@rhelosp-rhel-7.2-z)
        Sep 08 15:25:53 overcloud-controller-0.localdomain os-collect-config[5855]: Requires: grub2-tools = 1:2.02-0.34.el7_2
        Sep 08 15:25:53 overcloud-controller-0.localdomain os-collect-config[5855]: Removing: 1:grub2-tools-2.02-0.34.el7_2.x86_64 (installed)
        Sep 08 15:25:53 overcloud-controller-0.localdomain os-collect-config[5855]: grub2-tools = 1:2.02-0.34.el7_2
        Sep 08 15:25:53 overcloud-controller-0.localdomain os-collect-config[5855]: Updated By: 1:grub2-tools-2.02-0.41.el7.x86_64 (rhelosp-rhel-7.3-server)
        Sep 08 15:25:53 overcloud-controller-0.localdomain os-collect-config[5855]: grub2-tools = 1:2.02-0.41.el7
        Sep 08 15:25:53 overcloud-controller-0.localdomain os-collect-config[5855]: [2016-09-08 15:25:53,479] (heat-config) [ERROR] Error running /var/lib/heat-config/heat-config-script/37d2e8d5-e39a-443c-b31b-6a15a7a13632. [1]
        Sep 08 15:25:53 overcloud-controller-0.localdomain os-collect-config[5855]: [2016-09-08 15:25:53,482] (heat-config) [INFO] Completed /var/lib/heat-config/hooks/script
        Sep 08 15:25:53 overcloud-controller-0.localdomain os-collect-config[5855]: [2016-09-08 15:25:53,483] (heat-config) [DEBUG] Running heat-config-notify /var/lib/heat-config/deployed/37d2e8d5-e39a-443c-b31b-6a15a7a13632.json < /var/lib/heat-config/deployed/37d2e8d5-e39a-443c-b31b-6a15a7a13632.notify.json
        Sep 08 15:25:54 overcloud-controller-0.localdomain os-collect-config[5855]: [2016-09-08 15:25:54,588] (heat-config) [INFO]
        Sep 08 15:25:54 overcloud-controller-0.localdomain os-collect-config[5855]: [2016-09-08 15:25:54,588] (heat-config) [DEBUG] [2016-09-08 15:25:53,775] (heat-config-notify) [DEBUG] Signaling to http://192.0.2.1:8000/v1/signal/arn%3Aopenstack%3Aheat%3A%3Abd6a8c1f981d41b49edf002f2d9973bc%3Astacks%2Fovercloud-UpdateWorkflow-ldke352vklak-ControllerPacemakerUpgradeDeployment_Step1-mgxcos22dcqm%2Fbee03177-a31f-4bc8-ae72-84d9cdd5b1a5%2Fresources%2F0?Timestamp=2016-09-08T15%3A14%3A52Z&SignatureMethod=HmacSHA256&AWSAccessKeyId=573e6354a9774c60bf8bfefcb89ea7ec&SignatureVersion=2&Signature=z1wmEX%2BMIyZSVfevyRjHQuv3lunxdvbAGKUpv2mOK1w%3D via POST


which seems to be a dependency issue. Today I also tried without the -d, so latest puddle, and have the same issue:



cat > overcloud-repos.yaml <<EOF
parameter_defaults:
  UpgradeInitCommand: |
    set -e
    yum localinstall -y http://rhos-release.virt.bos.redhat.com/repos/rhos-release/rhos-release-latest.noarch.rpm
    # You need Red-Hat 7.3, see
    # https://bugzilla.redhat.com/show_bug.cgi?id=1373140
    # NO -d today
    rhos-release -P 10 -r 7.3
    ! [ -e /usr/share/openstack-dashboard/openstack_dashboard/local/local_settings.d ] || rm /usr/share/openstack-dashboard/openstack_dashboard/local/local_settings.d
EOF

*=* 18:28:42 *=*=*= "UPGRADE INIT" 
openstack overcloud deploy --templates /usr/share/openstack-tripleo-heat-templates -e  /usr/share/openstack-tripleo-heat-templates/overcloud-resource-registry-puppet.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/puppet-pacemaker.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/net-single-nic-with-vlans.yaml -e network_env.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/major-upgrade-pacemaker-init.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/updates/update-from-overcloud-compute-hostnames.yaml -e overcloud-repos.yaml  


*=* 18:51:27 *=*=*= "CONTROLLER UPGRADE" 
openstack overcloud deploy --templates /usr/share/openstack-tripleo-heat-templates -e  /usr/share/openstack-tripleo-heat-templates/overcloud-resource-registry-puppet.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/puppet-pacemaker.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/net-single-nic-with-vlans.yaml -e network_env.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/major-upgrade-pacemaker.yaml



error from control-0 like:



Sep 09 16:06:38 overcloud-controller-0.localdomain os-collect-config[5507]: + for S in '${services[@]}'
Sep 09 16:06:38 overcloud-controller-0.localdomain os-collect-config[5507]: + systemctl stop openstack-swift-proxy
Sep 09 16:06:38 overcloud-controller-0.localdomain os-collect-config[5507]: ++ date +%s
Sep 09 16:06:38 overcloud-controller-0.localdomain os-collect-config[5507]: + tstart=1473437187
Sep 09 16:06:38 overcloud-controller-0.localdomain os-collect-config[5507]: + systemctl is-active pacemaker
Sep 09 16:06:38 overcloud-controller-0.localdomain os-collect-config[5507]: + '[' 0 -eq 1 ']'
Sep 09 16:06:38 overcloud-controller-0.localdomain os-collect-config[5507]: + yum -y install python-zaqarclient
Sep 09 16:06:38 overcloud-controller-0.localdomain os-collect-config[5507]: + yum -y install yum-plugin-versionlock
Sep 09 16:06:38 overcloud-controller-0.localdomain os-collect-config[5507]: + yum versionlock openvswitch
Sep 09 16:06:38 overcloud-controller-0.localdomain os-collect-config[5507]: + yum -y -q update
Sep 09 16:06:38 overcloud-controller-0.localdomain os-collect-config[5507]: Error: Package: 1:grub2-efi-modules-2.02-0.44.el7.x86_64 (rhelosp-rhel-7.3-server-opt)
Sep 09 16:06:38 overcloud-controller-0.localdomain os-collect-config[5507]: Requires: grub2-tools = 1:2.02-0.44.el7
Sep 09 16:06:38 overcloud-controller-0.localdomain os-collect-config[5507]: Removing: 1:grub2-tools-2.02-0.34.el7_2.x86_64 (installed)
Sep 09 16:06:38 overcloud-controller-0.localdomain os-collect-config[5507]: grub2-tools = 1:2.02-0.34.el7_2
Sep 09 16:06:38 overcloud-controller-0.localdomain os-collect-config[5507]: Updated By: 1:grub2-tools-2.02-0.41.el7.x86_64 (rhelosp-rhel-7.3-server)
Sep 09 16:06:38 overcloud-controller-0.localdomain os-collect-config[5507]: grub2-tools = 1:2.02-0.41.el7
Sep 09 16:06:38 overcloud-controller-0.localdomain os-collect-config[5507]: [2016-09-09 16:06:37,970] (heat-config) [ERROR] Error running /var/lib/heat-config/heat-config-script/e7875e05-d166-4997-ab87-8ec89a4b9ae7. [1]
Sep 09 16:06:38 overcloud-controller-0.localdomain os-collect-config[5507]: [2016-09-09 16:06:37,974] (heat-config) [INFO] Completed /var/lib/heat-config/hooks/script
Sep 09 16:06:38 overcloud-controller-0.localdomain os-collect-config[5507]: [2016-09-09 16:06:37,974] (heat-config) [DEBUG] Running heat-config-notify /var/lib/heat-config/deployed/e7875e05-d166-4997-ab87-8ec89a4b9ae7.json < /var/lib/heat-config/deployed/e7875e05-d166-4997-ab87-8ec89a4b9ae7.notify.json
Sep 09 16:06:39 overcloud-controller-0.localdomain os-collect-config[5507]: [2016-09-09 16:06:39,088] (heat-config) [INFO]
Sep 09 16:06:39 overcloud-controller-0.localdomain os-collect-config[5507]: [2016-09-09 16:06:39,088] (heat-config) [DEBUG] [2016-09-09 16:06:38,290] (heat-config-notify) [DEBUG] Signaling to http://192.0.2.1:8000/v1/signal/arn%3Aopenstack%3Aheat%3A%3Abd6a8c1f981d41b49edf002f2d9973bc%3Astacks%2Fovercloud-UpdateWorkflow-omudm3sqhhad-ControllerPacemakerUpgradeDeployment_Step1-c32krcs6q33u%2Fa057355c-293a-43ba-917f-8896e75a71d2%2Fresources%2F0?Timestamp=2016-09-09T15%3A54%3A16Z&SignatureMethod=HmacSHA256&AWSAccessKeyId=526e4cb8f1684408abb7f48b3e8dbe8e&SignatureVersion=2&Signature=c49juxDJaJ6JHRs5ZsCGHdnshx4YjFLao4JNt2EzZrw%3D via POST
Sep 09 16:06:39 overcloud-controller-0.localdomain os-collect-config[5507]: [2016-09-09 16:06:39,064] (heat-config-notify) [DEBUG] Response <Response [200]>
Sep 09 16:06:39 overcloud-controller-0.localdomain os-collect-config[5507]: dib-run-parts Fri Sep 9 16:06:39 UTC 2016 55-heat-config completed





Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 2 Marios Andreou 2016-09-09 16:16:17 UTC
version info requested earlier from mburns 


[stack@instack ~]$ for i in $(nova list|grep ctlplane|awk -F' ' '{ print $12 }'|awk -F'=' '{ print $2 }'); do ssh heat-admin@$i "hostname; echo ''; sudo yum list installed grub2*"; done
overcloud-compute-0.localdomain

Loaded plugins: search-disabled-repos
Installed Packages
grub2.x86_64                      1:2.02-0.34.el7_2          @rhelosp-rhel-7.2-z
grub2-efi.x86_64                  1:2.02-0.34.el7_2          @rhelosp-rhel-7.2-z
grub2-efi-modules.x86_64          1:2.02-0.34.el7_2          @rhelosp-rhel-7.2-z
grub2-tools.x86_64                1:2.02-0.34.el7_2          installed          
overcloud-controller-0.localdomain

Loaded plugins: search-disabled-repos, versionlock
Installed Packages
grub2.x86_64                      1:2.02-0.34.el7_2          @rhelosp-rhel-7.2-z
grub2-efi.x86_64                  1:2.02-0.34.el7_2          @rhelosp-rhel-7.2-z
grub2-efi-modules.x86_64          1:2.02-0.34.el7_2          @rhelosp-rhel-7.2-z
grub2-tools.x86_64                1:2.02-0.34.el7_2          installed          
overcloud-controller-1.localdomain

Loaded plugins: search-disabled-repos, versionlock
Installed Packages
grub2.x86_64                      1:2.02-0.34.el7_2          @rhelosp-rhel-7.2-z
grub2-efi.x86_64                  1:2.02-0.34.el7_2          @rhelosp-rhel-7.2-z
grub2-efi-modules.x86_64          1:2.02-0.34.el7_2          @rhelosp-rhel-7.2-z
grub2-tools.x86_64                1:2.02-0.34.el7_2          installed          
overcloud-controller-2.localdomain

Loaded plugins: search-disabled-repos, versionlock
Installed Packages
grub2.x86_64                      1:2.02-0.34.el7_2          @rhelosp-rhel-7.2-z
grub2-efi.x86_64                  1:2.02-0.34.el7_2          @rhelosp-rhel-7.2-z
grub2-efi-modules.x86_64          1:2.02-0.34.el7_2          @rhelosp-rhel-7.2-z
grub2-tools.x86_64                1:2.02-0.34.el7_2          installed          
[stack@instack ~]$

Comment 3 Mike Burns 2016-09-09 17:11:59 UTC
Ok, after a lot of debugging and image exploration, I've found the reason for this.  

tl;dr:  self-built images with optional repos enabled are the cause of this problem.

Full Summary:

When building the overcloud-full images, one of the included elements is grub2.  In that element, the relevant pkg-map includes:

"signed_grub_efi": "efibootmgr grub2-efi-modules grub2-efi"

For those unfamiliar with dib, this essentially results in 

yum install efibootmgr grub2-efi-modules grub2-efi

efibootmgr and grub2-efi are in the core RHEL repositories so they are included in the official images.  grub2-efi-modules is in the optional channel.  Since we intentionally don't include any optional repos in the official image builds, grub2-efi-modules does not get included.

There are a few possible paths forward for this as well as some temporary workarounds:

Temporary:
yum erase grub2-efi-modules

Longer term:
* remove grub2-efi-modules from the dib config
* get RHEL to move the package to a non-optional channel
* carry grub2-efi-modules in OSP channels (highly undesirable)

One additional concern is why dib is bubbling up an error for the missing package.

Comment 5 Mike Burns 2016-09-09 17:22:29 UTC
@lucas -- you put the grub2 element into dib -- do we really need the grub2-efi-modules package?

I'll note that we haven't had any bug reports that we're missing the package, but maybe it's a hidden thing that causes other instability...

Comment 6 Lucas Alvares Gomes 2016-09-12 11:05:46 UTC
(In reply to Mike Burns from comment #5)
> @lucas -- you put the grub2 element into dib -- do we really need the
> grub2-efi-modules package?
> 
> I'll note that we haven't had any bug reports that we're missing the
> package, but maybe it's a hidden thing that causes other instability...

Hi Mike,

This element is requiring this package for a long time now and it's needed in order to UEFI to be supported. That said, grub2-efi-modules should be present in RHEL/CentOS/Fedora. Is it not ? 

I can confirm it's present in RHEL 7.2 (I didn't try 7.3 tho): http://pastebin.test.redhat.com/410815

Comment 8 Mike Burns 2016-09-13 16:29:30 UTC
Lucas -- It's in RHEL 7.2 and RHEL 7.3 but in the optional channel which we're not allowed to depend on.  

In that pastebin, it's from -optional

If you're telling me that it's 100% required then I'll chase it down.

Comment 11 Mike Burns 2016-10-12 13:34:11 UTC
RHEL has moved the grub2-efi-modules package to the base channel, so this should be resolved now.

Comment 13 Mike Burns 2016-10-13 13:44:44 UTC
With RHEL 7.3, we're shipping the package in base RHEL channels, so this bug is irrelevant from an OSP perspective.