Bug 1372040 - [Docs]OSP-Director-10: upgrading undercloud from: osp9 to osp10, the yum update command hangs for about 20min over: 'Yum Cleanup: 1:openstack-nova' .
Summary: [Docs]OSP-Director-10: upgrading undercloud from: osp9 to osp10, the yum u...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: documentation
Version: 10.0 (Newton)
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: ---
: 10.0 (Newton)
Assignee: Dan Macpherson
QA Contact: Martin Lopes
URL:
Whiteboard:
: 1391686 (view as bug list)
Depends On:
Blocks: 1367466
TreeView+ depends on / blocked
 
Reported: 2016-08-31 18:38 UTC by Omri Hochman
Modified: 2017-02-23 08:01 UTC (History)
13 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-02-23 07:59:36 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Launchpad 1632330 0 None None None 2016-10-11 13:27:38 UTC

Description Omri Hochman 2016-08-31 18:38:44 UTC
OSP-Director-10:   upgrading undercloud from:  osp9 to osp10, the yum update command hangs for about 20min over:  'Yum Cleanup: 1:openstack-nova' .


environments: 
--------------
instack-undercloud-5.0.0-0.20160818065636.41ef775.el7ost.noarch
instack-5.0.0-0.20160802165724.5aabf5c.el7ost.noarch
openstack-heat-api-cfn-7.0.0-0.20160823082523.1106458.el7ost.noarch
openstack-tripleo-heat-templates-liberty-2.0.0-33.el7ost.noarch
openstack-heat-templates-0.0.1-0.20160822094546.1ac2823.el7ost.noarch
python-heat-tests-7.0.0-0.20160823082523.1106458.el7ost.noarch
openstack-heat-engine-7.0.0-0.20160823082523.1106458.el7ost.noarch
puppet-heat-9.1.0-0.20160815142726.d364553.el7ost.noarch
python-heatclient-1.3.0-0.20160802194627.44dfe53.el7ost.noarch
openstack-heat-common-7.0.0-0.20160823082523.1106458.el7ost.noarch
openstack-heat-api-7.0.0-0.20160823082523.1106458.el7ost.noarch
heat-cfntools-1.3.0-2.el7ost.noarch
openstack-tripleo-heat-templates-5.0.0-0.20160823140311.072404b.el7ost.noarch

The cleanup issue seems to be with: 
openstack-nova-13.1.1-4.el7ost.noarch    

Description : 
--------------
when following the steps to upgrade undercloud from osp9 to osp10 ,  when running 'yum update command' just after fixing the repos,  the yum process hangs for about ~15 minutes over the step : 
'yum cleanup for openstack-nova'

Info about upgrade process: https://gitlab.cee.redhat.com/sathlang/ospd-9-to-10-upgrade#controller-and-block-storage-upgrade


How to reproduce :
------------------
(1) Deploy osp9
(2) Update the repos on the undercloud to point to osp10 
(3) run 'yum update' 

logs: 
-----
19:41:32   Cleanup    : 1:openstack-nova-13.1.1-4.el7ost.noarch                  360/504 
20:02:08   Cleanup    : 1:openstack-nova-compute-13.1.1-4.el7ost.noarch          361/504 
20:02:11   Cleanup    : 1:openstack-nova-api-13.1.1-4.el7ost.noarch              362/504       <- rabbitmq-server was restarted manually here

in nova-compute.log
---------------------
AMQP server on 192.0.2.1:5672 is unreachable: [Errno 111] ECONNREFUSED. Trying again in

Comment 2 Sofer Athlan-Guyot 2016-09-06 16:36:13 UTC
Hi,

is this workaround not able to fix this: 

before running the undercloud upgrade:

sudo systemctl stop 'openstack-*'
sudo systemctl stop 'neutron-*'

if such, then the upstream bug is https://bugs.launchpad.net/tripleo/+bug/1593182

Comment 4 Omri Hochman 2016-09-08 19:50:50 UTC
(In reply to Sofer Athlan-Guyot from comment #2)
> Hi,
> 
> is this workaround not able to fix this: 
> 
> before running the undercloud upgrade:
> 
> sudo systemctl stop 'openstack-*'
> sudo systemctl stop 'neutron-*'
> 
> if such, then the upstream bug is
> https://bugs.launchpad.net/tripleo/+bug/1593182

Workaround seems valid:
19:40:04   Cleanup    : 1:openstack-nova-13.1.1-4.el7ost.noarch                 702/1118 
19:40:05   Cleanup    : 1:openstack-nova-compute-13.1.1-4.el7ost.noarch         703/1118 
19:40:06   Cleanup    : libvirt-python-1.2.17-2.el7.x86_64                      704/1118

Comment 5 Marios Andreou 2016-10-10 12:03:23 UTC
adding info here from my environment... i was about to file a new BZ so collected the info (and found this bugzilla):


Doing this:
sudo yum localinstall -y http://rhos-release.virt.bos.redhat.com/repos/rhos-release/rhos-release-latest.noarch.rpm
sudo yum -y update
sudo rhos-release -P 10 -r 7.3
sudo yum-config-manager --disable 'rhelosp-9.0*'
openstack undercloud upgrade

at some point during the cleanup for the yum update done before the openstack undercloud install [1] the update hangs on 'Cleanup' for openstack-nova-compute:

     "   Cleanup    : 1:openstack-nova-compute-13.1.1-10.el7ost.noarch        801/1156 " 

systemctl status says "activating": 
    ● openstack-nova-compute.service - OpenStack Nova Compute Server
       Loaded: loaded (/usr/lib/systemd/system/openstack-nova-compute.service; enabled; vendor preset: disabled)
       Active: activating (start) since Fri 2016-10-07 04:05:04 EDT; 1h 0min ago
     Main PID: 25500 (nova-compute)
       CGroup: /system.slice/openstack-nova-compute.service
               └─25500 /usr/bin/python2 /usr/bin/nova-compute

    Oct 07 04:05:04 instack.localdomain systemd[1]: Starting OpenStack Nova Compute Server...
    Oct 07 04:05:06 instack.localdomain nova-compute[25500]: Option "rpc_backend" from group "DEFAULT" is deprecated for removal.  Its value may be silently ignored in the future.
    Oct 07 04:05:06 instack.localdomain nova-compute[25500]: Option "notification_driver" from group "DEFAULT" is deprecated. Use option "driver" from group "oslo_messaging_notifications".
    Oct 07 04:05:06 instack.localdomain nova-compute[25500]: Option "notification_topics" from group "DEFAULT" is deprecated. Use option "topics" from group "oslo_messaging_notifications".


    [stack@instack ~]$ systemctl | grep nova
      openstack-nova-api.service                                                               loaded active     running         OpenStack Nova API Server
      openstack-nova-cert.service                                                              loaded active     running         OpenStack Nova Cert Server
      openstack-nova-compute.service                                                           loaded activating start     start OpenStack Nova Compute Server
      openstack-nova-conductor.service                                                         loaded active     running         OpenStack Nova Conductor Server

As soon as I "sudo systemctl stop openstack-nova-compute" the update continues and the undercloud upgrade eventually completes OK.

The openstack-nova-compute package is like:

    [m@m PACKAGES_FOR_BZ_STOP_SERVICES]$ grepr openstack-nova-compute ./*
    ./osp10_upgraded_packages:66:openstack-nova-compute-14.0.0-1.el7ost.noarch
    ./osp9_updated_packages:739:openstack-nova-compute-13.1.1-10.el7ost.noarch
    ./osp9_deployed_packages:246:openstack-nova-compute-13.1.0-6.el7ost.noarch


Workaround is to include a stop before the openstack undercloud upgrade:

    sudo yum localinstall -y http://rhos-release.virt.bos.redhat.com/repos/rhos-release/rhos-release-latest.noarch.rpm
    sudo yum -y update
    sudo rhos-release -P 10 -r 7.3
    sudo yum-config-manager --disable 'rhelosp-9.0*'
    #STOP services as workaround
    sudo systemctl stop 'openstack-*'
    sudo systemctl stop 'neutron-*'
    openstack undercloud upgrade


I am filing the BZ for now to capture this information but we aren't sure yet if it is confined to these specific package versions or if we need a more permanent fix to stop the services before the undercloud upgrade. 

My development env is OSP9 poodle being upgraded to OSP10 puddle; this may be a significant factor since afaik I am the only person hitting this.

[1] https://github.com/openstack/python-tripleoclient/blob/master/tripleoclient/v1/undercloud.py#L50

Comment 6 Marios Andreou 2016-10-11 16:06:52 UTC
Update after spending some time trying to progress this issue today. I filed a launchpad bug and also a quick fix at https://review.openstack.org/385012 (linked in related changes above). However during discussion of this issue in the upstream tripleo meeting today http://eavesdrop.openstack.org/meetings/tripleo/2016/tripleo.2016-10-11-14.00.log.txt the consensus was that we deal with this as a documentation fix. 

Upstream tripleo docs already document a stop for the openstack-* and neutron-* services before running the undercloud upgrade, like at http://tripleo.org/installation/installation.html#updating-undercloud-components 

Re-assigning this to docs team for now - for clarity, we need to document for OSP10 undercloud upgrade, that before running the "openstack undercloud upgrade" command the operator should stop services like:


        sudo systemctl stop 'openstack-*'
        sudo systemctl stop 'neutron-*'
        openstack undercloud upgrade

Comment 7 Lucy Bopf 2016-10-13 06:21:01 UTC
Changing the component to 'documentation' for tracking purposes.

Comment 8 Marios Andreou 2016-10-13 10:29:14 UTC
removing this as blocking the upgrades rfe https://bugzilla.redhat.com/show_bug.cgi?id=1337794 since this is now a docs bug

Comment 11 Sofer Athlan-Guyot 2016-12-02 07:55:32 UTC
*** Bug 1391686 has been marked as a duplicate of this bug. ***

Comment 18 Dan Macpherson 2017-02-03 03:01:43 UTC
Hi Omri,

This content is now live:

https://access.redhat.com/documentation/en/red-hat-openstack-platform/10/single/upgrading-red-hat-openstack-platform/#sect-Major-Updating_Director_Packages

Was there anything else to add for this BZ? If not, I'll close this BZ.

Comment 19 Dan Macpherson 2017-02-23 07:59:36 UTC
No response in over 2 weeks. If nothing else to add to this BZ, I'm closing it. If further changes are required for this issue, please feel free to reopen it.


Note You need to log in before you can comment on or make changes to this bug.