Bug 1388675 - Osp-Director-10 : Upgrade Ceph script fails over syntax error in /root/tripleo_upgrade_node.sh: line 42 # Special-case OVS
Summary: Osp-Director-10 : Upgrade Ceph script fails over syntax error in /root/trip...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo-heat-templates
Version: 10.0 (Newton)
Hardware: x86_64
OS: Linux
urgent
urgent
Target Milestone: rc
: 10.0 (Newton)
Assignee: mathieu bultel
QA Contact: Omri Hochman
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-10-25 21:35 UTC by Omri Hochman
Modified: 2016-12-29 16:58 UTC (History)
8 users (show)

Fixed In Version: openstack-tripleo-heat-templates-5.0.0-1.3.el7ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-12-14 16:25:24 UTC


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2016:2948 normal SHIPPED_LIVE Red Hat OpenStack Platform 10 enhancement update 2016-12-14 19:55:27 UTC
OpenStack gerrit 389753 None None None 2016-10-31 14:21:05 UTC
OpenStack gerrit 390792 None None None 2016-10-31 13:46:43 UTC
Launchpad 1636748 None None None 2016-10-31 13:45:40 UTC

Description Omri Hochman 2016-10-25 21:35:29 UTC
Osp-Director-10 :  Upgrade Ceph script fails over syntax error in  /root/tripleo_upgrade_node.sh: line 42 # Special-case OVS for 

https://bugs.launchpad.net/tripleo/+bug/1635205


Steps: 
-------
(1) Deploy latest osp9
(2) attempt to upgrade osp9 to osp10 according guide 
(3) when running the command to upgrade ceph  :  upgrade-non-controller.sh --upgrade overcloud-cephstorage-0

Results: 
---------
first attempt fails with :

 21:08:51 === osd.0 === 
21:08:53 Stopping Ceph osd.0 on overcloud-cephstorage-0...kill 18042...kill 18042...done
21:08:53 /root/tripleo_upgrade_node.sh: line 42: syntax error in conditional expression: unexpected token `('


looking at the code that fails on syntax error on the ceph node: 
---------------------------------------------------------------
[heat-admin@overcloud-cephstorage-0 ~]$ sudo su -
Last login: Tue Oct 25 21:17:35 UTC 2016 on pts/0
[root@overcloud-cephstorage-0 ~]#
vi /root/tripleo_upgrade_node.sh +42


# Special-case OVS for https://bugs.launchpad.net/tripleo/+bug/1635205
if [[ -n \$(rpm -q --scripts openvswitch | awk '/postuninstall/,/*/' | grep "systemctl.*try-restart") ]]; then
    echo "Manual upgrade of openvswitch - restart in postun detected"
    mkdir OVS_UPGRADE || true
    pushd OVS_UPGRADE
    echo "Attempting to downloading latest openvswitch with yumdownloader"
    yumdownloader --resolve openvswitch
    echo "Updating openvswitch with nopostun"
    rpm -U --nopostun ./*.rpm
    popd
else
    echo "Skipping manual upgrade of openvswitch - no restart in postun detected"
fi

Comment 1 Omri Hochman 2016-10-25 21:36:36 UTC
the issues effects upgrade automation osp9 to osp10 .  more info : 
https://rhos-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/view/Director/view/9.0/job/BM_upgrade_9.0_to_10_on_rhos18/21/console

Comment 2 Omri Hochman 2016-10-26 01:26:00 UTC
After commenting-out the problematic section :

another attempt have failed /root/tripleo_upgrade_node.sh: line 79:  
--------------------------------------------------------------------
chown: cannot access ‘/var/run/ceph’: No such file or directory
WARNING: chown of /var/run/ceph failed
/root/tripleo_upgrade_node.sh: line 79: [: too many arguments
Created symlink from /etc/systemd/system/multi-user.target.wants/ceph-osd.target to /usr/lib/systemd/system/ceph-osd.target.
Created symlink from /etc/systemd/system/ceph.target.wants/ceph-osd.target to /usr/lib/systemd/system/ceph-osd.target.
Created symlink from /etc/systemd/system/ceph-osd.target.wants/ceph-osd@0.service to /usr/lib/systemd/system/ceph-osd@.service.
Job for ceph-osd@0.service failed because a timeout was exceeded. See "systemctl status ceph-osd@0.service" and "journalctl -xe" for details.
There was an error running ### UPGRADE CEPH ###. Exiting....


vi tripleo_upgrade_node.sh +79:
--------------------------------
    # If on ext4, we need to enforce lower values for name and namespace len
    # or ceph-osd will refuse to start, see: http://tracker.ceph.com/issues/16187
    for OSD_ID in $OSD_IDS; do
      OSD_FS=$(findmnt -n -o FSTYPE -T /var/lib/ceph/osd/ceph-${OSD_ID})
      if [ ${OSD_FS} = ext4 ]; then
        crudini --set /etc/ceph/ceph.conf global osd_max_object_name_len 256
        crudini --set /etc/ceph/ceph.conf global osd_max_object_namespace_len 64
      fi
    done

Comment 4 Marios Andreou 2016-10-26 10:13:23 UTC
thanks Omri, this is just a nit in the escaping here (we need to remove the '\' before the $:

-if [[ -n \$(rpm -q --scripts openvswitch | awk '/postuninstall/,/*/' | grep "systemctl.*try-restart") ]]; then
+if [[ -n $(rpm -q --scripts openvswitch | awk '/postuninstall/,/*/' | grep "systemctl.*try-restart") ]]; then


I'll put a review out later (matbu is also fixing another related issue here we will try and combine them later into a single review)


thanks

Comment 10 Marios Andreou 2016-10-31 14:20:32 UTC
Adding a note to mark this as related to https://bugzilla.redhat.com/show_bug.cgi?id=1364540 . The fixes here are a follow on to the ovs upgrade workaround which was delivered as the fix for that BZ. 

The review linked above fixes the ceph upgrade script syntax nit (noneed to escape \$ if you have 'HEREDOC' vs HEREDOC, apparently) and also adds the --replacepkgs incase you already had latest OVS.

So for the full fixup/workaround for ovs 2.4 to 2.5 upgrade you need both https://review.openstack.org/#/c/389753/ and https://review.openstack.org/#/c/390792/ (added both above)

Comment 11 Omri Hochman 2016-10-31 16:03:56 UTC
Marios  - I've tested the new patches that are on the Readme - they seems to work and solved this problem on my env.

Comment 12 Marios Andreou 2016-11-01 15:41:23 UTC
landed into newton @ https://review.openstack.org/#/c/390826/3 today so waiting for it to appear in puddle/package

Comment 14 Omri Hochman 2016-11-15 18:46:50 UTC
verified with openstack-tripleo-heat-templates-5.0.0-1.7.el7ost.noarch

Comment 16 errata-xmlrpc 2016-12-14 16:25:24 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHEA-2016-2948.html


Note You need to log in before you can comment on or make changes to this bug.