Bug 1393800 - OSP-9/10 upgrades make blockstorage (LVM) nodes unusable and need workaround
Summary: OSP-9/10 upgrades make blockstorage (LVM) nodes unusable and need workaround
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo-heat-templates
Version: 10.0 (Newton)
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: Upstream M3
: 11.0 (Ocata)
Assignee: Jiri Stransky
QA Contact: Tzach Shefi
URL:
Whiteboard:
Depends On:
Blocks: 1396308
TreeView+ depends on / blocked
 
Reported: 2016-11-10 11:11 UTC by Sofer Athlan-Guyot
Modified: 2017-04-28 19:30 UTC (History)
15 users (show)

Fixed In Version: openstack-tripleo-heat-templates-6.0.0-4.el7ost
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 1396308 (view as bug list)
Environment:
Last Closed: 2017-04-19 13:29:30 UTC
Target Upstream Version:
scohen: needinfo+


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Launchpad 1640730 0 None None None 2016-11-10 11:11:10 UTC

Description Sofer Athlan-Guyot 2016-11-10 11:11:11 UTC
Description of problem:  After the upgrade the blockstorage node is not working, making the creation of a volume attached to a instance non working.


Version-Release number of selected component (if applicable): OSP10, puddle Tue Nov  8 20:12:35 2016

How reproducible: always


Steps to Reproduce:
1.  Install osp9:
openstack overcloud deploy --templates --libvirt-type qemu \
          --control-flavor baremetal \
          --compute-flavor baremetal \
          --block-storage-flavor baremetal \
          --ceph-storage-flavor baremetal \
          --swift-storage-flavor baremetal \
          --control-scale 3 \
          --compute-scale 1 \
          --neutron-network-type vxlan --neutron-tunnel-types vxlan \
          --ntp-server pool.ntp.org \
          --timeout 90 \
          -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml \
          -e /usr/share/openstack-tripleo-heat-templates/environments/net-single-nic-with-vlans.yaml \
          -e $HOME/network-environment.yaml \
          -e /usr/share/openstack-tripleo-heat-templates/environments/config-debug.yaml \
          -e /home/stack/deploy_env.yaml \
# important part:
            --block-storage-scale 1 \
            --swift-storage-scale 1 \
            --ceph-storage-scale 1 \
            -e /usr/share/openstack-tripleo-heat-templates/environments/storage-environment.yaml

2.  run the upgrade
3. try to create a instance after a successful upgrade using 

   tripleo-ci.sh --overcloud-pingtest --skip-pingtest-cleanup 

Actual results: vm creation fails


Expected results: vm creation successful.

Comment 1 Sofer Athlan-Guyot 2016-11-10 15:08:30 UTC
So after some discussion with Guilio Fidente, the issue is that some part of the controller cinder ceph configuration is spilling into the blockstorage cinder configuration:

This is what is added to the cinder.conf on the blockstorage:


    enabled_backends = tripleo_ceph,
    [tripleo_ceph]
    volume_driver=cinder.volume.drivers.rbd.RBDDriver
    rbd_pool=volumes
    backend_host=hostgroup
    rbd_secret_uuid=
    volume_backend_name=tripleo_ceph
    rbd_user=openstack
    rbd_ceph_conf=/etc/ceph/ceph.conf

as there is no ceph configuration here, it fails.

Comment 3 Sofer Athlan-Guyot 2016-11-10 17:07:20 UTC
As a workaround, you can add this environment file during the blockstorage and controller upgrade:

parameter_defaults:
  BlockStorageExtraConfig:
    tripleo::profile::base::cinder::volume::cinder_enable_iscsi_backend: true
    tripleo::profile::base::cinder::volume::cinder_enable_rbd_backend: false

which makes the blockstorge nodes working again.

Comment 7 Giulio Fidente 2016-11-15 18:40:06 UTC
(In reply to Sofer Athlan-Guyot from comment #3)
> As a workaround, you can add this environment file during the blockstorage
> and controller upgrade:
> 
> parameter_defaults:
>   BlockStorageExtraConfig:
>     tripleo::profile::base::cinder::volume::cinder_enable_iscsi_backend: true
>     tripleo::profile::base::cinder::volume::cinder_enable_rbd_backend: false
> 
> which makes the blockstorge nodes working again.

fwiw, these would be needed (and sufficient) for both new deployments and upgrades

Comment 8 Paul Grist 2016-11-16 15:34:28 UTC
Converting this to a documentation bug. The issue only occurs if LVM is in use, which is not supported for production with RHOS, but is used for POCs.  We will documented the work around and let the upstream launchpad take care of fixing the bug.

For anyone else wondering, the option: On the OSP9 install in #c1 --block-storage-scale is used to add nodes where cinder is configured with the lvm driver only.

Comment 9 Don Domingo 2016-11-17 02:08:58 UTC
Thanks Paul, I've edited the release note to provide more information. Let me know if any changes are required.

Do we need to add this same blurb to the Ceph documentation? Given that we don't recommend LVM as a back end anyway, I'm inclined not to.


Note You need to log in before you can comment on or make changes to this bug.