Bug 1572515 - Cannot complete FFU upgrade due to a bug in cinder
Summary: Cannot complete FFU upgrade due to a bug in cinder
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: puppet-cinder
Version: 13.0 (Queens)
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: z2
: 13.0 (Queens)
Assignee: Alan Bishop
QA Contact: Avi Avraham
Kim Nylander
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-04-27 08:09 UTC by Yolanda Robla
Modified: 2018-12-24 11:40 UTC (History)
13 users (show)

Fixed In Version: puppet-cinder-12.4.1-0.20180628102250.641e036.el7ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-08-29 16:35:54 UTC
Target Upstream Version:


Attachments (Terms of Use)
sosreport on failing controller (18.48 MB, application/x-xz)
2018-04-27 08:23 UTC, Yolanda Robla
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Launchpad 1768833 0 None None None 2018-05-03 13:08:22 UTC
OpenStack gerrit 566056 0 None MERGED Revert "Restore iscsi loopback LVM volume group on startup" 2020-03-07 19:46:16 UTC
OpenStack gerrit 566349 0 None MERGED Revert "Restore iscsi loopback LVM volume group on startup" 2020-03-07 19:46:16 UTC
Red Hat Product Errata RHBA-2018:2574 0 None None None 2018-08-29 16:36:50 UTC

Description Yolanda Robla 2018-04-27 08:09:24 UTC
Description of problem:

When executing FFU, i always get it stopped with a cinder error:

        "Error: losetup -f /var/lib/cinder/cinder-volumes && udevadm settle && vgchange -a y cinder-volumes returned 5 instead of one of [0]", 
        "Error: /Stage[main]/Cinder::Setup_test_volume/Exec[losetup -f /var/lib/cinder/cinder-volumes && udevadm settle && vgchange -a y cinder-volumes]/returns: change from notrun to 0 failed: losetup -f /var/lib/cinder/cinder-volumes && udevadm settle && vgchange -a y cinder-volumes returned 5 instead of one of [0]", 
        "Warning: /Stage[main]/Cinder::Deps/Anchor[cinder::service::begin]: Skipping because of failed dependencies", 
        "Warning: /Stage[main]/Cinder::Volume/Service[cinder-volume]: Skipping because of failed dependencies", 
        "Warning: /Stage[main]/Tripleo::Profile::Base::Cinder::Volume::Iscsi/Cinder::Backend::Iscsi[tripleo_iscsi]/Service[target]: Skipping because of failed dependencies",

After this failure, i can login in the controllers and i see:

 losetup
NAME       SIZELIMIT OFFSET AUTOCLEAR RO BACK-FILE
/dev/loop0         0      0         0  0 /var/lib/cinder/cinder-volumes

vgs
  VG             #PV #LV #SN Attr   VSize   VFree  
  cinder-volumes   1   0   0 wz--n- <10,04g <10,04g

Comment 1 Yolanda Robla 2018-04-27 08:12:09 UTC
When i execute the commadn that fails, i get:

[root@overcloud-controller-2 heat-admin]# losetup -f /var/lib/cinder/cinder-volumes && udevadm settle && vgchange -a y cinder-volumes
  WARNING: Not using lvmetad because duplicate PVs were found.
  WARNING: Use multipath or vgimportclone to resolve duplicate PVs?
  WARNING: After duplicates are resolved, run "pvscan --cache" to enable lvmetad.
  WARNING: PV jYDAOP-SOtz-UeGN-cEM1-5mOk-DNjk-Al69ev on /dev/loop1 was already found on /dev/loop0.
  WARNING: PV jYDAOP-SOtz-UeGN-cEM1-5mOk-DNjk-Al69ev on /dev/loop2 was already found on /dev/loop0.
  WARNING: PV jYDAOP-SOtz-UeGN-cEM1-5mOk-DNjk-Al69ev prefers device /dev/loop0 because device was seen first.
  WARNING: PV jYDAOP-SOtz-UeGN-cEM1-5mOk-DNjk-Al69ev prefers device /dev/loop0 because device was seen first.
  0 logical volume(s) in volume group "cinder-volumes" now active

Comment 2 Yolanda Robla 2018-04-27 08:23:46 UTC
Created attachment 1427578 [details]
sosreport on failing controller

Comment 3 Marius Cornea 2018-04-30 13:07:00 UTC
This appears to be related to a known bug for LVM deployments where the volume is backed by a loopback device which gets wiped after a controller reboot: BZ#1412661

Comment 4 Alan Bishop 2018-05-02 13:07:27 UTC
There are a couple of things to consider. First, as Marius noted in comment
#3, Cinder's LVM backend has a known problem that probably makes it unsuitable
for full FFU testing. The LVM backend uses a loopback device that isn't
restored when the node is rebooted.

I attempted to fix the loopback problem with [1]. However, the fix is not
sufficient because I failed to realize at the time that TripleO does not
trigger puppet to run on boot-up.

[1] https://review.openstack.org/465731

Now there's evidence that this patch is actually causing problems. The logs in
sosreport show the "vgchange -a y cinder-volumes" is the source of the code 5
error. Here's a similar example:

[root@overcloud-controller-0 ~]# vgchange -a y foo
  WARNING: Not using lvmetad because duplicate PVs were found.
  WARNING: Use multipath or vgimportclone to resolve duplicate PVs?
  WARNING: After duplicates are resolved, run "pvscan --cache" to enable lvmetad.
  Volume group "foo" not found
  Cannot process volume group foo
[root@overcloud-controller-0 ~]# echo $?
5

I'm going to open a new LP and revert my patch [1]. The LVM backend will still
suffer from the loopback device not being restored on boot, but it should no
longer trigger this BZ's problem.

Comment 5 Benjamin Schmaus 2018-06-28 19:44:28 UTC
Just as a note I am seeing this on a minor upgrade in OSP12:

 u'        "Error: losetup -f /var/lib/cinder/cinder-volumes && udevadm settle && vgchange -a y cinder-volumes returned 5 instead of one of [0]", ',
 u'        "Error: /Stage[main]/Cinder::Setup_test_volume/Exec[losetup -f /var/lib/cinder/cinder-volumes && udevadm settle && vgchange -a y cinder-volumes]/returns: change from notrun to 0 failed: losetup -f /var/lib/cinder/cinder-volumes && udevadm settle && vgchange -a y cinder-volumes returned 5 instead of one of [0]", ',
 u'        "Warning: /Stage[main]/Cinder::Deps/Anchor[cinder::service::begin]: Skipping because of failed dependencies", ',
 u'        "Warning: /Stage[main]/Cinder::Api/Service[cinder-api]: Skipping because of failed dependencies", ',
 u'        "Warning: /Stage[main]/Cinder::Scheduler/Service[cinder-scheduler]: Skipping because of failed dependencies", ',
 u'        "Warning: /Stage[main]/Cinder::Volume/Service[cinder-volume]: Skipping because of failed dependencies", ',
 u'        "Warning: /Stage[main]/Tripleo::Profile::Base::Cinder::Volume::Iscsi/Cinder::Backend::Iscsi[tripleo_iscsi]/Service[target]: Skipping because of failed dependencies", ',
 u'        "Warning: /Stage[main]/Cinder::Deps/Anchor[cinder::service::end]: Skipping because of failed dependencies", ',
 u'        "Warning: /Stage[main]/Apache::Service/Service[httpd]: Skipping because of failed dependencies", ',
 u'        "Warning: /Stage[main]/Tripleo::Firewall::Post/Tripleo::Firewall::Rule[998 log all]/Firewall[998 log all ipv4]: Skipping because of failed dependencies", ',
 u'        "Warning: /Stage[main]/Tripleo::Firewall::Post/Tripleo::Firewall::Rule[998 log all]/Firewall[998 log all ipv6]: Skipping because of failed dependencies", ',
 u'        "Warning: /Stage[main]/Tripleo::Firewall::Post/Tripleo::Firewall::Rule[999 drop all]/Firewall[999 drop all ipv4]: Skipping because of failed dependencies", ',
 u'        "Warning: /Stage[main]/Tripleo::Firewall::Post/Tripleo::Firewall::Rule[999 drop all]/Firewall[999 drop all ipv6]: Skipping because of failed dependencies", ',
 u'        "Warning: /Stage[main]/Firewall::Linux::Redhat/File[/etc/sysconfig/iptables]: Skipping because of failed dependencies", ',
 u'        "Warning: /Stage[main]/Firewall::Linux::Redhat/File[/etc/sysconfig/ip6tables]: Skipping because of failed dependencies"',
 u'    ]',


If I log into controller I see:

[root@overcloud-controller-0 heat-admin]# losetup
NAME       SIZELIMIT OFFSET AUTOCLEAR RO BACK-FILE
/dev/loop0         0      0         0  0 /var/lib/cinder/cinder-volumes
[root@overcloud-controller-0 heat-admin]# vgs
  VG             #PV #LV #SN Attr   VSize   VFree  
  cinder-volumes   1   0   0 wz--n- <10.04g <10.04g


[root@overcloud-controller-0 heat-admin]# vgchange -a y foo
  Volume group "foo" not found
  Cannot process volume group foo
[root@overcloud-controller-0 heat-admin]# echo $?
5

Comment 15 Joanne O'Flynn 2018-08-15 08:07:04 UTC
This bug is marked for inclusion in the errata but does not currently contain draft documentation text. To ensure the timely release of this advisory please provide draft documentation text for this bug as soon as possible.

If you do not think this bug requires errata documentation, set the requires_doc_text flag to "-".


To add draft documentation text:

* Select the documentation type from the "Doc Type" drop down field.

* A template will be provided in the "Doc Text" field based on the "Doc Type" value selected. Enter draft text in the "Doc Text" field.

Comment 17 errata-xmlrpc 2018-08-29 16:35:54 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:2574


Note You need to log in before you can comment on or make changes to this bug.