Bug 1273194
Summary: | Cinder cannot create volumes after Ceph packages are updated | ||
---|---|---|---|
Product: | Red Hat OpenStack | Reporter: | nalmond |
Component: | openstack-cinder | Assignee: | Jon Bernard <jobernar> |
Status: | CLOSED EOL | QA Contact: | nlevinki <nlevinki> |
Severity: | high | Docs Contact: | |
Priority: | high | ||
Version: | 5.0 (RHEL 6) | CC: | creynold, cschwede, eharney, jdurgin, lhh, nlevine, sathlang, sgotliv, yeylon |
Target Milestone: | --- | Keywords: | Reopened, ZStream |
Target Release: | 5.0 (RHEL 6) | ||
Hardware: | Unspecified | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2018-09-07 13:31:57 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
nalmond
2015-10-19 22:02:54 UTC
An unresolved symbol in any dynamically linked library suggests a packaging bug. If librbd cannot be loaded as a result, any user (cinder in this case) will fail. We need to look closer at librbd packaging for that particular version. Josh, I guess we have reassign it to Ceph. There are internals in librbd and librados like this that are accidentally exposed in firefly. These internal ABIs are not stable, so this kind of problem occurs when mismatched versions are loaded. Since cinder may have the old version of librados in memory, then try loading the new version librbd, this sort of error can happen. These internal symbols are not exported in hammer (downstream 1.3.0), and for upgrades like this of older versions we may need to document a workaround, i.e. restart cinder-volume (and nova-compute if using rbd for ephemeral disks) after upgrading librbd. Other librbd users like qemu are much less likely to be affected since they only open librbd/librados once, at start up. The python bindings are effectively using dlopen(), so there are larger windows during which a conflict can arise as packages are installed, and cinder-volume or nova-compute re-load new versions of the libraries. Is there a documented workaround for this, or will it be addressed in a later version? Since there are no further releases of RHCS 1.2, where the bug is present, it does not make sense to document workarounds for the issue. We should be pushing for customers to upgrade to RHCS 1.3, which will not have this problem. Nick, Please, recommend your customer to upgrade to RHCS 1.3 or restart relevant services as described in comment #4. Hi, Re-opening that bug because we've got a new instance of it with[1] - librbd1-10.2.10-28.el7cp.x86_64 - openstack-nova-compute-14.1.0-26.el7ost.noarch on a compute node of a director installation during the upgrade for OSP9 to OSP10. This basically happen after having run yum upgrade on all node. The exact error is: Build of instance cc5b7484-e201-496c-af5b-75297a7f8870 aborted: /lib64/librbd.so.1: undefined symbol: _ZN8librados5Rados15aio_watch_flushEPNS_13AioCompletionE', u'code': 500, u'details': u' File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 1787, a workaround is to restart nova-compute. But I wonder if there could be a more "permanent" fix. [1] https://bugzilla.redhat.com/show_bug.cgi?id=1625166 |