Bug 1573870
| Summary: | Attaching an encrypted Cinder vol created from an image (created from another encrypted vol upload)fails | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat OpenStack | Reporter: | Tzach Shefi <tshefi> | ||||
| Component: | openstack-cinder | Assignee: | Eric Harney <eharney> | ||||
| Status: | CLOSED ERRATA | QA Contact: | Avi Avraham <aavraham> | ||||
| Severity: | high | Docs Contact: | Kim Nylander <knylande> | ||||
| Priority: | high | ||||||
| Version: | 13.0 (Queens) | CC: | berrange, cschwede, dasmith, eglynn, eharney, jhakimra, jschluet, kchamart, lyarwood, mbooth, pgrist, sbauza, scohen, sferdjao, sgordon, srevivo, thiago, tshefi, vromanso | ||||
| Target Milestone: | rc | Keywords: | Triaged | ||||
| Target Release: | 13.0 (Queens) | Flags: | eharney:
needinfo-
|
||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | openstack-cinder-12.0.1-0.20180418194614.c476898.el7ost | Doc Type: | If docs needed, set a value | ||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2018-06-27 13:54:52 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
|
Description
Tzach Shefi
2018-05-02 12:08:35 UTC
Nova compute does:
2018-05-02 11:12:18.362 [nova-compute.log] 1 DEBUG nova.virt.libvirt.host [req-97b2a5a2-170c-43f3-89cb-6588e39c4a03 17143d85ea234df191e5b5ac06a6489f 4e0b66c6ff994c8f8e306cca69781c87 - default default] Secret XML: <secret ephemeral="no" private="no">
<usage type="volume">
<volume>a23f5c23-1dbc-413e-afac-e2b6285d0aa6</volume>
</usage>
</secret>
create_secret /usr/lib/python2.7/site-packages/nova/virt/libvirt/host.py:732
2018-05-02 11:12:18.389 [nova-compute.log] 1 DEBUG nova.virt.libvirt.guest [req-97b2a5a2-170c-43f3-89cb-6588e39c4a03 17143d85ea234df191e5b5ac06a6489f 4e0b66c6ff994c8f8e306cca69781c87 - default default] attach device xml: <disk type="network" device="disk">
<driver name="qemu" type="raw" cache="writeback" discard="unmap"/>
<source protocol="rbd" name="volumes/volume-a23f5c23-1dbc-413e-afac-e2b6285d0aa6">
<host name="172.17.3.16" port="6789"/>
<host name="172.17.3.18" port="6789"/>
<host name="172.17.3.20" port="6789"/>
</source>
<auth username="openstack">
<secret type="ceph" uuid="cd20e710-4b99-11e8-bfec-525400688018"/>
</auth>
<target bus="virtio" dev="vdb"/>
<serial>a23f5c23-1dbc-413e-afac-e2b6285d0aa6</serial>
<encryption format="luks">
<secret type="passphrase" uuid="8859017b-e4b8-4387-b7cb-215db09a3730"/>
</encryption>
</disk>
attach_device /usr/lib/python2.7/site-packages/nova/virt/libvirt/guest.py:302
which results in:
2018-05-02 11:12:20.813 [nova-compute.log] 1 ERROR nova.virt.libvirt.driver [req-97b2a5a2-170c-43f3-89cb-6588e39c4a03 17143d85ea234df191e5b5ac06a6489f 4e0b66c6ff994c8f8e306cca69781c87 - default default] [instance: 4dcc222e-8522-4ca3-badb-1ae9ca3bb3ea] Failed to attach volume at mountpoint: /dev/vdb: libvirtError: internal error: unable to execute QEMU command '__com.redhat_drive_add': Device 'drive-virtio-disk1' could not be initialized
2018-05-02 11:12:20.813 [nova-compute.log] 1 ERROR nova.virt.libvirt.driver [instance: 4dcc222e-8522-4ca3-badb-1ae9ca3bb3ea] Traceback (most recent call last):
2018-05-02 11:12:20.813 [nova-compute.log] 1 ERROR nova.virt.libvirt.driver [instance: 4dcc222e-8522-4ca3-badb-1ae9ca3bb3ea] File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 1461, in attach_volume
2018-05-02 11:12:20.813 [nova-compute.log] 1 ERROR nova.virt.libvirt.driver [instance: 4dcc222e-8522-4ca3-badb-1ae9ca3bb3ea] guest.attach_device(conf, persistent=True, live=live)
2018-05-02 11:12:20.813 [nova-compute.log] 1 ERROR nova.virt.libvirt.driver [instance: 4dcc222e-8522-4ca3-badb-1ae9ca3bb3ea] File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/guest.py", line 303, in attach_device
2018-05-02 11:12:20.813 [nova-compute.log] 1 ERROR nova.virt.libvirt.driver [instance: 4dcc222e-8522-4ca3-badb-1ae9ca3bb3ea] self._domain.attachDeviceFlags(device_xml, flags=flags)
2018-05-02 11:12:20.813 [nova-compute.log] 1 ERROR nova.virt.libvirt.driver [instance: 4dcc222e-8522-4ca3-badb-1ae9ca3bb3ea] File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 186, in doit
2018-05-02 11:12:20.813 [nova-compute.log] 1 ERROR nova.virt.libvirt.driver [instance: 4dcc222e-8522-4ca3-badb-1ae9ca3bb3ea] result = proxy_call(self._autowrap, f, *args, **kwargs)
2018-05-02 11:12:20.813 [nova-compute.log] 1 ERROR nova.virt.libvirt.driver [instance: 4dcc222e-8522-4ca3-badb-1ae9ca3bb3ea] File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 144, in proxy_call
2018-05-02 11:12:20.813 [nova-compute.log] 1 ERROR nova.virt.libvirt.driver [instance: 4dcc222e-8522-4ca3-badb-1ae9ca3bb3ea] rv = execute(f, *args, **kwargs)
2018-05-02 11:12:20.813 [nova-compute.log] 1 ERROR nova.virt.libvirt.driver [instance: 4dcc222e-8522-4ca3-badb-1ae9ca3bb3ea] File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 125, in execute
2018-05-02 11:12:20.813 [nova-compute.log] 1 ERROR nova.virt.libvirt.driver [instance: 4dcc222e-8522-4ca3-badb-1ae9ca3bb3ea] six.reraise(c, e, tb)
2018-05-02 11:12:20.813 [nova-compute.log] 1 ERROR nova.virt.libvirt.driver [instance: 4dcc222e-8522-4ca3-badb-1ae9ca3bb3ea] File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 83, in tworker
2018-05-02 11:12:20.813 [nova-compute.log] 1 ERROR nova.virt.libvirt.driver [instance: 4dcc222e-8522-4ca3-badb-1ae9ca3bb3ea] rv = meth(*args, **kwargs)
2018-05-02 11:12:20.813 [nova-compute.log] 1 ERROR nova.virt.libvirt.driver [instance: 4dcc222e-8522-4ca3-badb-1ae9ca3bb3ea] File "/usr/lib64/python2.7/site-packages/libvirt.py", line 594, in attachDeviceFlags
2018-05-02 11:12:20.813 [nova-compute.log] 1 ERROR nova.virt.libvirt.driver [instance: 4dcc222e-8522-4ca3-badb-1ae9ca3bb3ea] if ret == -1: raise libvirtError ('virDomainAttachDeviceFlags() failed', dom=self)
2018-05-02 11:12:20.813 [nova-compute.log] 1 ERROR nova.virt.libvirt.driver [instance: 4dcc222e-8522-4ca3-badb-1ae9ca3bb3ea] libvirtError: internal error: unable to execute QEMU command '__com.redhat_drive_add': Device 'drive-virtio-disk1' could not be initialized
2018-05-02 11:12:20.813 [nova-compute.log] 1 ERROR nova.virt.libvirt.driver [instance: 4dcc222e-8522-4ca3-badb-1ae9ca3bb3ea]
The logs in Comment 1 indicate that qemu can't add the disk. This may or may not be the root cause, but I'd like to start here anyway. Possible cause: bug 1406796 suggests that we don't yet support native LUKS encryption on RBD. Please could you provide libvirt debug logs as detailed here: https://kashyapc.fedorapeople.org/virt/openstack/request-nova-libvirt-qemu-debug-logs.txt Also sosreports would be preferrable to cherry-picked logs. Please could you provide sosreports from the relevant controller(s) and compute? (In reply to Matthew Booth from comment #2) > The logs in Comment 1 indicate that qemu can't add the disk. This may or may > not be the root cause, but I'd like to start here anyway. > > Possible cause: bug 1406796 suggests that we don't yet support native LUKS > encryption on RBD. That's covering full support for all blockdev operations, something that isn't required here to simply attach and decrypt the new rbd volume. > Please could you provide libvirt debug logs as detailed here: > > https://kashyapc.fedorapeople.org/virt/openstack/request-nova-libvirt-qemu- > debug-logs.txt > > Also sosreports would be preferrable to cherry-picked logs. Please could you > provide sosreports from the relevant controller(s) and compute? Agreed, can we also get the output of the following commands from a host with the required rbd keys: $ qemu-img info volumes/volume-23ce20c9-fa03-4638-ae6e-452cffb884fa $ qemu-img info volumes/volume-a23f5c23-1dbc-413e-afac-e2b6285d0aa6 Finally, can you also try attaching the volume to another instance? I've reproduced the issue, on a new system logs are available.
Also noticed another oddity on this new system, might have happened on original system I didn't notice it back then. The new volume I create on step 5 (ID a6591d92-e838-4ab1-8644-81aba6f746c6) is listed as bootable which is false, as it only contains as a text file on it. Suspect there is another Glance/Cinder bug lurking in the shadows here.
I've enabled libvirt logs now tried to attach this second new volume -> d8b360fa-6348-4d7c-9714-dfbeee88fd8d also failing, same as step5 just another volume.
Versions of new system:
openstack-cinder-12.0.1-0.20180418194613.c476898.el7ost.noarch
openstack-nova-compute-17.0.3-0.20180420001139.el7ost.noarch
openstack-glance-16.0.1-2.el7ost.noarch
openstack-nova-common-17.0.3-0.20180420001139.el7ost.noarch
openstack-nova-api-17.0.3-0.20180420001139.el7ost.noarch
On this new system qemu-image of the three volumes
qemu-img info volumes/volume-17276f8b-c21d-479e-a58a-522319d01ff8
qemu-img: Could not open 'volumes/volume-17276f8b-c21d-479e-a58a-522319d01ff8': Could not open 'volumes/volume-17276f8b-c21d-479e-a58a-522319d01ff8': No such file or directory
[root@controller-0 tmp]# qemu-img info rbd:volumes/volume-17276f8b-c21d-479e-a58a-522319d01ff8
image: json:{"driver": "luks", "file": {"pool": "volumes", "image": "volume-17276f8b-c21d-479e-a58a-522319d01ff8", "driver": "rbd"}}
file format: luks
virtual size: 1.0G (1073741824 bytes)
disk size: unavailable
encrypted: yes
cluster_size: 4194304
Format specific information:
ivgen alg: plain64
hash alg: sha256
cipher alg: aes-256
uuid: 8d41fc3e-d134-480c-a35e-a42bf8a5763b
cipher mode: xts
slots:
[0]:
active: true
iters: 1038604
key offset: 4096
stripes: 4000
[1]:
active: false
key offset: 262144
[2]:
active: false
key offset: 520192
[3]:
active: false
key offset: 778240
[4]:
active: false
key offset: 1036288
[5]:
active: false
key offset: 1294336
[6]:
active: false
key offset: 1552384
[7]:
active: false
key offset: 1810432
payload offset: 2068480
master key iters: 259425
qemu-img info rbd:volumes/volume-a6591d92-e838-4ab1-8644-81aba6f746c6
image: json:{"driver": "luks", "file": {"pool": "volumes", "image": "volume-a6591d92-e838-4ab1-8644-81aba6f746c6", "driver": "rbd"}}
file format: luks
virtual size: 2.0G (2145415168 bytes)
disk size: unavailable
encrypted: yes
cluster_size: 4194304
Format specific information:
ivgen alg: plain64
hash alg: sha256
cipher alg: aes-256
uuid: 8d41fc3e-d134-480c-a35e-a42bf8a5763b
cipher mode: xts
slots:
[0]:
active: true
iters: 1038604
key offset: 4096
stripes: 4000
[1]:
active: false
key offset: 262144
[2]:
active: false
key offset: 520192
[3]:
active: false
key offset: 778240
[4]:
active: false
key offset: 1036288
[5]:
active: false
key offset: 1294336
[6]:
active: false
key offset: 1552384
[7]:
active: false
key offset: 1810432
payload offset: 2068480
master key iters: 259425
qemu-img info rbd:volumes/volume-d8b360fa-6348-4d7c-9714-dfbeee88fd8d
image: json:{"driver": "luks", "file": {"pool": "volumes", "image": "volume-d8b360fa-6348-4d7c-9714-dfbeee88fd8d", "driver": "rbd"}}
file format: luks
virtual size: 2.0G (2145415168 bytes)
disk size: unavailable
encrypted: yes
cluster_size: 4194304
Format specific information:
ivgen alg: plain64
hash alg: sha256
cipher alg: aes-256
uuid: 8d41fc3e-d134-480c-a35e-a42bf8a5763b
cipher mode: xts
slots:
[0]:
active: true
iters: 1038604
key offset: 4096
stripes: 4000
[1]:
active: false
key offset: 262144
[2]:
active: false
key offset: 520192
[3]:
active: false
key offset: 778240
[4]:
active: false
key offset: 1036288
[5]:
active: false
key offset: 1294336
[6]:
active: false
key offset: 1552384
[7]:
active: false
key offset: 1810432
payload offset: 2068480
master key iters: 259425
Thanks for the logs and access to the underlying host Tzach! I've updated the upstream bug with my findings and am moving this downstream bug across to Cinder. AFAICT Cinder is creating a fresh Barbican secret that is itself unable to unlock the original LUKS encryption on the volume. The entire flow needs to be reviewed by the Cinder team as we appear to be doing duplicate downloads of the image from Glance and conversions on the c-vol host. I assume this is where the new secret is also created. Let me know if I can help anymore with this! One indicator of what's happening here is this message in the qemu log for the instance when the attach fails (the last line): # cat /var/log/libvirt/qemu/instance-00000002.log 'serial' is deprecated, please use the corresponding option of '-device' instead 'serial' is deprecated, please use the corresponding option of '-device' instead Invalid password, cannot unlock any keyslot Waiting to rebuild my server as latest deployment was short of fixed-in -> openstack-cinder-12.0.1-0.20180418194613.c476898.el7ost.noarch Are we still waiting for rpm/package, as we have a fix-in but status isn't ON_QA yet. Verified on openstack-cinder-12.0.1-0.20180418194614.c476898.el7ost Puddle date 2018-06-04.2 Create a LUKS (Barbican) volume: #cinder create 1 --volume-type LUKS - created -> | 9230d56a-3098-44b3-a5fb-c55403734896 | available | - | 1 | LUKS Boot an instance -> inst1 | ACTIVE Attach volume to instance, mkfs write some data detach volume All passed fine Upload volume to Glance: #openstack image create --disk-format raw --container-format bare --volume 9230d56a-3098-44b3-a5fb-c55403734896 EncVolPlusData Image created, now create a new LUKS volume from image: #cinder create 2 --image 0b900303-3c62-4a97-8eda-20186eea9dbf --volume-type LUKS --name NewVol Volume created -> | d26d3271-0d75-47e1-9f41-f4dde42bfec1 | available | NewVol | 2 | LUKS | true Now attach new volume to instance (failed before) working now # nova volume-attach 0fb7b7e1-a675-481b-8bed-d7bb73a43ae2 d26d3271-0d75-47e1-9f41-f4dde42bfec1 auto Cinder list -> | d26d3271-0d75-47e1-9f41-f4dde42bfec1 | in-use | NewVol | 2 | LUKS | true | 0fb7b7e1-a675-481b-8bed-d7bb73a43ae2 | Mount volume inside instance validate original data exists drums.. # cat flrd/tshefi.txt hello Yep working as expected OK to verify. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2018:2086 |