Bug 1319987 - Storage activities are failing with error "Image is not a legal chain"
Summary: Storage activities are failing with error "Image is not a legal chain"
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: vdsm
Version: 3.6.3
Hardware: Unspecified
OS: Linux
urgent
high
Target Milestone: ovirt-4.0.0-beta
: 4.0.0
Assignee: Adam Litke
QA Contact: Kevin Alon Goldblatt
URL:
Whiteboard: storage
Depends On:
Blocks: 1321444
TreeView+ depends on / blocked
 
Reported: 2016-03-22 03:03 UTC by nijin ashok
Modified: 2019-12-16 05:33 UTC (History)
25 users (show)

Fixed In Version: v4.18.0
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 1321444 (view as bug list)
Environment:
Last Closed: 2016-08-23 20:15:15 UTC
oVirt Team: Storage
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 2204741 0 None None None 2016-03-22 14:50:08 UTC
Red Hat Product Errata RHEA-2016:1671 0 normal SHIPPED_LIVE VDSM 4.0 GA bug fix and enhancement update 2016-09-02 21:32:03 UTC
oVirt gerrit 55164 0 ovirt-3.6 MERGED storage: Clean up image links during teardownImage 2020-11-16 16:18:47 UTC
oVirt gerrit 55175 0 master MERGED blockVolume: Remove stale links to image directory 2020-11-16 16:18:47 UTC
oVirt gerrit 55227 0 master MERGED storage: Clean up image links during teardownImage 2020-11-16 16:18:47 UTC
oVirt gerrit 55231 0 ovirt-3.6 MERGED blockVolume: Remove stale links to image directory 2020-11-16 16:19:09 UTC

Description nijin ashok 2016-03-22 03:03:31 UTC
Description of problem:

In the vdsm-4.17.23 , many storage related activities are failing with error "Image is not a legal chain" because of the incorrect handling of files under /rhev/data-center/

In a RHEV environment with more than 1 host, created a VM having below disk

engine=# select image_guid,image_group_id,parentid from images where image_group_id in   
(select image_group_id from vm_images_view where disk_id in   
(select device_id from vm_device where vm_id = (select vm_guid from vm_static where vm_name = 'test-vm') and device = 'disk'))   
order by image_group_id,parentid;
              image_guid              |            image_group_id            |               parentid               
--------------------------------------+--------------------------------------+--------------------------------------
 b93d9217-f7d4-4f3f-8b0e-14a267c63b70 | 6a5efd25-ebb6-404d-bfc3-c02f1383ff4a | 00000000-0000-0000-0000-000000000000


After creating the disk, a directory was created in the  host A which is the SPM host.

==
root@HOST-A~]# ls -l /rhev/data-center/00000001-0001-0001-0001-0000000003bd/d9b512c6-fe14-4b69-9757-ec707fdd1231/images/ |grep 6a5efd25-ebb6-404d-bfc3-c02f1383ff4a
drwxr-xr-x. 2 vdsm kvm 60 Mar 21 18:53 6a5efd25-ebb6-404d-bfc3-c02f1383ff4a
==


Started the VM in the host B which creates a symlink in HOST B.

==
root@HOST-B~]#ls -l /rhev/data-center/00000001-0001-0001-0001-0000000003bd/d9b512c6-fe14-4b69-9757-ec707fdd1231/images/ |grep 6a5efd25-ebb6-404d-bfc3-c02f1383ff4a
lrwxrwxrwx. 1 vdsm kvm 95 Mar 22 00:29 6a5efd25-ebb6-404d-bfc3-c02f1383ff4a -> /var/run/vdsm/storage/d9b512c6-fe14-4b69-9757-ec707fdd1231/6a5efd25-ebb6-404d-bfc3-c02f1383ff4a
==

Set HOST-B as SPM host manually.

Shutdown the VM.

Try to take template of this VM. It fails with below error.

===
jsonrpc.Executor/1::DEBUG::2016-03-22 00:41:21,997::__init__::503::jsonrpc.JsonRpcServer::(_serveRequest) Calling 'Volume.copy' in bridge with {u'preallocate': 1, u'volFormat': 5, u'volType': 6, u'dstImgUUID': u'5b6a3cd3-6b7b-4446-815c-9c2c9a6c85da', u'dstVolUUID': u'9015e879-224c-40f0-8b22-1d2a452b1951', u'dstSdUUID': u'd9b512c6-fe14-4b69-9757-ec707fdd1231', u'volumeID': u'b93d9217-f7d4-4f3f-8b0e-14a267c63b70', u'imageID': u'6a5efd25-ebb6-404d-bfc3-c02f1383ff4a', u'postZero': u'false', u'storagepoolID': u'00000001-0001-0001-0001-0000000003bd', u'storagedomainID': u'd9b512c6-fe14-4b69-9757-ec707fdd1231', u'force': u'false', u'desc': u'{"DiskAlias":"test-vm_Disk1","DiskDescription":""}'}
jsonrpc.Executor/1::INFO::2016-03-22 00:41:22,099::image::298::Storage.Image::(isLegal) image 6a5efd25-ebb6-404d-bfc3-c02f1383ff4a in domain d9b512c6-fe14-4b69-9757-ec707fdd1231 has vollist ['b93d9217-f7d4-4f3f-8b0e-14a267c63b70']
jsonrpc.Executor/1::ERROR::2016-03-22 00:41:22,099::blockVolume::459::Storage.Volume::(validateImagePath) Unexpected error
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/blockVolume.py", line 457, in validateImagePath
    os.mkdir(imageDir, 0o755)
OSError: [Errno 17] File exists: '/rhev/data-center/00000001-0001-0001-0001-0000000003bd/d9b512c6-fe14-4b69-9757-ec707fdd1231/images/6a5efd25-ebb6-404d-bfc3-c02f1383ff4a'
jsonrpc.Executor/1::ERROR::2016-03-22 00:41:22,100::task::866::Storage.TaskManager.Task::(_setError) Task=`93d1adac-2970-4d50-9b1e-d4481cab9ae6`::Unexpected error
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/task.py", line 873, in _run
    return fn(*args, **kargs)
  File "/usr/share/vdsm/logUtils.py", line 49, in wrapper
    res = f(*args, **kwargs)
  File "/usr/share/vdsm/storage/hsm.py", line 1815, in copyImage
    pool.validateVolumeChain(sdUUID, srcImgUUID)
  File "/usr/share/vdsm/storage/securable.py", line 77, in wrapper
    return method(self, *args, **kwargs)
  File "/usr/share/vdsm/storage/sp.py", line 2016, in validateVolumeChain
    image.Image(self.poolPath).validateVolumeChain(sdUUID, imgUUID)
  File "/usr/share/vdsm/storage/image.py", line 734, in validateVolumeChain
    raise se.ImageIsNotLegalChain(imgUUID)
ImageIsNotLegalChain: Image is not a legal chain: (u'6a5efd25-ebb6-404d-bfc3-c02f1383ff4a',)
===

Extending the disk also fails with same error.

===
cdaf11c6-6c6a-4012-b149-44da3c290b1f::DEBUG::2016-03-22 00:34:01,441::task::905::Storage.TaskManager.Task::(_runJobs) Task=`cdaf11c6-6c6a-4012-b149-44da3c290b1f`::Task.run: running job 0: extendVolumeSize: <bound method StoragePool.extendVolumeSize of <storage.sp.StoragePool object at 0x7fce800ae1d0>> (args: (u'd9b512c6-fe14-4b69-9757-ec707fdd1231', u'6a5efd25-ebb6-404d-bfc3-c02f1383ff4a', u'b93d9217-f7d4-4f3f-8b0e-14a267c63b70', 8388608) kwargs: {})
cdaf11c6-6c6a-4012-b149-44da3c290b1f::ERROR::2016-03-22 00:34:01,524::blockVolume::459::Storage.Volume::(validateImagePath) Unexpected error
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/blockVolume.py", line 457, in validateImagePath
    os.mkdir(imageDir, 0o755)
OSError: [Errno 17] File exists: '/rhev/data-center/00000001-0001-0001-0001-0000000003bd/d9b512c6-fe14-4b69-9757-ec707fdd1231/images/6a5efd25-ebb6-404d-bfc3-c02f1383ff4a'
===

Also other storage activity like exporting the VM , live storage migration is failing.
  
Version-Release number of selected component (if applicable):

Red Hat Enterprise Virtualization 3.6
vdsm-4.17.23-0.el7ev.noarch
Red Hat Enterprise Virtualization Hypervisor release 7.2 (20160302.1.el7ev)

How reproducible:
100%

Steps to Reproduce:

1. Create a VM and start this VM in any host other than the SPM host.
2. Set the host where we start the VM as SPM host.
3. Shutdown the VM and try to do any storage activity like creating template , extending disk, exporting the VM 

Actual results:

The storage activity on the VM fails with "Image is not a legal chain" error.

Expected results:

The storage activity on the VM should work.

Additional info:

From the code it seems that it checks for the path and if it's not a directory, the same will be created.

==
if not os.path.isdir(imageDir):
  try:
    os.mkdir(imageDir, 0o755)
  except Exception:
    self.log.error("Unexpected error", exc_info=True)
                raise se.ImagePathError(imageDir)
==

This will be a symlink to a directory if the VM is running on the host which will work good. However in this case, an broken symlink is left behind after we shutdown the VM.

file /rhev/data-center/00000001-0001-0001-0001-0000000003bd/d9b512c6-fe14-4b69-9757-ec707fdd1231/images/6a5efd25-ebb6-404d-bfc3-c02f1383ff4a
/rhev/data-center/00000001-0001-0001-0001-0000000003bd/d9b512c6-fe14-4b69-9757-ec707fdd1231/images/6a5efd25-ebb6-404d-bfc3-c02f1383ff4a: broken symbolic link to `/var/run/vdsm/storage/d9b512c6-fe14-4b69-9757-ec707fdd1231/6a5efd25-ebb6-404d-bfc3-c02f1383ff4a'

file /var/run/vdsm/storage/d9b512c6-fe14-4b69-9757-ec707fdd1231/6a5efd25-ebb6-404d-bfc3-c02f1383ff4a
/var/run/vdsm/storage/d9b512c6-fe14-4b69-9757-ec707fdd1231/6a5efd25-ebb6-404d-bfc3-c02f1383ff4a: cannot open (No such file or directory)

If this host becomes SPM and any storage related activity is done for this  VM, it fails because the creation of directory fails.

Comment 4 Adam Litke 2016-03-23 18:45:57 UTC
I was able to reproduce this locally.  Note that it only affects Block-based storage.

Comment 6 Adam Litke 2016-03-23 20:06:48 UTC
As a temporary workaround, you can safely remove the broken symlink /rhev/data-center/00000002-0002-0002-0002-0000000002b6/17a3c34d-c48b-48ba-bafe-14450be367f5/images/d7650319-bf64-4030-95d4-6bee86aca526 .

Comment 9 Yaniv Kaul 2016-05-03 17:34:28 UTC
This has been MODIFIED for >1M - isn't it already available in some build and can be moved to ON_QA?

Comment 10 Allon Mureinik 2016-05-05 12:28:13 UTC
(In reply to Yaniv Kaul from comment #9)
> This has been MODIFIED for >1M - isn't it already available in some build
> and can be moved to ON_QA?

There's are no tags on VDSM past v4.17.999, so I guess the answer is sadly no.

Comment 11 Kevin Alon Goldblatt 2016-06-22 16:45:20 UTC
Verified with the following code:
-----------------------------------------------
rhevm-4.0.0.4-0.1.el7ev.noarch
vdsm-4.18.2-0.el7ev.x86_64

Verified with the following scenario:
-----------------------------------------------
1. On a system with 2 hosts created VM8 with 1 iscis disk
2. Started vm8 on the host which is not the SPM
3. Manually made the above host the SPM
4. Powered off the VM
5. Created a template of the VM successfully
6. Extended the disk of the VM successfully



MOVING TO VERIFIED

Comment 13 errata-xmlrpc 2016-08-23 20:15:15 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHEA-2016-1671.html


Note You need to log in before you can comment on or make changes to this bug.