Bug 1292092 - Bogus log when unlink operation fails during deleteImage on file storage.
Summary: Bogus log when unlink operation fails during deleteImage on file storage.
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: vdsm
Classification: oVirt
Component: General
Version: 4.17.10
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ovirt-3.6.2
: 4.17.16
Assignee: Idan Shaby
QA Contact: Elad
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-12-16 12:57 UTC by Nir Soffer
Modified: 2016-02-18 11:05 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of: 1291207
Environment:
Last Closed: 2016-02-18 11:05:11 UTC
oVirt Team: Storage
Embargoed:
rule-engine: ovirt-3.6.z+
ylavi: planning_ack+
amureini: devel_ack+
rule-engine: testing_ack+


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 50589 0 master MERGED fileSD: Do not try to remove non-existent lease file 2015-12-17 19:08:16 UTC
oVirt gerrit 50804 0 master MERGED fileSD: improve error handling in deleteImage 2015-12-24 07:10:43 UTC
oVirt gerrit 51229 0 ovirt-3.6 MERGED fileSD: improve error handling in deleteImage 2016-01-03 20:01:03 UTC
oVirt gerrit 51299 0 ovirt-3.6 MERGED fileSD: improve error handling in deleteImage 2016-01-05 09:52:44 UTC

Description Nir Soffer 2015-12-16 12:57:54 UTC
+++ This bug was initially created as a clone of Bug #1291207 +++

Description of problem:

When .meta or .lease file unlink fail, we log that the volume file could
not be removed instead of logging about the failure to remove the actual
file, making debugging harder.

Version-Release number of selected component (if applicable):
rhevm-3.6.1.3-0.1.el6.noarch
vdsm-4.17.13-1.el7ev.noarch

How reproducible:
Always

Steps to Reproduce:
1. Export a VM with disk to export domain
2. Remove the VM from export domain

Actual results:
Trying to remove the non-existent .lease file, we log about the volume file:

jsonrpc.Executor/1::ERROR::2015-12-14 10:44:56,320::fileSD::409::Storage.StorageDomain::(deleteImage) vol: /rhev/data-center/mnt/10.35.64.11:_vol_RHEV_Storage_elad_3/e89bff1b-b661-419a-85cc-1b083219879b/images/_re
move_me_e5b31639-bf4d-475d-b286-fbd09ae835c0/6f3b44a7-60f4-4c5a-8e33-5c0bd592d04d can't be removed.
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/fileSD.py", line 406, in deleteImage
    self.oop.os.remove(leaseFile)
  File "/usr/share/vdsm/storage/outOfProcess.py", line 246, in remove
    self._iop.unlink(path)
  File "/usr/lib/python2.7/site-packages/ioprocess/__init__.py", line 491, in unlink
    return self._sendCommand("unlink", {"path": path}, self.timeout)
  File "/usr/lib/python2.7/site-packages/ioprocess/__init__.py", line 427, in _sendCommand
    raise OSError(errcode, errstr)
OSError: [Errno 2] No such file or directory

The failed operation was:

    self.oop.os.remove(leaseFile)

But the logged message is:

  vol: /rhev/data-center/mnt/10.35.64.11:_vol_RHEV_Storage_elad_3/e89bff1b-b661-419a-85cc-1b083219879b/images/_re
move_me_e5b31639-bf4d-475d-b286-fbd09ae835c0/6f3b44a7-60f4-4c5a-8e33-5c0bd592d04d can't be removed.

This is a lie, as the file "6f3b44a7-60f4-4c5a-8e33-5c0bd592d04d" was removed just before
trying to remove the file "6f3b44a7-60f4-4c5a-8e33-5c0bd592d04d.lease".

Expected result:

Log the error with the actual path that failed.
    
The issue of deleting non-existent lease file is handled in bug 1291207
this bug is about logging correctly.

Comment 1 Nir Soffer 2015-12-16 12:59:34 UTC
Setting to medium because bogus logs make it harder to debug real issues.

Comment 2 Nir Soffer 2015-12-16 13:16:59 UTC
The unlink operation failed with ENOENT. This means that the delete operation was
not needed, as the file does not exist.

In this case we should log a debug message, since we do not expect a missing
volume file in an image directory (maybe we do when removing garbage volumes),
and continue the operation.

And there is even worse problem, that can fail the entire operation if deleting
the vol-uuid file or vol-uuid.meta file failed.

In this flow, we are removing these files:

    image-uuid/vol-uuid
    image-uuid/vol-uuid.meta
    image-uuid/vol-uuid.lease

We have one error handler for the 3 unlink calls, so if removing vol-uuid 
fails, we will not try to remove vol-uuid.meta and vol-uuid.lease, and
deleting the image directory will fail because rmdir works only for
empty directories.

The fix should log errors (or debug message for ENOENT) for each unlink
operation, so we always delete all the vol-uuid* files.

Comment 3 Elad 2016-01-18 10:20:24 UTC
Tested the following over file data and export domain:
1) Manually removed the volume file from /rhev/data-center/pool/sd/images/image/
2) Removed the disk from the domain

For both export and data domains, the delete operation all the image content proceeded and the images got removed successfully.

jsonrpc.Executor/6::DEBUG::2016-01-18 12:13:16,512::fileSD::410::Storage.StorageDomain::(_deleteVolumeFile) Removing file: /rhev/data-center/mnt/10.35.64.11:_vol_RHEV_Storage_elad_6/36f31a8d-55aa-4c3e-881c-02544bc
5d213/images/_remove_me_833762ef-2597-44f8-a8f5-41ad5276698b/b9a13fe9-296a-4685-8038-f5299f0a650e
jsonrpc.Executor/6::WARNING::2016-01-18 12:13:16,514::fileSD::415::Storage.StorageDomain::(_deleteVolumeFile) File u'/rhev/data-center/mnt/10.35.64.11:_vol_RHEV_Storage_elad_6/36f31a8d-55aa-4c3e-881c-02544bc5d213/
images/_remove_me_833762ef-2597-44f8-a8f5-41ad5276698b/b9a13fe9-296a-4685-8038-f5299f0a650e' does not exist: [Errno 2] No such file or directory
jsonrpc.Executor/6::DEBUG::2016-01-18 12:13:16,514::fileSD::410::Storage.StorageDomain::(_deleteVolumeFile) Removing file: /rhev/data-center/mnt/10.35.64.11:_vol_RHEV_Storage_elad_6/36f31a8d-55aa-4c3e-881c-02544bc
5d213/images/_remove_me_833762ef-2597-44f8-a8f5-41ad5276698b/b9a13fe9-296a-4685-8038-f5299f0a650e.meta
jsonrpc.Executor/6::DEBUG::2016-01-18 12:13:16,516::fileSD::410::Storage.StorageDomain::(_deleteVolumeFile) Removing file: /rhev/data-center/mnt/10.35.64.11:_vol_RHEV_Storage_elad_6/36f31a8d-55aa-4c3e-881c-02544bc
5d213/images/_remove_me_833762ef-2597-44f8-a8f5-41ad5276698b/b9a13fe9-296a-4685-8038-f5299f0a650e.lease
jsonrpc.Executor/6::DEBUG::2016-01-18 12:13:16,517::fileSD::402::Storage.StorageDomain::(deleteImage) Removing directory: /rhev/data-center/mnt/10.35.64.11:_vol_RHEV_Storage_elad_6/36f31a8d-55aa-4c3e-881c-02544bc5
d213/images/_remove_me_833762ef-2597-44f8-a8f5-41ad5276698b




Verified using:
vdsm-4.17.17-0.el7ev.noarch
rhevm-3.6.2.5-0.1.el6.noarch


Note You need to log in before you can comment on or make changes to this bug.