Bug 814321 - logging into faulty block devices will cause unknown device error and we will be unable to login to new targets on the storage
Summary: logging into faulty block devices will cause unknown device error and we will...
Keywords:
Status: CLOSED WORKSFORME
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: vdsm
Version: 3.1.0
Hardware: x86_64
OS: Linux
high
high
Target Milestone: ---
: 3.3.0
Assignee: Ayal Baron
QA Contact:
URL:
Whiteboard: storage
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-04-19 15:16 UTC by Dafna Ron
Modified: 2016-02-10 17:06 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2013-06-06 13:01:12 UTC
oVirt Team: Storage
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
logs (637.50 KB, application/x-gzip)
2012-04-19 15:18 UTC, Dafna Ron
no flags Details

Description Dafna Ron 2012-04-19 15:16:55 UTC
Description of problem:

I have several faulty luns and when I discovered devices and logged in I got an error in the gui that we cannot retrieve luns.
As long as these luns are logged in (not connected - just logged in) I cannot add any other targets from the storage. 

Version-Release number of selected component (if applicable):

vdsm-4.9.6-7.el6.x86_64

How reproducible:

100%

Steps to Reproduce:
1. have some faulty luns (dirty ones will be enough) and a new lun
2. try to login to the faulty luns
3. try to login to the good lun 
  
Actual results:

after trying to connect the 3ed faulty lun we get an error from vdsm and can no longer connect to any other luns. 

Expected results:

faulty luns are filtered in GUI. 
we should still be allowed to connect to operational luns

Additional info: full logs attached. 


vdsm error: 

Thread-324::ERROR::2012-04-19 18:04:48,882::task::853::TaskManager.Task::(_setError) Task=`c6a7fc90-de32-4086-bf9d-971a8d4e9e15`::Unexpected error
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/task.py", line 861, in _run
    return fn(*args, **kargs)
  File "/usr/share/vdsm/logUtils.py", line 38, in wrapper
    res = f(*args, **kwargs)
  File "/usr/share/vdsm/storage/hsm.py", line 1488, in getDeviceList
    includePartitioned=options.get('includePartitioned', False))
  File "/usr/share/vdsm/storage/hsm.py", line 1513, in _getDeviceList
    for pv in lvm.getAllPVs():
  File "/usr/share/vdsm/storage/lvm.py", line 756, in getAllPVs
    return _lvminfo.getAllPvs()
  File "/usr/share/vdsm/storage/lvm.py", line 529, in getAllPvs
    pvs = self._reloadpvs()
  File "/usr/share/vdsm/storage/lvm.py", line 346, in _reloadpvs
    pv = makePV(*fields)
  File "/usr/share/vdsm/storage/lvm.py", line 200, in makePV
    name = fixPVName(args[1])
  File "/usr/share/vdsm/storage/lvm.py", line 195, in fixPVName
    dmId = devicemapper.getDmIdFromFile(devPath)
  File "/usr/share/vdsm/storage/devicemapper.py", line 35, in getDmIdFromFile
    raise OSError(errno.ENODEV, "Could not find dm device named `%s`" % path)
OSError: [Errno 19] Could not find dm device named `unknown device`
Thread-324::DEBUG::2012-04-19 18:04:48,883::task::872::TaskManager.Task::(_run) Task=`c6a7fc90-de32-4086-bf9d-971a8d4e9e15`::Task._run: c6a7fc90-de32-4086-bf9d-971a8d4e9e15 (3,) {} failed - stopping task
Thread-324::DEBUG::2012-04-19 18:04:48,884::task::1199::TaskManager.Task::(stop) Task=`c6a7fc90-de32-4086-bf9d-971a8d4e9e15`::stopping in state preparing (force False)
Thread-324::DEBUG::2012-04-19 18:04:48,884::task::978::TaskManager.Task::(_decref) Task=`c6a7fc90-de32-4086-bf9d-971a8d4e9e15`::ref 1 aborting True
Thread-324::INFO::2012-04-19 18:04:48,885::task::1157::TaskManager.Task::(prepare) Task=`c6a7fc90-de32-4086-bf9d-971a8d4e9e15`::aborting: Task is aborted: u'[Errno 19] Could not find dm device named `unknown device`' - code 100


backend: 

2012-04-19 18:09:21,993 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.BrokerCommandBase] (ajp--0.0.0.0-8009-7) Error code BlockDeviceActionError and error message VDSGenericException: VDSErrorException: Failed to GetDeviceListVDS, error = Error block device action: ()
2012-04-19 18:09:21,995 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.BrokerCommandBase] (ajp--0.0.0.0-8009-7) Command org.ovirt.engine.core.vdsbroker.vdsbroker.GetDeviceListVDSCommand return value 
 
Class Name: org.ovirt.engine.core.vdsbroker.vdsbroker.LUNListReturnForXmlRpc
lunList                       Null
mStatus                       Class Name: org.ovirt.engine.core.vdsbroker.vdsbroker.StatusForXmlRpc
mCode                         600
mMessage                      Error block device action: ()



2012-04-19 18:09:21,998 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.BrokerCommandBase] (ajp--0.0.0.0-8009-7) Vds: blond-vdsh
2012-04-19 18:09:21,999 ERROR [org.ovirt.engine.core.vdsbroker.VDSCommandBase] (ajp--0.0.0.0-8009-7) Command GetDeviceListVDS execution failed. Exception: VDSErrorException: VDSGenericException: VDSErrorException: Failed to GetDeviceListVDS, error = Error block device action: ()
2012-04-19 18:09:22,001 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.GetDeviceListVDSCommand] (ajp--0.0.0.0-8009-7) FINISH, GetDeviceListVDSCommand, log id: 7a3fc2e
2012-04-19 18:09:22,002 ERROR [org.ovirt.engine.core.bll.storage.GetDeviceListQuery] (ajp--0.0.0.0-8009-7) Query GetDeviceListQuery failed. Exception message is VdcBLLException: org.ovirt.engine.core.vdsbroker.vdsbroker.VDSErrorException: VDSGenericException: VDSErrorException: Failed to GetDeviceListVDS, error = Error block device action: ()
2012-04-19 18:10:00,001 INFO  [org.ovirt.engine.core.bll.AutoRecoveryManager] (QuartzScheduler_Worker-39) Checking autorecoverable hosts

Comment 1 Dafna Ron 2012-04-19 15:18:36 UTC
Created attachment 578675 [details]
logs

Comment 3 Dan Kenigsberg 2012-05-09 09:45:34 UTC
Why is it marked a regression? Since which vdsm version?

Comment 4 Haim 2012-05-10 20:59:39 UTC
(In reply to comment #3)
> Why is it marked a regression? Since which vdsm version?

removing regression flag - manage to reproduce on z-stream.

Comment 6 RHEL Program Management 2012-07-10 07:51:56 UTC
This request was not resolved in time for the current release.
Red Hat invites you to ask your support representative to
propose this request, if still desired, for consideration in
the next release of Red Hat Enterprise Linux.

Comment 7 RHEL Program Management 2012-07-11 01:55:38 UTC
This request was erroneously removed from consideration in Red Hat Enterprise Linux 6.4, which is currently under development.  This request will be evaluated for inclusion in Red Hat Enterprise Linux 6.4.

Comment 8 RHEL Program Management 2012-12-14 07:53:02 UTC
This request was not resolved in time for the current release.
Red Hat invites you to ask your support representative to
propose this request, if still desired, for consideration in
the next release of Red Hat Enterprise Linux.

Comment 9 Ayal Baron 2013-02-16 20:30:12 UTC
Haim, since there have been many changes in this area, could you try to reproduce?

Comment 10 Dafna Ron 2013-06-06 13:01:12 UTC
since they fixed the --force bug, we cannot reproduce this any more. 
closing bug


Note You need to log in before you can comment on or make changes to this bug.