Bug 814321 - logging into faulty block devices will cause unknown device error and we will be unable to login to new targets on the storage
logging into faulty block devices will cause unknown device error and we will...
Status: CLOSED WORKSFORME
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: vdsm (Show other bugs)
3.1.0
x86_64 Linux
high Severity high
: ---
: 3.3.0
Assigned To: Ayal Baron
storage
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2012-04-19 11:16 EDT by Dafna Ron
Modified: 2016-02-10 12:06 EST (History)
6 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2013-06-06 09:01:12 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: Storage
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
logs (637.50 KB, application/x-gzip)
2012-04-19 11:18 EDT, Dafna Ron
no flags Details

  None (edit)
Description Dafna Ron 2012-04-19 11:16:55 EDT
Description of problem:

I have several faulty luns and when I discovered devices and logged in I got an error in the gui that we cannot retrieve luns.
As long as these luns are logged in (not connected - just logged in) I cannot add any other targets from the storage. 

Version-Release number of selected component (if applicable):

vdsm-4.9.6-7.el6.x86_64

How reproducible:

100%

Steps to Reproduce:
1. have some faulty luns (dirty ones will be enough) and a new lun
2. try to login to the faulty luns
3. try to login to the good lun 
  
Actual results:

after trying to connect the 3ed faulty lun we get an error from vdsm and can no longer connect to any other luns. 

Expected results:

faulty luns are filtered in GUI. 
we should still be allowed to connect to operational luns

Additional info: full logs attached. 


vdsm error: 

Thread-324::ERROR::2012-04-19 18:04:48,882::task::853::TaskManager.Task::(_setError) Task=`c6a7fc90-de32-4086-bf9d-971a8d4e9e15`::Unexpected error
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/task.py", line 861, in _run
    return fn(*args, **kargs)
  File "/usr/share/vdsm/logUtils.py", line 38, in wrapper
    res = f(*args, **kwargs)
  File "/usr/share/vdsm/storage/hsm.py", line 1488, in getDeviceList
    includePartitioned=options.get('includePartitioned', False))
  File "/usr/share/vdsm/storage/hsm.py", line 1513, in _getDeviceList
    for pv in lvm.getAllPVs():
  File "/usr/share/vdsm/storage/lvm.py", line 756, in getAllPVs
    return _lvminfo.getAllPvs()
  File "/usr/share/vdsm/storage/lvm.py", line 529, in getAllPvs
    pvs = self._reloadpvs()
  File "/usr/share/vdsm/storage/lvm.py", line 346, in _reloadpvs
    pv = makePV(*fields)
  File "/usr/share/vdsm/storage/lvm.py", line 200, in makePV
    name = fixPVName(args[1])
  File "/usr/share/vdsm/storage/lvm.py", line 195, in fixPVName
    dmId = devicemapper.getDmIdFromFile(devPath)
  File "/usr/share/vdsm/storage/devicemapper.py", line 35, in getDmIdFromFile
    raise OSError(errno.ENODEV, "Could not find dm device named `%s`" % path)
OSError: [Errno 19] Could not find dm device named `unknown device`
Thread-324::DEBUG::2012-04-19 18:04:48,883::task::872::TaskManager.Task::(_run) Task=`c6a7fc90-de32-4086-bf9d-971a8d4e9e15`::Task._run: c6a7fc90-de32-4086-bf9d-971a8d4e9e15 (3,) {} failed - stopping task
Thread-324::DEBUG::2012-04-19 18:04:48,884::task::1199::TaskManager.Task::(stop) Task=`c6a7fc90-de32-4086-bf9d-971a8d4e9e15`::stopping in state preparing (force False)
Thread-324::DEBUG::2012-04-19 18:04:48,884::task::978::TaskManager.Task::(_decref) Task=`c6a7fc90-de32-4086-bf9d-971a8d4e9e15`::ref 1 aborting True
Thread-324::INFO::2012-04-19 18:04:48,885::task::1157::TaskManager.Task::(prepare) Task=`c6a7fc90-de32-4086-bf9d-971a8d4e9e15`::aborting: Task is aborted: u'[Errno 19] Could not find dm device named `unknown device`' - code 100


backend: 

2012-04-19 18:09:21,993 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.BrokerCommandBase] (ajp--0.0.0.0-8009-7) Error code BlockDeviceActionError and error message VDSGenericException: VDSErrorException: Failed to GetDeviceListVDS, error = Error block device action: ()
2012-04-19 18:09:21,995 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.BrokerCommandBase] (ajp--0.0.0.0-8009-7) Command org.ovirt.engine.core.vdsbroker.vdsbroker.GetDeviceListVDSCommand return value 
 
Class Name: org.ovirt.engine.core.vdsbroker.vdsbroker.LUNListReturnForXmlRpc
lunList                       Null
mStatus                       Class Name: org.ovirt.engine.core.vdsbroker.vdsbroker.StatusForXmlRpc
mCode                         600
mMessage                      Error block device action: ()



2012-04-19 18:09:21,998 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.BrokerCommandBase] (ajp--0.0.0.0-8009-7) Vds: blond-vdsh
2012-04-19 18:09:21,999 ERROR [org.ovirt.engine.core.vdsbroker.VDSCommandBase] (ajp--0.0.0.0-8009-7) Command GetDeviceListVDS execution failed. Exception: VDSErrorException: VDSGenericException: VDSErrorException: Failed to GetDeviceListVDS, error = Error block device action: ()
2012-04-19 18:09:22,001 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.GetDeviceListVDSCommand] (ajp--0.0.0.0-8009-7) FINISH, GetDeviceListVDSCommand, log id: 7a3fc2e
2012-04-19 18:09:22,002 ERROR [org.ovirt.engine.core.bll.storage.GetDeviceListQuery] (ajp--0.0.0.0-8009-7) Query GetDeviceListQuery failed. Exception message is VdcBLLException: org.ovirt.engine.core.vdsbroker.vdsbroker.VDSErrorException: VDSGenericException: VDSErrorException: Failed to GetDeviceListVDS, error = Error block device action: ()
2012-04-19 18:10:00,001 INFO  [org.ovirt.engine.core.bll.AutoRecoveryManager] (QuartzScheduler_Worker-39) Checking autorecoverable hosts
Comment 1 Dafna Ron 2012-04-19 11:18:36 EDT
Created attachment 578675 [details]
logs
Comment 3 Dan Kenigsberg 2012-05-09 05:45:34 EDT
Why is it marked a regression? Since which vdsm version?
Comment 4 Haim 2012-05-10 16:59:39 EDT
(In reply to comment #3)
> Why is it marked a regression? Since which vdsm version?

removing regression flag - manage to reproduce on z-stream.
Comment 6 RHEL Product and Program Management 2012-07-10 03:51:56 EDT
This request was not resolved in time for the current release.
Red Hat invites you to ask your support representative to
propose this request, if still desired, for consideration in
the next release of Red Hat Enterprise Linux.
Comment 7 RHEL Product and Program Management 2012-07-10 21:55:38 EDT
This request was erroneously removed from consideration in Red Hat Enterprise Linux 6.4, which is currently under development.  This request will be evaluated for inclusion in Red Hat Enterprise Linux 6.4.
Comment 8 RHEL Product and Program Management 2012-12-14 02:53:02 EST
This request was not resolved in time for the current release.
Red Hat invites you to ask your support representative to
propose this request, if still desired, for consideration in
the next release of Red Hat Enterprise Linux.
Comment 9 Ayal Baron 2013-02-16 15:30:12 EST
Haim, since there have been many changes in this area, could you try to reproduce?
Comment 10 Dafna Ron 2013-06-06 09:01:12 EDT
since they fixed the --force bug, we cannot reproduce this any more. 
closing bug

Note You need to log in before you can comment on or make changes to this bug.