Bug 922515 - vdsm: vdsm fails to recover after restart with 'AttributeError: 'list' object has no attribute 'split'' error
Summary: vdsm: vdsm fails to recover after restart with 'AttributeError: 'list' object...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: vdsm
Version: 3.2.0
Hardware: x86_64
OS: Linux
unspecified
urgent
Target Milestone: ---
: 3.2.0
Assignee: Yaniv Bronhaim
QA Contact: Dafna Ron
URL:
Whiteboard: infra
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-03-17 15:33 UTC by Dafna Ron
Modified: 2022-07-09 05:59 UTC (History)
11 users (show)

Fixed In Version: vdsm-4.10.2-13.0.el6ev
Doc Type: Bug Fix
Doc Text:
Previously, VDSM failed to recover after restarts, and reported an error "AttributeError: 'list' object has no attribute 'split'". The function storage.fuser.fuser() was patched, and VDSM now recovers as expected after restarts.
Clone Of:
Environment:
Last Closed: 2013-06-10 20:45:57 UTC
oVirt Team: Infra
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
logs (1.19 MB, application/x-gzip)
2013-03-17 15:33 UTC, Dafna Ron
no flags Details
logs (693.22 KB, application/x-gzip)
2013-03-18 20:08 UTC, Dafna Ron
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker RHV-47081 0 None None None 2022-07-09 05:59:03 UTC
Red Hat Product Errata RHSA-2013:0886 0 normal SHIPPED_LIVE Moderate: rhev 3.2 - vdsm security and bug fix update 2013-06-11 00:25:02 UTC
oVirt gerrit 13302 0 None None None Never

Description Dafna Ron 2013-03-17 15:33:08 UTC
Created attachment 711441 [details]
logs

Description of problem:

to verify bug 910013 I deleted 150 vm with wipe=true. 
at some point I started getting "Exception: No free file handlers in pool
" and than vdsm restarted and could not recover with "AttributeError: 'list' object has no attribute 'split'" 

I had to manually restart vdsm 

Version-Release number of selected component (if applicable):

sf10
4.10-11.0 

How reproducible:


Steps to Reproduce:
1. create a 2 hosts iscsi pool with 3 domains 100G each
2. create a wipe=true template (1GB disk)
3. create 3 pools from the template with 50 vm's on each pool
4. detach and remove the vm from each pool (I detached -> removed each pool at a time without waiting for the delete to end on the previouse pool). 
  
Actual results:

we get "Exception: No free file handlers in pool" and the vdsm suddenly restart and fails to recover

Expected results:

vdsm should recover

Additional info: logs

Comment 2 Dafna Ron 2013-03-18 20:05:54 UTC
I also reproduced this issue with a much simpler scenrio. 

1. run two vm's with two disks both thin provision 
2. live migrate disks on both vm's twice (move disks -> wait to finish -> move again)

vdsm crashed:

MainThread::ERROR::2013-03-18 21:55:51,991::clientIF::263::vds::(_initIRS) Error initializing IRS
Traceback (most recent call last):
  File "/usr/share/vdsm/clientIF.py", line 261, in _initIRS
    self.irs = Dispatcher(HSM())
  File "/usr/share/vdsm/storage/hsm.py", line 344, in __init__
    sp.StoragePool.cleanupMasterMount()
  File "/usr/share/vdsm/storage/sp.py", line 356, in cleanupMasterMount
    blockSD.BlockStorageDomain.doUnmountMaster(master)
  File "/usr/share/vdsm/storage/blockSD.py", line 1128, in doUnmountMaster
    pids = fuser(masterMount.fs_file, mountPoint=True)
  File "/usr/share/vdsm/storage/fuser.py", line 34, in fuser
    return [int(pid) for pid in out.split()]
AttributeError: 'list' object has no attribute 'split'


and we also have an attribute error form the vm channel: 

Thread-15::ERROR::2013-03-18 21:55:52,999::guestIF::103::vm.Vm::(__init__) vmId=`8df501ee-12eb-4f21-b709-0a44b2d33051`::Failed to prepare vmchannel
Traceback (most recent call last):
  File "/usr/share/vdsm/guestIF.py", line 101, in __init__
    self._prepare_socket()
  File "/usr/share/vdsm/guestIF.py", line 113, in _prepare_socket
    supervdsm.getProxy().prepareVmChannel(self._socketName)
  File "/usr/share/vdsm/supervdsm.py", line 76, in __call__
    return callMethod()
  File "/usr/share/vdsm/supervdsm.py", line 66, in <lambda>
    getattr(self._supervdsmProxy._svdsm, self._funcName)(*args,
AttributeError: 'ProxyCaller' object has no attribute 'prepareVmChannel'

clientIFinit::ERROR::2013-03-18 21:55:55,263::clientIF::409::vds::(_recoverExistingVms) Vm's recovery failed
Traceback (most recent call last):
  File "/usr/share/vdsm/clientIF.py", line 395, in _recoverExistingVms
    not self.irs.getConnectedStoragePoolsList()['poollist']:
AttributeError: 'NoneType' object has no attribute 'getConnectedStoragePoolsList'

Comment 3 Dafna Ron 2013-03-18 20:08:10 UTC
Created attachment 712225 [details]
logs

Comment 4 Dan Kenigsberg 2013-03-24 09:47:20 UTC
Goodness. storage.fuser.fuser() has never worked. When solving this bug, please write a unit test for the function.

Comment 5 Cheryn Tan 2013-04-03 07:01:47 UTC
This bug is currently attached to errata RHBA-2012:14332. If this change is not to be documented in the text for this errata please either remove it from the errata, set the requires_doc_text flag to minus (-), or leave a "Doc Text" value of "--no tech note required" if you do not have permission to alter the flag.

Otherwise to aid in the development of relevant and accurate release documentation, please fill out the "Doc Text" field above with these four (4) pieces of information:

* Cause: What actions or circumstances cause this bug to present.

* Consequence: What happens when the bug presents.

* Fix: What was done to fix the bug.

* Result: What now happens when the actions or circumstances above occur. (NB: this is not the same as 'the bug doesn't present anymore')

Once filled out, please set the "Doc Type" field to the appropriate value for the type of change made and submit your edits to the bug.

For further details on the Cause, Consequence, Fix, Result format please refer to:

https://bugzilla.redhat.com/page.cgi?id=fields.html#cf_release_notes

Thanks in advance.

Comment 6 Dafna Ron 2013-04-07 12:23:47 UTC
verified on vdsm-4.10.2-14.0.el6ev.x86_64
vdsm did not crash but I also tested a storage issue in which vdsm had to restart and it was able to recover without the spit issue.

Comment 8 errata-xmlrpc 2013-06-10 20:45:57 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2013-0886.html


Note You need to log in before you can comment on or make changes to this bug.