Description of problem: VDSM gets stuck during the startup process if a NFS storage is unreachable. Version-Release number of selected component (if applicable): vdsm-4.9.0-0 How reproducible: 100% Steps to Reproduce: 1. connect vdsm to a NFS storage domain (as SPM) 2. block the connection to the NFS storage 3. wait for vdsm to restart 4. vdsm gets stuck on __cleanStorageRepository Actual results: VDSM gets stuck on __cleanStorageRepository. Expected results: VDSM shouldn't get stuck on __cleanStorageRepository. Additional info: The problem is when vdsm descends into the pool directory: /rhev/data-center/<spUUID> The links there (eg: mastersd and the sdUUIDs) point to unreachable files and therefore os.walk() gets stuck running os.path.isdir() on them. Thread-11::DEBUG::2011-10-20 17:46:52,868::hsm::239::Storage.HSM::(__cleanStorageRepository) Cleaning leftovers. # ps -Lf $(cat /var/run/vdsm/vdsmd.pid) UID PID PPID LWP C NLWP STIME TTY STAT TIME CMD [...] vdsm 13888 12805 13988 0 15 17:46 ? D<l 0:00 /usr/bin/python /usr/share/vdsm//vdsm [...] Thread 4 (Thread 0x7f59517fb700 (LWP 13988)): #1 <built-in function stat> #3 file '/usr/lib64/python2.6/genericpath.py', in 'isdir' #6 file '/usr/lib64/python2.6/os.py', in 'walk' #8 file '/usr/lib64/python2.6/os.py', in 'walk' #10 file '/usr/share/vdsm/storage/hsm.py', in '__cleanStorageRepository' #14 file '/usr/share/vdsm/storage/hsm.py', in 'storageRefresh' #19 file '/usr/lib64/python2.6/threading.py', in 'run' #22 file '/usr/lib64/python2.6/threading.py', in '__bootstrap_inner' #25 file '/usr/lib64/python2.6/threading.py', in '__bootstrap'
commit 80840522ffade1714791cf7ea2b3c1758f181090 Author: Federico Simoncelli <fsimonce> Date: Fri Oct 21 10:50:01 2011 +0000 BZ#747917 Don't get information about mountpoints The regular os.walk() function tries to identify the files present in the given path. Avoiding to descend into the mountpoint is not enough to prevent vdsm from getting stuck if a NFS mount is unreachable, we should also prevent any other operation, eg: os.path.isdir(). Change-Id: I16a9e54586daa766e420fa8571b19a0b744b602d http://gerrit.usersys.redhat.com/1050
posted upstream http://gerrit.ovirt.org/202
closing ON_QA bugs as oVirt 3.1 was released: http://www.ovirt.org/get-ovirt/