Bug 1264195

Summary: Typo in ovirtnode/storage.py reread_partitions causes RHEV-H install to fail
Product: Red Hat Enterprise Virtualization Manager Reporter: Derrick Ornelas <dornelas>
Component: ovirt-nodeAssignee: Fabian Deutsch <fdeutsch>
Status: CLOSED ERRATA QA Contact: Ying Cui <ycui>
Severity: medium Docs Contact:
Priority: high    
Version: 3.5.0CC: bgraveno, cshao, dornelas, fdeutsch, gklein, lsurette, mgoldboi, ycui, ykaul
Target Milestone: ovirt-3.6.1Keywords: ZStream
Target Release: 3.6.0   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: ovirt-node-3.6.0-0.24.20151209gitc0fa931.el7ev Doc Type: Bug Fix
Doc Text:
Fixed an error in the code that caused device discovery to take longer than it should when installing Red Hat Enterprise Virtualization.
Story Points: ---
Clone Of:
: 1280213 (view as bug list) Environment:
Last Closed: 2016-03-09 14:38:52 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Node RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1280213    
Attachments:
Description Flags
screenshot of traceback
none
simple patch
none
screenshot.png none

Description Derrick Ornelas 2015-09-17 20:15:11 UTC
Created attachment 1074577 [details]
screenshot of traceback

Description of problem:  

Typo in ovirtnode/storage.py causes RHEV-H install to fail randomly with traceback:

  File "/usr/lib/python2.6/site-packages/ovirtnode/storage.py", line 251, in reread_partitions
    "secs") % (timeout -i)
TypeError: unsupported operand type(s) for %: 'NoneType' and 'int'



ovirt-node-3.2.3/src/ovirtnode/storage.py:
---
 243         if "dev/mapper" in drive:
 244             _functions.system("service multipathd reload")
 245             _functions.system("multipath -r &>/dev/null")
 246             # wait for device to exit
 247             i = 0
 248             timeout = 15
 249             while not os.path.exists(drive):
 250                 logger.error(drive + " is not available, waiting %s more " +
 251                              "secs") % (timeout - i)
 252                 i = i + i
 253                 time.sleep(1)
 254                 if i == timeout:
 255                     logger.error("Timed out waiting for: %s" % drive)
 256                     return False
---


Version-Release number of selected component (if applicable):
ovirt-node-3.2.3-20.el6


How reproducible:  unknown


Steps to Reproduce:
1.  Start install of RHEV-H 6.7


Actual results:
Sometimes install fails with traceback as seen in attached screenshot

Expected results:
Install finishes successfully


Additional info:

The line:

 251                              "secs") % (timeout - i)

should be:

 251                              "secs" % (timeout - i))



It's likely that the line:

 252                 i = i + i

should be:

 252                 i = i + 1

as well.

Comment 1 Derrick Ornelas 2015-09-17 20:37:33 UTC
Created attachment 1074578 [details]
simple patch

Comment 2 Fabian Deutsch 2015-09-18 07:50:16 UTC
Thanks for the patch. I had to modify it slightly, because the format string needed another pair of braces.

Comment 3 Ying Cui 2015-10-21 12:13:53 UTC
I change the QA Contact to myself to follow this bug further.
I tested clean automatic installation and TUI reinstallation with build rhev-hypervisor6-6.7-20150828.0.el6ev and build rhev-hypervisor6-6.7-20150911.0.el6ev on ibm-3650m4-02(did not use remotely media via imm, just use PXE) and dell-R910(pxe) servers with FC HBA 6 times.
I did not encounter installation failed issue and did not see the traceback error during my testing.

According to this code, I did the following, here can see the detail in screenshot, but the device is forced reload quickly, can not easy to trigger " while not os.path.exists"
# systemctl multipathd reload; multipath -r; ll /dev/mapper/3600a0b80005ada7600004b9455a5ddb0

Comment 4 Ying Cui 2015-10-21 12:14:59 UTC
Created attachment 1085105 [details]
screenshot.png

Comment 5 Ying Cui 2015-10-21 12:25:26 UTC
Hi Derrick,
  We can not reproduce this issue on QE's test environment. QE verified the patches modification is valid. Could you help to provide the RHEV-H 3.5.6 build to customer to verify this bug once the build is built ready by devel? QE will do sanity testing as well for this bug fix.

Thanks
Ying

Comment 12 Ying Cui 2015-12-29 05:09:43 UTC
According to comment 9, Checked the code patch verification pass.

Did the sanity testing on this bug on FC machine PASS.
Sanity coverage scenarios:
1. clean automatic installation PASS
2. TUI clean installation PASS
3. TUI reinstallation PASS

# cat /etc/redhat-release 
Red Hat Enterprise Virtualization Hypervisor (Beta) release 7.2 (20151221.1.el7ev)
# rpm -qa ovirt-node
ovirt-node-3.6.0-0.24.20151209gitc0fa931.el7ev.noarch

Comment 14 errata-xmlrpc 2016-03-09 14:38:52 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-0378.html