Bug 754054

Summary: [ovirt] [vdsm] Fedora host installation fails as 'reboot' exec happens too fast
Product: [Retired] oVirt Reporter: Haim <hateya>
Component: vdsmAssignee: Dan Kenigsberg <danken>
Status: CLOSED NEXTRELEASE QA Contact: yeylon <yeylon>
Severity: high Docs Contact:
Priority: unspecified    
Version: unspecifiedCC: abaron, acathrow, bazulay, iheim, mgoldboi, srevivo, yeylon, ykaul
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-11-15 13:36:36 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Attachments:
Description Flags
vdsm installation logs none

Description Haim 2011-11-15 09:31:03 UTC
Created attachment 533716 [details]
vdsm installation logs

Description of problem:

Fedora host installation fails on engine system, as reboot instance happens too fast, and host fails to report engine that installation passed successfully. 

please note that host installation is completed, and it actually gets rebooted, further-more, if you activate host, it turns up to active state. 

problem: 

take a look at vds_bootstrap_complete.py, after applying some printings inside the code, i can see that reboot(line 84) is executed, then system shutdown, and never gets to line 89, hence, engine fails as it didn't get a valid response.

git vdsm: 6b0a82392c2e4e3095b11787eb324e35eeecbd4f
git engine: 1aef0c633663dd487c926f1a5006c0ada261d8ac

problem is consistent and happens on every installation utilizing Fedora host.


     50 def main():
     51     """Usage: vds_bootstrap_complete.py  [-c vds_config_str] <random_num> [reboot]"""
     52     try:
     53         vds_config_str = None
     54         opts, args = getopt.getopt(sys.argv[1:], "c:")
     55         for o,v in opts:
     56             if o == "-c":
     57                 # it should looks like: 'ssl=true;ksm_nice=5;images=/images/irsd'
     58                 # without white spaces in it.
     59                 vds_config_str = v
     60 
     61         rnum = args[0]
     62     except:
     63         print main.__doc__
     64         return 0
     65     try:
     66         arg = args[1]
     67     except:
     68         arg = 1
     69 
     70     res = True
     71     try:
     72         res = deployUtil.instCert(rnum, VDSM_CONF_FILE)
     73         if res:
     74             res = deployUtil.setCoreDumpPath()
     75 
     76         if res:
     77             res = deployUtil.cleanAll(rnum)
     78 
     79         if res:
     80             res = deployUtil.setVdsConf(vds_config_str, VDSM_CONF_FILE)
     81 
     82         deployUtil.setService("vdsmd", "reconfigure")
     83 
     84         Reboot(arg)
     85     except:
     86         logging.error('bootstrap complete failed', exc_info=True)
     87         res = False
     88 
     89     if res:
     90         print "<BSTRAP component='RHEV_INSTALL' status='OK'/>"
     91     else:
     92         print "<BSTRAP component='RHEV_INSTALL' status='FAIL'/>"
     93     sys.stdout.flush()
     94 
     95 if __name__ == "__main__":
     96     sys.exit(main())

Comment 1 Dan Kenigsberg 2011-11-15 10:50:31 UTC
heh, what a lovely race.

Would this do?

http://gerrit.ovirt.org/232

Comment 2 Haim 2011-11-15 12:43:24 UTC
(In reply to comment #1)
> heh, what a lovely race.
> 
> Would this do?
> 
> http://gerrit.ovirt.org/232

tried it, still fails, clearing need-info.

Comment 3 Dan Kenigsberg 2011-11-15 13:36:36 UTC
second failure fixed, patch pushed upstream.