Bug 927162
Summary: | bootstrap: incorrect progress to next state if phase1 of vdsm-bootstrap fails | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Virtualization Manager | Reporter: | Dafna Ron <dron> | ||||
Component: | ovirt-engine | Assignee: | Alon Bar-Lev <alonbl> | ||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | Pavel Stehlik <pstehlik> | ||||
Severity: | high | Docs Contact: | |||||
Priority: | unspecified | ||||||
Version: | 3.1.3 | CC: | acathrow, dyasny, hateya, iheim, lpeer, mgoldboi, Rhev-m-bugs, sgrinber, yeylon, ykaul, yzaslavs | ||||
Target Milestone: | --- | ||||||
Target Release: | 3.2.0 | ||||||
Hardware: | x86_64 | ||||||
OS: | Linux | ||||||
Whiteboard: | infra | ||||||
Fixed In Version: | sf2 | Doc Type: | Bug Fix | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | Type: | Bug | |||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | Infra | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | 875528 | ||||||
Bug Blocks: | |||||||
Attachments: |
|
As far as I can see these are all timeout related issues. 2013-03-25 10:50:22,028 ERROR [org.ovirt.engine.core.utils.hostinstall.VdsInstallerSSH] (pool-3-thread-46) SSH error running command cougar01.scl.lab.tlv.redhat.com:'umask 0077; MYTMP="$(mktemp -t ovirt-XXXXXXXXXX)"; trap "chmod -R u+rwX \"${MYTMP}\" > /dev/null 2>&1; rm -fr \"${MYTMP}\" > /dev/null 2>&1" 0; rm -fr "${MYTMP}" && mkdir "${MYTMP}" && tar -C "${MYTMP}" --no-same-permissions -o -x && "${MYTMP}"/setup -c 'ssl=true;management_port=54321' -O 'RedHat' -t 2013-03-25T08:40:22 -f /tmp/firewall.conf.6d69c622-a4e6-4a50-ac4e-5f33765c342f -S /tmp/ovirt-id_rsa_6d69c622-a4e6-4a50-ac4e-5f33765c342f -p 80 -b -B rhevm http://dafna-31.scl.lab.tlv.redhat.com:80/Components/vds/ http://dafna-31.scl.lab.tlv.redhat.com:80/Components/vds/ cougar01.scl.lab.tlv.redhat.com 6d69c622-a4e6-4a50-ac4e-5f33765c342f False': javax.naming.TimeLimitExceededException: SSH session hard timeout host 'cougar01.scl.lab.tlv.redhat.com:22' The problem is that even in this case the engine tries to download the certificate request. I see where the error is in the code. Does it worth fixing? This whole code is dead in next version. 3.2 has been released 3.2 has been released 3.2 has been released 3.2 has been released |
Created attachment 715924 [details] log Description of problem: this has happned to me more than once now, when I try to install two hosts at the same time they fail on Downloading certificate request from Host after I reinstall them again, the install succeeds. this is in 3.1.3 Version-Release number of selected component (if applicable): 3.1.3 How reproducible: 100% Steps to Reproduce: 1. install two hosts at the same time 2. after we fail on Downloading certificate request from Host reinstall 3. Actual results: we fail in step 'Downloading certificate request from Host' when installing more than one host at the same time if we reinstall the host again, the step succeeds and the install continues Expected results: we should not fail Additional info:log