Hide Forgot
Description of problem: scale team experiencing some strange behavior, once engine installed and being populated, after we restart the engine or reinstall it, we can find our hosts as non-responsive. reinstalling the hosts dose not work Version-Release number of selected component (if applicable): 3.6.5.1 How reproducible: no clear Steps to Reproduce: 1. establishing setup with some hosts 2. restart the engine \ reinstall it 3. hosts become nonresponsive Actual results: non responsive hosts after after running reinstall host Expected results: reinstall hosts should pass with no issues Additional info:
What do you mean by engine reinstall? You mean upgrade? Please attach complete logs.
Created attachment 1147600 [details] server log at debug level
Created attachment 1147601 [details] engine log at debug level
Do you have the host deploy logs?
This might be an environment issue, also related to fake hosts, as the environment is mixed. As no one else had that, I don't think it should be marked as blocker. Gil?
Eldad, is this happening on fake hosts only or on bare metal hosts?
To (In reply to Oved Ourfali from comment #11) > This might be an environment issue, also related to fake hosts, as the > environment is mixed. > > As no one else had that, I don't think it should be marked as blocker. > > Gil? Based on comment #4, this was reproduced with a bare metal host. Oved, can we put an attention on this to confirm if this is an environment issue or a real regression?
Can you try to reproduce on a clean environment, with only real hosts? We're examining the logs, but want a cleaner reproduction, if any.
Except of having steps to reproduce it would be great to have ssl debug logs. You can enable them by providing parameter -Djavax.net.debug=all to engine jvm.
Seems like the issue disappear, after we found two engine process running. now hosts can be reinstall with no issues.
it happens to me again i found two engine process once i ran 'service ovirt-engine restart'
Based on comment #16 looks like environment issue
There shouldn't be two engine processes running. If there are perhaps it is because the service was killed, but no the engine itself, so a restart of the service will restart it. So, might be related to Bug 1320903. Changing to NOTABUG.