Bug 1438640 - Host added to virt+gluster cluster displays a failed event message when checking for available updates for host with error message 'Command returned failure code 1 during SSH session'
Summary: Host added to virt+gluster cluster displays a failed event message when check...
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: ovirt-engine
Classification: oVirt
Component: Frontend.WebAdmin
Version: 4.1.1.2
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: ---
Assignee: bugs@ovirt.org
QA Contact: RamaKasturi
URL:
Whiteboard:
Depends On:
Blocks: Gluster-HC-3
TreeView+ depends on / blocked
 
Reported: 2017-04-04 02:09 UTC by RamaKasturi
Modified: 2017-04-30 11:46 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-04-30 11:46:02 UTC
oVirt Team: Infra
Embargoed:


Attachments (Terms of Use)

Description RamaKasturi 2017-04-04 02:09:58 UTC
Description of problem:
Host which is added to virt+gluster cluster throws an error "failed to check available updates on host <host_name> with message 'Command returned failure code 1 during SSH session' 

Version-Release number of selected component (if applicable):
Red Hat Virtualization Manager Version: 4.1.1.2-0.1.el7

How reproducible:
Always

Steps to Reproduce:
1. Install HC with three hosts
2. 
3.

Actual results:
There is an event logged in the events tab which reads "Failed to check available updates on host <host_name> with message 'Command returned failure code 1 during SSH session'

Expected results:
There should not be any failure message while checking for updates.

Additional info:
Following error is seen in the engine.log
2017-04-03 06:47:50,865-04 ERROR [org.ovirt.engine.core.uutils.ssh.SSHDialog] (pool-7-thread-3) [77d49287] SSH error running command root.eng.blr.redhat.c
om:'umask 0077; MYTMP="$(TMPDIR="${OVIRT_TMPDIR}" mktemp -d -t ovirt-XXXXXXXXXX)"; trap "chmod -R u+rwX \"${MYTMP}\" > /dev/null 2>&1; rm -fr \"${MYTMP}\" > /dev/null 2>&1" 
0; tar --warning=no-timestamp -C "${MYTMP}" -x &&  "${MYTMP}"/ovirt-host-mgmt DIALOG/dialect=str:machine DIALOG/customization=bool:True': Command returned failure code 1 dur
ing SSH session 'root.eng.blr.redhat.com'
2017-04-03 06:47:50,865-04 ERROR [org.ovirt.engine.core.uutils.ssh.SSHDialog] (pool-7-thread-3) [77d49287] Exception: java.io.IOException: Command returned failure code 1 du
ring SSH session 'root.eng.blr.redhat.com'
        at org.ovirt.engine.core.uutils.ssh.SSHClient.executeCommand(SSHClient.java:503) [uutils.jar:]
        at org.ovirt.engine.core.uutils.ssh.SSHDialog.executeCommand(SSHDialog.java:317) [uutils.jar:]
        at org.ovirt.engine.core.bll.hostdeploy.VdsDeployBase.execute(VdsDeployBase.java:563) [bll.jar:]
        at org.ovirt.engine.core.bll.host.HostUpgradeManager.checkForUpdates(HostUpgradeManager.java:48) [bll.jar:]
        at org.ovirt.engine.core.bll.host.AvailableUpdatesFinder.checkForUpdates(AvailableUpdatesFinder.java:40) [bll.jar:]
        at org.ovirt.engine.core.bll.hostdeploy.HostUpdatesChecker.checkForUpdates(HostUpdatesChecker.java:49) [bll.jar:]
        at org.ovirt.engine.core.bll.hostdeploy.HostUpdatesCheckerService.lambda$submitCheckUpdatesForHost$1(HostUpdatesCheckerService.java:67) [bll.jar:]
        at java.util.concurrent.FutureTask.run(FutureTask.java:266) [rt.jar:1.8.0_121]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [rt.jar:1.8.0_121]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [rt.jar:1.8.0_121]
        at java.lang.Thread.run(Thread.java:745) [rt.jar:1.8.0_121]

2017-04-03 06:47:50,865-04 ERROR [org.ovirt.engine.core.bll.hostdeploy.VdsDeployBase] (pool-7-thread-3) [77d49287] Error during host rhsqa-grafton2.lab.eng.blr.redhat.com in
stall: java.io.IOException: Command returned failure code 1 during SSH session 'root.eng.blr.redhat.com'
        at org.ovirt.engine.core.uutils.ssh.SSHClient.executeCommand(SSHClient.java:503) [uutils.jar:]
        at org.ovirt.engine.core.uutils.ssh.SSHDialog.executeCommand(SSHDialog.java:317) [uutils.jar:]
        at org.ovirt.engine.core.bll.hostdeploy.VdsDeployBase.execute(VdsDeployBase.java:563) [bll.jar:]
        at org.ovirt.engine.core.bll.host.HostUpgradeManager.checkForUpdates(HostUpgradeManager.java:48) [bll.jar:]
        at org.ovirt.engine.core.bll.host.AvailableUpdatesFinder.checkForUpdates(AvailableUpdatesFinder.java:40) [bll.jar:]
        at org.ovirt.engine.core.bll.hostdeploy.HostUpdatesChecker.checkForUpdates(HostUpdatesChecker.java:49) [bll.jar:]
        at org.ovirt.engine.core.bll.hostdeploy.HostUpdatesCheckerService.lambda$submitCheckUpdatesForHost$1(HostUpdatesCheckerService.java:67) [bll.jar:]
        at java.util.concurrent.FutureTask.run(FutureTask.java:266) [rt.jar:1.8.0_121]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [rt.jar:1.8.0_121]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [rt.jar:1.8.0_121]
        at java.lang.Thread.run(Thread.java:745) [rt.jar:1.8.0_121]

2017-04-03 06:47:50,866-04 ERROR [org.ovirt.engine.core.bll.hostdeploy.HostUpdatesChecker] (pool-7-thread-3) [77d49287] Failed to check if updates are available for host 'rh
sqa-grafton2.lab.eng.blr.redhat.com' with error message 'Command returned failure code 1 during SSH session 'root.eng.blr.redhat.com''
2017-04-03 06:47:50,869-04 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (pool-7-thread-3) [77d49287] EVENT_ID: HOST_AVAILABLE_UPDATES_FAILED(
839), Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: Failed to check for available updates on host rhsqa-grafton2.lab.eng.blr.redhat.com with message 
'Command returned failure code 1 during SSH session 'root.eng.blr.redhat.com''.
2017-04-03 06:47:51,201-04 INFO  [

Comment 2 RamaKasturi 2017-04-04 09:13:41 UTC
Hi yaniv,
  
  I have copied engine and vdsm logs to the link below.

http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/HC/1438640/

Thanks
kasturi

Comment 3 Yaniv Kaul 2017-04-04 09:37:50 UTC
(In reply to RamaKasturi from comment #2)
> Hi yaniv,
>   
>   I have copied engine and vdsm logs to the link below.
> 
> http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/HC/1438640/
> 
> Thanks
> kasturi

Excellent, since now we can see in Engine the following:
 ERROR [org.ovirt.engine.core.bll.hostdeploy.VdsDeployBase] (VdsDeploy) [77d49287] Yum: Cannot queue package ovirt-node-ng-image-update: Package ovirt-node-ng-image-update cannot be found
2017-04-03 06:47:50,520-04 INFO  [org.ovirt.engine.core.bll.hostdeploy.VdsDeployBase] (VdsDeploy) [77d49287] Yum: Performing yum transaction rollback
2017-04-03 06:47:50,521-04 ERROR [org.ovirt.engine.core.bll.hostdeploy.VdsDeployBase] (VdsDeploy) [77d49287] Failed to execute stage 'Package installation': Package ovirt-node-ng-image-update cannot be found

Is that the case?

Comment 4 RamaKasturi 2017-04-04 09:59:59 UTC
(In reply to Yaniv Kaul from comment #3)
> (In reply to RamaKasturi from comment #2)
> > Hi yaniv,
> >   
> >   I have copied engine and vdsm logs to the link below.
> > 
> > http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/HC/1438640/
> > 
> > Thanks
> > kasturi
> 
> Excellent, since now we can see in Engine the following:
>  ERROR [org.ovirt.engine.core.bll.hostdeploy.VdsDeployBase] (VdsDeploy)
> [77d49287] Yum: Cannot queue package ovirt-node-ng-image-update: Package
> ovirt-node-ng-image-update cannot be found
> 2017-04-03 06:47:50,520-04 INFO 
> [org.ovirt.engine.core.bll.hostdeploy.VdsDeployBase] (VdsDeploy) [77d49287]
> Yum: Performing yum transaction rollback
> 2017-04-03 06:47:50,521-04 ERROR
> [org.ovirt.engine.core.bll.hostdeploy.VdsDeployBase] (VdsDeploy) [77d49287]
> Failed to execute stage 'Package installation': Package
> ovirt-node-ng-image-update cannot be found
> 
> Is that the case?

Hi Yaniv,

  I changed the title of the bug to reflect the full error message where command returns failure during SSH session.

Thanks
kasturi

Comment 5 RamaKasturi 2017-04-04 10:47:22 UTC
Hi Yaniv,

  If the event message was "failed to check updates" i would be happy as there are no repos enabled on the node because of which it failed to check for update. But from the event message it appears to me that it fails to check update because 'Command returned failure code 1 during SSH session'. 


Thanks
kasturi

Comment 7 Martin Perina 2017-04-19 07:59:12 UTC
Every host (both type Centos/Fedora or NGN) needs to have oVirt repositories installed. If not or one of required packages are not available we fail check for upgrade. The error message itself is shown in Events tab, details are in specific host-deploy log (the exact name is shown in Events), but on engine side we just don't know why exactly host-deploy process failed, that's why we show stack trace on engine for all premature host-deploy SSH session exits.

Comment 8 Oved Ourfali 2017-04-30 11:46:02 UTC
As Martin stated, if everything is configured properly, we won't have any error.
For other cases, the events + host deploy logs are the address to troubleshoot.
Closing as wontfix.


Note You need to log in before you can comment on or make changes to this bug.