Bugzilla (bugzilla.redhat.com) will be under maintenance for infrastructure upgrades and will not be unavailable on July 31st between 12:30 AM - 05:30 AM UTC. We appreciate your understanding and patience. You can follow status.redhat.com for details.
Bug 1419853 - [BLOCKED] Image upload fails when one of the ovirt-imageio-daemons was not running
Summary: [BLOCKED] Image upload fails when one of the ovirt-imageio-daemons was not ru...
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: ovirt-engine
Classification: oVirt
Component: BLL.Storage
Version: 4.1.0
Hardware: Unspecified
OS: Unspecified
medium
high
Target Milestone: ovirt-4.2.0
: ---
Assignee: Daniel Erez
QA Contact: Natalie Gavrielov
URL:
Whiteboard:
Depends On: 1435988
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-02-07 09:00 UTC by Natalie Gavrielov
Modified: 2017-07-10 08:13 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1435988 (view as bug list)
Environment:
Last Closed: 2017-07-10 08:13:52 UTC
oVirt Team: Storage
ylavi: ovirt-4.2+


Attachments (Terms of Use)
logs: engine, imageio-proxy, imageio-daemon, vdsm (1.11 MB, application/zip)
2017-02-07 09:00 UTC, Natalie Gavrielov
no flags Details


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 71947 0 master MERGED webadmin: upload image dialog - use host select-box 2017-02-09 14:13:04 UTC
oVirt gerrit 72027 0 ovirt-engine-4.1 MERGED webadmin: clean ReadOnlyDiskModel for upload image 2017-02-12 09:06:27 UTC
oVirt gerrit 72028 0 ovirt-engine-4.1 MERGED webadmin: upload image dialog - use host select-box 2017-02-12 09:06:14 UTC

Description Natalie Gavrielov 2017-02-07 09:00:13 UTC
Created attachment 1248315 [details]
logs: engine, imageio-proxy, imageio-daemon, vdsm

Description of problem:
Try to upload an image using the GUI, when one of the host's ovirt-imageio-daemon is down and the other one is up - operation fails and the disk is in "paused by the system" status.

Version-Release number of selected component:

Engine:
ovirt-engine-4.1.0.3-0.1.el7.noarch
ovirt-imageio-common-1.0.0-0.el7ev.noarch
ovirt-imageio-proxy-1.0.0-0.el7ev.noarch

Hosts:
vdsm-4.19.4-21.git310b0a0.el7.centos.x86_64
ovirt-imageio-common-1.0.0-1.el7.noarch
ovirt-imageio-daemon-1.0.0-1.el7.noarch


How reproducible:
100% (2 out of 2)

Steps to Reproduce:
Have an environment with two hosts - one with ovirt-imageio-daemon running and the other daemon not.
Try to upload an image using the UI.

Actual results:
The transfer failed:
engine.log:
2017-02-07 10:34:48,307+02 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.AddImageTicketVDSCommand] (DefaultQuartzScheduler2) [edb72b48-af00-46dc-8ac9-3bb224608421] Failed in 'AddImageTicketVDS' method
2017-02-07 10:34:48,320+02 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (DefaultQuartzScheduler2) [edb72b48-af00-46dc-8ac9-3bb224608421] EVENT_ID: VDS_BROKER_COMMAND_FAILURE(10,802), Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: VDSM green-vdsb.qa.lab.tlv.redhat.com command AddImageTicketVDS failed: Cannot communicate with image daemon: 'reason=Error communicating with ovirt-imageio-daemon: [Errno 111] Connection refused'

Expected results:
For the upload to succeed - since there is another host with a daemon running. 

Additional info:
Tried to resume the upload twice, first time failed with the same error, second time succeeded - since it used the other daemon.

Comment 1 Daniel Erez 2017-02-09 07:42:51 UTC
ovirt-imageio-daemon is a required service and should be running, otherwise, it's an issue that should be addressed by the user. To make the behavior more consistent and clear, we'll add an host select-box in upload image dialog, so uploading will be done only using the specified host.

Comment 2 Natalie Gavrielov 2017-03-14 14:59:20 UTC
(In reply to Daniel Erez from comment #1)
> ovirt-imageio-daemon is a required service and should be running, otherwise,
> it's an issue that should be addressed by the user. To make the behavior
> more consistent and clear, we'll add an host select-box in upload image
> dialog, so uploading will be done only using the specified host.

Does this select-box just lists all hosts? or does it have some logic prior to that (checks which hosts have running ovirt-imageio-daemons and lists only them)?

Comment 3 Daniel Erez 2017-03-14 15:17:40 UTC
(In reply to Natalie Gavrielov from comment #2)
> (In reply to Daniel Erez from comment #1)
> > ovirt-imageio-daemon is a required service and should be running, otherwise,
> > it's an issue that should be addressed by the user. To make the behavior
> > more consistent and clear, we'll add an host select-box in upload image
> > dialog, so uploading will be done only using the specified host.
> 
> Does this select-box just lists all hosts? or does it have some logic prior
> to that (checks which hosts have running ovirt-imageio-daemons and lists
> only them)?

We list the active hosts. ovirt-imagio-daemon should be running on every active host. The engine doesn't have any indication if the daemon is actually running.

Comment 4 Natalie Gavrielov 2017-03-14 16:30:04 UTC
(In reply to Daniel Erez from comment #3)
> We list the active hosts. ovirt-imagio-daemon should be running on every
> active host. The engine doesn't have any indication if the daemon is
> actually running.

I agree that ovirt-imagio-daemon should be running on every active host.
But there might be some cases in which the imageio-daemon is not running - mainly because of bugs (which we've already seen).
So, lets say, for instance, a user has 5 hosts in total, and for some reason (emmm..bug..) Two of the ovirt-imageio-daemons are not running - but the other 3 are - why can't we use them, automatically?
Or at least supply the user the relevant info for what can be used..

I don't see how this select-box solves the issue here.. 
Should the user connect to some of the hosts and check whether daemon process is active before performing an upload? 
Or if he doesn't, and the upload fails, do we expect the user to select a different host next time he tries?

Comment 5 Daniel Erez 2017-03-14 16:54:40 UTC
(In reply to Natalie Gavrielov from comment #4)
> (In reply to Daniel Erez from comment #3)
> > We list the active hosts. ovirt-imagio-daemon should be running on every
> > active host. The engine doesn't have any indication if the daemon is
> > actually running.
> 
> I agree that ovirt-imagio-daemon should be running on every active host.
> But there might be some cases in which the imageio-daemon is not running -
> mainly because of bugs (which we've already seen).
> So, lets say, for instance, a user has 5 hosts in total, and for some reason
> (emmm..bug..) Two of the ovirt-imageio-daemons are not running - but the
> other 3 are - why can't we use them, automatically?

bugs should be fixed :) we shouldn't just assume there's a bug... a bug can cause about anything in the system, so we can protect ourselves against everything.

> Or at least supply the user the relevant info for what can be used..

We currently don't expose the status of the daemon to the engine. You may open an rfe for that, although I think it's redundant.

> 
> I don't see how this select-box solves the issue here.. 
> Should the user connect to some of the hosts and check whether daemon
> process is active before performing an upload? 

The user shouldn't check it, the daemon should be running.

> Or if he doesn't, and the upload fails, do we expect the user to select a
> different host next time he tries?

If it fails for some reason, the user may try another host. Or, regardless, the user can select the host just to balance traffic, or any other reason.

Comment 6 Natalie Gavrielov 2017-03-15 15:56:40 UTC
I'm sorry, I don't see how this fix addresses the issue.
The expected result (IMHO) in case a user tries to upload a disk, and there is a running daemon on a host - to use it (not fail because there is a dead daemon somewhere else).
I don't see how listing the hosts helps.

Comment 7 Red Hat Bugzilla Rules Engine 2017-03-15 15:56:48 UTC
Target release should be placed once a package build is known to fix a issue. Since this bug is not modified, the target version has been reset. Please use target milestone to plan a fix for a oVirt release.

Comment 8 Daniel Erez 2017-03-15 17:04:49 UTC
The ovirt-imageio-daemon service should be running on every host. Otherwise, it's a bug that should be addressed. Hence, I don't think we need to expose the status of the daemon service to the engine. 

@Yaniv - do we want to open an RFE for auto-detecting whether a daemon is running on an host, so the upload would start auto-magically on that host (instead of / in addition to selecting the host manually). Though it seems inconsistent behavior comparing the other flows in the system (and probably an overkill)

Comment 9 Yaniv Lavi 2017-03-19 13:36:25 UTC
(In reply to Daniel Erez from comment #8)
> The ovirt-imageio-daemon service should be running on every host. Otherwise,
> it's a bug that should be addressed. Hence, I don't think we need to expose
> the status of the daemon service to the engine. 
> 
> @Yaniv - do we want to open an RFE for auto-detecting whether a daemon is
> running on an host, so the upload would start auto-magically on that host
> (instead of / in addition to selecting the host manually). Though it seems
> inconsistent behavior comparing the other flows in the system (and probably
> an overkill)

Reopening for discussion.
Do we have any way to let the user know that the agent is down on any of the hosts?

Comment 10 Red Hat Bugzilla Rules Engine 2017-03-19 13:36:31 UTC
Target release should be placed once a package build is known to fix a issue. Since this bug is not modified, the target version has been reset. Please use target milestone to plan a fix for a oVirt release.

Comment 11 Daniel Erez 2017-03-19 14:30:11 UTC
(In reply to Yaniv Dary from comment #9)
> (In reply to Daniel Erez from comment #8)
> > The ovirt-imageio-daemon service should be running on every host. Otherwise,
> > it's a bug that should be addressed. Hence, I don't think we need to expose
> > the status of the daemon service to the engine. 
> > 
> > @Yaniv - do we want to open an RFE for auto-detecting whether a daemon is
> > running on an host, so the upload would start auto-magically on that host
> > (instead of / in addition to selecting the host manually). Though it seems
> > inconsistent behavior comparing the other flows in the system (and probably
> > an overkill)
> 
> Reopening for discussion.
> Do we have any way to let the user know that the agent is down on any of the
> hosts?

No, we don't have any way. I think it's redundant.

Comment 12 Yaniv Lavi 2017-03-19 14:49:50 UTC
(In reply to Daniel Erez from comment #11)
> (In reply to Yaniv Dary from comment #9)
> > (In reply to Daniel Erez from comment #8)
> > > The ovirt-imageio-daemon service should be running on every host. Otherwise,
> > > it's a bug that should be addressed. Hence, I don't think we need to expose
> > > the status of the daemon service to the engine. 
> > > 
> > > @Yaniv - do we want to open an RFE for auto-detecting whether a daemon is
> > > running on an host, so the upload would start auto-magically on that host
> > > (instead of / in addition to selecting the host manually). Though it seems
> > > inconsistent behavior comparing the other flows in the system (and probably
> > > an overkill)
> > 
> > Reopening for discussion.
> > Do we have any way to let the user know that the agent is down on any of the
> > hosts?
> 
> No, we don't have any way. I think it's redundant.

How will the user know what the issue is? Or how to fix it without such a prompt?

Comment 13 Daniel Erez 2017-03-19 14:53:36 UTC
(In reply to Yaniv Dary from comment #12)
> (In reply to Daniel Erez from comment #11)
> > (In reply to Yaniv Dary from comment #9)
> > > (In reply to Daniel Erez from comment #8)
> > > > The ovirt-imageio-daemon service should be running on every host. Otherwise,
> > > > it's a bug that should be addressed. Hence, I don't think we need to expose
> > > > the status of the daemon service to the engine. 
> > > > 
> > > > @Yaniv - do we want to open an RFE for auto-detecting whether a daemon is
> > > > running on an host, so the upload would start auto-magically on that host
> > > > (instead of / in addition to selecting the host manually). Though it seems
> > > > inconsistent behavior comparing the other flows in the system (and probably
> > > > an overkill)
> > > 
> > > Reopening for discussion.
> > > Do we have any way to let the user know that the agent is down on any of the
> > > hosts?
> > 
> > No, we don't have any way. I think it's redundant.
> 
> How will the user know what the issue is? Or how to fix it without such a
> prompt?

There's a warning in the events tab.

Comment 14 Yaniv Lavi 2017-03-19 15:04:32 UTC
(In reply to Daniel Erez from comment #13)
> (In reply to Yaniv Dary from comment #12)
> > (In reply to Daniel Erez from comment #11)
> > > (In reply to Yaniv Dary from comment #9)
> > > > (In reply to Daniel Erez from comment #8)
> > > > > The ovirt-imageio-daemon service should be running on every host. Otherwise,
> > > > > it's a bug that should be addressed. Hence, I don't think we need to expose
> > > > > the status of the daemon service to the engine. 
> > > > > 
> > > > > @Yaniv - do we want to open an RFE for auto-detecting whether a daemon is
> > > > > running on an host, so the upload would start auto-magically on that host
> > > > > (instead of / in addition to selecting the host manually). Though it seems
> > > > > inconsistent behavior comparing the other flows in the system (and probably
> > > > > an overkill)
> > > > 
> > > > Reopening for discussion.
> > > > Do we have any way to let the user know that the agent is down on any of the
> > > > hosts?
> > > 
> > > No, we don't have any way. I think it's redundant.
> > 
> > How will the user know what the issue is? Or how to fix it without such a
> > prompt?
> 
> There's a warning in the events tab.

What is the text that is displayed?

Comment 15 Daniel Erez 2017-03-19 17:56:03 UTC
Unable to upload image to disk 6c844b2d-e864-4e1b-8c34-7c3199f64da7 due to a network error. Make sure ovirt-imageio-proxy service is installed and configured, and ovirt-engine's certificate is registered as a valid CA in the browser.

Comment 16 Yaniv Lavi 2017-03-20 11:34:23 UTC
Closing, seems good enough.

Comment 17 Daniel Erez 2017-03-22 09:45:29 UTC
@Yaniv - reopened by mistake I guess?

Comment 18 Yaniv Lavi 2017-03-30 09:13:38 UTC
Please see the bug on depends on.

Comment 19 Allon Mureinik 2017-07-03 12:20:53 UTC
(In reply to Yaniv Lavi from comment #18)
> Please see the bug on depends on.

Yaniv bug 1435988 which this one depends on was CLOSED and WONTFIX. As we aren't going to develop something special for these demons, I think this one should be closed too.

Comment 20 Yaniv Lavi 2017-07-10 08:13:52 UTC
Matching the status of that bug. QE if you feel this should be reopened, please start with the dependent RFE.


Note You need to log in before you can comment on or make changes to this bug.