Bug 1203417 - [RFE] migration failures with routing errors should give better logs
Summary: [RFE] migration failures with routing errors should give better logs
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: oVirt
Classification: Retired
Component: vdsm
Version: 3.5
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: ---
Assignee: Dan Kenigsberg
QA Contact: Gil Klein
URL:
Whiteboard: virt
Depends On: 1090626
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-03-18 18:52 UTC by Markus Stockhausen
Modified: 2015-04-02 10:01 UTC (History)
11 users (show)

Fixed In Version:
Doc Type: Enhancement
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-04-02 09:59:33 UTC
oVirt Team: ---
Embargoed:


Attachments (Terms of Use)

Description Markus Stockhausen 2015-03-18 18:52:44 UTC
Description of problem:

After setting up a new host in Ovirt 3.5.2 we encountered a routing error on the  host. Especially the migration network was not accessible (although the IP being up). Thus all migrations from and to the host failed. The reason could not be identified in the web interface and we had to crawl thorugh several logs until the issue was found.

Version-Release number of selected component (if applicable):

Ovirt 3.5.2 / vdsm 4.16.10

How reproducible:

100% (if error exists) 

Steps to Reproduce:
1. define separate migration network for hosts of Ovirt cluster
2. simulate error (e.g. detach cable, close firewall) on that host
3. Start migration to/from specific host

Actual results:

migration fails with webinterface message "Migration failed due to error: Fatal error during migration". 

Real error is only visible in vdsm logs: libvirtError: Could not connect to server X.X.X.X:49152: No route to target host

Expected results:

Reason should be visible in webinterface.

Comment 1 Michal Skrivanek 2015-03-27 09:11:22 UTC
Dan, apart for improvement in error reporting from libvirt...

if the network is defined as not optional, shouldn't it bring host to non-operational?
And network designated for migration should always be mandatory, IMO.

Comment 2 Dan Kenigsberg 2015-03-27 11:38:42 UTC
yes; if a required network is missing, the host goes to non-operational. However, there is an inherent race: Engine can initialize migration before it has become aware of that the migration network is down.

I don't think we should "nanny the user". Some users need migration very very rarely. For them, the migration network can be off most of the time, but they still want to fire up a VM instances. The concept of "non-required network" was designed exactly for such cases: when the user KNOWS that he can live with the network being down. We should not restrain him.

Comment 3 Michal Skrivanek 2015-04-02 09:59:33 UTC
(In reply to Dan Kenigsberg from comment #2)
> yes; if a required network is missing, the host goes to non-operational.
> However, there is an inherent race: Engine can initialize migration before
> it has become aware of that the migration network is down.
> 
> I don't think we should "nanny the user". Some users need migration very
> very rarely. For them, the migration network can be off most of the time,
> but they still want to fire up a VM instances. The concept of "non-required
> network" was designed exactly for such cases: when the user KNOWS that he
> can live with the network being down. We should not restrain him.

I would agree. For these cases we can assume the user is aware of the concepts and behavior and when there is an error he/she would check connectivity. Normal users will have the network mandatory.
I think it's a fair assumption hence I don't see this RFE makes too much sense. 

note for improving migration errors reporting there is an existing bug 1090626


Note You need to log in before you can comment on or make changes to this bug.