Bug 982686 - engine [logger]: same exception is logged multiple times in engine log
engine [logger]: same exception is logged multiple times in engine log
Status: CLOSED CURRENTRELEASE
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine (Show other bugs)
3.2.0
x86_64 Linux
unspecified Severity low
: ---
: 3.4.0
Assigned To: Piotr Kliczewski
sefi litmanovich
infra
:
Depends On: 960635
Blocks: rhev3.4beta 1142926
  Show dependency treegraph
 
Reported: 2013-07-09 10:54 EDT by Alissa
Modified: 2016-02-10 14:40 EST (History)
13 users (show)

See Also:
Fixed In Version: ovirt-3.4.0-alpha1
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 960635
Environment:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: Infra
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Alissa 2013-07-09 10:54:33 EDT
+++ This bug was initially created as a clone of Bug #960635 +++

Description of problem:

I ran a vm on hsm host and restarted the vdsm on the hsm during the create volume step.
the live storage migration command fails and we report it 5 (not counting the INFO which also reports it)

(see more info)

Version-Release number of selected component (if applicable):

sf15
vdsm-4.10.2-17.0.el6ev.x86_64

How reproducible:

100%

Steps to Reproduce:
1. create and run a vm on iscsi pool with 2 hosts
2. live migrate the vm's disk
3. restart the vdsm on the hsm when task createVolume appears in spm
 
Actual results:

we fail the move and engine reports it 5 times in the log which would make debugging deficult in case we have multiple vms

Expected results:

we should diminish the amount of reported errors in the log


Additional info: logs

2013-05-07 16:14:10,043 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.BrokerCommandBase] (pool-4-thread-46) [105ad651] Error code imageErr and error message VDSGenericException: VDSErrorException: Failed to VmReplicateDiskStartVDS, e
rror = Drive image file %s could not be found

2013-05-07 16:14:10,044 ERROR [org.ovirt.engine.core.vdsbroker.VDSCommandBase] (pool-4-thread-46) [105ad651] Command VmReplicateDiskStartVDS execution failed. Exception: VDSErrorException: VDSGenericException: VDSErrorException: Failed
to VmReplicateDiskStartVDS, error = Drive image file %s could not be found

2013-05-07 16:14:10,044 ERROR [org.ovirt.engine.core.bll.lsm.LiveMigrateDiskCommand] (pool-4-thread-46) [105ad651] Command org.ovirt.engine.core.bll.lsm.LiveMigrateDiskCommand throw Vdc Bll exception. With error message VdcBLLException:
 Drive image file %s could not be found

2013-05-07 16:14:10,070 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.BrokerCommandBase] (pool-4-thread-46) [105ad651] Error code imageErr and error message VDSGenericException: VDSErrorException: Failed to VmReplicateDiskFinishVDS,
error = Drive image file %s could not be found

2013-05-07 16:14:10,070 ERROR [org.ovirt.engine.core.vdsbroker.VDSCommandBase] (pool-4-thread-46) [105ad651] Command VmReplicateDiskFinishVDS execution failed. Exception: VDSErrorException: VDSGenericException: VDSErrorException: Failed
 to VmReplicateDiskFinishVDS, error = Drive image file %s could not be found

--- Additional comment from Alissa on 2013-07-09 08:04:05 EDT ---

First problem that needs a fix related to the specific error that the error doesn't contain the name of the image file, it's not properly replaced and will make troubleshooting more difficult: 
"Drive image file %s could not be found".

Second - regarding the multiple logging, it looks as this is not a storage related problem, but rather happens for many different commands.
Perhaps the bug should be splitted to handle the first issue in storage area, and the multiple logging to be handled on infra level - Ayal, your input on this?
The multiple logging is done in the following places:

BrokerCommandBase --> ProceedProxyReturnValue
VDSCommandBase --> setVdsRuntimeError

--- Additional comment from Ayal Baron on 2013-07-09 09:52:12 EDT ---

(In reply to Alissa from comment #1)
> First problem that needs a fix related to the specific error that the error
> doesn't contain the name of the image file, it's not properly replaced and
> will make troubleshooting more difficult: 
> "Drive image file %s could not be found".

This means that we're lacking this information in the log?

> 
> Second - regarding the multiple logging, it looks as this is not a storage
> related problem, but rather happens for many different commands.
> Perhaps the bug should be splitted to handle the first issue in storage
> area, and the multiple logging to be handled on infra level - Ayal, your
> input on this?
> The multiple logging is done in the following places:
> 
> BrokerCommandBase --> ProceedProxyReturnValue
> VDSCommandBase --> setVdsRuntimeError

Sounds reasonable to me.
Please clone it, rename both to reflect the relevant problems etc.

--- Additional comment from Alissa on 2013-07-09 10:43:06 EDT ---

(In reply to Ayal Baron from comment #2)
> (In reply to Alissa from comment #1)
> > First problem that needs a fix related to the specific error that the error
> > doesn't contain the name of the image file, it's not properly replaced and
> > will make troubleshooting more difficult: 
> > "Drive image file %s could not be found".
> 
> This means that we're lacking this information in the log?
yes. I think the incomplete message is originated like this in vdsm so the bug should be moved from engine to vdsm component - this is how it's declared in define.py (and it's not defined like this in engine code):
  'imageErr': {'status': {
        'code': 13,
        'message': 'Drive image file %s could not be found'}},

> > Second - regarding the multiple logging, it looks as this is not a storage
> > related problem, but rather happens for many different commands.
> > Perhaps the bug should be splitted to handle the first issue in storage
> > area, and the multiple logging to be handled on infra level - Ayal, your
> > input on this?
> > The multiple logging is done in the following places:
> > 
> > BrokerCommandBase --> ProceedProxyReturnValue
> > VDSCommandBase --> setVdsRuntimeError
> 
> Sounds reasonable to me.
> Please clone it, rename both to reflect the relevant problems etc.
Comment 1 Alissa 2013-07-09 10:56:59 EDT
The multiple logging is done in the following places:

BrokerCommandBase --> ProceedProxyReturnValue
VDSCommandBase --> setVdsRuntimeError
Comment 2 Barak 2013-08-13 07:56:50 EDT
Yaniv, don't try to reproduce with live storage migration.
Just fix/understand the issue described in comment #1
Comment 3 Yair Zaslavsky 2013-08-15 03:15:39 EDT
Yaniv,  following Comment #2 the log at BrokerCommandBase happens in the "default" part of the large switch.
error code 13 is imageErr at VdcBllErrors.
There are many other errors in VdcBllErrors which are handdled by the default part (so indeed you can use other flows to reproduce this).

ProccedProxyReturnValue is being called from VmReplicateDiskStartVdsCommand and
VmReplaicateDiskFinishVdsCommand  - both extend the BrokerCommndBase.
These VDS broker commands are called from the task handlers of LiveMigrateDisk, see
LiveMigrateDiskCommand.initTaskHandlers()
Since each one of these VDS commands also extend VdsCommandBase, you will see that setVdsRuntimeError(ex) is also being called (see VdsCommandBase.executeCommand

One of the problems is that all the exceptions generated by ProceedProxyReturnValue extend (directly or indirectly) RuntimeException,
and you can see what's going on in VDSCommandBase.executeCommand  - the catch for RuntimeException performs the setVdsRuntimeError

What i would suggest (of course other suggestions are posible :) is to wrap
the exceptions generated at ProceedProxyReturnValue with Let's say a ProxyReturnValueException which extends RuntimeException, and then at the VdsCommandBase.executeCommand code, add a catch to ProxyReturnValueException which does not log (in contrast to catch RuntimeException)
Comment 4 Yaniv Bronhaim 2013-09-02 09:36:35 EDT
Yair, I can catch VDSExceptionBase and call VDSCommandBase::setVdsRuntimeError with a parameter not to report to log. Isn't that enough?
Comment 5 Yair Zaslavsky 2013-09-16 00:27:46 EDT
Yaniv, did you check this out? How will you determine the value of the boolean flag?
Anyway, I'm postponing to 3.4
Comment 6 Eli Mesika 2013-11-20 04:53:43 EST
fixed in commit : afc4d50
Comment 7 Sandro Bonazzola 2014-01-14 03:44:43 EST
ovirt 3.4.0 alpha has been released
Comment 8 sefi litmanovich 2014-02-17 08:22:35 EST
Verified on Ovirt 3.4.0-0.7beta2, vdsm-4.14.1-3.el6.x86_64.

engine.log doesn't have multiple error messages at the same time.
Comment 10 Itamar Heim 2014-06-12 10:07:21 EDT
Closing as part of 3.4.0

Note You need to log in before you can comment on or make changes to this bug.