Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1440549

Summary: When failing to fence volume jobs the number of commands in the db grows till the execution ends.
Product: [oVirt] ovirt-engine Reporter: Liron Aravot <laravot>
Component: BLL.StorageAssignee: Liron Aravot <laravot>
Status: CLOSED CURRENTRELEASE QA Contact: Carlos Mestre González <cmestreg>
Severity: high Docs Contact:
Priority: unspecified    
Version: 4.1.1CC: amureini, bugs, lveyde
Target Milestone: ovirt-4.1.2Flags: rule-engine: ovirt-4.1+
Target Release: 4.1.2   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-05-23 08:19:07 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Storage RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Liron Aravot 2017-04-09 21:53:05 UTC
Description of problem:
FenceVolumeJob is executed in order to fence operations that were submitted and
are supposed to be performed on the volume.
In case of failure to fence an operation, the engine will attempt to fence it
again - when the fencing fails constantly the number of commands (in the command_entities table) will grow till the execution will end successfully.

How reproducible:
Always

Steps to Reproduce:
1. Attempt add a server vm based on a template, having the destination of one of the disks to be a domain which isn't the master domain (and which is on a different server than the master domain)
2. Block the connection to the destination domain when the CopyVolumeData command starts

Actual results:
The engine will attempt to execute FenceVolumeJob every few seconds, leading to growth of the number of records in the command_entities table in the db  till the execution ends successfully.

Expected results:
the number of records in the command_entities table in the db shouldn't grow indefinitely till the execution end.

Additional info:

Comment 1 Carlos Mestre González 2017-04-28 18:21:41 UTC
I tested this in a env with multiple hosts with multiple domains.

Waited after the CopyVolume operation started and blocked the connection to the domain.

The host status was connecting (and Handling non responsive Host host_mixed_2). The command_entities table didn't grow indefinitely, only entires where the unique for the copy:

engine=> select count(*) from command_entities;
 count 
-------
     5

version: rhevm-4.1.2-0.1.el7.noarch