Bug 1575984 - fail_over script reports false positively DR fail over operations
Summary: fail_over script reports false positively DR fail over operations
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: ovirt-ansible-collection
Classification: oVirt
Component: disaster-recovery
Version: 1.1.4
Hardware: x86_64
OS: Unspecified
unspecified
high
Target Milestone: ovirt-4.2.4
: ---
Assignee: Maor
QA Contact: Kevin Alon Goldblatt
URL:
Whiteboard: DR
Depends On:
Blocks: 1582073
TreeView+ depends on / blocked
 
Reported: 2018-05-08 13:31 UTC by Elad
Modified: 2018-06-26 08:40 UTC (History)
2 users (show)

Fixed In Version: ovirt-ansible-disaster-recovery-1.1.0
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-06-26 08:40:42 UTC
oVirt Team: Storage
Embargoed:
rule-engine: ovirt-4.2+


Attachments (Terms of Use)
ovirt-dr.log (14.21 KB, text/plain)
2018-05-08 13:31 UTC, Elad
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Github oVirt ovirt-ansible-disaster-recovery pull 43 0 None None None 2018-05-10 18:51:05 UTC

Description Elad 2018-05-08 13:31:26 UTC
Created attachment 1433218 [details]
ovirt-dr.log

Description of problem:
In case of a failure in DR fail over, fail_over.py reports it as a success.

Version-Release number of selected component (if applicable):
ovirt-ansible-disaster-recovery-0.4-1.el7ev.noarch
ansible-2.5.2-1.el7ae.noarch

How reproducible:
Always

Steps to Reproduce:
1. Execute DR fail_over.py while the engine (source or target) is unreachable


Actual results:
Fail over fails, /var/log/ovirt-dr/ovirt-dr.log:

Traceback (most recent call last):
  File "/tmp/ansible_TVKfm8/ansible_module_ovirt_auth.py", line 272, in main
    token = connection.authenticate()
  File "/usr/lib64/python2.7/site-packages/ovirtsdk4/__init__.py", line 384, in authenticate
    self.__parse_error(e)
  File "/usr/lib64/python2.7/site-packages/ovirtsdk4/__init__.py", line 932, in __parse_error
    six.reraise(clazz, clazz(error_msg), sys.exc_info()[2])
  File "/usr/lib64/python2.7/site-packages/ovirtsdk4/__init__.py", line 381, in authenticate
    self._sso_token = self._get_access_token()
  File "/usr/lib64/python2.7/site-packages/ovirtsdk4/__init__.py", line 617, in _get_access_token
    sso_response = self._get_sso_response(self._sso_url, post_data)
  File "/usr/lib64/python2.7/site-packages/ovirtsdk4/__init__.py", line 694, in _get_sso_response
    curl.perform()
ConnectionError: Error while sending HTTP request: (7, 'Failed to connect to rhv-dr2.scl.lab.tlv.redhat.com port 443: Connection timed out')
fatal: [localhost]: FAILED! => {
    "changed": false, 
    "invocation": {
        "module_args": {
            "ca_file": "/home/ebenahar/rhv-dr2-ca", 
            "compress": true, 
            "headers": null, 
            "insecure": null, 
            "kerberos": false, 
            "ovirt_auth": null, 
            "password": "VALUE_SPECIFIED_IN_NO_LOG_PARAMETER", 
            "state": "present", 
            "timeout": 0, 
            "token": null, 
            "url": "https://rhv-dr2.scl.lab.tlv.redhat.com/ovirt-engine/api", 
            "username": "admin@internal"
        }
    }, 
    "msg": "Error while sending HTTP request: (7, 'Failed to connect to rhv-dr2.scl.lab.tlv.redhat.com port 443: Connection timed out')"
}





fail_over.py output:
====================================

[Failover] Start failover operation...

[Failover] target_host: secondary 
[Failover] source_map: primary 
[Failover] var_file: /var/lib/ovirt-ansible-disaster-recovery/mapping_vars.yml 
[Failover] vault: /usr/share/ansible/roles/oVirt.disaster-recovery/ovirt_passwords.yml 
[Failover] ansible_play: ../examples/dr_play.yml 

Vault password: 
cat: ../files/report.log: No such file or directory

[Failover] Finished failover operation for oVirt ansible disaster recovery

====================================

Expected results:
fail_over.py output should contain an error message for the fail over failure

Additional info:
ovirt-dr.log

Comment 2 Maor 2018-05-17 05:05:51 UTC
(In reply to Maor from comment #1)
> Fixed in commit
> https://github.com/oVirt/ovirt-ansible-disaster-recovery/pull/43/commits/
> 712bb4f669eb569b742ac476d450184eac1fac3c

Here is an example of the output with that fix while the engine in unreachable:

[Failover] Please enter the vault password: 1

TASK [Gathering Facts] *******************************************************************************************************



TASK [oVirt.disaster-recovery : Recover target engine] ***********************************************************************



TASK [oVirt.disaster-recovery : Obtain SSO token] ****************************************************************************


 [WARNING]: provided hosts list is empty, only localhost is available. Note that the implicit localhost does not match 'all'

Exception: Command '['ansible-playbook', '/usr/share/doc/ovirt-ansible-disaster-recovery/examples/dr_play.yml', '-t', 'fail_over', '-e', '@/var/lib/ovirt-ansible-disaster-recovery/mapping_vars.yml', '-e', '@/usr/share/doc/ovirt-ansible-disaster-recovery/examples/ovirt_passwords.yml', '-e', ' dr_target_host=secondary dr_source_map=primary dr_report_file=report-1526533446084.log', '--vault-password-file', 'vault_secret.sh', '-vvv']' returned non-zero exit status 2

failover operation failed, please check log file for further details.

Comment 3 Kevin Alon Goldblatt 2018-06-07 15:42:14 UTC
Verified with the following code;
-----------------------------------------
ovirt-ansible-disaster-recovery-1.1.0-1.el7ev.noarch

Verified with the following scenario:
-----------------------------------------
Ran sudo ./ovirt-dr failover

Due to an error in the password file the dr failed wit an error>>>>
Exception: Command '['ansible-playbook', '/usr/share/doc/ovirt-ansible-disaster-recovery-1.1.0/examples/dr_play.yml', '-t', 'fail_over', '-e', '@/var/lib/ovirt-ansible-disaster-recovery/mapping_vars.yml', '-e', '@/usr/share/doc/ovirt-ansible-disaster-recovery-1.1.0/examples/ovirt_passwords.yml', '-e', ' dr_target_host=secondary dr_source_map=primary dr_report_file=report-1528385666059.log', '--vault-password-file', 'vault_secret.sh', '-vvv']' returned non-zero exit status 2

failover operation failed, please check log file for further details.


Moving to VERIFIED!

Comment 4 Sandro Bonazzola 2018-06-26 08:40:42 UTC
This bugzilla is included in oVirt 4.2.4 release, published on June 26th 2018.

Since the problem described in this bug report should be
resolved in oVirt 4.2.4 release, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.