Bug 1560419

Summary: Deploy HE with nfs storage failed under the [Copy local VM disk to shared storage] task
Product: [oVirt] cockpit-ovirt Reporter: Yihui Zhao <yzhao>
Component: Hosted EngineAssignee: Simone Tiraboschi <stirabos>
Status: CLOSED DUPLICATE QA Contact: Yihui Zhao <yzhao>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 0.11.19CC: bugs, cshao, huzhao, phbailey, qiyuan, rbarry, sbonazzo, stirabos, weiwang, yaniwang, ycui, yturgema
Target Milestone: ---Flags: rule-engine: ovirt-4.2+
yzhao: testing_ack+
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-03-26 12:23:55 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Node RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
vdsm_log none

Description Yihui Zhao 2018-03-26 06:39:30 UTC
Description of problem: 
Deploy HE with nfs storage failed under the  [Copy local VM disk to shared storage] task.

From the cockpit, raise the error:
[ INFO ] TASK [Copy local VM disk to shared storage]
[ ERROR ] fatal: [localhost]: FAILED! => {"changed": true, "cmd": ["qemu-img", "convert", "-n", "-O", "raw", "/var/tmp/localvmgSGNOR/images/50755da7-5742-4753-9d26-b862340e7bff/50b84698-99d5-4bd7-a90d-3e15a13ad41c", "/rhev/data-center/mnt/10.66.148.11:_home_yzhao_nfs3/9cd9fa2c-aeec-4104-b700-178e8d13bfa5/images/962e04f4-75b8-4009-80c8-5cf5ec9e63d8/1e745505-d49c-4b91-910a-f4992a39a7b0"], "delta": "0:00:00.188629", "end": "2018-03-26 11:40:56.391288", "msg": "non-zero return code", "rc": 1, "start": "2018-03-26 11:40:56.202659", "stderr": "qemu-img: Could not open '/var/tmp/localvmgSGNOR/images/50755da7-5742-4753-9d26-b862340e7bff/50b84698-99d5-4bd7-a90d-3e15a13ad41c': Failed to get shared \"write\" lock\nIs another process using the image?", "stderr_lines": ["qemu-img: Could not open '/var/tmp/localvmgSGNOR/images/50755da7-5742-4753-9d26-b862340e7bff/50b84698-99d5-4bd7-a90d-3e15a13ad41c': Failed to get shared \"write\" lock", "Is another process using the image?"], "stdout": "", "stdout_lines": []}


Version-Release number of selected component (if applicable): 
rhvh-4.2.2.0-0.20180322.0+1
cockpit-ovirt-dashboard-0.11.19-1.el7ev.noarch
ovirt-hosted-engine-setup-2.2.14-1.el7ev.noarch
ovirt-hosted-engine-ha-2.2.7-1.el7ev.noarch
rhvm-appliance-4.2-20180322.0.el7.noarch


How reproducible: 
60%


Steps to Reproduce: 
1. Deploy HE with NFS storage via cockpit
 
Actual results:  
The same as the description.

Expected results: 
Deploy HE successfully

Additional info:

Comment 1 Yihui Zhao 2018-03-26 06:45:34 UTC
Created attachment 1412971 [details]
vdsm_log

Comment 2 Ryan Barry 2018-03-26 08:58:26 UTC
Is this reproducible on the CLI?

Comment 3 Simone Tiraboschi 2018-03-26 12:23:55 UTC
I think it's a duplicate of https://bugzilla.redhat.com/1559750 with additional side effects on RHEL 7.5.

On RHEL 7.5, qemu introduces an additional locking mechanism to prevent also qemu-img to write an image while in use and it's exactly what we got here:

"stderr_lines": ["qemu-img: Could not open '/var/tmp/localvmgSGNOR/images/50755da7-5742-4753-9d26-b862340e7bff/50b84698-99d5-4bd7-a90d-3e15a13ad41c': Failed to get shared \"write\" lock", "Is another process using the image?"]

Due to https://bugzilla.redhat.com/1559750 we could start copying that image while the VM is still running since ansible virt module is basically async.

*** This bug has been marked as a duplicate of bug 1559750 ***