Bug 1564873

Summary: Bad permission of qcow file inside appliance OVA prevents hosted-engine deployment
Product: [oVirt] ovirt-appliance Reporter: Nikolai Sednev <nsednev>
Component: Packaging.rpmAssignee: Yuval Turgeman <yturgema>
Status: CLOSED CURRENTRELEASE QA Contact: Nikolai Sednev <nsednev>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: 4.2CC: bugs, cshao, dfediuck, mavital, nsednev, ratamir, stirabos, weiwang, ycui, yturgema, yzhao
Target Milestone: ovirt-4.2.2Keywords: Regression, Triaged
Target Release: ---Flags: rule-engine: ovirt-4.2?
ratamir: blocker?
nsednev: planning_ack?
rule-engine: devel_ack+
mavital: testing_ack+
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: rhvm-appliance-4.2-20180410.0 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-04-18 12:25:08 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Integration RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Bug Depends On:    
Bug Blocks: 1455169, 1547479    
Attachments:
Description Flags
sosreport from alma03 none

Description Nikolai Sednev 2018-04-08 12:42:49 UTC
Created attachment 1418897 [details]
sosreport from alma03

Description of problem:
iSCSI Node 0 deployment failed.

[ INFO  ] changed: [localhost]
[ INFO  ] TASK [Copy local VM disk to shared storage]
[ ERROR ] fatal: [localhost]: FAILED! => {"changed": true, "cmd": ["qemu-img", "convert", "-n", "-O", "raw", "/var/tmp/localvmn0FXpa/images/3f023247-9d1a-4b51-a7d6-fcf75cd3bc01/fe1c4b0a-89f7-4beb-954d-d1d6e61954ad", "/rhev/data-center/mnt/blockSD/50bad090-44a8-4097-9ce5-7cbb9660ec23/images/a317c1cd-dea5-42e8-a68b-5cb3cf0998ff/8d079c74-b715-4afa-9c90-9563561caa22"], "delta": "0:00:00.167448", "end": "2018-04-08 15:32:19.138283", "msg": "non-zero return code", "rc": 1, "start": "2018-04-08 15:32:18.970835", "stderr": "qemu-img: Could not open '/var/tmp/localvmn0FXpa/images/3f023247-9d1a-4b51-a7d6-fcf75cd3bc01/fe1c4b0a-89f7-4beb-954d-d1d6e61954ad': Could not open '/var/tmp/localvmn0FXpa/images/3f023247-9d1a-4b51-a7d6-fcf75cd3bc01/fe1c4b0a-89f7-4beb-954d-d1d6e61954ad': Permission denied", "stderr_lines": ["qemu-img: Could not open '/var/tmp/localvmn0FXpa/images/3f023247-9d1a-4b51-a7d6-fcf75cd3bc01/fe1c4b0a-89f7-4beb-954d-d1d6e61954ad': Could not open '/var/tmp/localvmn0FXpa/images/3f023247-9d1a-4b51-a7d6-fcf75cd3bc01/fe1c4b0a-89f7-4beb-954d-d1d6e61954ad': Permission denied"], "stdout": "", "stdout_lines": []}
[ ERROR ] Failed to execute stage 'Closing up': Failed executing ansible-playbook
[ INFO  ] Stage: Clean up
[ INFO  ] Cleaning temporary resources
[ INFO  ] TASK [Gathering Facts]
[ INFO  ] ok: [localhost]
[ INFO  ] TASK [include_tasks]
[ INFO  ] ok: [localhost]
[ INFO  ] TASK [Remove local vm dir]
[ INFO  ] changed: [localhost]
[ INFO  ] Generating answer file '/var/lib/ovirt-hosted-engine-setup/answers/answers-20180408153226.conf'
[ INFO  ] Stage: Pre-termination
[ INFO  ] Stage: Termination
[ ERROR ] Hosted Engine deployment failed: please check the logs for the issue, fix accordingly or re-deploy from scratch.
          Log file is located at /var/log/ovirt-hosted-engine-setup/ovirt-hosted-engine-setup-20180408150712-d76v90.log


Version-Release number of selected component (if applicable):
ovirt-hosted-engine-setup-2.2.16-1.el7ev.noarch
ovirt-hosted-engine-ha-2.2.10-1.el7ev.noarch
rhvm-appliance-4.2-20180404.0.el7.noarch
Linux 3.10.0-862.el7.x86_64 #1 SMP Wed Mar 21 18:14:51 EDT 2018 x86_64 x86_64 x86_64 GNU/Linux
Red Hat Enterprise Linux Server release 7.5 (Maipo)

How reproducible:
100%

Steps to Reproduce:
1.Deploy SHE ansible over iSCSI LUN of 80GB.

Actual results:
Could not open '/var/tmp/localvmn0FXpa/images/3f023247-9d1a-4b51-a7d6-fcf75cd3bc01/fe1c4b0a-89f7-4beb-954d-d1d6e61954ad': Could not open '/var/tmp/localvmn0FXpa/images/3f023247-9d1a-4b51-a7d6-fcf75cd3bc01/fe1c4b0a-89f7-4beb-954d-d1d6e61954ad': Permission denied"

Expected results:
Deployment should succeed.

Additional info:
Sosreport from host is attached.

Comment 1 Nikolai Sednev 2018-04-08 12:43:52 UTC
Deployment attempt was made using CLI.

Comment 2 Yaniv Kaul 2018-04-08 14:03:53 UTC
Nikolai,

1. Does it work with Cockpit?
2. The failure seems like in a very early stage. Any hint why it happened? Lack of disk space, selinux permission, is /var/tmp on a network mount, somethng like that?

Comment 3 Martin Sivák 2018-04-08 14:50:24 UTC
Yaniv, the disk copy is actually pretty late (almost finished). What is interesting is the fact that setup can't read the local disk file the VM was using in the first place.

Might this be related to the new qemu image locking feature on 7.5?

Comment 4 Nikolai Sednev 2018-04-08 14:52:35 UTC
(In reply to Yaniv Kaul from comment #2)
> Nikolai,
> 
> 1. Does it work with Cockpit?
I did not tried it on Cockpit, although AFAIK if it fails on otopi, it'll do the same thing on Cockpit, as its based on the same mechanism, at least it was so until now.
> 2. The failure seems like in a very early stage. Any hint why it happened?
No its not an early stage its almost at the end of the deployment.
> Lack of disk space, selinux permission, is /var/tmp on a network mount,
> somethng like that?
alma04 ~]# getenforce 
Enforcing

Disk space is fine, there was 100GB on LUN, which is sufficient.

/var/tmp is not mounted:
alma04 ~]# df -ha /var/tmp
Filesystem      Size  Used Avail Use% Mounted on
/dev/sda3       915G  5.2G  863G   1% /


alma04 ~]# lsblk
NAME                                                                                  MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT
sda                                                                                     8:0    0 931.5G  0 disk  
├─sda1                                                                                  8:1    0   700M  0 part  /boot
├─sda2                                                                                  8:2    0     2G  0 part  [SWAP]
└─sda3                                                                                  8:3    0 928.8G  0 part  /
sdb                                                                                     8:16   0   100G  0 disk  
└─3514f0c5a51601722                                                                   253:0    0   100G  0 mpath 
  ├─3cd7fbb0--0ac7--45b9--b3a1--c92541e97771-metadata                                 253:1    0   512M  0 lvm   
  ├─3cd7fbb0--0ac7--45b9--b3a1--c92541e97771-outbox                                   253:2    0   128M  0 lvm   
  ├─3cd7fbb0--0ac7--45b9--b3a1--c92541e97771-xleases                                  253:3    0     1G  0 lvm   
  ├─3cd7fbb0--0ac7--45b9--b3a1--c92541e97771-leases                                   253:4    0     2G  0 lvm   
  ├─3cd7fbb0--0ac7--45b9--b3a1--c92541e97771-ids                                      253:5    0   128M  0 lvm   
  ├─3cd7fbb0--0ac7--45b9--b3a1--c92541e97771-inbox                                    253:6    0   128M  0 lvm   
  ├─3cd7fbb0--0ac7--45b9--b3a1--c92541e97771-master                                   253:7    0     1G  0 lvm   /rhev/data-
  ├─3cd7fbb0--0ac7--45b9--b3a1--c92541e97771-7228c64a--9572--4948--9d98--315e8fa6e237 253:8    0    58G  0 lvm   
  ├─3cd7fbb0--0ac7--45b9--b3a1--c92541e97771-ec58290f--0f93--4064--bcd2--a8521c65d37e 253:9    0     1G  0 lvm   
  ├─3cd7fbb0--0ac7--45b9--b3a1--c92541e97771-d19774bb--d0a0--4898--9da3--bef88dcbcf25 253:10   0     1G  0 lvm   
  └─3cd7fbb0--0ac7--45b9--b3a1--c92541e97771-5bbf9f83--a716--49e5--a1c5--199393ca6315 253:11   0     1G  0 lvm

Comment 5 Nikolai Sednev 2018-04-08 14:58:19 UTC
Attaching the whole deployment run.
http://pastebin.test.redhat.com/573247
Please pay attention that I used different portal for discovery and for actual deployment in order to verify http://pastebin.test.redhat.com/573247.
Any space issues were fixed in https://bugzilla.redhat.com/show_bug.cgi?id=1522737, I've seen that it works fine in CLI prior this deployment.

Comment 6 Nikolai Sednev 2018-04-08 15:03:19 UTC
Any chance that https://bugzilla.redhat.com/show_bug.cgi?id=1547479 might be related to this bug?

Comment 7 Nikolai Sednev 2018-04-09 10:54:52 UTC
(In reply to Yaniv Kaul from comment #2)
> Nikolai,
> 
> 1. Does it work with Cockpit?
Happens also on Cockpit:


[ ERROR ] fatal: [localhost]: FAILED! => {"changed": true, "cmd": ["qemu-img", "convert", "-n", "-O", "raw", "/var/tmp/localvmIoWhRP/images/3f023247-9d1a-4b51-a7d6-fcf75cd3bc01/fe1c4b0a-89f7-4beb-954d-d1d6e61954ad", "/rhev/data-center/mnt/blockSD/56db013b-1e30-4f38-889e-64694d7fcc37/images/01899787-8b0d-4e3b-bc8b-dcfae54efec2/799d6f67-ec08-47f4-82a9-3886fa74b4ed"], "delta": "0:00:00.166840", "end": "2018-04-09 13:53:11.443789", "msg": "non-zero return code", "rc": 1, "start": "2018-04-09 13:53:11.276949", "stderr": "qemu-img: Could not open '/var/tmp/localvmIoWhRP/images/3f023247-9d1a-4b51-a7d6-fcf75cd3bc01/fe1c4b0a-89f7-4beb-954d-d1d6e61954ad': Could not open '/var/tmp/localvmIoWhRP/images/3f023247-9d1a-4b51-a7d6-fcf75cd3bc01/fe1c4b0a-89f7-4beb-954d-d1d6e61954ad': Permission denied", "stderr_lines": ["qemu-img: Could not open '/var/tmp/localvmIoWhRP/images/3f023247-9d1a-4b51-a7d6-fcf75cd3bc01/fe1c4b0a-89f7-4beb-954d-d1d6e61954ad': Could not open '/var/tmp/localvmIoWhRP/images/3f023247-9d1a-4b51-a7d6-fcf75cd3bc01/fe1c4b0a-89f7-4beb-954d-d1d6e61954ad': Permission denied"], "stdout": "", "stdout_lines": []}

Comment 8 Martin Sivák 2018-04-09 11:19:33 UTC
Nikolai, just to prove or disprove the locking possibility, can you attach the output of "lslocks -o COMMAND,PID,TYPE,SIZE,MODE,M,START,END,PATH,BLOCKER" please?

Comment 9 Nikolai Sednev 2018-04-09 11:21:52 UTC
Deployment fails from Cockpit also over NFS:
[ ERROR ] fatal: [localhost]: FAILED! => {"changed": true, "cmd": ["qemu-img", "convert", "-n", "-O", "raw", "/var/tmp/localvmoFPZeR/images/3f023247-9d1a-4b51-a7d6-fcf75cd3bc01/fe1c4b0a-89f7-4beb-954d-d1d6e61954ad", "/rhev/data-center/mnt/yellow-vdsb.qa.lab.tlv.redhat.com:_Compute__NFS_nsednev__he__1/0cb22e7e-f793-46d4-abf5-33346ccd2c28/images/eecee05d-4572-44bf-9e01-b492e064efcc/47f86258-01d1-401f-ba8a-c6a1ff7571f1"], "delta": "0:00:00.167128", "end": "2018-04-09 14:16:08.005990", "msg": "non-zero return code", "rc": 1, "start": "2018-04-09 14:16:07.838862", "stderr": "qemu-img: Could not open '/var/tmp/localvmoFPZeR/images/3f023247-9d1a-4b51-a7d6-fcf75cd3bc01/fe1c4b0a-89f7-4beb-954d-d1d6e61954ad': Could not open '/var/tmp/localvmoFPZeR/images/3f023247-9d1a-4b51-a7d6-fcf75cd3bc01/fe1c4b0a-89f7-4beb-954d-d1d6e61954ad': Permission denied", "stderr_lines": ["qemu-img: Could not open '/var/tmp/localvmoFPZeR/images/3f023247-9d1a-4b51-a7d6-fcf75cd3bc01/fe1c4b0a-89f7-4beb-954d-d1d6e61954ad': Could not open '/var/tmp/localvmoFPZeR/images/3f023247-9d1a-4b51-a7d6-fcf75cd3bc01/fe1c4b0a-89f7-4beb-954d-d1d6e61954ad': Permission denied"], "stdout": "", "stdout_lines": []}

alma04 ~]# lslocks -o COMMAND,PID,TYPE,SIZE,MODE,M,START,END,PATH,BLOCKER
COMMAND           PID  TYPE SIZE MODE  M START END PATH                                BLOCKER
wdmd              629 POSIX   4B WRITE 0     0   0 /run/wdmd/wdmd.pid                  
libvirtd        15946 POSIX   5B WRITE 0     0   0 /run/libvirtd.pid                   
vdsmd           16210 FLOCK   0B WRITE 0     0   0 /run/vdsm/vdsmd.lock                
ovsdb-server    16962 POSIX   6B WRITE 0     0   0 /run/openvswitch/ovsdb-server.pid   
ovsdb-server    16962 POSIX   0B WRITE 0     0   0 /etc/openvswitch/.conf.db.~lock~    
iscsid           1141 POSIX   5B WRITE 0     0   0 /run/iscsid.pid                     
atd              1186 POSIX   5B WRITE 0     0   0 /run/atd.pid                        
python           1126 FLOCK   4B WRITE 0     0   0 /run/goferd.pid                     
master           1670 FLOCK  33B WRITE 0     0   0 /var/spool/postfix/pid/master.pid   
master           1670 FLOCK  33B WRITE 0     0   0 /var/lib/postfix/master.lock        
abrtd           13827 POSIX   6B WRITE 0     0   0 /run/abrt/abrtd.pid                 
ovn-controller  17086 POSIX   6B WRITE 0     0   0 /run/openvswitch/ovn-controller.pid 
virtlogd         4306 POSIX   4B WRITE 0     0   0 /run/virtlogd.pid                   
sanlock          5446 POSIX   5B WRITE 0     0   0 /run/sanlock/sanlock.pid            
multipathd       5684 POSIX   4B WRITE 0     0   0 /run/multipathd/multipathd.pid      
rhsmcertd        1128 FLOCK   0B WRITE 0     0   0 /run/lock/subsys/rhsmcertd          
crond            1184 FLOCK   5B WRITE 0     0   0 /run/crond.pid                      
supervdsmd      16064 FLOCK   0B WRITE 0     0   0 /run/vdsm/supervdsmd.lock           
ovs-vswitchd    17012 POSIX   6B WRITE 0     0   0 /run/openvswitch/ovs-vswitchd.pid

Comment 10 Nikolai Sednev 2018-04-09 11:24:50 UTC
Tested on latest components as follows:
cockpit-ovirt-dashboard-0.11.20-1.el7ev.noarch
rhvm-appliance-4.2-20180404.0.el7.noarch
ovirt-hosted-engine-setup-2.2.16-1.el7ev.noarch
ovirt-hosted-engine-ha-2.2.10-1.el7ev.noarch
Linux 3.10.0-862.el7.x86_64 #1 SMP Wed Mar 21 18:14:51 EDT 2018 x86_64 x86_64 x86_64 GNU/Linux
Red Hat Enterprise Linux Server release 7.5 (Maipo)

Comment 11 Simone Tiraboschi 2018-04-09 11:43:40 UTC
Upstream 4.2 on Centos 7.4 runs fine:
http://jenkins.ovirt.org/view/oVirt%20system%20tests/job/ovirt-system-tests_he-basic-suite-4.2/

We have to understand if RHEL 7.5 specific.

Comment 12 Yaniv Kaul 2018-04-09 11:54:40 UTC
Nikolai, anything with selinux?

Comment 13 Nikolai Sednev 2018-04-09 12:11:45 UTC
(In reply to Yaniv Kaul from comment #12)
> Nikolai, anything with selinux?

Not that I can see something on my own, other than its enforcing as I previously answered.

Comment 14 Simone Tiraboschi 2018-04-09 12:29:32 UTC
SELinux semas clean:
[root@alma04 3f023247-9d1a-4b51-a7d6-fcf75cd3bc01]# ausearch -m avc
<no matches>

It seams an ownership issue:
[root@alma04 3f023247-9d1a-4b51-a7d6-fcf75cd3bc01]# ls -l
total 3374760
-rw-------. 1 qemu qemu 3452370944  9 apr 14.16 fe1c4b0a-89f7-4beb-954d-d1d6e61954ad
-rw-r--r--. 1 root root        330  4 apr 19.47 fe1c4b0a-89f7-4beb-954d-d1d6e61954ad.meta
[root@alma04 3f023247-9d1a-4b51-a7d6-fcf75cd3bc01]# sudo -u vdsm dd if=/var/tmp/localvmoFPZeR/images/3f023247-9d1a-4b51-a7d6-fcf75cd3bc01/fe1c4b0a-89f7-4beb-954d-d1d6e61954ad bs=4k count=1
dd: failed to open ‘/var/tmp/localvmoFPZeR/images/3f023247-9d1a-4b51-a7d6-fcf75cd3bc01/fe1c4b0a-89f7-4beb-954d-d1d6e61954ad’: Permission denied

Comment 15 Simone Tiraboschi 2018-04-09 12:36:48 UTC
vdsm is still part of qemu group

[root@alma04 tmp]# sudo -u vdsm groups
kvm qemu sanlock

but the source image is not readable by qemu group but just from qemu user.

Comment 16 Simone Tiraboschi 2018-04-09 12:46:16 UTC
Upstream and downstream appliance permissions are different:

- Upstream -

[root@c74he20180302h1 ~]# tar tvf /usr/share/ovirt-engine-appliance/ovirt-engine-appliance-4.2-20180408.1.el7.centos.ova
drwxr-xr-x root/root         0 2018-04-08 13:18 images/
drwxr-xr-x root/root         0 2018-04-08 13:18 images/8da54388-5d66-4c34-9884-995eae4aee5f/
-rwxr-xr-x root/root 2706374656 2018-04-08 13:18 images/8da54388-5d66-4c34-9884-995eae4aee5f/3cfadf2f-8fc9-45a4-b304-d5fad962bc9a
-rw-r--r-- root/root        330 2018-04-08 13:18 images/8da54388-5d66-4c34-9884-995eae4aee5f/3cfadf2f-8fc9-45a4-b304-d5fad962bc9a.meta
drwxr-xr-x root/root          0 2018-04-08 13:18 master/
drwxr-xr-x root/root          0 2018-04-08 13:18 master/vms/
drwxr-xr-x root/root          0 2018-04-08 13:18 master/vms/64aee80e-bbe8-48cc-bccd-c6d848cf53a8/
-rw-r--r-- root/root       3695 2018-04-08 13:18 master/vms/64aee80e-bbe8-48cc-bccd-c6d848cf53a8/64aee80e-bbe8-48cc-bccd-c6d848cf53a8.ovf

- Downstream -
[root@alma04 tmp]# tar tvf /usr/share/ovirt-engine-appliance/rhvm-appliance-4.2-20180404.0.el7.ova
drwxr-xr-x root/root         0 2018-04-04 19:47 master/
drwxr-xr-x root/root         0 2018-04-04 19:47 master/vms/
drwxr-xr-x root/root         0 2018-04-04 19:47 master/vms/91dd1dc8-601c-407a-8ca0-0d45fb667893/
-rw-r--r-- root/root      3698 2018-04-04 19:47 master/vms/91dd1dc8-601c-407a-8ca0-0d45fb667893/91dd1dc8-601c-407a-8ca0-0d45fb667893.ovf
drwxr-xr-x root/root         0 2018-04-04 19:47 images/
drwxr-xr-x root/root         0 2018-04-04 19:47 images/3f023247-9d1a-4b51-a7d6-fcf75cd3bc01/
-rw-r--r-- root/root       330 2018-04-04 19:47 images/3f023247-9d1a-4b51-a7d6-fcf75cd3bc01/fe1c4b0a-89f7-4beb-954d-d1d6e61954ad.meta
-rw------- mockbuild/mockbuild 3245342720 2018-04-04 20:30 images/3f023247-9d1a-4b51-a7d6-fcf75cd3bc01/fe1c4b0a-89f7-4beb-954d-d1d6e61954ad


The issue is in images/3f023247-9d1a-4b51-a7d6-fcf75cd3bc01/fe1c4b0a-89f7-4beb-954d-d1d6e61954ad readable just by mockbuild user

Comment 17 Nikolai Sednev 2018-04-09 13:46:59 UTC
Just in case.
I've tried an older appliance rhvm-appliance-4.2-20180322.0.el7.noarch and the result was just the same:
[ INFO  ] TASK [Copy local VM disk to shared storage]
[ ERROR ] fatal: [localhost]: FAILED! => {"changed": true, "cmd": ["qemu-img", "convert", "-n", "-O", "raw", "/var/tmp/localvmnZIxfN/images/50755da7-5742-4753-9d26-b862340e7bff/50b84698-99d5-4bd7-a90d-3e15a13ad41c", "/rhev/data-center/mnt/blockSD/c5789a99-4b45-4b68-9e8c-3fe428a00b55/images/ba82f2b0-73f3-4bfd-a8d8-7726aad7d3aa/d9fcc0fe-e9de-44dd-b189-be9f4e9b516b"], "delta": "0:00:00.168064", "end": "2018-04-09 16:43:31.183718", "msg": "non-zero return code", "rc": 1, "start": "2018-04-09 16:43:31.015654", "stderr": "qemu-img: Could not open '/var/tmp/localvmnZIxfN/images/50755da7-5742-4753-9d26-b862340e7bff/50b84698-99d5-4bd7-a90d-3e15a13ad41c': Could not open '/var/tmp/localvmnZIxfN/images/50755da7-5742-4753-9d26-b862340e7bff/50b84698-99d5-4bd7-a90d-3e15a13ad41c': Permission denied", "stderr_lines": ["qemu-img: Could not open '/var/tmp/localvmnZIxfN/images/50755da7-5742-4753-9d26-b862340e7bff/50b84698-99d5-4bd7-a90d-3e15a13ad41c': Could not open '/var/tmp/localvmnZIxfN/images/50755da7-5742-4753-9d26-b862340e7bff/50b84698-99d5-4bd7-a90d-3e15a13ad41c': Permission denied"], "stdout": "", "stdout_lines": []}
[ ERROR ] Failed to execute stage 'Closing up': Failed executing ansible-playbook
[ INFO  ] Stage: Clean up
[ INFO  ] Cleaning temporary resources
[ INFO  ] TASK [Gathering Facts]
[ INFO  ] ok: [localhost]
[ INFO  ] TASK [include_tasks]
[ INFO  ] ok: [localhost]
[ INFO  ] TASK [Remove local vm dir]
[ INFO  ] changed: [localhost]
[ INFO  ] Generating answer file '/var/lib/ovirt-hosted-engine-setup/answers/answers-20180409164338.conf'
[ INFO  ] Stage: Pre-termination
[ INFO  ] Stage: Termination
[ ERROR ] Hosted Engine deployment failed: please check the logs for the issue, fix accordingly or re-deploy from scratch.
          Log file is located at /var/log/ovirt-hosted-engine-setup/ovirt-hosted-engine-setup-20180409162308-qkzjvs.log
[root@alma03 ~]# hostname
alma03.qa.lab.tlv.redhat.com
[root@alma03 ~]# lslocks -o COMMAND,PID,TYPE,SIZE,MODE,M,START,END,PATH,BLOCKER
COMMAND           PID  TYPE SIZE MODE  M START END PATH                                BLOCKER
master           1643 FLOCK  33B WRITE 0     0   0 /var/spool/postfix/pid/master.pid   
master           1643 FLOCK  33B WRITE 0     0   0 /var/lib/postfix/master.lock        
abrtd            9556 POSIX   5B WRITE 0     0   0 /run/abrt/abrtd.pid                 
vdsmd           11988 FLOCK   0B WRITE 0     0   0 /run/vdsm/vdsmd.lock                
crond            1200 FLOCK   5B WRITE 0     0   0 /run/crond.pid                      
sanlock          5227 POSIX   5B WRITE 0     0   0 /run/sanlock/sanlock.pid            
anacron          8298 POSIX   0B WRITE 0     0   0 /var/spool/anacron/cron.weekly      
anacron          8298 POSIX   0B WRITE 0     0   0 /var/spool/anacron/cron.monthly     
libvirtd        11744 POSIX   5B WRITE 0     0   0 /run/libvirtd.pid                   
supervdsmd      11841 FLOCK   0B WRITE 0     0   0 /run/vdsm/supervdsmd.lock           
ovs-vswitchd    12765 POSIX   6B WRITE 0     0   0 /run/openvswitch/ovs-vswitchd.pid   
rhsmcertd        1126 FLOCK   0B WRITE 0     0   0 /run/lock/subsys/rhsmcertd          
atd              1198 POSIX   5B WRITE 0     0   0 /run/atd.pid                        
multipathd       5470 POSIX   4B WRITE 0     0   0 /run/multipathd/multipathd.pid      
wdmd              626 POSIX   4B WRITE 0     0   0 /run/wdmd/wdmd.pid                  
iscsid           1168 POSIX   5B WRITE 0     0   0 /run/iscsid.pid                     
python           1127 FLOCK   4B WRITE 0     0   0 /run/goferd.pid                     
virtlogd         4133 POSIX   4B WRITE 0     0   0 /run/virtlogd.pid                   
ovsdb-server    12716 POSIX   6B WRITE 0     0   0 /run/openvswitch/ovsdb-server.pid   
ovsdb-server    12716 POSIX   0B WRITE 0     0   0 /etc/openvswitch/.conf.db.~lock~    
ovn-controller  12839 POSIX   6B WRITE 0     0   0 /run/openvswitch/ovn-controller.pid 


alma03 ~]# tar tvf /usr/share/ovirt-engine-appliance/rhvm-appliance-4.2-20180322.0.el7.ova 
drwxr-xr-x root/root         0 2018-03-22 22:41 images/
drwxr-xr-x root/root         0 2018-03-22 22:41 images/50755da7-5742-4753-9d26-b862340e7bff/
-rw-r--r-- root/root       330 2018-03-22 22:41 images/50755da7-5742-4753-9d26-b862340e7bff/50b84698-99d5-4bd7-a90d-3e15a13ad41c.meta
drwxr-xr-x root/root         0 2018-03-22 22:41 master/
drwxr-xr-x root/root         0 2018-03-22 22:41 master/vms/
drwxr-xr-x root/root         0 2018-03-22 22:41 master/vms/93ef2f11-368b-4a20-a2df-ea99a371c57a/
-rw-r--r-- root/root      3698 2018-03-22 22:41 master/vms/93ef2f11-368b-4a20-a2df-ea99a371c57a/93ef2f11-368b-4a20-a2df-ea99a371c57a.ovf
-rw------- mockbuild/mockbuild 3217817600 2018-03-22 23:25 images/50755da7-5742-4753-9d26-b862340e7bff/50b84698-99d5-4bd7-a90d-3e15a13ad41c

Comment 18 Nikolai Sednev 2018-04-09 13:48:23 UTC
alma03 ~]# ls -lsha  /var/tmp/localvmbcKrY4/images/3f023247-9d1a-4b51-a7d6-fcf75cd3bc01/fe1c4b0a-89f7-4beb-954d-d1d6e61954ad
3.3G -rw-------. 1 root root 3.3G Apr  9 16:25 /var/tmp/localvmbcKrY4/images/3f023247-9d1a-4b51-a7d6-fcf75cd3bc01/fe1c4b0a-89f7-4beb-954d-d1d6e61954ad

Comment 19 Nikolai Sednev 2018-04-09 15:39:53 UTC
Deployment fails also with the rhvm-appliance-4.2-20180202.0.el7.noarch:
[ ERROR ] fatal: [localhost]: FAILED! => {"changed": true, "cmd": ["qemu-img", "convert", "-n", "-O", "raw", "/var/tmp/localvmTMEMEz/images/44b5a089-6ce7-4a23-b4d1-b9dab3fe2687/5ef59632-5a32-4285-95d9-54c59673e2f3", "/rhev/data-center/mnt/blockSD/8c3680ba-0e71-4a62-adf9-d071099a8d8a/images/8e63056b-1fc5-4e74-8b2d-3d31e42096c3/dcd89fb4-f3a0-471a-9ea7-bb2aab62a934"], "delta": "0:00:00.166741", "end": "2018-04-09 18:39:13.378009", "msg": "non-zero return code", "rc": 1, "start": "2018-04-09 18:39:13.211268", "stderr": "qemu-img: Could not open '/var/tmp/localvmTMEMEz/images/44b5a089-6ce7-4a23-b4d1-b9dab3fe2687/5ef59632-5a32-4285-95d9-54c59673e2f3': Could not open '/var/tmp/localvmTMEMEz/images/44b5a089-6ce7-4a23-b4d1-b9dab3fe2687/5ef59632-5a32-4285-95d9-54c59673e2f3': Permission denied", "stderr_lines": ["qemu-img: Could not open '/var/tmp/localvmTMEMEz/images/44b5a089-6ce7-4a23-b4d1-b9dab3fe2687/5ef59632-5a32-4285-95d9-54c59673e2f3': Could not open '/var/tmp/localvmTMEMEz/images/44b5a089-6ce7-4a23-b4d1-b9dab3fe2687/5ef59632-5a32-4285-95d9-54c59673e2f3': Permission denied"], "stdout": "", "stdout_lines": []}

Comment 20 Yihui Zhao 2018-04-10 11:25:54 UTC
Deployment fails on NFS also with rhvm-appliance-4.2-20180404.0.el7.noarch:

[ INFO ] TASK [Copy local VM disk to shared storage]
[ ERROR ] fatal: [localhost]: FAILED! => {"changed": true, "cmd": ["qemu-img", "convert", "-n", "-O", "raw", "/var/tmp/localvmu0N2Dy/images/3f023247-9d1a-4b51-a7d6-fcf75cd3bc01/fe1c4b0a-89f7-4beb-954d-d1d6e61954ad", "/rhev/data-center/mnt/10.66.148.11:_home_yzhao_nfs2/75086210-dc24-486c-a6ed-557aca4b4602/images/06a857af-0101-4ff8-a53f-d31a12b92ceb/4b0ba433-51dc-40b2-94bb-85ae200da919"], "delta": "0:00:00.176871", "end": "2018-04-10 17:10:35.623934", "msg": "non-zero return code", "rc": 1, "start": "2018-04-10 17:10:35.447063", "stderr": "qemu-img: Could not open '/var/tmp/localvmu0N2Dy/images/3f023247-9d1a-4b51-a7d6-fcf75cd3bc01/fe1c4b0a-89f7-4beb-954d-d1d6e61954ad': Could not open '/var/tmp/localvmu0N2Dy/images/3f023247-9d1a-4b51-a7d6-fcf75cd3bc01/fe1c4b0a-89f7-4beb-954d-d1d6e61954ad': Permission denied", "stderr_lines": ["qemu-img: Could not open '/var/tmp/localvmu0N2Dy/images/3f023247-9d1a-4b51-a7d6-fcf75cd3bc01/fe1c4b0a-89f7-4beb-954d-d1d6e61954ad': Could not open '/var/tmp/localvmu0N2Dy/images/3f023247-9d1a-4b51-a7d6-fcf75cd3bc01/fe1c4b0a-89f7-4beb-954d-d1d6e61954ad': Permission denied"], "stdout": "", "stdout_lines": []}

Comment 21 Nikolai Sednev 2018-04-10 11:44:27 UTC
(In reply to Yihui Zhao from comment #20)
> Deployment fails on NFS also with rhvm-appliance-4.2-20180404.0.el7.noarch:
> 
> [ INFO ] TASK [Copy local VM disk to shared storage]
> [ ERROR ] fatal: [localhost]: FAILED! => {"changed": true, "cmd":
> ["qemu-img", "convert", "-n", "-O", "raw",
> "/var/tmp/localvmu0N2Dy/images/3f023247-9d1a-4b51-a7d6-fcf75cd3bc01/fe1c4b0a-
> 89f7-4beb-954d-d1d6e61954ad",
> "/rhev/data-center/mnt/10.66.148.11:_home_yzhao_nfs2/75086210-dc24-486c-a6ed-
> 557aca4b4602/images/06a857af-0101-4ff8-a53f-d31a12b92ceb/4b0ba433-51dc-40b2-
> 94bb-85ae200da919"], "delta": "0:00:00.176871", "end": "2018-04-10
> 17:10:35.623934", "msg": "non-zero return code", "rc": 1, "start":
> "2018-04-10 17:10:35.447063", "stderr": "qemu-img: Could not open
> '/var/tmp/localvmu0N2Dy/images/3f023247-9d1a-4b51-a7d6-fcf75cd3bc01/fe1c4b0a-
> 89f7-4beb-954d-d1d6e61954ad': Could not open
> '/var/tmp/localvmu0N2Dy/images/3f023247-9d1a-4b51-a7d6-fcf75cd3bc01/fe1c4b0a-
> 89f7-4beb-954d-d1d6e61954ad': Permission denied", "stderr_lines":
> ["qemu-img: Could not open
> '/var/tmp/localvmu0N2Dy/images/3f023247-9d1a-4b51-a7d6-fcf75cd3bc01/fe1c4b0a-
> 89f7-4beb-954d-d1d6e61954ad': Could not open
> '/var/tmp/localvmu0N2Dy/images/3f023247-9d1a-4b51-a7d6-fcf75cd3bc01/fe1c4b0a-
> 89f7-4beb-954d-d1d6e61954ad': Permission denied"], "stdout": "",
> "stdout_lines": []}

I already reported about NFS failure with the same appliance here:
https://bugzilla.redhat.com/show_bug.cgi?id=1564873#c10

Comment 22 Red Hat Bugzilla Rules Engine 2018-04-10 14:04:10 UTC
Target release should be placed once a package build is known to fix a issue. Since this bug is not modified, the target version has been reset. Please use target milestone to plan a fix for a oVirt release.

Comment 23 Yaniv Kaul 2018-04-11 09:18:53 UTC
Yuval, why it is in MODIFIED state? Where's the fix?
Where's the build containing this fix?

Comment 27 Nikolai Sednev 2018-04-11 13:57:37 UTC
Deployment passes If manually filling FC LUN ID from "multipath -ll
", then deployment worked fine on these components:
cockpit-storaged-160-3.el7.noarch
cockpit-dashboard-160-3.el7.x86_64
cockpit-system-160-3.el7.noarch
cockpit-bridge-160-3.el7.x86_64
cockpit-160-3.el7.x86_64
cockpit-ovirt-dashboard-0.11.20-1.el7ev.noarch
cockpit-ws-160-3.el7.x86_64
rhvm-appliance-4.2-20180410.0.el7.noarch
ovirt-hosted-engine-setup-2.2.16-1.el7ev.noarch
ovirt-hosted-engine-ha-2.2.10-1.el7ev.noarch
Linux 3.10.0-862.el7.x86_64 #1 SMP Wed Mar 21 18:14:51 EDT 2018 x86_64 x86_64 x86_64 GNU/Linux
Red Hat Enterprise Linux Server release 7.5 (Maipo)

Comment 28 Nikolai Sednev 2018-04-11 14:00:02 UTC
puma19 ~]# tar tvf /usr/share/ovirt-engine-appliance/rhvm-appliance-4.2-20180410.0.el7.ova 
drwxr-xr-x root/root         0 2018-04-10 18:13 images/
drwxr-xr-x root/root         0 2018-04-10 18:13 images/dc72d2dc-9ab0-4da0-94b4-bf9550c1a292/
-rw-r--r-- root/root       330 2018-04-10 18:13 images/dc72d2dc-9ab0-4da0-94b4-bf9550c1a292/4dd41dc4-0df8-46af-a149-dbad8649ca75.meta
drwxr-xr-x root/root         0 2018-04-10 18:13 master/
drwxr-xr-x root/root         0 2018-04-10 18:13 master/vms/
drwxr-xr-x root/root         0 2018-04-10 18:13 master/vms/03c95913-fa93-4da7-99d4-5b798973e712/
-rw-r--r-- root/root      3698 2018-04-10 18:13 master/vms/03c95913-fa93-4da7-99d4-5b798973e712/03c95913-fa93-4da7-99d4-5b798973e712.ovf
-rw-r--r-- mockbuild/mockbuild 3229286400 2018-04-10 18:58 images/dc72d2dc-9ab0-4da0-94b4-bf9550c1a292/4dd41dc4-0df8-46af-a149-dbad8649ca75

Comment 29 Yihui Zhao 2018-04-12 05:34:45 UTC
Works for me on the latest RHVH from cockpit with NFS:

rhvh-4.2.2.1-0.20180410.0+1
cockpit-bridge-160-3.el7.x86_64
cockpit-160-3.el7.x86_64
cockpit-dashboard-160-3.el7.x86_64
cockpit-ws-160-3.el7.x86_64
cockpit-system-160-3.el7.noarch
cockpit-storaged-160-3.el7.noarch
cockpit-ovirt-dashboard-0.11.20-1.el7ev.noarch
ovirt-hosted-engine-setup-2.2.16-1.el7ev.noarch
ovirt-hosted-engine-ha-2.2.10-1.el7ev.noarch
rhvm-appliance-4.2-20180410.0.el7.noarch

Comment 30 Yaniv Kaul 2018-04-12 10:10:22 UTC
Doron, is it in the errata?

Comment 31 Nikolai Sednev 2018-04-12 10:37:07 UTC
Moving to verified forth to comment #28 and #29.

Comment 34 Sandro Bonazzola 2018-04-18 12:25:08 UTC
This bugzilla is included in oVirt 4.2.2 release, published on March 28th 2018.

Since the problem described in this bug report should be
resolved in oVirt 4.2.2 release, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.