Bug 1479747

Summary: The hosted-engine --deploy fails on first attempt on 'Permission denied'
Product: Red Hat Enterprise Virtualization Manager Reporter: Jaroslav Spanko <jspanko>
Component: ovirt-hosted-engine-setupAssignee: Ido Rosenzwig <irosenzw>
Status: CLOSED INSUFFICIENT_DATA QA Contact: Nikolai Sednev <nsednev>
Severity: high Docs Contact:
Priority: medium    
Version: 4.0.0CC: jspanko, lsurette, ykaul, ylavi
Target Milestone: ovirt-4.1.7Keywords: Unconfirmed
Target Release: ---Flags: lsvaty: testing_plan_complete-
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-10-15 04:52:46 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Integration RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
vdsm.log first attempt
none
supervdsm first attempt
none
deploy log
none
hosted-engine-setup second sucesfull attempt none

Description Jaroslav Spanko 2017-08-09 10:28:24 UTC
Description of problem:
The deploy of the HE fails on first attempt with
[ ERROR ] Failed to execute stage 'Misc Configuration' : Connection to storage server failed.
The local server is used as NFS,all the permissions and ownerships are set correctly.
RHV_STORAGE/DATA_DOMAIN
                <world>(rw,async,wdelay,root_squash,no_subtree_check,sec=sys,rw,secure,root_squash,no_all_squash)
/RHV_STORAGE/RHVM_DATA
                <world>(rw,async,wdelay,root_squash,no_subtree_check,sec=sys,rw,secure,root_squash,no_all_squash)

Once the deployment fails and it's redeployed again for the second time, it will get deployed successfully without any issue. 

Version-Release number of selected component (if applicable):
RHEL 7.3 and 7.4

How reproducible:
100% in user env

Actual results:
StorageServerAccessPermissionError: Permission settings on the specified path do not allow access to the storage. Verify permission settings on the specified storage path.: 'path = /rhev/data-center/mnt/_var_lib_ovirt-hosted-engine-setup_tmpdjRwix'


Additional info:
We added debug line to the code and this is the result, as uid,guid is root:root instead of vdsm:kvm 
MainProcess|jsonrpc/7::DEBUG::2017-08-07 18:58:17,432::supervdsmServer::93::SuperVdsm.ServerCallback::(wrapper) call validateAccess with ('vdsm', ('kvm',), '/rhev/data-center/mnt/_var_lib_ovirt-hosted-engine-setup_tmpadUXrT', 7) {}

MainProcess|jsonrpc/7::DEBUG::2017-08-07 18:58:17,436::fileUtils::138::storage.fileUtils::(validateAccess) Permissions for directory '/rhev/data-center/mnt/_var_lib_ovirt-hosted-engine-setup_tmpadUXrT' are '0755', stat output is 'posix.stat_result(st_mode=16877, st_ino=2, st_dev=1793L, st_nlink=3, st_uid=0, st_gid=0, st_size=4096, st_atime=1502112496, st_mtime=1502112496, st_ctime=1502112496)'

MainProcess|jsonrpc/7::WARNING::2017-08-07 18:58:17,436::fileUtils::141::storage.fileUtils::(validateAccess) Permission denied for directory: /rhev/data-center/mnt/_var_lib_ovirt-hosted-engine-setup_tmpadUXrT with permissions: 7

Permissions during deploy
/var/lib/ovirt-hosted-engine-setup/:
drwxr-xr-x root root ?                                ..
drwxr-xr-x root root ?                                .
drwxr-xr-x root root ?                                answers
-rw------- vdsm kvm  ?                                tmp9SjQGH

/rhev/data-center/mnt:
drwxr-xr-x vdsm kvm ?                                ..
drwxrwxrwx vdsm kvm ?                                10.53.197.129:_RHV__STORAGE_RHVM__DATA
drwxr-xr-x vdsm kvm ?                                .

Comment 1 Jaroslav Spanko 2017-08-09 10:29:20 UTC
Created attachment 1311155 [details]
vdsm.log first attempt

Comment 2 Jaroslav Spanko 2017-08-09 10:30:25 UTC
Created attachment 1311156 [details]
supervdsm first attempt

Comment 3 Jaroslav Spanko 2017-08-09 10:31:11 UTC
Created attachment 1311157 [details]
deploy log

Comment 4 Sandro Bonazzola 2017-08-09 11:19:49 UTC
Please check the NFS configuration, ensure it's not exported with root_squash.
Looks like a not supported configuration of the NFS storage.
See also bug #1466234

Comment 5 Jaroslav Spanko 2017-08-09 11:41:59 UTC
Hi Sandro

You're right , it is exported with root_squash as default for RHEL
/RHV_STORAGE/RHVM_DATA
<world>(rw,async,wdelay,root_squash,no_subtree_check,sec=sys,rw,secure,root_squash,no_all_squash)

What is strange is that I'm not able to reproduce this in our env with the same export.
I will ask to change to non_root_squash and let you know.

Thanks

Comment 6 Ido Rosenzwig 2017-08-15 11:00:55 UTC
Hi Jaroslav,

Can you please update?
Did you try with non_root_squash ?

Thanks

Comment 7 Jaroslav Spanko 2017-08-16 10:23:39 UTC
Hi Ido
Unfortunately still no , we're waiting for the CU update.
As i wrote is 100% reproducible in user env, we were not able to reproduce it yet.
Thanks

Comment 8 Jaroslav Spanko 2017-08-23 08:41:10 UTC
Hi 
We got answer from the customer regarding this, unfortunately no_root_squash did not help
~~~
the problem still persists with RHEL 7.4
During past week, we deployed the self-hosted engine setup on 5 servers out of which 3 are deployed remotely (from PuTTY) and 2 are deployed directly from the host itself.
The deployment performed remotely (via PuTTY) on the 3 servers is successful in the very first attempt where as the deployment performed directly on 2 servers is failed in the first attempt as usual.
~~~

Thanks

Comment 12 Jaroslav Spanko 2017-08-30 09:58:05 UTC
Created attachment 1319974 [details]
hosted-engine-setup second sucesfull attempt

Comment 13 Ido Rosenzwig 2017-08-30 12:56:58 UTC
Hi Jaroslav,

I didn't manage to reproduce it so far.
I've tried to deploy it via GUI on RHEL 7.3 with the following NFS configuration:

   *(rw,async,wdelay,no_subtree_check,sec=sys,rw,secure,no_all_squash)


which is exactly like the customer's configuration (without root_squash).

I believe this is a problem with the customer's configuration.

please provide the answer from the customer after he tried the configuration suggested in comment 10 .

Thank you.

Comment 14 Ido Rosenzwig 2017-10-15 04:52:46 UTC
No answer was given in more then a month. 
Closing the bug.

Comment 15 Jaroslav Spanko 2017-10-15 07:46:51 UTC
Hi Ido
As i wrote you, the CU still do not replied so I'm ok with close.
Thanks for your help.