Bug 1637014

Summary: Cinder fails to mount NFS shares when a single NFS server supports both Glance and Cinder requiring different mount options (like SELinux ones)
Product: Red Hat OpenStack Reporter: Tzach Shefi <tshefi>
Component: openstack-tripleo-heat-templatesAssignee: Giulio Fidente <gfidente>
Status: CLOSED ERRATA QA Contact: David Rosenfeld <drosenfe>
Severity: medium Docs Contact:
Priority: medium    
Version: 14.0 (Rocky)CC: abishop, eharney, gfidente, ltoscano, mburns, tenobreg
Target Milestone: z2Keywords: Triaged, ZStream
Target Release: 16.1 (Train on RHEL 8.2)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: openstack-tripleo-heat-templates-11.3.2-1.20200914170155.29a02c1.el8ost Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-10-28 15:36:47 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Cinder and selinux audit logs. none

Description Tzach Shefi 2018-10-08 13:09:11 UTC
Created attachment 1491661 [details]
Cinder and selinux audit logs.

Description of problem: This is a clone of an old OPS11 closed EOL bz 1491597, when a single NFS server serves both Glance and Cinder at the same time Cinder's NFS mount fails due to selinux. I'm pretty sure this should be cloned for OSP12-13 as well, if it still fails on OPS14. 

u'mount -t nfs -o retry=1,vers=4,minorversion=1 10.35.160.111:/export/ins_cinder /var/lib/cinder/mnt/47266020eacec99097bdec49f2451d38' failed. Not Retrying.


Version-Release number of selected component (if applicable):

python-cinder-13.0.1-0.20180917193045.c56591a.el7ost.noarch
puppet-cinder-13.3.1-0.20180917145846.550e793.el7ost.noarch
openstack-cinder-13.0.1-0.20180917193045.c56591a.el7ost.noarch
python2-cinderclient-4.0.1-0.20180809133302.460229c.el7ost.noarchopenstack-selinux-0.8.15-0.20180823061238.b63283a.el7ost.noarch
RHEL 7.5 


How reproducible:
Every time

Steps to Reproduce:
1. I've used OPSD to deploy nfs as back end for both Glance+Cinder via
Adding these on overcloud_deploy.sh:
-e /usr/share/openstack-tripleo-heat-templates/environments/storage/cinder-nfs.yaml \
-e /usr/share/openstack-tripleo-heat-templates/environments/storage/glance-nfs.yaml \
-e /home/stack/virt/extra_templates.yaml \


[stack@undercloud-0 ~]$ cat /home/stack/virt/extra_templates.yaml
parameter_defaults:
  CinderEnableIscsiBackend: false
  CinderEnableRbdBackend: false
  CinderEnableNfsBackend: true
  CinderNfsMountOptions: 'retry=1'
  CinderNfsServers: '10.35.160.111:/export/ins_cinder'

  GlanceBackend: 'file'
  GlanceNfsEnabled: true
  GlanceNfsShare: '10.35.160.111:/export/ins_glance'


Deployment completed successfully, "looks" fine.

2. Uploaded an image to Glance works fine, image found on Glance's NFS share. 

3. Create a Cinder volume, it's status and service stats report available and up, despite these being wrong. 
No such volume was found on Cinder NFS share.
Also then back end status shouldn't be up but down due to selinux mount failing. 

Actual results:
Alan tip helped spot another bug lurking here, that the volume is created but in fact locally and thus is available, this is wrong/bad. So is the fact that the service is reported as up while NFS mount failed.  

cinder service-list
+------------------+-----------------------+------+---------+-------+----------------------------+-----------------+
| Binary           | Host                  | Zone | Status  | State | Updated_at                 | Disabled Reason |
+------------------+-----------------------+------+---------+-------+----------------------------+-----------------+
| cinder-scheduler | controller-0          | nova | enabled | up    | 2018-10-08T12:46:47.000000 | -               |
| cinder-volume    | hostgroup@tripleo_nfs | nova | enabled | up    | 2018-10-08T12:46:55.000000 | -               |
+------------------+-----------------------+------+---------+-------+----------------------------+-----------------+
(overcloud) [stack@undercloud-0 ~]$ cinder list
+--------------------------------------+-----------+--------------+------+-------------+----------+--------------------------------------+
| ID                                   | Status    | Name         | Size | Volume Type | Bootable | Attached to                          |
+--------------------------------------+-----------+--------------+------+-------------+----------+--------------------------------------+
| 6e909b3f-96b3-4777-8092-28867dbb6f16 | available | -            | 1    | tripleo     | false    |                                      |


Here is the error from volume log

/var/log/containers/cinder/cinder-volume.log:2018-10-08 12:36:48.922 60 DEBUG oslo_concurrency.processutils [req-39ad3e9d-aeb2-4c55-b36b-b74e485473f7 - - - - -] CMD "mount -t nfs -o retry=1,vers=4,minorversion=1 10.35.160.111:/export/ins_cinder /var/lib/cinder/mnt/47266020eacec99097bdec49f2451d38" returned: 32 in 0.419s execute /usr/lib/python2.7/site-packages/oslo_concurrency/processutils.py:409
/var/log/containers/cinder/cinder-volume.log:2018-10-08 12:36:48.923 60 DEBUG oslo_concurrency.processutils [req-39ad3e9d-aeb2-4c55-b36b-b74e485473f7 - - - - -] u'mount -t nfs -o retry=1,vers=4,minorversion=1 10.35.160.111:/export/ins_cinder /var/lib/cinder/mnt/47266020eacec99097bdec49f2451d38' failed. Not Retrying. execute /usr/lib/python2.7/site-packages/oslo_concurrency/processutils.py:457
/var/log/containers/cinder/cinder-volume.log:2018-10-08 12:36:48.923 60 INFO os_brick.remotefs.remotefs [req-39ad3e9d-aeb2-4c55-b36b-b74e485473f7 - - - - -] Already mounted: 10.35.160.111:/export/ins_cinder
/var/log/containers/cinder/cinder-volume.log:2018-10-08 12:36:48.924 60 DEBUG os_brick.remotefs.remotefs [req-39ad3e9d-aeb2-4c55-b36b-b74e485473f7 - - - - -] Mounted 10.35.160.111:/export/ins_cinder using pnfs. _mount_nfs /usr/lib/python2.7/site-
Expected results:


Additional info: Original bug 
https://bugzilla.redhat.com/show_bug.cgi?id=1491597#c4
Where I reported adding nosharecache resolved issue, I didn't test this yet.

Comment 2 Alan Bishop 2018-10-08 13:53:32 UTC
Lowering the priority because this is a long-standing issue that has not (yet) been a significant issue in the field.

Comment 12 Giulio Fidente 2020-07-03 11:26:11 UTC
Tentatively using container_file_t as default for CinderNfsMountOptions in https://review.opendev.org/739214

Comment 20 errata-xmlrpc 2020-10-28 15:36:47 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat OpenStack Platform 16.1 bug fix and enhancement advisory), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2020:4284