Bug 1440700
Summary: | Unable to live migrate Nova instance with attached NFS backed Cinder volume | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat OpenStack | Reporter: | Marius Cornea <mcornea> | ||||
Component: | openstack-tripleo-heat-templates | Assignee: | Alan Bishop <abishop> | ||||
Status: | CLOSED ERRATA | QA Contact: | Amit Ugol <augol> | ||||
Severity: | urgent | Docs Contact: | |||||
Priority: | medium | ||||||
Version: | 11.0 (Ocata) | CC: | abishop, ccollett, cschwede, dbecker, eharney, jjoyce, mburns, morazi, pgrist, rhel-osp-director-maint, tshefi | ||||
Target Milestone: | z3 | Keywords: | TestOnly, Triaged, ZStream | ||||
Target Release: | 11.0 (Ocata) | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | openstack-tripleo-heat-templates-6.1.0-1.el7ost, puppet-tripleo-6.5.0-1.el7ost, puppet-cinder-10.3.1-1.el7ost | Doc Type: | Bug Fix | ||||
Doc Text: |
Previously, some cinder volume operations would fail when using the NFS backend. This was because cinder's NFS backend driver implements enhanced NAS security features that are enabled by default. These features require non-standard configuration changes in nova's libvirt, and without these changes, some cinder volume operations would fail.
This update introduces TripleO settings to control the NFS driver's NAS secure features, and disables the features by default. As a result, cinder volume operations no longer fail when using the NFS backend.
|
Story Points: | --- | ||||
Clone Of: | Environment: | ||||||
Last Closed: | 2017-10-31 17:37:35 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Marius Cornea
2017-04-10 10:10:57 UTC
As a workaround add this environment file to the deployment: parameter_defaults: ControllerExtraConfig: cinder::config::cinder_config: tripleo_nfs/nas_secure_file_operations: value: false Removing the blocker flag. According to our records, this should be resolved by openstack-tripleo-heat-templates-6.1.0-2.el7ost. This build is available now. According to our records, this should be resolved by puppet-tripleo-6.5.0-5.el7ost. This build is available now. According to our records, this should be resolved by puppet-cinder-10.3.1-1.el7ost. This build is available now. Alan I hit an issue, failed to verify, got stuck on Cinder create didn't even reach migration yet. Versions: openstack-tripleo-heat-templates-6.2.0-3.el7ost.noarch puppet-tripleo-6.5.0-8.el7ost.noarch puppet-cinder-10.3.1-1.el7ost.noarch This is the file I added to overcloud_deploy, to enable nfs for Cinder (and glance by mistake, not needed for this bug) parameter_defaults: CinderEnableIscsiBackend: false CinderEnableRbdBackend: false CinderEnableNfsBackend: true CinderNfsMountOptions: 'retry=1' CinderNfsServers: '10.35.160.111:/export/ins_cinder' GlanceBackend: 'file' GlanceNfsEnabled: true GlanceNfsShare: '10.35.160.111:/export/ins_glance' Shares work the deployment finished, Cinder create fails. cinder.conf relevant bits: enabled_backends = tripleo_nfs [tripleo_nfs] volume_backend_name=tripleo_nfs volume_driver=cinder.volume.drivers.nfs.NfsDriver nfs_shares_config=/etc/cinder/shares-nfs.conf nfs_mount_options=retry=1 nas_secure_file_operations=False -> good these are add by default nas_secure_file_permissions=False Vol in error state [stack@undercloud-0 ~]$ cinder list | 9270f2b6-2bbb-4be8-9d56-ba484a2dd722 | error Volume.log errors 2017-09-11 08:23:11.243 101043 ERROR cinder.service [-] Manager for service cinder-volume hostgroup@tripleo_nfs is reporting problems, not sending heartbeat. Service will appear "down". 2017-09-11 08:23:21.252 101043 ERROR cinder.service [-] Manager for service cinder-volume hostgroup@tripleo_nfs is reporting problems, not sending heartbeat. Service will appear "down". 2017-09-11 08:23:29.242 101043 DEBUG oslo_service.periodic_task [req-3fd33190-6c7f-4b47-9649-db0965a0e9b9 - - - - -] Running periodic task VolumeManager._publish_service_capabilities run_periodic_tasks /usr/lib/python2.7/site-packages/oslo_service/periodic_task.py:215 2017-09-11 08:23:29.242 101043 DEBUG oslo_service.periodic_task [req-3fd33190-6c7f-4b47-9649-db0965a0e9b9 - - - - -] Running periodic task VolumeManager._report_driver_status run_periodic_tasks /usr/lib/python2.7/site-packages/oslo_service/periodic_task.py:215 2017-09-11 08:23:29.243 101043 WARNING cinder.volume.manager [req-3fd33190-6c7f-4b47-9649-db0965a0e9b9 - - - - -] Update driver status failed: (config name tripleo_nfs) is uninitialized. 2017-09-11 08:23:31.254 101043 ERROR cinder.service [-] Manager for service cinder-volume hostgroup@tripleo_nfs is reporting problems, not sending heartbeat. Service will appear "down". # mount | grep 10.35.160 -> only Glance is shown but no Cinder mount 10.35.160.111:/export/ins_glance on /var/lib/glance/images type nfs4 (rw,relatime,context=system_u:object_r:glance_var_lib_t:s0,vers=4.1,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=10.0.0.108,local_lock=none,addr=10.35.160.111) Mounts on the server are alive # showmount -e 10.35.160.111 Export list for 10.35.160.111: /export/ins_cinder * /export/ins_glance * Manual mount to /mnt/cinder fails, odd says it's already mounted but I don't see it # mount 10.35.160.111:/export/ins_cinder /mnt/cinder mount.nfs: /mnt/cinder is busy or already mounted # mount | grep cinder Nada nothing. Figured it's a single backend no need to manually define cinder type-create but maybe. That didn't work, so I created a type and set it's backend that didn't help either [stack@undercloud-0 ~]$ cinder type-create nfs | 3f1636c0-94ac-4ebb-9612-a0524d815b07 | nfs | - | True | [stack@undercloud-0 ~]$ cinder type-key nfs set volume_backend_name=tripleo_nfs [stack@undercloud-0 ~]$ cinder extra-specs-list | 3f1636c0-94ac-4ebb-9612-a0524d815b07 | nfs | {'volume_backend_name': 'tripleo_nfs'} | [stack@undercloud-0 ~]$ cinder create 1 --volume-type nfs Vol also in error state. grep -ir 9270f2b6-2bbb-4be8-9d56-ba484a2dd722 /var/log/cinder/ /var/log/cinder/scheduler.log:2017-09-11 08:09:52.170 78621 DEBUG cinder.volume.flows.common [req-1cae1c50-bf66-4a08-9b96-07da3a559872 b9da1699e5524941bcadf3f2393a2792 f780603a47df4840aad0b77583907364 - default default] Setting Volume 9270f2b6-2bbb-4be8-9d56-ba484a2dd722 to error due to: No valid backend was found. No weighed backends available error_out /usr/lib/python2.7/site-packages/cinder/volume/flows/common.py:85 /var/log/cinder/cinder-api.log:2017-09-11 08:09:52.008 109865 DEBUG cinder.volume.api [req-1cae1c50-bf66-4a08-9b96-07da3a559872 b9da1699e5524941bcadf3f2393a2792 f780603a47df4840aad0b77583907364 - default default] Task 'cinder.volume.flows.api.create_volume.EntryCreateTask;volume:create' (7ab55d59-0058-47a7-98dc-58935833e55b) transitioned into state 'SUCCESS' from state 'RUNNING' with result '{'volume': Volume(_name_id=None,admin_metadata=<?>,attach_status='detached',availability_zone='nova',bootable=False,cluster=<?>,cluster_name=None,consistencygroup=<?>,consistencygroup_id=None,created_at=2017-09-11T08:09:51Z,deleted=False,deleted_at=None,display_description=None,display_name=None,ec2_id=None,encryption_key_id=None,glance_metadata=<?>,group=<?>,group_id=None,host=None,id=9270f2b6-2bbb-4be8-9d56-ba484a2dd722,launched_at=None,metadata={},migration_status=None,multiattach=False,previous_status=None,project_id='f780603a47df4840aad0b77583907364',provider_auth=None,provider_geometry=None,provider_id=None,provider_location=None,replication_driver_data=None,replication_extended_status=None,replication_status=None,scheduled_at=None,size=1,snapshot_id=None,snapshots=<?>,source_volid=None,status='creating',terminated_at=None,updated_at=None,user_id='b9da1699e5524941bcadf3f2393a2792',volume_attachment=<?>,volume_type=<?>,volume_type_id=None), 'volume_properties': VolumeProperties(attach_status='detached',availability_zone='nova',cgsnapshot_id=None,consistencygroup_id=None,display_description=None,display_name=None,encryption_key_id=None,group_id=None,group_type_id=<?>,metadata={},multiattach=False,project_id='f780603a47df4840aad0b77583907364',qos_specs=None,replication_status=<?>,reservations=['7e316b1c-f354-4e08-9332-b374c89cde0c','b712d2b8-d0a3-4883-b299-42cfba3eac2c'],size=1,snapshot_id=None,source_replicaid=None,source_volid=None,status='creating',user_id='b9da1699e5524941bcadf3f2393a2792',volume_type_id=None), 'volume_id': '9270f2b6-2bbb-4be8-9d56-ba484a2dd722'}' _task_receiver /usr/lib/python2.7/site-packages/taskflow/listeners/logging.py:183 /var/log/cinder/cinder-api.log:2017-09-11 08:09:52.075 109865 INFO cinder.api.openstack.wsgi [req-1a84f97b-7821-4c19-bd26-8660ecc5a8bd b9da1699e5524941bcadf3f2393a2792 f780603a47df4840aad0b77583907364 - default default] GET http://10.0.0.104:8776/v2/f780603a47df4840aad0b77583907364/volumes/9270f2b6-2bbb-4be8-9d56-ba484a2dd722 /var/log/cinder/cinder-api.log:2017-09-11 08:09:52.147 109865 INFO cinder.api.openstack.wsgi [req-1a84f97b-7821-4c19-bd26-8660ecc5a8bd b9da1699e5524941bcadf3f2393a2792 f780603a47df4840aad0b77583907364 - default default] http://10.0.0.104:8776/v2/f780603a47df4840aad0b77583907364/volumes/9270f2b6-2bbb-4be8-9d56-ba484a2dd722 returned with HTTP 200 /var/log/cinder/cinder-api.log:2017-09-11 08:22:18.431 109865 INFO cinder.api.openstack.wsgi [req-2432adf6-96a9-4e52-81cf-570fd415b608 b9da1699e5524941bcadf3f2393a2792 f780603a47df4840aad0b77583907364 - default default] GET http://10.0.0.104:8776/v2/f780603a47df4840aad0b77583907364/volumes/9270f2b6-2bbb-4be8-9d56-ba484a2dd722 /var/log/cinder/cinder-api.log:2017-09-11 08:22:18.495 109865 INFO cinder.api.openstack.wsgi [req-2432adf6-96a9-4e52-81cf-570fd415b608 b9da1699e5524941bcadf3f2393a2792 f780603a47df4840aad0b77583907364 - default default] http://10.0.0.104:8776/v2/f780603a47df4840aad0b77583907364/volumes/9270f2b6-2bbb-4be8-9d56-ba484a2dd722 returned with HTTP 200 An this smoking gun bit /var/log/cinder/volume.log:2017-09-11 07:42:09.966 101043 ERROR cinder.volume.drivers.remotefs [req-90bc1c8d-5424-44f3-917b-773bc84dcd38 - - - - -] Exception during mounting NFS mount failed for share 10.35.160.111:/export/ins_cinder. Error - {'pnfs': u"Unexpected error while running command.\nCommand: sudo cinder-rootwrap /etc/cinder/rootwrap.conf mount -t nfs -o retry=1,vers=4,minorversion=1 10.35.160.111:/export/ins_cinder /var/lib/cinder/mnt/47266020eacec99097bdec49f2451d38\nExit code: 32\nStdout: u''\nStderr: u'mount.nfs: /var/lib/cinder/mnt/47266020eacec99097bdec49f2451d38 is busy or already mounted\\n'", 'nfs': u"Unexpected error while running command.\nCommand: sudo cinder-rootwrap /etc/cinder/rootwrap.conf mount -t nfs -o retry=1 10.35.160.111:/export/ins_cinder /var/lib/cinder/mnt/47266020eacec99097bdec49f2451d38\nExit code: 32\nStdout: u''\nStderr: u'mount.nfs: /var/lib/cinder/mnt/47266020eacec99097bdec49f2451d38 is busy or already mounted\\n'"} /var/log/secure:Sep 11 03: Let me know should I keep system up for you to access? Created attachment 1324370 [details]
Cinder config files and logs
Verified, A booted instance with an attached nfs backed Cinder volume was successfully migrated to second compute. I retested on same undercloud Versions: openstack-tripleo-heat-templates-6.2.0-3.el7ost.noarch puppet-tripleo-6.5.0-8.el7ost.noarch puppet-cinder-10.3.1-1.el7ost.noarch This time I didn't configure Glance nfs Or Cinder's nfs_mount_options=retry=1 nfs heat template used-> $cat nfs11Cinder.yaml parameter_defaults: CinderEnableIscsiBackend: false CinderEnableRbdBackend: false CinderEnableNfsBackend: true CinderNfsMountOptions: '' CinderNfsServers: 'W.X.Y.Z:/export/ins_cinder' Cinder.conf -> [tripleo_nfs] volume_backend_name=tripleo_nfs volume_driver=cinder.volume.drivers.nfs.NfsDriver nfs_shares_config=/etc/cinder/shares-nfs.conf nfs_mount_options= nas_secure_file_operations=False nas_secure_file_permissions=False Basic Cinder sanity create volume worked, moving on. 1. Cinder create worked $ cinder list +--------------------------------------+-----------+------+------+-------------+----------+-------------+ | ID | Status | Name | Size | Volume Type | Bootable | Attached to | +--------------------------------------+-----------+------+------+-------------+----------+-------------+ | f15212c4-94f9-4dad-a40e-253f82412ffa | available | - | 1 | - | false | | +--------------------------------------+-----------+------+------+-------------+----------+-------------+ 2. Boot an instance $ nova list +--------------------------------------+-------+--------+------------+-------------+-----------------------------------+ | ID | Name | Status | Task State | Power State | Networks | +--------------------------------------+-------+--------+------------+-------------+-----------------------------------+ | 8e7d2ee0-e106-45f6-8bc0-5d544035997d | inst1 | ACTIVE | - | Running | internal=192.168.0.3, 10.10.10.12 | +--------------------------------------+-------+--------+------------+-------------+-----------------------------------+ 3. Attach vol to instance nova volume-attach 8e7d2ee0-e106-45f6-8bc0-5d544035997d f15212c4-94f9-4dad-a40e-253f82412ffa auto +----------+--------------------------------------+ | Property | Value | +----------+--------------------------------------+ | device | /dev/vdb | | id | f15212c4-94f9-4dad-a40e-253f82412ffa | | serverId | 8e7d2ee0-e106-45f6-8bc0-5d544035997d | | volumeId | f15212c4-94f9-4dad-a40e-253f82412ffa | +----------+--------------------------------------+ 4. Now we see an attached vol. cinder list +--------------------------------------+--------+------+------+-------------+----------+--------------------------------------+ | ID | Status | Name | Size | Volume Type | Bootable | Attached to | +--------------------------------------+--------+------+------+-------------+----------+--------------------------------------+ | f15212c4-94f9-4dad-a40e-253f82412ffa | in-use | - | 1 | - | false | 8e7d2ee0-e106-45f6-8bc0-5d544035997d | +--------------------------------------+--------+------+------+-------------+----------+--------------------------------------+ 5. Migrate instance with an attached volume (verification..:) $openstack server migrate inst1 $ nova list +--------------------------------------+-------+--------+------------------+-------------+-----------------------------------+ | ID | Name | Status | Task State | Power State | Networks | +--------------------------------------+-------+--------+------------------+-------------+-----------------------------------+ | 8e7d2ee0-e106-45f6-8bc0-5d544035997d | inst1 | RESIZE | resize_migrating | Running | internal=192.168.0.3, 10.10.10.12 | +--------------------------------------+-------+--------+------------------+-------------+-----------------------------------+ $ nova list +--------------------------------------+-------+---------------+------------+-------------+-----------------------------------+ | ID | Name | Status | Task State | Power State | Networks | +--------------------------------------+-------+---------------+------------+-------------+-----------------------------------+ | 8e7d2ee0-e106-45f6-8bc0-5d544035997d | inst1 | VERIFY_RESIZE | - | Running | internal=192.168.0.3, 10.10.10.12 | +--------------------------------------+-------+---------------+------------+-------------+-----------------------------------+ $ nova resize-confirm inst1 6. Post migrate, inst1 alive and has an attached volume -. Argo verified :) [stack@undercloud-0 ~]$ nova list +--------------------------------------+-------+--------+------------+-------------+-----------------------------------+ | ID | Name | Status | Task State | Power State | Networks | +--------------------------------------+-------+--------+------------+-------------+-----------------------------------+ | 8e7d2ee0-e106-45f6-8bc0-5d544035997d | inst1 | ACTIVE | - | Running | internal=192.168.0.3, 10.10.10.12 | +--------------------------------------+-------+--------+------------+-------------+-----------------------------------+ [stack@undercloud-0 ~]$ cinder list +--------------------------------------+--------+------+------+-------------+----------+--------------------------------------+ | ID | Status | Name | Size | Volume Type | Bootable | Attached to | +--------------------------------------+--------+------+------+-------------+----------+--------------------------------------+ | f15212c4-94f9-4dad-a40e-253f82412ffa | in-use | - | 1 | - | false | 8e7d2ee0-e106-45f6-8bc0-5d544035997d | +--------------------------------------+--------+------+------+-------------+----------+--------------------------------------+ Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:3098 |