Bug 1710034
| Summary: | [RHOS-15] Tempest tests failing on "Cannot access storage file '/var/lib/nova/instances/<uuid>/disk' (as uid:107, gid:107): Permission denied"}" error | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat OpenStack | Reporter: | Archit Modi <amodi> | ||||
| Component: | puppet-tripleo | Assignee: | Martin Schuppert <mschuppe> | ||||
| Status: | CLOSED ERRATA | QA Contact: | Joe H. Rahme <jhakimra> | ||||
| Severity: | urgent | Docs Contact: | |||||
| Priority: | urgent | ||||||
| Version: | 15.0 (Stein) | CC: | apevec, dasmith, eglynn, jhakimra, jjoyce, jschluet, kchamart, mbooth, mschuppe, ratailor, sbauza, sgordon, slinaber, tvignaud, vromanso, whayutin | ||||
| Target Milestone: | beta | Keywords: | Triaged | ||||
| Target Release: | 15.0 (Stein) | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | puppet-tripleo-10.4.2-0.20190524100405.61a73d1.el8ost | Doc Type: | No Doc Update | ||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2019-09-21 11:22:03 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
|
Description
Archit Modi
2019-05-14 19:19:48 UTC
*** Bug 1710439 has been marked as a duplicate of this bug. *** https://bugzilla.redhat.com/show_bug.cgi?id=1710049 is most likely related, or even a duplicate. The target files in the above cases appear to have been copied there over SSH. There are no relavant SELinux denials in the audit logs, so I assume that this is a file permission issue. I'm going to need a reproducer system, but I suspect this is going to be a TripleO thing to fix. *** Bug 1710049 has been marked as a duplicate of this bug. *** (overcloud) [stack@undercloud-0 ~]$ openstack server list --long +--------------------------------------+-------+--------+------------+-------------+----------------------+------------+--------------------------------------+-------------+--------------------------------------+-------------------+-----------------------+------------+ | ID | Name | Status | Task State | Power State | Networks | Image Name | Image ID | Flavor Name | Flavor ID | Availability Zone | Host | Properties | +--------------------------------------+-------+--------+------------+-------------+----------------------+------------+--------------------------------------+-------------+--------------------------------------+-------------------+-----------------------+------------+ | de54e5dd-c3cc-49f0-ae6c-442512552d28 | test1 | ACTIVE | None | Running | private=192.168.0.36 | cirros | 936f80af-bedf-46c2-a511-014d14421487 | m1.small | b23e6419-9886-4ca3-bfcb-9ec17d4e9f89 | nova | compute-0.localdomain | | +--------------------------------------+-------+--------+------------+-------------+----------------------+------------+--------------------------------------+-------------+--------------------------------------+-------------------+-----------------------+------------+ (overcloud) [stack@undercloud-0 ~]$ openstack server stop test1 (overcloud) [stack@undercloud-0 ~]$ openstack server migrate --wait test1 Complete (overcloud) [stack@undercloud-0 ~]$ openstack server list --long +--------------------------------------+-------+---------------+------------+-------------+----------------------+------------+--------------------------------------+-------------+--------------------------------------+-------------------+-----------------------+------------+ | ID | Name | Status | Task State | Power State | Networks | Image Name | Image ID | Flavor Name | Flavor ID | Availability Zone | Host | Properties | +--------------------------------------+-------+---------------+------------+-------------+----------------------+------------+--------------------------------------+-------------+--------------------------------------+-------------------+-----------------------+------------+ | de54e5dd-c3cc-49f0-ae6c-442512552d28 | test1 | VERIFY_RESIZE | None | Shutdown | private=192.168.0.36 | cirros | 936f80af-bedf-46c2-a511-014d14421487 | m1.small | b23e6419-9886-4ca3-bfcb-9ec17d4e9f89 | nova | compute-1.localdomain | | +--------------------------------------+-------+---------------+------------+-------------+----------------------+------------+--------------------------------------+-------------+--------------------------------------+-------------------+-----------------------+------------+ (overcloud) [stack@undercloud-0 ~]$ openstack server resize --confirm test1 (overcloud) [stack@undercloud-0 ~]$ openstack server list --long +--------------------------------------+-------+---------+------------+-------------+----------------------+------------+--------------------------------------+-------------+--------------------------------------+-------------------+-----------------------+------------+ | ID | Name | Status | Task State | Power State | Networks | Image Name | Image ID | Flavor Name | Flavor ID | Availability Zone | Host | Properties | +--------------------------------------+-------+---------+------------+-------------+----------------------+------------+--------------------------------------+-------------+--------------------------------------+-------------------+-----------------------+------------+ | de54e5dd-c3cc-49f0-ae6c-442512552d28 | test1 | SHUTOFF | None | Shutdown | private=192.168.0.36 | cirros | 936f80af-bedf-46c2-a511-014d14421487 | m1.small | b23e6419-9886-4ca3-bfcb-9ec17d4e9f89 | nova | compute-1.localdomain | | +--------------------------------------+-------+---------+------------+-------------+----------------------+------------+--------------------------------------+-------------+--------------------------------------+-------------------+-----------------------+------------+ (overcloud) [stack@undercloud-0 ~]$ openstack server start test1 => fail 2019-05-21 07:02:25.896 6 ERROR nova.virt.libvirt.driver [req-e61e8f5c-b6ce-464d-9724-f5a8d3e7b885 2347cdd46e804f19b8dedb6e8d0bb44a 184feb4b990f41489806127bdfa6df19 - default default] [instance: de54e5dd-c3cc-49f0-ae6c-442512552d28] Failed to start libvirt guest: libvirt.libvirtErr or: Cannot access storage file '/var/lib/nova/instances/de54e5dd-c3cc-49f0-ae6c-442512552d28/disk.eph0' (as uid:107, gid:107): Permission denied * wrong permissions on instance directory, should be 42436:42436 / 755 [root@compute-1 instances]# ll /var/lib/nova/instances ... drwx------. 2 42436 root 52 May 21 07:34 de54e5dd-c3cc-49f0-ae6c-442512552d28 * instance disk and disk.eph0 should be owned by root:root when powered off and qemu:qemu (644) when switched on [root@compute-1 instances]# ll /var/lib/nova/instances/de54e5dd-c3cc-49f0-ae6c-442512552d28/ total 2376 -rw-r--r--. 1 42436 42436 0 May 21 07:00 console.log -rw-------. 1 42436 root 2228224 May 21 07:00 disk -rw-------. 1 42436 root 196624 May 21 07:00 disk.eph0 -rw-------. 1 42436 root 162 May 21 07:00 disk.info like from a fresh instance start: [root@compute-0 ~]# ll /var/lib/nova/instances/51c39ade-f6ba-4034-9459-f489791b0999/ total 2408 -rw-------. 1 root root 25288 May 21 07:17 console.log -rw-r--r--. 1 root root 2293760 May 21 07:17 disk -rw-r--r--. 1 root root 196624 May 21 07:16 disk.eph0 -rw-r--r--. 1 42436 42436 162 May 21 07:16 disk.info [root@compute-0 ~]# ll /var/lib/nova/instances/51c39ade-f6ba-4034-9459-f489791b0999/ total 1380 -rw-------. 1 root root 20927 May 21 07:16 console.log -rw-r--r--. 1 qemu qemu 1245184 May 21 07:16 disk -rw-r--r--. 1 qemu qemu 196624 May 21 07:16 disk.eph0 -rw-r--r--. 1 42436 42436 162 May 21 07:16 disk.info adding some debug to nova_migration_wrapper we see that it runs with an umask 63dec => 077oct, while we should have 022oct May 21 08:44:40 compute-1 nova_migration_wrapper[140456]: old umask = '63' May 21 08:44:40 compute-1 nova_migration_wrapper[140456]: Allowing connection='172.17.1.112 43042 172.17.1.58 2022' command=['/usr/bin/nova-rootwrap', '/etc/nova/migration/rootwrap.conf', 'scp', '-r', '-t', '/var/lib/nova/instances/ba632f2b-a388-467e-b337-0986d52b0d24/disk.info'] different behavior between RHEL7/8: RHEL8: [root@compute-0 ~]# sudo su -c umask 0077 RHEL7 (even if we do not specify to use a login shell): [root@compute-0 ~]# sudo su -c umask 0022 This has landed upstream. *** Bug 1715499 has been marked as a duplicate of this bug. *** (overcloud) [stack@undercloud-0 tempestdir]$ tempest run --whitelist-file tempestlist.txt
{1} tempest.api.compute.servers.test_delete_server.DeleteServersTestJSON.test_delete_server_while_in_verify_resize_state [76.296827s] ... ok
{2} tempest.api.compute.servers.test_disk_config.ServerDiskConfigTestJSON.test_resize_server_from_auto_to_manual [80.066982s] ... ok
{0} tempest.api.compute.admin.test_migrations.MigrationsAdminTest.test_cold_migration [86.336120s] ... ok
{2} tempest.api.compute.servers.test_disk_config.ServerDiskConfigTestJSON.test_resize_server_from_manual_to_auto [83.380423s] ... ok
{0} tempest.api.compute.admin.test_migrations.MigrationsAdminTest.test_list_migrations_in_flavor_resize_situation [77.863636s] ... ok
{2} setUpClass (tempest.api.compute.volumes.test_attach_volume.AttachVolumeMultiAttachTest) ... SKIPPED: Volume multi-attach is not available.
{1} tempest.scenario.test_network_advanced_server_ops.TestNetworkAdvancedServerOps.test_server_connectivity_resize [88.593727s] ... ok
{0} tempest.api.compute.admin.test_migrations.MigrationsAdminTest.test_resize_server_revert_deleted_flavor [74.262430s] ... ok
{0} tempest.api.compute.servers.test_server_actions.ServerActionsTestJSON.test_resize_server_confirm [34.079580s] ... ok
{0} tempest.api.compute.servers.test_server_actions.ServerActionsTestJSON.test_resize_server_confirm_from_stopped [52.781385s] ... ok
{0} tempest.api.compute.servers.test_server_actions.ServerActionsTestJSON.test_resize_server_revert_with_volume_attached [73.127325s] ... ok
======
Totals
======
Ran: 11 tests in 442.0713 sec.
- Passed: 10
- Skipped: 1
- Expected Fail: 0
- Unexpected Success: 0
- Failed: 0
Sum of execute time for each test: 726.7884 sec.
==============
Worker Balance
==============
- Worker 0 (6 tests) => 0:07:19.942135
- Worker 1 (2 tests) => 0:02:53.737000
- Worker 2 (3 tests) => 0:02:47.642804
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2019:2811 |