Description of problem: When starting nova_compute container on a compute node, which has /var/lib/nova/instances shared between multiple nodes, all running instances get their disks put into read-only mode. Version-Release number of selected component (if applicable): How reproducible: Test environment: 1 Hypervisor Host for Infrastructure 2 Compute Nodes 1 NetApp (maybe other NFS servers would work, too). All nodes in our setup where predeployed, but this should be reproduce-able on normal installations as well. Install: RHOSP12, 3 Controllers, 2 Compute. Containerized Environment. The NFS share was added before the actual RHOSP installation, but from my tests it should also work when adding them later. After the installation, we just prepare one node to be able to receive VMs to ensure that the test would be like a scale (its just faster). So: if not predeployed: stop all docker containers, add nfs share to /var/lib/nova/instances on compute1, start containers on compute1. After that, and a bit of preparing OpenStack, you should be able to start a VM. Inside the VM, we executed bonnie++ to put IO load on the root disk: bonnie++ -d /root -c 1 -s 8100 -x 2000 -f -b -u root Now add the second compute node by adding the share to the node (should work without issues). After that, start the containers, one by one, but leave nova_compute as last one to start. Before you start nova_compute, you can run something like: while true; do date; dmesg | grep -i error; sleep 1; do into the VMs console. This will print out the time each second and check dmesg for errors. If you're ready, start nova_compute and watch the VM go read-only. Steps to Reproduce: If you did the above, easy way to reproduce: 1. stop nova_compute on node2 2. start a new VM as described above 3. start nova_compute on node2 Actual results: Guest (we tested with different guest images!) gets IO errors and the guest root disk will go read-only until you turn it off and back on. Expected results: Guests on other hypervisors (but same share) are not affected at all. Additional info: We hit a problem in one of customers environments when we scaled two new compute nodes into the environment. The OpenStack environment consists of four availability zones, two of them got a node added. After the scaling, we got reports about VMs reporting their root disks read-only. We noticed that out of the four zones only two where affected. But it affected all VMs on each zone. So after some investigation, we noticed that the NFS share is the only commonality between all nodes/VMs. The NFS share is added before RHOSP is installed, so its not part of the RHOSP installation. For that reason, we removed the share from a new compute node, scaled that node into the environment and nothing happened. So it must be somehow related to the share. For that reason we started to test only related takss to the share. And finally we where able to reproduce it by simply starting nova_compute. The customer plans to put RHOSP12 into production starting beginning of July. This one would currently be a killer. Some Infos: # docker start nova_compute nova_compute # /var/log/messages: 2018-06-22 14:07:57 +02:00 compute02 kern.warning kernel: [79615.684979] overlayfs: upperdir is in-use by another mount, accessing files from both mounts will result in undefined behavior. 2018-06-22 14:07:57 +02:00 compute02 kern.warning kernel: [79615.684992] overlayfs: workdir is in-use by another mount, accessing files from both mounts will result in undefined behavior. 2018-06-22 14:07:57 +02:00 compute02 user.debug oci-systemd-hook[304421]: systemdhook <debug>: ae399472d6c8: Skipping as container command is kolla_start, not init or systemd 2018-06-22 14:07:57 +02:00 compute02 user.debug oci-umount[304422]: umounthook <debug>: prestart container_id:ae399472d6c8 rootfs:/var/lib/docker/overlay2/6324b25985ec8262ad5bf750e9ae3fb96dda4b45a15901ff5242b71c77d93498/merged Same moment on the Guest: /var/log/messages: 2018-06-22 14:07:57 +02:00 (none) kern.err kernel: [ 1760.124936] end_request: I/O error, dev vda, sector 17793208 2018-06-22 14:07:57 +02:00 (none) kern.err kernel: [ 1760.124940] Buffer I/O error on device vda1, logical block 2223895 2018-06-22 14:07:57 +02:00 (none) kern.warning kernel: [ 1760.124942] lost page write due to I/O error on vda1 2018-06-22 14:07:58 +02:00 (none) kern.err kernel: [ 1760.353043] end_request: I/O error, dev vda, sector 17793208 2018-06-22 14:07:58 +02:00 (none) kern.err kernel: [ 1760.353046] Buffer I/O error on device vda1, logical block 2223895 2018-06-22 14:07:58 +02:00 (none) kern.warning kernel: [ 1760.353048] lost page write due to I/O error on vda1 2018-06-22 14:07:58 +02:00 (none) kern.err kernel: [ 1760.353498] end_request: I/O error, dev vda, sector 17793208 2018-06-22 14:07:58 +02:00 (none) kern.err kernel: [ 1760.353500] Buffer I/O error on device vda1, logical block 2223895 2018-06-22 14:07:58 +02:00 (none) kern.warning kernel: [ 1760.353501] lost page write due to I/O error on vda1 2018-06-22 14:07:58 +02:00 (none) kern.err kernel: [ 1760.353916] end_request: I/O error, dev vda, sector 17793208 2018-06-22 14:07:58 +02:00 (none) kern.err kernel: [ 1760.353917] Buffer I/O error on device vda1, logical block 2223895 2018-06-22 14:07:58 +02:00 (none) kern.warning kernel: [ 1760.353918] lost page write due to I/O error on vda1 2018-06-22 14:07:58 +02:00 (none) kern.err kernel: [ 1760.357556] end_request: I/O error, dev vda, sector 17793208 2018-06-22 14:07:58 +02:00 (none) kern.err kernel: [ 1760.357558] Buffer I/O error on device vda1, logical block 2223895 2018-06-22 14:07:58 +02:00 (none) kern.warning kernel: [ 1760.357559] lost page write due to I/O error on vda1 2018-06-22 14:07:58 +02:00 (none) kern.err kernel: [ 1760.357963] end_request: I/O error, dev vda, sector 17793208 2018-06-22 14:07:58 +02:00 (none) kern.err kernel: [ 1760.357965] Buffer I/O error on device vda1, logical block 2223895 2018-06-22 14:07:58 +02:00 (none) kern.warning kernel: [ 1760.357966] lost page write due to I/O error on vda1 2018-06-22 14:07:58 +02:00 (none) kern.err kernel: [ 1760.358420] end_request: I/O error, dev vda, sector 17793208 2018-06-22 14:07:58 +02:00 (none) kern.err kernel: [ 1760.358423] Buffer I/O error on device vda1, logical block 2223895 2018-06-22 14:07:58 +02:00 (none) kern.warning kernel: [ 1760.358425] lost page write due to I/O error on vda1 2018-06-22 14:07:58 +02:00 (none) kern.err kernel: [ 1760.358932] end_request: I/O error, dev vda, sector 17793208 2018-06-22 14:07:58 +02:00 (none) kern.err kernel: [ 1760.358935] Buffer I/O error on device vda1, logical block 2223895 2018-06-22 14:07:58 +02:00 (none) kern.warning kernel: [ 1760.358937] lost page write due to I/O error on vda1 2018-06-22 14:07:58 +02:00 (none) kern.err kernel: [ 1760.359378] end_request: I/O error, dev vda, sector 17793208 2018-06-22 14:07:58 +02:00 (none) kern.err kernel: [ 1760.359380] Buffer I/O error on device vda1, logical block 2223895 2018-06-22 14:07:58 +02:00 (none) kern.warning kernel: [ 1760.359381] lost page write due to I/O error on vda1 2018-06-22 14:07:58 +02:00 (none) kern.err kernel: [ 1760.359827] end_request: I/O error, dev vda, sector 17793208 2018-06-22 14:07:58 +02:00 (none) kern.err kernel: [ 1760.359830] Buffer I/O error on device vda1, logical block 2223895 2018-06-22 14:07:58 +02:00 (none) kern.warning kernel: [ 1760.359832] lost page write due to I/O error on vda1 2018-06-22 14:07:58 +02:00 (none) kern.err kernel: [ 1760.360345] end_request: I/O error, dev vda, sector 17793208 2018-06-22 14:07:58 +02:00 (none) kern.err kernel: [ 1760.360790] end_request: I/O error, dev vda, sector 17793208
The NetApp in this case is a FAS6290 running Ontap Release 8.2.3P6
(In reply to Sven Michels from comment #0) > /var/log/messages: > 2018-06-22 14:07:57 +02:00 compute02 kern.warning kernel: [79615.684979] > overlayfs: upperdir is in-use by another mount, accessing files from both > mounts will result in undefined behavior. I'm pretty sure that's the problem right there. overlayfs can't be involved here or, as it says, the behaviour is undefined. This needs to be mounted directly. Presumably docker on the source is detecting that the directory is mounted elsewhere and it marking it readonly for safety. When qemu writes to a disk, that write needs to go direct to the filer with no overlayfs involved. However the docker containers needs to be configured to achieve that, that's what we need to do. Presumably this is still a compute bug, but nova isn't involved.
As Irina pointed out, we're using RHEL7.5 and according to this doc: https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/7.5_release_notes/technology_previews_file_systems overlayfs is tech preview and only supported under some circumstances. One note i found: Only XFS is currently supported for use as a lower layer file system. The nodes are running on ext4, not sure how this is related to issue, but Irina and i agreed that this might be interesting.
Can you retest with the nfs export mounted on /var/lib/nova instead of /var/lib/nova/instances? This appears to be safe, looking at the state_path docs in https://docs.openstack.org/ocata/config-reference/compute/config-options.html: "In some scenarios (for example migrations) it makes sense to use a storage location which is shared between multiple compute hosts (for example via NFS)"
Hey Ollie, i tested the suggestion. I stopped all the docker containers and VMs , created an instances director on the share, moved everything in there, unmounted the share, moved /var/lib/nova away, created new dir, set permissions, changed fstab, mounted /var/lib/nova and started docker again. I also ensured a fresh new vm was used for testing. After the VM was up&running, i started the containers on the second compute node again. The result was the same. Everything was normal till i started nova_compute container. Then the VM on the other node immediately got the disk r-o. So the change didn't change much, sorry :(
I believe this is a conflict between overlayfs and nfs that is causing this issue. Adding Vivek to see what he thinks.
I've been able to reproduce with a fresh deployment (1 controller + 2 computes) using latest docker images (12.0-20180529.1) and a rhel nfs server: (undercloud) [stack@undercloud-12 ~]$ cat /etc/exports /nfs/nova *(rw,sync,no_root_squash,no_all_squash) here what happens to a cirros instance running on compute-0: $ $ while true; do echo "1" > /tmp/iotest && sleep 2; done [ 1989.829606] end_request: I/O error, dev vda, sector 64899 [ 1989.833039] Buffer I/O error on device vda1, logical block 24417 [ 1989.833039] lost page write due to I/O error on vda1 [ 1989.833039] end_request: I/O error, dev vda, sector 46791 [ 1989.833039] Buffer I/O error on device vda1, logical block 15363 [ 1989.833039] lost page write due to I/O error on vda1 [ 1989.910484] JBD: Detected IO errors while flushing file data on vda1 [ 1989.932504] end_request: I/O error, dev vda, sector 51025 [ 1989.945224] Aborting journal on device vda1. [ 1989.955549] EXT3-fs (vda1): error: ext3_journal_start_sb: Detected aborted journal [ 1989.970686] end_request: I/O error, dev vda, sector 49351 [ 1989.974598] Buffer I/O error on device vda1, logical block 16643 [ 1989.974598] lost page write due to I/O error on vda1 [ 1990.006547] EXT3-fs (vda1): error: remounting filesystem read-only [ 1990.019265] JBD: I/O error detected when updating journal superblock for vda1. [ 1990.038305] end_request: I/O error, dev vda, sector 26931 [ 1990.042271] Buffer I/O error on device vda1, logical block 5433 [ 1990.042271] lost page write due to I/O error on vda1 [ 1990.072201] JBD: Detected IO errors while flushing file data on vda1 -sh: can't create /tmp/iotest: Read-only file system the fs remains accessible r/w on the underlying host: [root@overcloud-compute-0 ~]# touch /var/lib/nova/instances/a [root@overcloud-compute-0 ~]# echo a > /var/lib/nova/instances/a [root@overcloud-compute-0 ~]# my deployment is using standard images, so xfs on the overcloud (and undercloud/nfs-server fwwi). stopping and starting the vm restores r/w access to the root disk. there are no entries in dmesg/messages that could correlate to this issue.
I've run this on compute-0 (where the vm runs) from within /var/lib/nova/instances/ : [root@overcloud-compute-0 instances]# while true; do echo `date` >> /tmp/output && ls -lart * >> /tmp/output && sleep 1; done Sat Jun 23 15:04:24 UTC 2018 locks: total 0 -rw-r--r--. 1 42436 42436 0 Jun 23 12:43 nova-e0aa1cba172506b12f79eb056c4c9ea0ae9442b7 drwxr-xr-x. 2 42436 42436 59 Jun 23 12:43 . drwxrwxrwx. 5 42436 42436 94 Jun 23 15:01 .. _base: total 18176 drwxr-xr-x. 2 42436 42436 54 Jun 23 12:44 . -rw-r--r--. 1 qemu qemu 41126400 Jun 23 12:44 e0aa1cba172506b12f79eb056c4c9ea0ae9442b7 drwxrwxrwx. 5 42436 42436 94 Jun 23 15:01 .. 49ef880a-d19a-46ab-91f9-239975a31ddc: total 2588 -rw-r--r--. 1 42436 42436 79 Jun 23 12:43 disk.info drwxrwxrwx. 5 42436 42436 94 Jun 23 15:01 .. drwxr-xr-x. 2 42436 42436 54 Jun 23 15:01 . -rw-r--r--. 1 qemu qemu 2686976 Jun 23 15:02 disk -rw-------. 1 root root 16635 Jun 23 15:04 console.log here the vm is running fine, notice the disk is owned by qemu:qemu while console.log is root:root I then started the nova_compute container on compute-1, and here what happens: Sat Jun 23 15:04:56 UTC 2018 locks: total 0 -rw-r--r--. 1 42436 42436 0 Jun 23 12:43 nova-e0aa1cba172506b12f79eb056c4c9ea0ae9442b7 drwxr-xr-x. 2 42436 42436 59 Jun 23 12:43 . drwxrwxrwx. 5 42436 42436 94 Jun 23 15:01 .. _base: total 18176 drwxr-xr-x. 2 42436 42436 54 Jun 23 12:44 . -rw-r--r--. 1 42436 42436 41126400 Jun 23 12:44 e0aa1cba172506b12f79eb056c4c9ea0ae9442b7 drwxrwxrwx. 5 42436 42436 94 Jun 23 15:01 .. 49ef880a-d19a-46ab-91f9-239975a31ddc: total 2588 -rw-r--r--. 1 42436 42436 79 Jun 23 12:43 disk.info drwxrwxrwx. 5 42436 42436 94 Jun 23 15:01 .. drwxr-xr-x. 2 42436 42436 54 Jun 23 15:01 . -rw-r--r--. 1 42436 42436 2686976 Jun 23 15:02 disk -rw-------. 1 42436 42436 16657 Jun 23 2018 console.log disk and console.log are chown'd to 42436:42436 (id of user nova inside the container). stopping nova_compute + stopping/starting the vm brings things back on track: -rw-------. 1 root root 16684 Jun 23 15:13 console.log -rw-r--r--. 1 qemu qemu 2686976 Jun 23 15:13 disk -rw-r--r--. 1 42436 42436 79 Jun 23 12:43 disk.info Sven can you check this is actually happening on the customer environment?
nova_compute container logs: INFO:__main__:Loading config file at /var/lib/kolla/config_files/config.json INFO:__main__:Validating config file INFO:__main__:Kolla config strategy set to: COPY_ALWAYS INFO:__main__:Copying service configuration files INFO:__main__:Deleting /etc/libvirt/libvirtd.conf INFO:__main__:Copying /var/lib/kolla/config_files/src/etc/libvirt/libvirtd.conf to /etc/libvirt/libvirtd.conf INFO:__main__:Deleting /etc/libvirt/passwd.db INFO:__main__:Copying /var/lib/kolla/config_files/src/etc/libvirt/passwd.db to /etc/libvirt/passwd.db INFO:__main__:Deleting /etc/libvirt/qemu.conf INFO:__main__:Copying /var/lib/kolla/config_files/src/etc/libvirt/qemu.conf to /etc/libvirt/qemu.conf INFO:__main__:Deleting /etc/my.cnf.d/tripleo.cnf INFO:__main__:Copying /var/lib/kolla/config_files/src/etc/my.cnf.d/tripleo.cnf to /etc/my.cnf.d/tripleo.cnf INFO:__main__:Deleting /etc/nova/migration/authorized_keys INFO:__main__:Copying /var/lib/kolla/config_files/src/etc/nova/migration/authorized_keys to /etc/nova/migration/authorized_keys INFO:__main__:Deleting /etc/nova/migration/identity INFO:__main__:Copying /var/lib/kolla/config_files/src/etc/nova/migration/identity to /etc/nova/migration/identity INFO:__main__:Deleting /etc/nova/nova.conf INFO:__main__:Copying /var/lib/kolla/config_files/src/etc/nova/nova.conf to /etc/nova/nova.conf INFO:__main__:Deleting /etc/sasl2/libvirt.conf INFO:__main__:Copying /var/lib/kolla/config_files/src/etc/sasl2/libvirt.conf to /etc/sasl2/libvirt.conf INFO:__main__:Deleting /etc/ssh/sshd_config INFO:__main__:Copying /var/lib/kolla/config_files/src/etc/ssh/sshd_config to /etc/ssh/sshd_config INFO:__main__:Deleting /var/lib/nova/.ssh/config INFO:__main__:Copying /var/lib/kolla/config_files/src/var/lib/nova/.ssh/config to /var/lib/nova/.ssh/config INFO:__main__:Deleting /etc/ceph/rbdmap INFO:__main__:Copying /var/lib/kolla/config_files/src-ceph/rbdmap to /etc/ceph/rbdmap INFO:__main__:Writing out command to execute INFO:__main__:Setting permission for /var/log/nova INFO:__main__:Setting permission for /var/log/nova/nova-compute.log INFO:__main__:Setting permission for /var/lib/nova INFO:__main__:Setting permission for /var/lib/nova/buckets INFO:__main__:Setting permission for /var/lib/nova/networks INFO:__main__:Setting permission for /var/lib/nova/.ssh INFO:__main__:Setting permission for /var/lib/nova/tmp INFO:__main__:Setting permission for /var/lib/nova/keys INFO:__main__:Setting permission for /var/lib/nova/instances INFO:__main__:Setting permission for /var/lib/nova/.ssh/config INFO:__main__:Setting permission for /var/lib/nova/instances/49ef880a-d19a-46ab-91f9-239975a31ddc INFO:__main__:Setting permission for /var/lib/nova/instances/_base INFO:__main__:Setting permission for /var/lib/nova/instances/locks INFO:__main__:Setting permission for /var/lib/nova/instances/a INFO:__main__:Setting permission for /var/lib/nova/instances/b INFO:__main__:Setting permission for /var/lib/nova/instances/49ef880a-d19a-46ab-91f9-239975a31ddc/disk.info INFO:__main__:Setting permission for /var/lib/nova/instances/49ef880a-d19a-46ab-91f9-239975a31ddc/disk INFO:__main__:Setting permission for /var/lib/nova/instances/49ef880a-d19a-46ab-91f9-239975a31ddc/console.log INFO:__main__:Setting permission for /var/lib/nova/instances/_base/e0aa1cba172506b12f79eb056c4c9ea0ae9442b7 INFO:__main__:Setting permission for /var/lib/nova/instances/locks/nova-e0aa1cba172506b12f79eb056c4c9ea0ae9442b7 Running command: '/usr/bin/nova-compute --config-file /etc/nova/nova.conf --config-file /etc/nova/rootwrap.conf' IMHO kolla_start should check whether permissions are actually OK before setting them recursively. Thoughts?
Hey Luca, i can confirm. I've seen this yesterday by my tests but not really taken care of. To be 100% sure, i rechecked: [root@compute01 41ed0b4b-3445-4757-a6ff-3e13f5f43a86]# ls -la total 70652 drwxr-xr-x. 2 42436 42436 4096 Jun 23 17:33 . drwxr-xr-x. 6 42436 42436 4096 Jun 23 17:32 .. -rw-------. 1 root root 0 Jun 23 17:33 console.log -rw-r--r--. 1 qemu qemu 71630848 Jun 23 17:36 disk -rw-r--r--. 1 qemu qemu 475136 Jun 23 17:33 disk.config -rw-r--r--. 1 42436 42436 79 Jun 23 17:32 disk.info [root@compute02 ~]# docker start nova_compute nova_compute [root@compute02 ~]# [root@compute01 41ed0b4b-3445-4757-a6ff-3e13f5f43a86]# ls -la total 70716 drwxr-xr-x. 2 42436 42436 4096 Jun 23 17:33 . drwxr-xr-x. 6 42436 42436 4096 Jun 23 17:32 .. -rw-------. 1 42436 42436 0 Jun 23 17:33 console.log -rw-r--r--. 1 42436 42436 71696384 Jun 23 17:36 disk -rw-r--r--. 1 42436 42436 475136 Jun 23 17:33 disk.config -rw-r--r--. 1 42436 42436 79 Jun 23 17:32 disk.info [root@d100siul0552 41ed0b4b-3445-4757-a6ff-3e13f5f43a86]# So the permissions are changed after starting the compute service. I agree that this should be done carefully, but i'm also not sure if the first set of permissions actually is the correct one. The 42436 id is usually the nova id from the container and everything owned by this id makes more sense for me at first place. Because otherwise we have 2 or 3 different owners here: nova container id for the info file, qmeu for the disk, root for the log. Cheers, Sven
First set is correct, qemu run the the VMs so needs to own the disk, libvirt (root) manages the logs. This is the culprit - https://github.com/openstack/tripleo-heat-templates/blob/stable/pike/docker/services/nova-compute.yaml#L122 I assume this was added to handle upgrades from baremetal OSP11 to docker OSP12 as the nova uid/gids are not the same on host and kolla images. Should be safe to remove this to workaround the issue.
(In reply to Ollie Walsh from comment #13) > First set is correct, qemu run the the VMs so needs to own the disk, > libvirt (root) manages the logs. > > This is the culprit - > https://github.com/openstack/tripleo-heat-templates/blob/stable/pike/docker/ > services/nova-compute.yaml#L122 > > I assume this was added to handle upgrades from baremetal OSP11 to docker > OSP12 as the nova uid/gids are not the same on host and kolla images. Should > be safe to remove this to workaround the issue. Thanks Ollie! If it makes sense for upgrades we could maybe limit the depth of the recursiveness to "1" (not sure if feasible)? If we prevent chown from reaching disks/console.log it should be safe to keep.
kolla doesn't currently have a max depth an option. Also I don't think it would be sufficient - we need a recursive chown, but only on the files currently owned by the host nova user. We also don't need to do this on ever nova-compute start, just once during upgrade. I'll figure something out on Monday...
(In reply to Ollie Walsh from comment #15) > kolla doesn't currently have a max depth an option. > > Also I don't think it would be sufficient - we need a recursive chown, but > only on the files currently owned by the host nova user. > We also don't need to do this on ever nova-compute start, just once during > upgrade. > > I'll figure something out on Monday... awesome, thank you very much! And have a nice rest of the weekend :) @Sven: for a quick&dirty workaround, without redeployment ad without touching the containers you can edit /var/lib/kolla/config_files/nova_compute.json the original should look similar to: {"config_files": [{"dest": "/", "merge": true, "source": "/var/lib/kolla/config_files/src/*", "preserve_properties": true}, {"dest": "/etc/ceph/", "merge": true, "source": "/var/lib/kolla/config_files/src-ceph/", "preserve_properties": true}], "command": "/usr/bin/nova-compute --config-file /etc/nova/nova.conf --config-file /etc/nova/rootwrap.conf", "permissions": [{"owner": "nova:nova", "path": "/var/log/nova", "recurse": true}, {"owner": "nova:nova", "path": "/var/lib/nova", "recurse": true}, {"owner": "nova:nova", "path": "/etc/ceph/ceph.client.openstack.keyring", "perm": "0600"}]} I've modified like this: {"config_files": [{"dest": "/", "merge": true, "source": "/var/lib/kolla/config_files/src/*", "preserve_properties": true}, {"dest": "/etc/ceph/", "merge": true, "source": "/var/lib/kolla/config_files/src-ceph/", "preserve_properties": true}], "command": "/usr/bin/nova-compute --config-file /etc/nova/nova.conf --config-file /etc/nova/rootwrap.conf", "permissions": [{"owner": "nova:nova", "path": "/var/log/nova", "recurse": true}, {"owner": "nova:nova", "path": "/var/lib/nova", "recurse": false}, {"owner": "nova:nova", "path": "/var/lib/nova/instances/*", "recurse": false}, {"owner": "nova:nova", "path": "/var/lib/nova/buckets", "recurse": false}, {"owner": "nova:nova", "path": "/var/lib/nova/keys", "recurse": false}, {"owner": "nova:nova", "path": "/var/lib/nova/networks", "recurse": false}, {"owner": "nova:nova", "path": "/var/lib/nova/tmp", "recurse": false}, {"owner": "nova:nova", "path": "/etc/ceph/ceph.client.openstack.keyring", "perm": "0600"}]} Alternatively you can remove the /var/lib/nova paths alltogether I guess, or set recurse to false. I just included all the subdirs of /var/lib/nova for testing purposes.
Hey there, as the json fix would be at risk when they run a stack update, i would like to adjust the template. So my proposal would be: modify docker/services/nova-compute.yaml: 103 permissions: 104 - path: /var/log/nova 105 owner: nova:nova 106 recurse: true 107 - path: /var/lib/nova 108 owner: nova:nova 109 recurse: true and remove line 109 / change it to false. This way we can deploy and the nasty part should be gone. We will test it with the customer affected and if that works for them, we're done with them, cause they can't wait for an official fix. Ollie: would you agree with this? Thanks and cheers, Sven
I have reproduced this from #16 and can confirm that dropping the recursive chown on /var/lib/nova/instances fixes this problem.
After looking at a reproducer system, firstly I can confirm that overlayfs isn't involved here. This was my primary concern, as this would be a data integrity issue. The issue appears to relate to how NFS manages open file handles when permissions change. I did the following quick reproducer: $ touch foo; tail -f foo # date >> foo Observe that the unprivileged tail can read the data written by root. # chown root.root foo; chmod 600 foo; date >> foo Observe that even though the unprivileged tail no longer has permissions to read the file, it can still read the new data written by root. The above test fails on NFS, though, with: tail: error reading 'foo': Input/output error Which matches what we see from VMs when we change the file permissions.
With the patch from https://review.openstack.org/577855: [root@overcloud-novacompute-0 nova]# docker logs nova_statedir_owner ownership of '/var/lib/nova' retained as nova:nova ownership of '/var/lib/nova/buckets' retained as nova:nova ownership of '/var/lib/nova/.ssh' retained as nova:nova ownership of '/var/lib/nova/.ssh/config' retained as nova:nova ownership of '/var/lib/nova/keys' retained as nova:nova ownership of '/var/lib/nova/instances' retained as nova:nova ownership of '/var/lib/nova/instances/_base' retained as nova:nova ownership of '/var/lib/nova/instances/locks' retained as nova:nova ownership of '/var/lib/nova/instances/0d0dd47b-0354-404a-9c20-a6049d5ac103' retained as nova:nova ownership of '/var/lib/nova/instances/0d0dd47b-0354-404a-9c20-a6049d5ac103/disk.info' retained as nova:nova ownership of '/var/lib/nova/tmp' retained as nova:nova ownership of '/var/lib/nova/networks' retained as nova:nova ownership of '/var/lib/nova/.bash_history' retained as nova:nova changed ownership of '/var/lib/nova/foo' from root:root to nova:nova
(In reply to Matthew Booth from comment #2) > (In reply to Sven Michels from comment #0) > > /var/log/messages: > > 2018-06-22 14:07:57 +02:00 compute02 kern.warning kernel: [79615.684979] > > overlayfs: upperdir is in-use by another mount, accessing files from both > > mounts will result in undefined behavior. > > I'm pretty sure that's the problem right there. overlayfs can't be involved > here or, as it says, the behaviour is undefined. This needs to be mounted > directly. Presumably docker on the source is detecting that the directory is > mounted elsewhere and it marking it readonly for safety. I doubt that this is causing the problem you are seeing. Reason being that overlayfs either denies the mount or just warns (but does not make mount read-only). So while leaked mount is a problem, but that's a different issue. What about all these errors from the disk (vda). I don't understand the configuration fully, but that seems to be part of the problem. I am assuming that VM images are over NFS and it shows up as vda in disk. So that error is happeing because NFS is read-only? Also overlayfs should not have anything to do with NFS mount. Can somebody explain, what's the correlation here between NFS and overlayfs.
(In reply to Vivek Goyal from comment #27) > (In reply to Matthew Booth from comment #2) > > (In reply to Sven Michels from comment #0) > > > /var/log/messages: > > > 2018-06-22 14:07:57 +02:00 compute02 kern.warning kernel: [79615.684979] > > > overlayfs: upperdir is in-use by another mount, accessing files from both > > > mounts will result in undefined behavior. > > > > I'm pretty sure that's the problem right there. overlayfs can't be involved > > here or, as it says, the behaviour is undefined. This needs to be mounted > > directly. Presumably docker on the source is detecting that the directory is > > mounted elsewhere and it marking it readonly for safety. > > I doubt that this is causing the problem you are seeing. Reason being that > overlayfs either denies the mount or just warns (but does not make mount > read-only). So while leaked mount is a problem, but that's a different issue. > > > What about all these errors from the disk (vda). I don't understand the > configuration fully, but that seems to be part of the problem. I am assuming > that VM images are over NFS and it shows up as vda in disk. So that error is > happeing because NFS is read-only? > > Also overlayfs should not have anything to do with NFS mount. Can somebody > explain, what's the correlation here between NFS and overlayfs. Did you read on? Had nothing to do with overlayfs. The root cause was a recursive chown combined with the fact that NFS isn't POSIX (results in I/O errors on open files).
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2019:0045