Created attachment 1407632 [details] screenshot virtual machine see screenshot, i face that on most Fedora 27 installs over months at reboot/shutdown while with Fedora 26 that was not the case
How does /proc/self/mountinfo look like (before shutdown) on that machine?
[root@testserver:~]$ cat /proc/self/mountinfo 18 63 0:18 / /sys rw,nosuid,nodev,noexec,relatime shared:2 - sysfs sysfs rw 19 63 0:4 / /proc rw,nosuid,nodev,noexec,relatime shared:23 - proc proc rw 20 63 0:6 / /dev rw,nosuid shared:19 - devtmpfs devtmpfs rw,size=3549752k,nr_inodes=887438,mode=755 21 18 0:7 / /sys/kernel/security rw,nosuid,nodev,noexec,relatime shared:3 - securityfs securityfs rw 22 20 0:19 / /dev/shm rw,nosuid,nodev shared:20 - tmpfs tmpfs rw 23 20 0:20 / /dev/pts rw,nosuid,noexec,relatime shared:21 - devpts devpts rw,gid=5,mode=620,ptmxmode=000 24 63 0:21 / /run rw,nosuid,nodev shared:22 - tmpfs tmpfs rw,mode=755 25 18 0:22 / /sys/fs/cgroup ro,nosuid,nodev,noexec shared:4 - tmpfs tmpfs ro,mode=755 26 25 0:23 / /sys/fs/cgroup/unified rw,nosuid,nodev,noexec,relatime shared:5 - cgroup2 cgroup rw 27 25 0:24 / /sys/fs/cgroup/systemd rw,nosuid,nodev,noexec,relatime shared:6 - cgroup cgroup rw,xattr,name=systemd 28 18 0:25 / /sys/fs/pstore rw,nosuid,nodev,noexec,relatime shared:17 - pstore pstore rw 29 25 0:26 / /sys/fs/cgroup/cpu,cpuacct rw,nosuid,nodev,noexec,relatime shared:7 - cgroup cgroup rw,cpu,cpuacct 30 25 0:27 / /sys/fs/cgroup/perf_event rw,nosuid,nodev,noexec,relatime shared:8 - cgroup cgroup rw,perf_event 31 25 0:28 / /sys/fs/cgroup/freezer rw,nosuid,nodev,noexec,relatime shared:9 - cgroup cgroup rw,freezer 32 25 0:29 / /sys/fs/cgroup/hugetlb rw,nosuid,nodev,noexec,relatime shared:10 - cgroup cgroup rw,hugetlb 33 25 0:30 / /sys/fs/cgroup/cpuset rw,nosuid,nodev,noexec,relatime shared:11 - cgroup cgroup rw,cpuset 34 25 0:31 / /sys/fs/cgroup/net_cls,net_prio rw,nosuid,nodev,noexec,relatime shared:12 - cgroup cgroup rw,net_cls,net_prio 35 25 0:32 / /sys/fs/cgroup/memory rw,nosuid,nodev,noexec,relatime shared:13 - cgroup cgroup rw,memory 36 25 0:33 / /sys/fs/cgroup/pids rw,nosuid,nodev,noexec,relatime shared:14 - cgroup cgroup rw,pids 37 25 0:34 / /sys/fs/cgroup/blkio rw,nosuid,nodev,noexec,relatime shared:15 - cgroup cgroup rw,blkio 38 25 0:35 / /sys/fs/cgroup/devices rw,nosuid,nodev,noexec,relatime shared:16 - cgroup cgroup rw,devices 61 18 0:36 / /sys/kernel/config rw,relatime shared:18 - configfs configfs rw 63 0 8:17 / / rw,relatime shared:1 - ext4 /dev/sdb1 rw,lazytime,commit=30,inode_readahead_blks=16 39 18 0:8 / /sys/kernel/debug rw,relatime shared:24 - debugfs debugfs rw 40 20 0:37 / /dev/hugepages rw,relatime shared:25 - hugetlbfs hugetlbfs rw,pagesize=2M 41 19 0:38 / /proc/sys/fs/binfmt_misc rw,relatime shared:26 - autofs systemd-1 rw,fd=32,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=14807 42 20 0:17 / /dev/mqueue rw,relatime shared:27 - mqueue mqueue rw 43 63 0:39 / /var/www/sessiondata rw,nosuid,noexec,noatime shared:28 - tmpfs tmpfs rw,size=4194304k 75 63 0:40 / /usr/share/mysql-test/var rw,noatime shared:29 - tmpfs tmpfs rw,size=4194304k,mode=750,uid=27,gid=27 77 63 8:33 / /Volumes/dune rw,relatime shared:30 - ext4 /dev/sdc1 rw,lazytime,commit=30,inode_readahead_blks=16 79 63 8:33 /builduser /home/builduser rw,relatime shared:30 - ext4 /dev/sdc1 rw,lazytime,commit=30,inode_readahead_blks=16 81 77 0:41 / /Volumes/dune/mysql_data/autotest rw,nosuid,noexec,noatime shared:31 - tmpfs tmpfs rw,size=4194304k,mode=700,uid=27,gid=27 83 63 8:33 /.tmp /var/tmp rw,relatime shared:30 - ext4 /dev/sdc1 rw,lazytime,commit=30,inode_readahead_blks=16 85 77 0:42 / /Volumes/dune/www-servers/cms/cms/temp rw,nosuid,noexec,noatime shared:32 - tmpfs tmpfs rw,size=4194304k,mode=770,uid=48,gid=48 87 77 8:33 /buildserver/autotest /Volumes/dune/www-servers/autotest rw,relatime shared:30 - ext4 /dev/sdc1 rw,lazytime,commit=30,inode_readahead_blks=16 89 63 8:33 /.tmp /tmp rw,relatime shared:30 - ext4 /dev/sdc1 rw,lazytime,commit=30,inode_readahead_blks=16 91 77 0:43 / /Volumes/dune/www-servers/cms/cms/logs rw,nosuid,noexec,noatime shared:33 - tmpfs tmpfs rw,size=4194304k,mode=770,uid=48,gid=48 93 63 8:33 /www-servers/cms /usr/local/sftp-homes/flow/cms rw,relatime shared:30 - ext4 /dev/sdc1 rw,lazytime,commit=30,inode_readahead_blks=16 95 63 8:1 / /boot rw,relatime shared:34 - ext4 /dev/sda1 rw,data=ordered 97 79 0:44 / /home/builduser/rpmbuild/PHP-PGO/logs rw,nosuid,noatime shared:35 - tmpfs tmpfs rw,size=4194304k,mode=770,uid=5000,gid=5000 98 77 0:44 / /Volumes/dune/builduser/rpmbuild/PHP-PGO/logs rw,nosuid,noatime shared:35 - tmpfs tmpfs rw,size=4194304k,mode=770,uid=5000,gid=5000 101 63 9:0 / /mnt/raid10 rw,relatime shared:36 - ext4 /dev/md0 rw,lazytime,commit=30,stripe=128,inode_readahead_blks=16 558 24 0:52 / /run/user/48 rw,nosuid,nodev,relatime shared:83 - tmpfs tmpfs rw,size=711220k,mode=700,uid=48,gid=48 588 24 0:53 / /run/user/5000 rw,nosuid,nodev,relatime shared:135 - tmpfs tmpfs rw,size=711220k,mode=700,uid=5000,gid=5000 613 24 0:54 / /run/user/4503 rw,nosuid,nodev,relatime shared:85 - tmpfs tmpfs rw,size=711220k,mode=700,uid=4503,gid=4503 653 24 0:45 / /run/user/4500 rw,nosuid,nodev,relatime shared:87 - tmpfs tmpfs rw,size=711220k,mode=700,uid=4500,gid=48 699 24 0:55 / /run/user/0 rw,nosuid,nodev,relatime shared:89 - tmpfs tmpfs rw,size=711220k,mode=700
I pulled the code that does this parsing, and it parses this file without any issues. I wasn't expecting to find the error there, but it was worth checking. The code that we currently have is buggy: if it fails on a line, it won't process subsequent line correctly. I'll prep a build with some patches, I'd appreciate if you test if that'll fix your issue.
I prepared a scratch build: https://koji.fedoraproject.org/koji/taskinfo?taskID=25832929 This is the F28 package with patches from https://github.com/systemd/systemd/issues/8452 added. Can you test if this fixes the issue? Unfortunately there were additional changes between systemd-234 which is in F27 and systemd-237 in F28 that make it hard to backport those patches. It would be great if you could first test if it helps at all. Please don't install this on any "important" machiens, since F27 with systemd-237 hasn't had much testing.
Created attachment 1410429 [details] issue fixed - new screenshot looks way better, see attached screenshot at the momewnt before restart happens, rebootet the machine around 10 times without ever see that issue of this bugreport that virtual machine has nearly anything configured and installed we use on dozens of servers and our systemd-units make *heavy* use of security-options/namespaces [root@testserver:~]$ rpm -qa | grep systemd | sort | grep 237 systemd-237-1.git04a361e.fc27.x86_64 systemd-devel-237-1.git04a361e.fc27.x86_64 systemd-libs-237-1.git04a361e.fc27.x86_64 systemd-pam-237-1.git04a361e.fc27.x86_64 systemd-udev-237-1.git04a361e.fc27.x86_64 [root@testserver:~]$ rpm -q kernel-core kernel-core-4.15.10-300.fc27.x86_64 _________________________ part of our httpd-service and most others look similar which indicates no regressions other than in some previous Fedora GA releases which broke namespaces multiple times (symlinks, ReadOnly/ReadWrite combined....) CapabilityBoundingSet=CAP_DAC_OVERRIDE CAP_IPC_LOCK CAP_NET_BIND_SERVICE CAP_SETGID CAP_SETUID MemoryDenyWriteExecute=yes NoNewPrivileges=yes PrivateDevices=yes PrivateTmp=yes ProtectControlGroups=yes ProtectKernelModules=yes ProtectKernelTunables=yes RestrictAddressFamilies=AF_INET AF_INET6 AF_LOCAL AF_UNIX RestrictRealtime=yes SystemCallArchitectures=x86-64 SystemCallFilter=~@clock @cpu-emulation @debug @keyring @module @mount @obsolete @raw-io @reboot @swap acct modify_ldt add_key adjtimex clock_adjtime delete_module fanotify_init finit_module get_mempolicy init_module io_destroy io_getevents iopl ioperm io_setup io_submit io_cancel kcmp kexec_load keyctl lookup_dcookie mbind migrate_pages mount move_pages open_by_handle_at perf_event_open pivot_root process_vm_readv process_vm_writev ptrace remap_file_pages request_key set_mempolicy swapoff swapon umount2 uselib vmsplice ReadOnlyDirectories=/ ReadWriteDirectories=-/run ReadWriteDirectories=-/tmp ReadWriteDirectories=-/Volumes/dune/modsec-upload ReadWriteDirectories=-/Volumes/dune/tmp ReadWriteDirectories=-/Volumes/dune/www-servers _________________________
revert snapshot to the state with systemd-234-10.git5f8984e.fc27.x86_64 and at reboot the error immediately is back is it possible that this triggers sometimes not 100% clean shutdown/unmounts filesystem-checks on machine with nested bind-mounts? after upgrade the host-machine to F27 i had some checks at boot which where not expected (repeated within few days)
Yeah, it's certainly possible that this causes unclean fs shutdowns. OK, I'll try to backport this to F27.
ping - a issue which can cause unclean fs shutdowns is a showstopper when consider upgrade production servers to Fedora 27
Apr 12 14:04:18 testserver systemd-fsck[328]: system: recovering journal this is *not* funny every random reboot
dont' get me wrong but guven that F28 will be release in a few days - will this ever be backported and fixed in the lifetime of F27 or will it end like every other bugreport from me for systemd in Fedora only get fixed in the next Fedora release and now we have to upgrade every machine from F26 to F27 with a known major issue?
Do you still plan to backport this to F27? The filesystem bug triage is a bit annoying as well, people are noticing this and are concernedby it. Please fix this if at all possible.
> his message is a reminder that Fedora 26 is nearing its end of life. > Approximately 4 (four) weeks from now Fedora will stop maintaining > and issuing updates for Fedora 26 don't you think it's time to make F27 production ready? this is a bug which can lead to silent data loss!
You don't need to backport a big commit! There's a one-line fix which is already present upstream in v235: https://github.com/systemd/systemd/issues/6796 https://github.com/systemd/systemd/commit/3d4ec01269244c2d35a781abf748ea9ba57666e2
systemd-234-11.git5f8984e.fc27 has been submitted as an update to Fedora 27. https://bodhi.fedoraproject.org/updates/FEDORA-2018-73193bc5bb
Created attachment 1431161 [details] fixed - systemd-234-11.git5f8984e.fc27.x86_64 screenshot from a vmware-guest which *always* had unclean shutdowns systemd-234-11.git5f8984e.fc27.x86_64 systemd-devel-234-11.git5f8984e.fc27.x86_64 systemd-libs-234-11.git5f8984e.fc27.x86_64 systemd-pam-234-11.git5f8984e.fc27.x86_64 systemd-udev-234-11.git5f8984e.fc27.x86_64
Thanks for fixing this.
systemd-234-11.git5f8984e.fc27 has been pushed to the Fedora 27 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2018-73193bc5bb
systemd-234-11.git5f8984e.fc27 has been pushed to the Fedora 27 stable repository. If problems still persist, please make note of it in this bug report.