Bug 1726612
Summary: | Stratisd starts whit fail messages after reboot | ||||||
---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Florian van Oudgaarden <florianvanoudgaarden> | ||||
Component: | stratisd | Assignee: | John Baublitz <jbaublitz> | ||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||
Severity: | unspecified | Docs Contact: | |||||
Priority: | unspecified | ||||||
Version: | 30 | CC: | amulhern, andy, bgurney, igor.raits, jbaublitz, rust-sig, yaneti | ||||
Target Milestone: | --- | ||||||
Target Release: | --- | ||||||
Hardware: | x86_64 | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | 1.0.6 | Doc Type: | If docs needed, set a value | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2020-04-15 13:21:53 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Florian van Oudgaarden
2019-07-03 09:39:02 UTC
Note: the information provided by "stratis --version" may be incorrect due to https://bugzilla.redhat.com/show_bug.cgi?id=1652124. This was fixed in https://github.com/stratis-storage/stratis-cli/pull/265, and included in version 1.0.3. So, the actual version of stratis-cli being run is probably 1.0.2. Note: We can rule out the problem of the missing application binary due to ordering problems as the error message would unambiguously identify this problem. So, the error "Jul 03 10:50:27 f30-2.test.org stratisd[559]: IO error: No such file or directory (os error 2)" refers definitely to something else. We will attempt to reproduce this problem. Re-assigning to stratis team member, as this seems unlikely to be a packaging problem. I have a couple of questions: How often is this issue reproducible? Is it reproducible 100% of the time? Most of the time? Sometimes? (In the Bugzilla template, this is the question being asked by the "How reproducible:" field.) Also, in your reproduction step "4.Configure some disks", can you elaborate on the approximate setup of the pools and/or filesystems? On a test VM, I set up Fedora 30 Workstation, with the following package versions: stratisd-1.0.4-1.module_f30+4201+e7a8c4f2.x86_64 stratis-cli-1.0.2-1.module_f30+4065+aaaad0b9.x86_64 After installing stratis-cli and stratisd, and running "systemctl enable stratisd" and "systemctl start stratisd", I created one filesystem in one pool, with the following commands: stratis pool create spool1 /dev/vdb1 stratis fs create spool1 sfs1 After restarting, stratisd was running, and the stratis device was running. To test the same pool with automatic mounting, I added this line to /etc/fstab: /stratis/spool1/sfs1 /mnt/sfs1/ xfs defaults 0 0 After restarting, stratisd was running, and the filesystem was mounted on /mnt/sfs1. For your tests (where stratisd is failing to start) are there any additional steps that are different from the steps in my test? I have one pool # stratis pool list Name Total Physical Size Total Physical Used greaterpool 8 GiB 1.65 GiB in that pool i made three filesystem points # stratis filesystem list Pool Name Name Used Created Device UUID greaterpool log2 546 MiB Jul 02 2019 14:54 /stratis/greaterpool/log2 3a781428dea149ada3c2c2b86f7d88df greaterpool log1 546 MiB Jul 02 2019 14:54 /stratis/greaterpool/log1 558eebd593d7449a802eb177d8cd5d00 greaterpool log3 546 MiB Jul 02 2019 14:54 /stratis/greaterpool/log3 acb1b3c3fa9f40ac8e3db4d9096839db And i added the following lines to my /etc/fstab UUID=acb1b3c3-fa9f-40ac-8e3d-b4d9096839db /mount/log3 xfs defaults 0 0 UUID=3a781428-dea1-49ad-a3c2-c2b86f7d88df /mount/log2 xfs defaults 0 0 UUID=558eebd5-93d7-449a-802e-b177d8cd5d00 /mount/log1 xfs defaults 0 0 Strange thing is wen I remove the lines from my fs tab the stratis deamon starts normal after reboot. wen I add them, and do the mount -a command the mount points work fine. wen I reboot again the boot log will give the following message : Jul 04 17:11:04 f30-2.test.org systemd[1]: dev-disk-by\x2duuid-558eebd5\x2d93d7\x2d449a\x2d802e\x2db177d8cd5d00.device: Job dev-disk-by\x2duuid-558eebd5\x2d93d7\x2d449a\x2d802e\x2db177d8cd5d00.device/start timed out. Jul 04 17:11:04 f30-2.test.org systemd[1]: Timed out waiting for device /dev/disk/by-uuid/558eebd5-93d7-449a-802e-b177d8cd5d00. Jul 04 17:11:04 f30-2.test.org systemd[1]: Dependency failed for /mount/log1. Jul 04 17:11:04 f30-2.test.org systemd[1]: Dependency failed for Local File Systems. Jul 04 17:11:04 f30-2.test.org systemd[1]: Dependency failed for Mark the need to relabel after reboot. Jul 04 17:11:04 f30-2.test.org systemd[1]: selinux-autorelabel-mark.service: Job selinux-autorelabel-mark.service/start failed with result 'dependency'. Jul 04 17:11:04 f30-2.test.org systemd[1]: local-fs.target: Job local-fs.target/start failed with result 'dependency'. Jul 04 17:11:04 f30-2.test.org systemd[1]: local-fs.target: Triggering OnFailure= dependencies. Jul 04 17:11:04 f30-2.test.org systemd[1]: mount-log1.mount: Job mount-log.mount/start failed with result 'dependency'. Jul 04 17:11:04 f30-2.test.org systemd[1]: dev-disk-by\x2duuid-558eebd5\x2d93d7\x2d449a\x2d802e\x2db177d8cd5d00.device: Job dev-disk-by\x2duuid-558eebd5\x2d93d7\x2d449a\x2d802e\x2db177d8cd5d00.device/start failed with result 'timeout'. Jul 04 17:11:04 f30-2.test.org systemd[1]: dev-disk-by\x2duuid-3a781428\x2ddea1\x2d49ad\x2da3c2\x2dc2b86f7d88df.device: Job dev-disk-by\x2duuid-3a781428\x2ddea1\x2d49ad\x2da3c2\x2dc2b86f7d88df.device/start timed out. Jul 04 17:11:04 f30-2.test.org systemd[1]: Timed out waiting for device /dev/disk/by-uuid/3a781428-dea1-49ad-a3c2-c2b86f7d88df. Jul 04 17:11:04 f30-2.test.org systemd[1]: Dependency failed for /mount/log2. Jul 04 17:11:04 f30-2.test.org systemd[1]: mount-log2.mount: Job mount-log2.mount/start failed with result 'dependency'. Jul 04 17:11:04 f30-2.test.org systemd[1]: dev-disk-by\x2duuid-3a781428\x2ddea1\x2d49ad\x2da3c2\x2dc2b86f7d88df.device: Job dev-disk-by\x2duuid-3a781428\x2ddea1\x2d49ad\x2da3c2\x2dc2b86f7d88df.device/start failed with result 'timeout'. After that I come in the emergency mode. So only wen I have my mount points configured in the fstab the failure whit the stratis deamon appears. Got the same configuration and the same packages : [root@f30-2 ~]# rpm -q stratisd stratisd-1.0.4-1.module_f30+4201+e7a8c4f2.x86_64 [root@f30-2 ~]# rpm -q stratis-cli stratis-cli-1.0.2-1.module_f30+4065+aaaad0b9.x86_64 What was the base block device used for creating the Stratis pool? I used a partition on an NVMe solid state drive, with the following commands: # stratis pool create spool1 /dev/nvme0n1p5 # stratis fs create spool1 sfs1 # stratis fs create spool1 sfs2 # stratis fs create spool1 sfs3 I used the following /etc/fstab lines: Test 1 (obtained via grepping for the relevant lines in /proc/mounts, and then replacing the xfs mount parameters with "defaults"): /dev/mapper/stratis-1-60027c704fc14cbaaf7254da329ce044-thin-fs-2087ea789725457a9b8a1a6c751e0fa4 /mnt/sfs1 xfs defaults 0 0 /dev/mapper/stratis-1-60027c704fc14cbaaf7254da329ce044-thin-fs-1ad9e5d623124ccbb21ae0003033fd42 /mnt/sfs2 xfs defaults 0 0 /dev/mapper/stratis-1-60027c704fc14cbaaf7254da329ce044-thin-fs-9ae62c03ec4e44578df51b45d4cda605 /mnt/sfs3 xfs defaults 0 0 Test 2 (obtained by running "blkid /stratis/spool1/sfs1", etc.): UUID=2087ea78-9725-457a-9b8a-1a6c751e0fa4 /mnt/sfs1 xfs defaults 0 0 UUID=1ad9e5d6-2312-4ccb-b21a-e0003033fd42 /mnt/sfs2 xfs defaults 0 0 UUID=9ae62c03-ec4e-4457-8df5-1b45d4cda605 /mnt/sfs3 xfs defaults 0 0 In both cases, my test system booted successfully, without any problems. To clarify my question about the block device being used for the Stratis pool: do you know if it's a device that may take longer to initialize than a standard SATA or NVMe storage device? (Even if it isn't, this may be a useful data point.) The block devices that are used are virtual block devices run whit virtual box on my apple notebook. they are vdi files used by virtualbox and the block device that is buildup whit lvm has no problems while booting. I will test this on my kvm environment on my centos 7 server tonight (I'm in the CET timezone) so i can rule out Virtualbox. never thought of that to be honest. Seeing the same thing on RHEL 8.0 Fails on Boot, then when I login and start stratisd normally from the command line. It starts fine. I have `noauto,x-systemd.automount,defaults,x-systemd.requires=stratisd.service` in the `/etc/fstab file` To let it boot without going to emergency.target Else with normal fstab options `defaults,x-systemd.requires=stratisd.service 0 0` it drops to emergency.target. After failing to mount. # journalctl -xb ... Sep 11 15:42:32 rhel8-9999.local stratisd[676]: IO error: No such file or directory (os error 2) Sep 11 15:42:32 rhel8-9999.local systemd[1]: stratisd.service: Main process exited, code=exited, status=1/FAILURE Sep 11 15:42:32 rhel8-9999.local systemd[1]: stratisd.service: Failed with result 'exit-code'. ... Sep 11 15:42:54 rhel8-9999.local dracut[3102]: dracut module 'stratis' will not be installed, because command 'stratisd-init' could not be found! # [devops@rhel8-9999 ~]$ cat /etc/os-release NAME="Red Hat Enterprise Linux" VERSION="8.0 (Ootpa)" ID="rhel" ID_LIKE="fedora" VERSION_ID="8.0" PLATFORM_ID="platform:el8" PRETTY_NAME="Red Hat Enterprise Linux 8.0 (Ootpa)" ANSI_COLOR="0;31" CPE_NAME="cpe:/o:redhat:enterprise_linux:8.0:GA" HOME_URL="https://www.redhat.com/" BUG_REPORT_URL="https://bugzilla.redhat.com/" REDHAT_BUGZILLA_PRODUCT="Red Hat Enterprise Linux 8" REDHAT_BUGZILLA_PRODUCT_VERSION=8.0 REDHAT_SUPPORT_PRODUCT="Red Hat Enterprise Linux" REDHAT_SUPPORT_PRODUCT_VERSION="8.0" [devops@rhel8-9999 ~]$ // ================================================= [devops@rhel8-9999 ~]$ rpm -qi stratis-cli stratisd Name : stratis-cli Version : 1.0.2 Release : 1.el8 Architecture: noarch Install Date: Fri 06 Sep 2019 09:59:12 AEST Group : Unspecified Size : 129260 License : ASL 2.0 Signature : RSA/SHA256, Fri 14 Dec 2018 15:35:00 AEDT, Key ID 199e2f91fd431d51 Source RPM : stratis-cli-1.0.2-1.el8.src.rpm Build Date : Wed 12 Dec 2018 10:26:04 AEDT Build Host : x86-vm-08.build.eng.bos.redhat.com Relocations : (not relocatable) Packager : Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla> Vendor : Red Hat, Inc. URL : https://github.com/stratis-storage/stratis-cli Summary : Command-line tool for interacting with the Stratis daemon Description : stratis provides a command-line interface (CLI) for interacting with the Stratis daemon, stratisd. stratis interacts with stratisd via D-Bus. Name : stratisd Version : 1.0.3 Release : 1.el8 Architecture: x86_64 Install Date: Fri 06 Sep 2019 09:59:11 AEST Group : Unspecified Size : 4215640 License : MPLv2.0 Signature : RSA/SHA256, Tue 15 Jan 2019 07:33:12 AEDT, Key ID 199e2f91fd431d51 Source RPM : stratisd-1.0.3-1.el8.src.rpm Build Date : Fri 11 Jan 2019 15:50:20 AEDT Build Host : x86-vm-07.build.eng.bos.redhat.com Relocations : (not relocatable) Packager : Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla> Vendor : Red Hat, Inc. URL : https://github.com/stratis-storage/stratisd Summary : Daemon that manages block devices to create filesystems Description : Daemon that manages block devices to create filesystems. [devops@rhel8-9999 ~]$ // ================================================= [devops@rhel8-9999 ~]$ cat /etc/fstab # # /etc/fstab # Created by anaconda on Fri Sep 6 09:42:49 2019 # # Accessible filesystems, by reference, are maintained under '/dev/disk/'. # See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info. # # After editing this file, run 'systemctl daemon-reload' to update systemd # units generated from this file. # /dev/mapper/rhel-root / xfs defaults 0 0 UUID=2fd90641-d8a1-4cb2-b3f9-397cd8e3a8d1 /boot xfs defaults 0 0 /dev/mapper/rhel-home /home xfs defaults 0 0 /dev/mapper/rhel-var /var xfs defaults 0 0 /dev/mapper/rhel-swap swap swap defaults 0 0 #UUID=22ef2306-2bd4-406a-aefa-e1b71c918341 /mnt/data/1 xfs nofail,defaults,x-systemd.requires=stratisd.service 0 0 #UUID=5dd9b7b5-4589-457f-ac49-f51d3aa7cd56 /mnt/data/2 xfs noauto,x-systemd.automount,defaults,x-systemd.requires=stratisd.service 0 0 UUID=22ef2306-2bd4-406a-aefa-e1b71c918341 /mnt/data/1 xfs noauto,x-systemd.automount,defaults,x-systemd.requires=stratisd.service 0 0 UUID=5dd9b7b5-4589-457f-ac49-f51d3aa7cd56 /mnt/data/2 xfs noauto,x-systemd.automount,defaults,x-systemd.requires=stratisd.service 0 0 // ================================================= [devops@rhel8-9999 ~]$ sudo stratis blockdev list Pool Name Device Node Physical Size State Tier stratis_test /dev/sda4 1.48 TiB In-use Data stratis_test /dev/sda5 1.48 TiB In-use Data [devops@rhel8-9999 ~]$ // ================================================== [devops@rhel8-9999 ~]$ df -h Filesystem Size Used Avail Use% Mounted on devtmpfs 16G 0 16G 0% /dev tmpfs 16G 0 16G 0% /dev/shm tmpfs 16G 8.8M 16G 1% /run tmpfs 16G 0 16G 0% /sys/fs/cgroup /dev/mapper/rhel-root 50G 2.2G 48G 5% / /dev/sda2 1014M 232M 783M 23% /boot /dev/mapper/rhel-home 200G 2.0G 198G 1% /home /dev/mapper/rhel-var 1.0T 11G 1013G 2% /var tmpfs 3.1G 0 3.1G 0% /run/user/1000 [devops@rhel8-9999 ~]$ sudo mount /stratis/stratis_test/ fs_howto my_precious [devops@rhel8-9999 ~]$ sudo mount /stratis/stratis_test/fs_howto /mnt/data/1/ [devops@rhel8-9999 ~]$ df -h Filesystem Size Used Avail Use% Mounted on devtmpfs 16G 0 16G 0% /dev tmpfs 16G 0 16G 0% /dev/shm tmpfs 16G 8.8M 16G 1% /run tmpfs 16G 0 16G 0% /sys/fs/cgroup /dev/mapper/rhel-root 50G 2.2G 48G 5% / /dev/sda2 1014M 232M 783M 23% /boot /dev/mapper/rhel-home 200G 2.0G 198G 1% /home /dev/mapper/rhel-var 1.0T 11G 1013G 2% /var tmpfs 3.1G 0 3.1G 0% /run/user/1000 /dev/mapper/stratis-1-b165796c3eff42cb8decc0083fdad600-thin-fs-5dd9b7b54589457fac49f51d3aa7cd56 1.0T 14G 1011G 2% /mnt/data/1 [devops@rhel8-9999 ~]$ sudo mount /stratis/stratis_test/my_precious /mnt/data/2 [devops@rhel8-9999 ~]$ df -h Filesystem Size Used Avail Use% Mounted on devtmpfs 16G 0 16G 0% /dev tmpfs 16G 0 16G 0% /dev/shm tmpfs 16G 8.8M 16G 1% /run tmpfs 16G 0 16G 0% /sys/fs/cgroup /dev/mapper/rhel-root 50G 2.2G 48G 5% / /dev/sda2 1014M 232M 783M 23% /boot /dev/mapper/rhel-home 200G 2.0G 198G 1% /home /dev/mapper/rhel-var 1.0T 11G 1013G 2% /var tmpfs 3.1G 0 3.1G 0% /run/user/1000 /dev/mapper/stratis-1-b165796c3eff42cb8decc0083fdad600-thin-fs-5dd9b7b54589457fac49f51d3aa7cd56 1.0T 14G 1011G 2% /mnt/data/1 /dev/mapper/stratis-1-b165796c3eff42cb8decc0083fdad600-thin-fs-22ef23062bd4406aaefae1b71c918341 1.0T 7.2G 1017G 1% /mnt/data/2 [devops@rhel8-9999 ~]$ // ========================================================================================= I have an sosreport if required. Yes, please do send an sosreport, if you can. The output from "sos_commands/systemd/systemd-analyze*" may help to diagnose any timing-related issues. One other question: before the stratisd error of "No such file or directory", was there a systemd error with the text "failed with result 'timeout'"? We can not rule out the possibility that this occurs due to failure to open the pid file. The pid file itself is opened with the create flag, but if the directory "/var/run" for some reason does not exist at the time that the attempt is made to open the pid file, then the IO error that we see will result. Created attachment 1614657 [details]
SOS Report from Stratisd Not Starting at boot
SOS Report from Stratisd Not Starting at boot
https://github.com/stratis-storage/stratisd/issues/1630 is the issue on Github that we'll be using to track the resolution of this problem. stratisd >= 1.0.6 is available in Fedora >= 32 |