Description of problem: mdmonitor service fails to start after upgrade to fedora 35 Version-Release number of selected component (if applicable): mdadm-4.2-rc2.fc35.x86_64 How reproducible: always Steps to Reproduce: 1. boot/re-boot Actual results: from logs Starting Software RAID monitoring and management... mdmonitor.service: Can't open PID file /run/mdadm/mdadm.pid (yet?) after start: Operation not permitted mdmonitor.service: Failed with result 'protocol'. Failed to start Software RAID monitoring and management. Expected results: Starting Software RAID monitoring and management... Started Software RAID monitoring and management. Additional info: the service starts ok when restarted manually after the machine boot completes
Hi Jan I tried to reproduce this in my environment and I didn't reproduce it. The steps I did are: 1. Install f35 2. Create some loop devices and create a raid1 3. mdadm -Es > /etc/mdadm.conf (mdmonitor service needs the config file) 4. systemctl start mdmonitor 5. systemctl status mdmonitor ● mdmonitor.service - Software RAID monitoring and management Loaded: loaded (/usr/lib/systemd/system/mdmonitor.service; enabled; vendor preset: enabled) Active: active (running) since Mon 2021-11-15 20:10:36 EST; 5s ago Process: 81616 ExecStart=/sbin/mdadm --monitor --scan --syslog -f --pid-file=/run/mdadm/mdadm.pid (code=exited, status=0/> Main PID: 81617 (mdadm) Tasks: 1 (limit: 14247) Memory: 456.0K CPU: 6ms CGroup: /system.slice/mdmonitor.service └─81617 /sbin/mdadm --monitor --scan --syslog -f --pid-file=/run/mdadm/mdadm.pid The mdadm.pid file is created automatically after running `systemctl start mdmonitor` How about restart your mdmonitor service? Can it success? Thanks Xiao
From the opening post: "the service starts ok when restarted manually after the machine boot completes" so yes, manual restarting the mdmonitor service works.
(In reply to Jan Vesely from comment #2) > From the opening post: > > "the service starts ok when restarted manually after the machine boot > completes" > > so yes, manual restarting the mdmonitor service works. How about lsblk after boot? So I can try to reproduce this problem in my machine.
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS sda 8:0 0 931.5G 0 disk ├─md126 9:126 0 1.8T 0 raid0 │ ├─md126p1 259:4 0 260M 0 part │ ├─md126p2 259:5 0 16M 0 part │ ├─md126p3 259:6 0 325.8G 0 part │ ├─md126p4 259:7 0 995M 0 part │ └─md126p5 259:8 0 1.5T 0 part │ └─luks-f3e4f35e-5a93-4c2a-a1b2-56da7dc057b6 │ 253:4 0 1.5T 0 crypt /mnt/big-data └─md127 9:127 0 0B 0 md sdb 8:16 0 931.5G 0 disk ├─md126 9:126 0 1.8T 0 raid0 │ ├─md126p1 259:4 0 260M 0 part │ ├─md126p2 259:5 0 16M 0 part │ ├─md126p3 259:6 0 325.8G 0 part │ ├─md126p4 259:7 0 995M 0 part │ └─md126p5 259:8 0 1.5T 0 part │ └─luks-f3e4f35e-5a93-4c2a-a1b2-56da7dc057b6 │ 253:4 0 1.5T 0 crypt /mnt/big-data └─md127 9:127 0 0B 0 md zram0 252:0 0 15.5G 0 disk [SWAP] nvme0n1 259:0 0 238.5G 0 disk ├─nvme0n1p1 259:1 0 200M 0 part /boot/efi ├─nvme0n1p2 259:2 0 1G 0 part /boot └─nvme0n1p3 259:3 0 237.3G 0 part └─luks-a99fc9de-32dd-43f7-9133-699710a861ef 253:0 0 237.3G 0 crypt ├─fedora-root 253:1 0 50G 0 lvm / ├─fedora-swap 253:2 0 15.7G 0 lvm [SWAP] └─fedora-home 253:3 0 171.6G 0 lvm /home sorry for the delay
I have this same problem on a virtual private server (VPS) at OVH that I just upgraded from Fedora 34 to 35. NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS sda 8:0 0 1.8T 0 disk ├─sda1 8:1 0 511M 0 part ├─sda2 8:2 0 1.8T 0 part │ └─md2 9:2 0 3.6T 0 raid0 /var/lib/containers/storage/overlay │ / ├─sda3 8:3 0 512M 0 part [SWAP] └─sda4 8:4 0 2M 0 part sdb 8:16 0 1.8T 0 disk ├─sdb1 8:17 0 511M 0 part /boot/efi ├─sdb2 8:18 0 1.8T 0 part │ └─md2 9:2 0 3.6T 0 raid0 /var/lib/containers/storage/overlay │ / └─sdb3 8:19 0 512M 0 part [SWAP]
still fails the same way in f36
I see similar failures in at least one more service: dnsmasq[1675]: chown of PID file /run/nm-dnsmasq-wlo1.pid failed: Operation not permitted but in the case of dnsmasq it doesn't result in service failure. It looks like the "chown of PID file" is comming from systemd instead of the mdadm daemon. I tried to follow the "yet?" part and changed mdmonitor service file to include: ExecStart=sh -c "/sbin/mdadm --monitor --scan --syslog -f --pid-file=/run/mdadm/mdadm.pid && sleep 5" This instead causes the issue to be 100% reproducible even when starting mdmonitor manually. The experiment points to the issue being that there's a race between systemd accessing the file set up in "PIDFile=" and mdadm deleting the pidfile after exiting. Removing the "PIDFile=" part of mdmonitor.service file works around the issue.
This message is a reminder that Fedora Linux 35 is nearing its end of life. Fedora will stop maintaining and issuing updates for Fedora Linux 35 on 2022-12-13. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as EOL if it remains open with a 'version' of '35'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, change the 'version' to a later Fedora Linux version. Thank you for reporting this issue and we are sorry that we were not able to fix it before Fedora Linux 35 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora Linux, you are encouraged to change the 'version' to a later version prior to this bug being closed.
Fedora Linux 35 entered end-of-life (EOL) status on 2022-12-13. Fedora Linux 35 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora Linux please feel free to reopen this bug against that version. Note that the version field may be hidden. Click the "Show advanced fields" button if you do not see the version field. If you are unable to reopen this bug, please file a new report against an active release. Thank you for reporting this bug and we are sorry it could not be fixed.
reopening, the issue is still present in fedora 37
This message is a reminder that Fedora Linux 37 is nearing its end of life. Fedora will stop maintaining and issuing updates for Fedora Linux 37 on 2023-12-05. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as EOL if it remains open with a 'version' of '37'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, change the 'version' to a later Fedora Linux version. Note that the version field may be hidden. Click the "Show advanced fields" button if you do not see it. Thank you for reporting this issue and we are sorry that we were not able to fix it before Fedora Linux 37 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora Linux, you are encouraged to change the 'version' to a later version prior to this bug being closed.
Fedora Linux 37 entered end-of-life (EOL) status on None. Fedora Linux 37 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora Linux please feel free to reopen this bug against that version. Note that the version field may be hidden. Click the "Show advanced fields" button if you do not see the version field. If you are unable to reopen this bug, please file a new report against an active release. Thank you for reporting this bug and we are sorry it could not be fixed.
Hello all. I am writing in this ticket as I have experienced this issue today in Fedora 41. I am running on a Mac Pro 5,1 and have 4x 1TB HDDs in each SATA2 port. >>> Removing the "PIDFile=" part of mdmonitor.service file works around the issue. Jan's advice holds sound to this day. It seems there is a race condition when the service is being started, which then causes the service to fail because it seemingly cannot both write/read the file. systemd[1]: mdmonitor.service: Can't open PID file /run/mdadm/mdadm.pid (yet?) after start: No such file or directory Here is the steps for reproduction: # Create the RAID sudo mdadm --create --verbose /dev/md0 --level=0 --raid-devices=4 /dev/sd[a-d]1 # Create ext4 fs sudo mkfs.ext4 /dev/md0 # Create /etc/mdadm.conf from assembled array sudo mdadm --detail --scan --verbose | sudo tee -a /etc/mdadm.conf # Add to fstab and set nofail otherwise feel pain sudo blkid /dev/md0 UUID=36149ec5-d782-4ad7-9dc0-962537b9870b /mnt/data ext4 defaults,nofail 0 0 sudo mount -a # Make it mine :) sudo chown $(whoami) /mnt/data # I am unsure if this helps really but I've seen that it should be done sudo dracut --force --add mdraid # Reboot sudo systemctl reboot Intended result: the 'ConditionPathExists=/etc/mdadm.conf' line in the service config should kick the service into running at next boot, therefore the RAID should mount. Problem: The service starts now that the config is there. However, the service dies: Feb 15 16:07:07 hostname systemd[1]: Starting mdmonitor.service - Software RAID monitoring and management... Feb 15 16:07:07 hostname systemd[1]: mdmonitor.service: Can't open PID file /run/mdadm/mdadm.pid (yet?) after start: No such file or directory Feb 15 16:07:07 hostname systemd[1]: mdmonitor.service: Failed with result 'protocol'. Feb 15 16:07:07 hostname systemd[1]: Failed to start mdmonitor.service - Software RAID monitoring and management. Workaround/Solution: sudo systemctl edit -full mdmonitor.service Comment out this line #PIDFile=/run/mdadm/mdadm.pid The service will start on next boot and mount the RAID in the correct mountpoint. :D) However, if something goes wrong I assume systemd not knowing the PID will cause trouble.