Bug 1375002
Summary: | Intel IMSM raid starts resync after each reboot | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Oleg Samarin <osamarin68> | ||||||||||
Component: | mdadm | Assignee: | XiaoNi <xni> | ||||||||||
Status: | CLOSED EOL | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||||||
Severity: | high | Docs Contact: | |||||||||||
Priority: | unspecified | ||||||||||||
Version: | 24 | CC: | agk, dledford, Jes.Sorensen, osamarin68, xni | ||||||||||
Target Milestone: | --- | ||||||||||||
Target Release: | --- | ||||||||||||
Hardware: | x86_64 | ||||||||||||
OS: | Linux | ||||||||||||
Whiteboard: | |||||||||||||
Fixed In Version: | Doc Type: | If docs needed, set a value | |||||||||||
Doc Text: | Story Points: | --- | |||||||||||
Clone Of: | Environment: | ||||||||||||
Last Closed: | 2017-08-08 17:17:23 UTC | Type: | Bug | ||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||
Documentation: | --- | CRM: | |||||||||||
Verified Versions: | Category: | --- | |||||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||||
Embargoed: | |||||||||||||
Attachments: |
|
Hi Oleg As I know, it should do resync after creating the fake raid. What will it happen if you let the resync to finish? Thanks Xiao In the log attached resync has finished, but it starts again after reboot. More tests: there was't such problem in f23. But if I boot with F24, wait for resync finishes, then reboot to F23, then resync startes again in F23. So the root problem is not at the start time, but at the shutdown time. Seems F24 unmounts some filesystems incorrectly, that triggers resync after boot. How to force mdadm to log the reason, why it starts resync? I checked the dmesg. It starts resync because the raid is not clean. Are those underlying devices OK? Can you have a try with other devices? сен 11 19:52:15 oleg2.oleghome kernel: md: bind<sda> сен 11 19:52:15 oleg2.oleghome systemd-udevd[499]: inotify_add_watch(8, /dev/sda2, 10) failed: No such file or directory сен 11 19:52:15 oleg2.oleghome systemd-udevd[518]: inotify_add_watch(8, /dev/sda1, 10) failed: No such file or directory сен 11 19:52:15 oleg2.oleghome kernel: md: bind<sdb> сен 11 19:52:15 oleg2.oleghome kernel: md: bind<sdb> сен 11 19:52:15 oleg2.oleghome kernel: md: bind<sda> сен 11 19:52:15 oleg2.oleghome kernel: md: raid1 personality registered for level 1 сен 11 19:52:15 oleg2.oleghome kernel: md/raid1:md126: not clean -- starting background reconstruction сен 11 19:52:15 oleg2.oleghome kernel: md/raid1:md126: active with 2 out of 2 mirrors сен 11 19:52:15 oleg2.oleghome kernel: md126: detected capacity change from 0 to 2000395698176 All devices are OK. F23 on the same machine does not trigger resync. What does "clean" mean? Why can they be not clean? There is a judgement in raid1.c if (mddev->recovery_cp != MaxSector) printk(KERN_NOTICE "md/raid1:%s: not clean" " -- starting background reconstruction\n", mdname(mddev)); If the raid device have finished the resync, the recovery_cp should be MaxSector. After rebooting, the raid device is created and it checked recovery_cp. And it goes into this judgement. I have tried in my workstation. I just have two disks in my workstation. First I created raid1 in intel BIOS tools. Then I installed f24 on raid1 device. The partition I used is the default choose. I clicked automatically button. I can't reproduce this problem. Can you add some logs in the codes and build a new kernel to check where and when the recovery_cp is modified? May be this behavior is a consequence of https://bugzilla.redhat.com/show_bug.cgi?id=1371991 ? Did you see the same log "mdadm: error waiting for XXX to become clean" during shutdown? Yes, I did. Created attachment 1203001 [details]
screenshot of shutdown -H with Fedora24
Created attachment 1203002 [details]
screenshot of shutdown -H with Fedora23
Another hypotese - the red message 'Failed unmounting /var'. May be it marks the raid as not clean? But in the screenshot from fedora 23 the same message is present (white instead of red), and it did not force resync upon the next startup. I added more debugging to mdadm and repeated shutdown .H. Monitor.c WaitClean: ------------------------------------------------ /* wait for array_state to be clean */ while (1) { rv = read(state_fd, buf, sizeof(buf)); if (rv < 0) break; if (sysfs_match_word(buf, clean_states) <= 4) break; rv = sysfs_wait(state_fd, &delay); if (rv < 0 && errno != EINTR) break; lseek(state_fd, 0, SEEK_SET); } if (rv < 0) rv = 1; else if (fping_monitor(sock) == 0 || ping_monitor(mdi->text_version) == 0) { /* we need to ping to close the window between array * state transitioning to clean and the metadata being * marked clean */ rv = 0; } else rv = 1; if (rv && verbose) pr_err("Error waiting for %s to be clean\n", dev); ------------------------------------------------- breaks the loop with condution 'if (sysfs_match_word(buf, clean_states) <= 4)', because in my case buf == "clean" Then it tries to call fping_monitor(sock) and ping_monitor(mdi->text_version). They both returned -1, so the rv variable remain be equal to 1. So waiting for clean was successful, but (f)ping_monitor - no. Maybe because dracut had killed the remaining processes before. But it made the same in F23. I think we are close to the key reason. I can't see the message during the shutdown. The function WaitClean is called in misc_scan and misc_list. I don't know who call them at shutdown time. Can you help me to add some log to check this? And the reason you guess maybe right. Can you add some logs in ping_monitor to check where it returned -1? We don't need to check fping_monitor, the sock it received always be -1. I've already added logs. ping_monitor returns -1 in my case. Where should I add logs for misc_scan and misc_list? Both misc_scan and misc_list are called from mdadm.c::main while processing 'mdadm misc' command Something like this: @@ -227,29 +227,50 @@ int main(int argc, char *argv[]) break; case MiscOpt: + printf("MISC : MiscOpt\n"); case 'D': + printf("MISC : D\n"); case 'E': + printf("MISC : E\n"); case 'X': + printf("MISC : X\n"); case 'Q': + printf("MISC : Q\n"); case ExamineBB: + printf("MISC : ExamineBB\n"); case Dump: + printf("MISC : Dump\n"); case Restore: + printf("MISC : Restore\n"); case Action: + printf("MISC : Action\n"); newmode = MISC; break; case 'R': + printf("MISC : R\n"); case 'S': + printf("MISC : S\n"); case 'o': + printf("MISC : o\n"); case 'w': + printf("MISC : w\n"); case 'W': + printf("MISC : W\n"); case WaitOpt: + printf("MISC : WaitOpt\n"); case Waitclean: + printf("MISC : Waitclean\n"); case DetailPlatform: + printf("MISC : DetailPlatform\n"); case KillSubarray: + printf("MISC : KillSubarray\n"); case UpdateSubarray: + printf("MISC : UpdateSubarray\n"); case UdevRules: + printf("MISC : UdevRules\n"); case KillOpt: + printf("MISC : KillOpt\n"); @@ -1511,16 +1532,19 @@ int main(int argc, char *argv[]) if (devmode == 'S' && c.scan) rv = stop_scan(c.verbose); else if ((devmode == 'D' || devmode == Waitclean) && - c.scan) + c.scan) { + printf("Now call misc_scan\n"); rv = misc_scan(devmode, &c); - else if (devmode == UdevRules) + } else if (devmode == UdevRules) rv = Write_rules(udev_filename); else { pr_err("No devices given.\n"); exit(2); } - } else + } else { + printf("Now call misc_list\n"); rv = misc_list(devlist, &ident, dump_directory, ss, &c); + } break; I know it's not a smart way to check the problem. But it's a way to check where calls Waitclean. mdadm was running two time during shutdown. 1. [ 1474.787226] dracut: Waiting for mdraid devices to be clean. [ 1474.788096] dracut: MISC : Waitclean [ 1474.788159] dracut: MISC : DetailPlatform [ 1474.788214] dracut: MISC : KillSubarray [ 1474.788266] dracut: MISC : UpdateSubarray [ 1474.788317] dracut: MISC : UdevRules [ 1474.788367] dracut: MISC : KillOpt [ 1474.788418] dracut: Now call misc_scan 2. [ 1474.788653] dracut: Disassembling mdraid devices. [ 1474.789517] dracut: MISC : S [ 1474.789596] dracut: MISC : o [ 1474.789664] dracut: MISC : w [ 1474.789730] dracut: MISC : W [ 1474.789796] dracut: MISC : WaitOpt [ 1474.789878] dracut: MISC : Waitclean [ 1474.789949] dracut: MISC : DetailPlatform [ 1474.790016] dracut: MISC : KillSubarray [ 1474.790084] dracut: MISC : UpdateSubarray [ 1474.790150] dracut: MISC : UdevRules [ 1474.790216] dracut: MISC : KillOpt Created attachment 1204261 [details]
dmesg with 'setenforce 0'
If I run 'setenforce 0' before shutdown, the raid remains clean across reboot.
There is a dmesg result just before system halt.
The reason of 'Error waiting for %s to be clean' is a wrong selinux context of md127.sock.
The reason of making raid not clean is a wrong selinux context of map.new.
So, I'd like to have following: 1. Change an error message from 'Error waiting for %s to be clean' to 'Unable to update metadata: permission denied on md127.sock' 2. Add an error message 'Error modifying map.new: permission denied' 3. Correct selinux policy for md127.sock and map.new These are the command run by dracut during shutdown (from /usr/lib/dracut/modules.d/90mdraid/md-shutdown.sh): local _offroot=$(strstr "$(mdadm --help-options 2>&1)" offroot && echo --offroot) info "Waiting for mdraid devices to be clean." mdadm $_offroot -vv --wait-clean --scan| vinfo ret=$? info "Disassembling mdraid devices." mdadm $_offroot -vv --stop --scan | vinfo ret=$(($ret+$?)) I saw the following in the dmesg [ 4.365599] systemd[1]: Successfully loaded SELinux policy in 89.640ms. [ 4.381281] systemd[1]: RTC configured in localtime, applying delta of 180 minutes to system time. [ 4.402737] systemd[1]: Unable to fix SELinux security context of /run/mdadm/md127.sock: Permission denied Seems it causes the problems with comunication to mdmon at shutdown I've created a separate bug 1379044 for correcting selinux policy. If it solve the issue, this bug will relate only to the incorrect error message upon shutdown A solution was proposed at https://bugzilla.redhat.com/show_bug.cgi?id=1379044 Thanks for the analysis. So now it's a SElinux bug. Can we close this now? (In reply to XiaoNi from comment #25) > Thanks for the analysis. So now it's a SElinux bug. Can we close this now? You probably need to reassign it to selinux-policy instead of mdadm Cheers, Jes > Thanks for the analysis. So now it's a SElinux bug. Can we close this now? Could you change the error message 'Error waiting for %s to be clean' to 'Unable to update metadata: permission denied on /run/mdadm/md127.sock' before closing this bug? The message 'Error waiting for %s to be clean' is not correct, because the array is in the clean state. It directs the troubleshooting to the wrong way. > You probably need to reassign it to selinux-policy instead of mdadm The separate bug 1379044 has been already assigned to selinux-policy. I made a mistake in comment17. I forgot the break; in each case. But you gave me the answer that who calls WaitClean in comment21 :) And I read the related codes. I think it's better to add the error message in the function ping_monitor like this: [xni@xiao mdadm]$ git diff diff --git a/msg.c b/msg.c index 45cd450..ffde439 100644 --- a/msg.c +++ b/msg.c @@ -209,8 +209,10 @@ int ping_monitor(char *devname) if (sfd >= 0) { err = fping_monitor(sfd); close(sfd); - } else + } else { + pr_err("Error connecting mdmonfor %s to be clean\n", err = -1; + } return err; } Jes, What's your opinion? Sorry send the patch again, please ignore the patch in comment28. diff --git a/msg.c b/msg.c index 45cd450..dda091f 100644 --- a/msg.c +++ b/msg.c @@ -209,8 +209,10 @@ int ping_monitor(char *devname) if (sfd >= 0) { err = fping_monitor(sfd); close(sfd); - } else + } else { + pr_err("Error connecting monitor\n"); err = -1; + } return err; } Thanks Xiao Could it better to print error message not here, but in connect_monitor with more info? Something like this: ============================================================================= if (connect(sfd, (struct sockaddr*)&addr, sizeof(addr)) < 0) { pr_err("Error connecting monitor with %s: %s\n", addr.sun_path, strerror(errno)); close(sfd); return -1; } ============================================================================= Xiao, I think it's fair to make the error message a little more elaborate. Jes This message is a reminder that Fedora 24 is nearing its end of life. Approximately 2 (two) weeks from now Fedora will stop maintaining and issuing updates for Fedora 24. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as EOL if it remains open with a Fedora 'version' of '24'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version. Thank you for reporting this issue and we are sorry that we were not able to fix it before Fedora 24 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior this bug is closed as described in the policy above. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. Hi Oleg Sorry for this. I missed this one. We already fix this in f26. You can update to f26 to resolve this problem. Thanks Xiao Fedora 24 changed to end-of-life (EOL) status on 2017-08-08. Fedora 24 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. If you are unable to reopen this bug, please file a new report against the current release. If you experience problems, please add a comment to this bug. Thank you for reporting this bug and we are sorry it could not be fixed. The absence of this fix blocks: https://github.com/dracutdevs/dracut/issues/342 Can the patch file please be applied? Please disregard comment #35 -- applied to wrong bug ;( |
Created attachment 1199902 [details] a part of result of journalctl -a Description of problem: After installing Fedora24 with a bios fake raid device it starts rebuilding the array after each reboot. The bios shows the raid state as Normal, but after booting mdadm starts resync. I have 2 not-raided ssd devices and 2 hard disks assembled to one RAID1 volume. I have LVM installation, some volumes are on SSD (/ and /boot) and some are on the raid volume (/var, /home and some jther filesystems). Before I had fedora23 on the same box with the same configuration without such problems. Version-Release number of selected component (if applicable): fedora 24 systemd-229-13.fc24.x86_64 mdadm-3.4-2.fc24.x86_64 lvm2-2.02.150-2.fc24.x86_64 dracut-044-20.fc24.x86_64 How reproducible: Always Steps to Reproduce: 1. Create imsm raid1 in bios 2. Install fedora24 with LVM on top of imsm raid1 3. Reboot Actual results: mdadm starts resync of /dev/md126 Expected results: /dev/md126 in the clean state Additional info: Which more information should I provide?