Bug 1537845
Summary: | Won't power off or reboot | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Todd <ToddAndMargo> | ||||||
Component: | kernel | Assignee: | Kernel Maintainer List <kernel-maint> | ||||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||
Severity: | high | Docs Contact: | |||||||
Priority: | unspecified | ||||||||
Version: | 27 | CC: | airlied, ajax, bskeggs, ewk, hdegoede, ichavero, itamar, jarodwilson, jcline, jglisse, john.j5live, jonathan, josef, kernel-maint, linville, mchehab, mjg59, steved | ||||||
Target Milestone: | --- | ||||||||
Target Release: | --- | ||||||||
Hardware: | x86_64 | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | If docs needed, set a value | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2018-03-22 17:37:04 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Attachments: |
|
Description
Todd
2018-01-24 00:41:17 UTC
This data may help: # lsblk --list NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 894.3G 0 disk sdb 8:16 0 894.3G 0 disk sdc 8:32 0 894.3G 0 disk sdd 8:48 0 1.8T 0 disk sdd1 8:49 0 1.8T 0 part md126 9:126 0 849.5G 0 raid1 md126 9:126 0 849.5G 0 raid1 md126p1 259:0 0 200M 0 md /boot/efi md126p1 259:0 0 200M 0 md /boot/efi md126p2 259:1 0 1G 0 md /boot md126p2 259:1 0 1G 0 md /boot md126p3 259:2 0 15.7G 0 md [SWAP] md126p3 259:2 0 15.7G 0 md [SWAP] md126p4 259:3 0 832.7G 0 md / md126p4 259:3 0 832.7G 0 md / # find / -type d -iname md127 /sys/devices/virtual/block/md127 # ls -al /sys/devices/virtual/block/md127 total 0 drwxr-xr-x. 9 root root 0 Jan 23 16:14 . drwxr-xr-x. 5 root root 0 Jan 23 16:14 .. -r--r--r--. 1 root root 4096 Jan 23 19:41 alignment_offset lrwxrwxrwx. 1 root root 0 Jan 23 19:41 bdi -> ../../bdi/9:127 -r--r--r--. 1 root root 4096 Jan 23 19:41 capability -r--r--r--. 1 root root 4096 Jan 23 16:14 dev -r--r--r--. 1 root root 4096 Jan 23 19:41 discard_alignment -r--r--r--. 1 root root 4096 Jan 23 19:41 ext_range drwxr-xr-x. 2 root root 0 Jan 23 19:10 holders -r--r--r--. 1 root root 4096 Jan 23 19:41 inflight drwxr-xr-x. 2 root root 0 Jan 23 19:10 integrity drwxr-xr-x. 5 root root 0 Jan 23 16:14 md drwxr-xr-x. 2 root root 0 Jan 23 19:10 power drwxr-xr-x. 2 root root 0 Jan 23 19:06 queue -r--r--r--. 1 root root 4096 Jan 23 19:41 range -r--r--r--. 1 root root 4096 Jan 23 19:41 removable -r--r--r--. 1 root root 4096 Jan 23 19:41 ro -r--r--r--. 1 root root 4096 Jan 23 19:06 size drwxr-xr-x. 2 root root 0 Jan 23 19:10 slaves -r--r--r--. 1 root root 4096 Jan 23 19:41 stat lrwxrwxrwx. 1 root root 0 Jan 23 16:14 subsystem -> ../../../../class/block drwxr-xr-x. 2 root root 0 Jan 23 19:10 trace -rw-r--r--. 1 root root 4096 Jan 23 16:14 uevent # lsof | grep md127 mdmon 784 root 3u unix 0xffff9c9d18561800 0t0 18189 /run/mdadm/md127.sock type=STREAM mdmon 784 786 root 3u unix 0xffff9c9d18561800 0t0 18189 /run/mdadm/md127.sock type=STREAM Installation media: https://download.fedoraproject.org/pub/fedora/linux/releases/27/Spins/x86_64/iso/Fedora-Xfce-Live-x86_64-27-1.6.iso Created attachment 1390426 [details]
Server #1 shutdown free at kill all
I now have two servers doing this. Both seize at the killall. This screen shot is the clearer of the two
Created attachment 1390427 [details]
Server #2 shutdown free at kill all
This screen shot is blurry, but if you squint right, the issue is also at the killall stage (I confirmed it with the customer).
Extra information, possibly a clue. I now have a server in my office with the identical motherboard, memory sticks, processor, and operating system installed from the same medium. My server has no such shutdown issue. The difference is that my server is using SATA and not RSTe RAID. If interest, my server is set to boot from either a (BIOS) SATA drive or a (EUFI) NVMe drive. The issue does not occur from either drive. Looking at the screen shot and comparing to my non-software raid computers I see both have the following: Mar 07 02:16:23 rn6.foo.local systemd[1]: Reached target Shutdown. But following is missing from the raid ones: Mar 07 02:16:23 rn6.foo.local systemd[1]: Reached target Final Step. Mar 07 02:16:23 rn6.foo.local systemd[1]: Starting Power-Off... Mar 07 02:16:23 rn6.foo.local audit[1]: USER_AVC pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='avc: received setenforce notice (enforcing=1) exe="/usr/lib/systemd/systemd" sauid=0 hostname=? addr=? terminal=?' Mar 07 02:16:23 rn6.foo.local systemd[1]: Shutting down. Mar 07 02:16:24 rn6.foo.local systemd-shutdown[1]: Sending SIGTERM to remaining processes... Mar 07 02:16:24 rn6.foo.local systemd-journald[1003]: Journal stopped The raid one get the following instead: mount: /oldsys/sys: filesystem was mounted, but subsequent operations failed: Permission denied mount: /oldsys/proc: filesystem was mounted, but subsequent operations failed: Permission denied mount: /oldsys/run: filesystem was mounted, but subsequent operations failed: Permission denied mount: /oldsys/dev: filesystem was mounted, but subsequent operations failed: Permission denied mdmon: Error connecting monitor with /run/mdadm/md127.sock: Permission denied dracut Warning: Killing all remaining processes Ran updates and retested. Went through about ten reboot and three poweroff's on he worst offending server, which froze all the time without a hitch. The second server that froze about 50% of the time, went through two reboots without a hitch. And now they are both working perfectly. THANK YOU! $ uname -r 4.15.8-300.fc27.x86_64 Great, thanks for letting us know. |