Bug 692248 - Job dev-disk-by\x2duuid-... timing out (/boot on /dev/md0)
Job dev-disk-by\x2duuid-... timing out (/boot on /dev/md0)
Status: CLOSED ERRATA
Product: Fedora
Classification: Fedora
Component: mdadm (Show other bugs)
15
Unspecified Unspecified
unspecified Severity unspecified
: ---
: ---
Assigned To: Doug Ledford
Fedora Extras Quality Assurance
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2011-03-30 15:39 EDT by Ian Pilcher
Modified: 2011-04-15 17:11 EDT (History)
9 users (show)

See Also:
Fixed In Version: mdadm-3.1.5-2.fc15
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2011-04-15 17:11:51 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
kernel command line (666 bytes, text/plain)
2011-03-30 15:39 EDT, Ian Pilcher
no flags Details
dmesg (163.29 KB, text/plain)
2011-03-30 15:40 EDT, Ian Pilcher
no flags Details
/etc/fstab (1.04 KB, text/plain)
2011-03-30 15:40 EDT, Ian Pilcher
no flags Details
/var/log/messages (88.05 KB, text/plain)
2011-03-30 15:44 EDT, Ian Pilcher
no flags Details
systemctl dump (681.19 KB, text/plain)
2011-03-30 15:45 EDT, Ian Pilcher
no flags Details
/bin/systemd --test --system --log-level=debug (as non-root) (752.88 KB, text/plain)
2011-03-30 15:46 EDT, Ian Pilcher
no flags Details
dmesg with systemd-21-2.fc15.x86_64 and udev-164-9.fc15.x86_64 (221.48 KB, text/plain)
2011-03-30 17:37 EDT, Ian Pilcher
no flags Details

  None (edit)
Description Ian Pilcher 2011-03-30 15:39:29 EDT
Description of problem:
Booting with the (anaconda-created) entry for my /boot filesystem in
/etc/fstab, the boot process pauses for 3 minutes, and the /boot
filesystem is not mounted.  I have tried using "LABEL=/boot ..." in
/etc/fstab, and that also gives me the delay/timeout.

Booting with systemd.log_level=debug gives me the following messages
related to this device/filesystem:

[    9.463964] systemd[1]: Installed new job fsck@dev-disk-by\x2duuid-67a74931\x2d48ac\x2d4bdd\x2d91d2\x2d3e788398d472.service/start as 35
[    9.463968] systemd[1]: Installed new job dev-disk-by\x2duuid-67a74931\x2d48ac\x2d4bdd\x2d91d2\x2d3e788398d472.device/start as 36
[  189.123507] systemd[1]: Job dev-disk-by\x2duuid-67a74931\x2d48ac\x2d4bdd\x2d91d2\x2d3e788398d472.device/start timed out.
[  189.124769] systemd[1]: Job dev-disk-by\x2duuid-67a74931\x2d48ac\x2d4bdd\x2d91d2\x2d3e788398d472.device/start finished, result=timeout
[  189.127221] systemd[1]: Job fsck@dev-disk-by\x2duuid-67a74931\x2d48ac\x2d4bdd\x2d91d2\x2d3e788398d472.service/start finished, result=dependency

Other things to note:

*  md0 is being started.
*  If I specify the device (/dev/md0) in /etc/fstab, boot proceeds normally
   and the filesystem is mounted.
*  All of the attached files were generated with SELinux in permissive mode.
*  The attached files show a boot into runlevel 1.

Version-Release number of selected component (if applicable):
systemd-20-2.fc15.x86_64

How reproducible:
100%

Steps to Reproduce:
See above
  
Actual results:
3 minute pause during boot (systemd timeout); /boot not mounted.

Expected results:
No pause; /boot mounted.

Additional info:
I've seen a very similar problem
Comment 1 Ian Pilcher 2011-03-30 15:39:58 EDT
Created attachment 488882 [details]
kernel command line
Comment 2 Ian Pilcher 2011-03-30 15:40:28 EDT
Created attachment 488883 [details]
dmesg
Comment 3 Ian Pilcher 2011-03-30 15:40:49 EDT
Created attachment 488884 [details]
/etc/fstab
Comment 4 Ian Pilcher 2011-03-30 15:44:36 EDT
Created attachment 488885 [details]
/var/log/messages
Comment 5 Ian Pilcher 2011-03-30 15:45:16 EDT
Created attachment 488886 [details]
systemctl dump
Comment 6 Ian Pilcher 2011-03-30 15:46:11 EDT
Created attachment 488887 [details]
/bin/systemd --test --system --log-level=debug (as non-root)
Comment 7 Michal Schmidt 2011-03-30 17:03:48 EDT
What information does udev have about md0?:
udevadm info --query=all --name=/dev/md0
Comment 8 Ian Pilcher 2011-03-30 17:35:36 EDT
(In reply to comment #7)
> What information does udev have about md0?:
> udevadm info --query=all --name=/dev/md0

Interesting you should ask.  I just noticed that the /dev/disk/by-uuid and
/dev/disk/by-label symlinks were not getting created for md0.  I downgraded
my udev packages to the F15 Alpha version (udev-164-9.fc15.x86_64), and the
symlinks are back:

lrwxrwxrwx. 1 root root  9 Mar 30 16:19 /dev/disk/by-uuid/67a74931-48ac-4bdd-91d2-3e788398d472 -> ../../md0

 lrwxrwxrwx. 1 root root  9 Mar 30 16:19 /dev/disk/by-label/\x2fboot -> ../../md0

Still getting the same timeout, though.

Here is what udev says about md0:

P: /devices/virtual/block/md0
N: md0
S: block/9:0
E: UDEV_LOG=3
E: DEVPATH=/devices/virtual/block/md0
E: MAJOR=9
E: MINOR=0
E: DEVNAME=/dev/md0
E: DEVTYPE=disk
E: SUBSYSTEM=block
E: UDISKS_PRESENTATION_NOPOLICY=1
E: DEVLINKS=/dev/block/9:0
E: TAGS=:systemd:
Comment 9 Ian Pilcher 2011-03-30 17:37:29 EDT
Created attachment 488906 [details]
dmesg with systemd-21-2.fc15.x86_64 and udev-164-9.fc15.x86_64
Comment 10 Ian Pilcher 2011-03-30 17:53:56 EDT
Waited through the timeout and booted up.  I just tried:

sudo systemctl start dev-disk-by\x2duuid-67a74931\x2d48ac\x2d4bdd\x2d91d2\x2d3e788398d472.device

And sure enough it's just sitting there.  ps fax shows:

 2956 pts/2    S+     0:00      \_ sudo systemctl start dev-disk-byx2duuid-67a74931x2d48acx2d4bddx2d91d2x2d3e788398d472.device
 2957 pts/2    S+     0:00          \_ systemctl start dev-disk-byx2duuid-67a74931x2d48acx2d4bddx2d91d2x2d3e788398d472.device
 2958 pts/2    S+     0:00              \_ /bin/systemd-tty-ask-password-agent --watch

Obviously, I don't know if this is the same thing that blocks it during
boot.

There is not encryption on md0, so there's no reason for it to be asking
for a password.  Also, SELinux is in permissive mode and there's nothing
about systemd-tty-ask-password-agent in my audit.log.
Comment 11 Ian Pilcher 2011-03-30 17:59:06 EDT
(In reply to comment #10)
> There is not encryption on md0, so there's no reason for it to be asking
> for a password.  Also, SELinux is in permissive mode and there's nothing
> about systemd-tty-ask-password-agent in my audit.log.

Actually, I did get this denial logged:

type=AVC msg=audit(1301520887.101:145): avc:  denied  { read } for  pid=2732 comm="systemd-tmpfile" name="user" dev=tmpfs ino=17882 scontext=system_u:system_r:systemd_tmpfiles_t:s0 tcontext=system_u:object_r:default_t:s0 tclass=dir


type=SYSCALL msg=audit(1301520887.101:145): arch=x86_64 syscall=open success=yes exit=EINTR a0=2187560 a1=90800 a2=18c27 a3=0 items=0 ppid=1 pid=2732 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm=systemd-tmpfile exe=/bin/systemd-tmpfiles subj=system_u:system_r:systemd_tmpfiles_t:s0 key=(null)

But again, SELinux is in permissive mode, so this should cause the timeout,
right?
Comment 12 Michal Schmidt 2011-03-30 18:02:25 EDT
Right, in permissive mode SELinux will not actually block anything. So it's not the cause of the timeout.

There's suspiciously few lines of information in the udev output in comment #8. There is no sign of knowledge about the filesystem on the device.
Let's blame udev.
Comment 13 Ian Pilcher 2011-03-30 18:34:29 EDT
I've just booted into emergency mode, attempted to "systemctl start" the
device, and verified that it's (apparently) stuck on

  systemd-tty-ask-password-agent --watch

gdb tells me that systemd-tty-ask-password-agent is waiting here:

#0  0x00000030cb0d72b8 in __poll (fds=0x7fff55f30380, nfds=2, timeout=<optimized out>) at ../sysdeps/unix/sysv/linux/poll.c:83
        resultvar = 18446744073709551100
        oldtype = <optimized out>
        result = <optimized out>
#1  0x0000000000402d7c in watch_passwords () at src/tty-ask-password-agent.c:593
        signal_fd = 5
        tty_block_fd = <optimized out>
        mask = {__val = {16386, 0 <repeats 15 times>}}
        notify = 4
        pollfd = {{fd = 4, events = 1, revents = 0}, {fd = 5, events = 1, revents = 0}}
        r = <optimized out>
#2  main (argc=<optimized out>, argv=<optimized out>) at src/tty-ask-password-agent.c:732
        r = <optimized out>
        __func__ = "main"

(In reply to comment #12)
> Right, in permissive mode SELinux will not actually block anything. So it's not
> the cause of the timeout.

Thanks for confirming that.  I'm starting to doubt my own sanity.

> There's suspiciously few lines of information in the udev output in comment #8.
> There is no sign of knowledge about the filesystem on the device.
> Let's blame udev.

Umm ... let's not.  As I noted above, I've already downgraded my udev
packages to the versions that work on a fresh install from the Alpha
DVD.

Also, it seems pretty clear that it's systemd-tty-ask-password-agent
that's hanging up.  In the absence of a reason to think otherwise, that
seems to put the problem pretty squarely in systemd-land.
Comment 14 Ian Pilcher 2011-03-30 22:09:44 EDT
This problem appears to be caused/triggered by mdadm.

Downgrading from 3.1.5-1.fc15 to 3.1.3-0.git20100804.2.fc15 makes
this problem go away.
Comment 15 Michal Schmidt 2011-03-31 04:21:48 EDT
/lib/udev/rules.d/64-md-raid.rules seems to have disappeared from the mdadm package. Was it intentional?
Comment 16 Ian Pilcher 2011-03-31 10:27:06 EDT
(In reply to comment #15)
> /lib/udev/rules.d/64-md-raid.rules seems to have disappeared from the mdadm
> package. Was it intentional?

I just confirmed that mdadm-3.1.5-1.fc15.x86_64 with /lib/udev/rules.d/64-md-raid.rules from mdadm-3.1.3-0.git20100804.2.fc15 boots just fine.  So it looks like the absence of that file is indeed the problem.
Comment 17 Doug Ledford 2011-03-31 12:54:21 EDT
This doesn't make any sense though.  I just checked, and the 64-md-raid.rules file was missing from the previous mdadm too.  Somewhere, it got removed but I couldn't find where the change happened looking through git logs.  But, I also remember adding the file in rawhide before.  So, I readded it and am rebuilding now.
Comment 18 Fedora Update System 2011-03-31 13:09:13 EDT
mdadm-3.1.5-2.fc15 has been submitted as an update for Fedora 15.
https://admin.fedoraproject.org/updates/mdadm-3.1.5-2.fc15
Comment 19 Ian Pilcher 2011-03-31 13:42:19 EDT
Confirmed that mdadm-3.1.5-2.fc15 fixes the problem for me.
Comment 20 Fedora Update System 2011-03-31 15:44:54 EDT
Package mdadm-3.1.5-2.fc15:
* should fix your issue,
* was pushed to the Fedora 15 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing mdadm-3.1.5-2.fc15'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/mdadm-3.1.5-2.fc15
then log in and leave karma (feedback).
Comment 21 Fedora Update System 2011-04-15 17:11:17 EDT
mdadm-3.1.5-2.fc15 has been pushed to the Fedora 15 stable repository.  If problems still persist, please make note of it in this bug report.

Note You need to log in before you can comment on or make changes to this bug.