Bug 1225641

Summary: udevadm settle runs very slowly when there are 256 disks
Product: [Fedora] Fedora Reporter: Richard W.M. Jones <rjones>
Component: systemdAssignee: systemd-maint
Status: CLOSED RAWHIDE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: rawhideCC: johannbg, jsynacek, lnykryn, msekleta, s, systemd-maint, zbyszek
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: systemd-220-3.fc23 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-05-28 09:56:27 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Bug Depends On:    
Bug Blocks: 910269    
Attachments:
Description Flags
virt-rescue log, including udevadm strace none

Description Richard W.M. Jones 2015-05-27 20:56:39 UTC
Created attachment 1030811 [details]
virt-rescue log, including udevadm strace

Description of problem:

'udevadm settle' in systemd >= 220 runs very slowly when there are a
large number (256) of disks and you add a partition to at least one
of the disks.  This behaviour was not seen in earlier versions.

The libguestfs test which fails is:
https://github.com/libguestfs/libguestfs/blob/master/tests/disks/test-max-disks.pl

Version-Release number of selected component (if applicable):

systemd-220-2.fc23.x86_64
kernel-4.1.0-0.rc5.git0.1.fc23.x86_64

How reproducible:

100%

Steps to Reproduce:

TBH this is very tricky to reproduce with a general guest, but here is one way
to reproduce it quite easily with libguestfs:

1. dnf install libguestfs-tools-c libguestfs-rescue

2. Run the following command as non-root:

virt-rescue --scratch=255

This command creates a simple appliance with 255 disks (/dev/sda, /dev/sdb etc)

3. In the rescue shell, run the following commands:

parted -s -- /dev/sda mklabel msdos
parted -s -- /dev/sda mkpart primary 64s 127s
udevadm --debug settle

Actual results:

The udevadm above will hang for quite a long time (at least a few minutes).

Additional info:

I ran udevadm under strace.  The full strace is attached, but the bit where
it hangs just does this over and over:

access("/run/udev/queue", F_OK)         = 0
poll([{fd=3, events=POLLIN}], 1, 1000)  = 0 (Timeout)
access("/run/udev/queue", F_OK)         = 0
poll([{fd=3, events=POLLIN}], 1, 1000)  = 0 (Timeout)

Comment 1 Richard W.M. Jones 2015-05-27 21:02:04 UTC
Well, maybe this *doesn't* have anything to do with having lots of
disks.  Adjusting the number of disks down to just one gives me the
same problem.

I suspect this is going to turn out to be either a libguestfs thing
or `udev --daemon' being further broken somehow.

Comment 2 Richard W.M. Jones 2015-05-27 21:03:55 UTC
Running the second command:

parted -s -- /dev/sda mkpart primary 64s 127s

causes /run/udev/queue to be created, and that file never
gets deleted.  Can that be right?  It seems as if udevadm settle
is expecting the file to be deleted.

Comment 3 Richard W.M. Jones 2015-05-27 21:14:01 UTC
OK, reading the mailing list now.  I'm going to try out:

commit 86c3bece38bcf55da6387d20c6f01da9ad0284dc (HEAD, origin/master, origin/HEAD, master)
Author: Tom Gundersen <teg@jklm.no>
Date:   Wed May 27 18:39:36 2015 +0200

    udevd: fix SIGCHLD handling in --daemon mode
    
    We were listening for SIGCHLD in the wrong process.

Comment 4 Richard W.M. Jones 2015-05-28 08:00:49 UTC
Yes, commit 86c3bece38bcf55da6387d20c6f01da9ad0284dc fixes
the problem.  I'm going to push an updated systemd package
which includes this extra patch.

Comment 5 Richard W.M. Jones 2015-05-28 08:03:34 UTC
http://koji.fedoraproject.org/koji/taskinfo?taskID=9860572