RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1018171 - /dev/md/xxx symlink not removed
Summary: /dev/md/xxx symlink not removed
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: udev
Version: 6.4
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: ---
Assignee: Harald Hoyer
QA Contact: Leos Pol
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-10-11 11:12 UTC by Jan Safranek
Modified: 2015-07-22 07:19 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
If udev processed the uevent queue for a device that was already removed, the internal handling failed to process an already removed device. Consequently, some symbolic links were not removed for these devices. Now, udev no longer relies on the existence of a device when dealing with the backlog of the uevent queue, and all symbolic links are removed as expected.
Clone Of:
Environment:
Last Closed: 2015-07-22 07:19:42 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
reproducer. Don't forget to change DISKS variable inside! (305 bytes, application/x-shellscript)
2013-10-11 11:12 UTC, Jan Safranek
no flags Details
reproducerDon't forget to change DISKS variables inside! (526 bytes, application/x-shellscript)
2013-10-11 11:35 UTC, Jan Safranek
no flags Details
Updated reproducer (882 bytes, text/plain)
2013-10-31 16:15 UTC, Jan Safranek
no flags Details
script output (20.77 KB, text/plain)
2013-10-31 16:23 UTC, Jan Safranek
no flags Details
syslog messages (18.14 KB, text/plain)
2013-10-31 16:24 UTC, Jan Safranek
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2015:1382 0 normal SHIPPED_LIVE udev bug fix update 2015-07-20 17:58:19 UTC

Description Jan Safranek 2013-10-11 11:12:33 UTC
Created attachment 811002 [details]
reproducer. Don't forget to change DISKS variable inside!

'mdadm -S /dev/md/test' sometimes does not remove /dev/md/test symlink, while /dev/md127 device is destroyed correctly.

Version-Release number of selected component (if applicable):
udev-147-2.48.el6.x86_64
kernel-2.6.32-422.el6.x86_64

How reproducible:
sometimes

Steps to Reproduce:
1. run attached test script

Actual results (typically):
Failed after: 2

Expected results:
Endless loop of mdadm -C and mdadm -S

Additional info:
It is reproducible only sometimes. When I clear both RAID members with 'dd', I usually get correct behavior. I have complex test suite for OpenLMI project which creates/destroys MD RAIDs, partitions, LVs, various filestems etc. and after the test suite finishes, I can reproduce the bug reliably. It seems some stray data on the RAID members or in memory make the bug reproducible.

Comment 1 Jan Safranek 2013-10-11 11:13:40 UTC
dmesg of mdadm -C  and mdadm -S at the time of buggy behavior, imho nothing interesting:

md: bind<sda>
md: bind<sdb1>
bio: create slab <bio-1> at 1
md/raid0:md127: md_size is 2287616 sectors.
md: RAID0 configuration for md127 - 2 zones
md: zone0=[sda/sdb1]
      zone-offset=         0KB, device-offset=         0KB, size=    191488KB
md: zone1=[sda]
      zone-offset=    191488KB, device-offset=     95744KB, size=    952320KB

md127: detected capacity change from 0 to 1171259392
 md127: unknown partition table
md127: detected capacity change from 1171259392 to 0
md: md127 stopped.
md: unbind<sdb1>
md: export_rdev(sdb1)
md: unbind<sda>
md: export_rdev(sda)

Comment 3 Jan Safranek 2013-10-11 11:35:37 UTC
Created attachment 811006 [details]
reproducerDon't forget to change DISKS variables inside!

Attached new reproducer script, now I can reproduce it with 100% success rate.

Comment 4 RHEL Program Management 2013-10-15 01:34:14 UTC
This request was not resolved in time for the current release.
Red Hat invites you to ask your support representative to
propose this request, if still desired, for consideration in
the next release of Red Hat Enterprise Linux.

Comment 5 Harald Hoyer 2013-10-31 12:07:32 UTC
(In reply to Jan Safranek from comment #3)
> Created attachment 811006 [details]
> reproducerDon't forget to change DISKS variables inside!
> 
> Attached new reproducer script, now I can reproduce it with 100% success
> rate.

instead of "sleep 2", you probably should use "udevadm settle"

See, if you can reproduce it then.

Comment 6 Jan Safranek 2013-10-31 12:48:14 UTC
Result is the same with udevadm settle, fails after several loops.

Comment 7 Harald Hoyer 2013-10-31 14:07:37 UTC
(In reply to Jan Safranek from comment #6)
> Result is the same with udevadm settle, fails after several loops.

and after the script ends/fails, the symlinks are still there and never ever removed? even after a 3 minutes?

maybe add

# udevadm control --log-priority=info

or run in parallel:

# udevadm monitor

and attach the last relevant messages from /var/log/messages and the monitor output

Comment 8 Jan Safranek 2013-10-31 16:13:10 UTC
I can't reproduce the bug with 'udevadm control --log-priority=info'. This probably indicates the bug is dependent on exact timing / race.

Comment 9 Jan Safranek 2013-10-31 16:15:06 UTC
Created attachment 817971 [details]
Updated reproducer

Updated reproducer script with enhanced logging (basically run udevadm monitor in background + print timestaps as the script progresses).

Comment 10 Jan Safranek 2013-10-31 16:23:59 UTC
Created attachment 817976 [details]
script output

Log with udevadm + script mixed together. udevadm seems to write log in batches, I'm not able to judge if it's due to buffered output or some delay in kernel/udev.

Anyway, just beware the udevadmm logs are written late and merge them to appropriate place.

Comment 11 Jan Safranek 2013-10-31 16:24:52 UTC
Created attachment 817979 [details]
syslog messages

syslog from the time the reproducer was running

Comment 12 Jan Safranek 2013-10-31 16:30:56 UTC
umm, syslog shows this error at the time the bug occurs:

Oct 31 16:10:01 rhel6_test udevd-work[1613]: inotify_add_watch(6, /dev/md127, 10) failed: No such file or directory

(=1383232199.000000000 UNIX time)

Comment 13 Harald Hoyer 2013-11-04 16:24:54 UTC
and after the script ends/fails, the symlinks are still there and never ever removed? even after a 3 minutes????

Comment 14 Jan Safranek 2013-11-04 16:43:59 UTC
Yes, after a few minutes of poking around the system, the symlink is still there.

Comment 15 Harald Hoyer 2013-11-04 17:07:33 UTC
I guess stracing udevd and its workers will change the timing also

Comment 16 Harald Hoyer 2013-11-04 17:21:59 UTC
(In reply to Jan Safranek from comment #10)
> Log with udevadm + script mixed together. udevadm seems to write log in
> batches, I'm not able to judge if it's due to buffered output or some delay
> in kernel/udev.

redirect stdout to stderr for udevadm monitor.

stdout is buffered, stderr is not

Comment 20 Leos Pol 2015-04-15 08:31:10 UTC
# rpm -q udev
udev-147-2.57.el6.x86_64
....
lrwxrwxrwx. 1 root root 8 15. dub 04.21 /dev/md/test -> ../md127
Failed after: 2


# rpm -q udev
udev-147-2.61.el6.x86_64
....
runs forewer

Honzo, thanks for reproducer.

Comment 21 errata-xmlrpc 2015-07-22 07:19:42 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2015-1382.html


Note You need to log in before you can comment on or make changes to this bug.