Bug 1544409
Summary: | left over devfs entries after force lvremove | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Roman Bednář <rbednar> | ||||
Component: | systemd | Assignee: | Michal Sekletar <msekleta> | ||||
Status: | CLOSED INSUFFICIENT_DATA | QA Contact: | qe-baseos-daemons | ||||
Severity: | unspecified | Docs Contact: | |||||
Priority: | unspecified | ||||||
Version: | 7.5 | CC: | agk, cmarthal, heinzm, jbrassow, kwalker, loberman, msekleta, msnitzer, prajnoha, rbednar, systemd-maint-list, thornber, udev-maint-list, zkabelac | ||||
Target Milestone: | rc | ||||||
Target Release: | --- | ||||||
Hardware: | x86_64 | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2019-02-06 15:48:13 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Roman Bednář
2018-02-12 12:05:07 UTC
Created attachment 1398564 [details]
test.log
I hit this again with different scenario today:
pvcreate --dataalignment 136192k vg /dev/sd{a..j}
vgcreate --physicalextentsize 34048k vg /dev/sd{a..j}
lvcreate --activate ey --type raid1 -m 1 -L 4G -n corigin vg /dev/sda /dev/sdb
lvcreate --activate ey --type raid1 -m 1 -L 2G -n 34048 vg /dev/sdc /dev/sdd
lvcreate --activate ey --type raid1 -m 1 -L 12M -n 34048_meta vg /dev/sde /dev/sdf
lvconvert --yes --type cache-pool --cachepolicy mq --cachemode writeback -c 64 --poolmetadata vg/34048_meta vg/34048
lvconvert --yes --type cache --cachemetadataformat 2 --cachepool vg/34048 vg/corigin
lvchange --syncaction repair vg/34048_cdata
lvchange --syncaction repair vg/34048_cmeta
lvconvert --splitcache vg/corigin
lvchange --syncaction check vg/corigin
lvremove -f /dev/vg/corigin
vgremove --yes vg
pvremove --yes /dev/sd{a..j}
pvcreate /dev/sd{a..j}
vgcreate vg /dev/sd{a..j} <<<< "already exists in filesystem" error should appear here
Leftover device on the node that this sequence ran from:
# ls -la /dev/cache_sanity/34048
lrwxrwxrwx. 1 root root 8 Feb 20 10:51 /dev/cache_sanity/34048 -> ../dm-11
Attaching full log of the run as well. Adding testblocker flag since this is preventing us from getting a reliable pass on cache regression suite.
So far I was not able to reproduce scenario in Comment 2 manually. I'd expect this to be a bug of 'older' systemd-udevd leaking symlink. To confirm this - you can enable "verify_udev_operations" in lvm.conf in this case lvm2 should 'spot' missing symlink removal. Also 'primary' way is to check present in dm table (dmsetup table) If device is NOT there, while symlink is in /dev dir - it's udev bug. I also believe it's the case of older version of udev in RHEL, since recent upsteam seems to be working fine here. dm table did not contain any entry related to removed lv. Running ~30 iterations of the same scenario with verify_udev_operations enabled did not reproduce the bug. Reassigning to udev. We don't have separate udev in rhel7. Also this does not look like an issue that should block rhel-7.5 RC, moving to 7.6 Isn't this another version of raid scrub bug 1549272? Check the log for "Failed to lock logical volume" messages during the scrubbing actions. I don't believe this requires a cluster to reproduce. Quite frankly I have no idea how to move this forward. I can't reproduce locally and without reproducer it is close to impossible to say what exactly happened and why the symlink wasn't removed. At the very least I need a dump of udev database (before and after lvremove), part of the udev debug log from the time of removal of VG and corresponding udevadm monitor output. According to last few comments the bug was not observed with latest LVM builds so I am closing this as INSUFFICIENT_DATA. In case someone is able to reproduce please reopen and attach relevant debug information. The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days |