RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1592960 - disappearing partitioned devices causing an array of test failures "Device excluded by a filter"
Summary: disappearing partitioned devices causing an array of test failures "Device ex...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: lvm2
Version: 7.6
Hardware: x86_64
OS: Linux
unspecified
urgent
Target Milestone: rc
: ---
Assignee: David Teigland
QA Contact: cluster-qe@redhat.com
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-06-19 16:44 UTC by Corey Marthaler
Modified: 2021-09-03 12:48 UTC (History)
8 users (show)

Fixed In Version: lvm2-2.02.179-2.el7
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-10-30 11:03:47 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
verbose pvcreate attempt (126.03 KB, text/plain)
2018-06-19 16:45 UTC, Corey Marthaler
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2018:3193 0 None None None 2018-10-30 11:04:28 UTC

Description Corey Marthaler 2018-06-19 16:44:20 UTC
Description of problem:

Initial 7.6 LVM regression tests have been failing due to a variety of apparent random storage failures. However tests run on 7.5 machines with the same backing physical storage is fine. I've narrowed this down to just doing pvcreates and pvremoves over and over and seeing random devices "fail to be found" at random times. However, with lvmetad running, that seems to mask these random "failures" and the pvcreates appear to work just fine each iteration.


[root@host-086 ~]# pvcreate /dev/sdb2 /dev/sdb1 /dev/sdf2 /dev/sdf1 /dev/sdc2 /dev/sdc1 /dev/sdh2 /dev/sdh1 /dev/sda2 /dev/sda1
  Physical volume "/dev/sdb2" successfully created.
  Physical volume "/dev/sdb1" successfully created.
  Physical volume "/dev/sdf2" successfully created.
  Physical volume "/dev/sdf1" successfully created.
  Physical volume "/dev/sdc2" successfully created.
  Physical volume "/dev/sdc1" successfully created.
  Physical volume "/dev/sdh2" successfully created.
  Physical volume "/dev/sdh1" successfully created.
  Physical volume "/dev/sda2" successfully created.
  Physical volume "/dev/sda1" successfully created.
[root@host-086 ~]# pvremove /dev/sdb2 /dev/sdb1 /dev/sdf2 /dev/sdf1 /dev/sdc2 /dev/sdc1 /dev/sdh2 /dev/sdh1 /dev/sda2 /dev/sda1
  Labels on physical volume "/dev/sdb2" successfully wiped.
  Labels on physical volume "/dev/sdb1" successfully wiped.
  Labels on physical volume "/dev/sdf2" successfully wiped.
  Labels on physical volume "/dev/sdf1" successfully wiped.
  Labels on physical volume "/dev/sdc2" successfully wiped.
  Labels on physical volume "/dev/sdc1" successfully wiped.
  Labels on physical volume "/dev/sdh2" successfully wiped.
  Labels on physical volume "/dev/sdh1" successfully wiped.
  Labels on physical volume "/dev/sda2" successfully wiped.
  Labels on physical volume "/dev/sda1" successfully wiped.

[root@host-086 ~]# pvcreate /dev/sdb2 /dev/sdb1 /dev/sdf2 /dev/sdf1 /dev/sdc2 /dev/sdc1 /dev/sdh2 /dev/sdh1 /dev/sda2 /dev/sda1
  Device open /dev/sdc1 8:33 failed errno 2
  Physical volume "/dev/sdb2" successfully created.
  Physical volume "/dev/sdb1" successfully created.
  Physical volume "/dev/sdf2" successfully created.
  Physical volume "/dev/sdf1" successfully created.
  Physical volume "/dev/sdc2" successfully created.
  Physical volume "/dev/sdc1" successfully created.
  Physical volume "/dev/sdh2" successfully created.
  Physical volume "/dev/sdh1" successfully created.
  Physical volume "/dev/sda2" successfully created.
  Physical volume "/dev/sda1" successfully created.
[root@host-086 ~]# pvremove /dev/sdb2 /dev/sdb1 /dev/sdf2 /dev/sdf1 /dev/sdc2 /dev/sdc1 /dev/sdh2 /dev/sdh1 /dev/sda2 /dev/sda1
  Labels on physical volume "/dev/sdb2" successfully wiped.
  Labels on physical volume "/dev/sdb1" successfully wiped.
  Labels on physical volume "/dev/sdf2" successfully wiped.
  Labels on physical volume "/dev/sdf1" successfully wiped.
  Labels on physical volume "/dev/sdc2" successfully wiped.
  Labels on physical volume "/dev/sdc1" successfully wiped.
  Labels on physical volume "/dev/sdh2" successfully wiped.
  Labels on physical volume "/dev/sdh1" successfully wiped.
  Labels on physical volume "/dev/sda2" successfully wiped.
  Labels on physical volume "/dev/sda1" successfully wiped.

[root@host-086 ~]# pvcreate /dev/sdb2 /dev/sdb1 /dev/sdf2 /dev/sdf1 /dev/sdc2 /dev/sdc1 /dev/sdh2 /dev/sdh1 /dev/sda2 /dev/sda1
  Physical volume "/dev/sdb2" successfully created.
  Physical volume "/dev/sdb1" successfully created.
  Physical volume "/dev/sdf2" successfully created.
  Physical volume "/dev/sdf1" successfully created.
  Physical volume "/dev/sdc2" successfully created.
  Physical volume "/dev/sdc1" successfully created.
  Physical volume "/dev/sdh2" successfully created.
  Physical volume "/dev/sdh1" successfully created.
  Physical volume "/dev/sda2" successfully created.
  Physical volume "/dev/sda1" successfully created.
[root@host-086 ~]# pvremove /dev/sdb2 /dev/sdb1 /dev/sdf2 /dev/sdf1 /dev/sdc2 /dev/sdc1 /dev/sdh2 /dev/sdh1 /dev/sda2 /dev/sda1
  Labels on physical volume "/dev/sdb2" successfully wiped.
  Labels on physical volume "/dev/sdb1" successfully wiped.
  Labels on physical volume "/dev/sdf2" successfully wiped.
  Labels on physical volume "/dev/sdf1" successfully wiped.
  Labels on physical volume "/dev/sdc2" successfully wiped.
  Labels on physical volume "/dev/sdc1" successfully wiped.
  Labels on physical volume "/dev/sdh2" successfully wiped.
  Labels on physical volume "/dev/sdh1" successfully wiped.
  Labels on physical volume "/dev/sda2" successfully wiped.
  Labels on physical volume "/dev/sda1" successfully wiped.

[root@host-086 ~]# pvcreate /dev/sdb2 /dev/sdb1 /dev/sdf2 /dev/sdf1 /dev/sdc2 /dev/sdc1 /dev/sdh2 /dev/sdh1 /dev/sda2 /dev/sda1
  Device /dev/sdb2 not found.
  Device open /dev/sdf2 8:82 failed errno 2
  Device open /dev/sdf2 8:82 failed errno 2
  Failed to create a new blkid probe for device /dev/sdf2.
  Device open /dev/sdf1 8:81 failed errno 2
  Physical volume "/dev/sdb1" successfully created.
  Physical volume "/dev/sdf1" successfully created.
  Physical volume "/dev/sdc2" successfully created.
  Physical volume "/dev/sdc1" successfully created.
  Physical volume "/dev/sdh2" successfully created.
  Physical volume "/dev/sdh1" successfully created.
  Physical volume "/dev/sda2" successfully created.
  Physical volume "/dev/sda1" successfully created.


# Another Machine also not running lvmetad
[root@host-092 ~]# pvcreate /dev/sd[abcdefgh][12]
  Device /dev/sdc1 not found.
  Device open /dev/sdd1 8:49 failed errno 2
  Device open /dev/sdd1 8:49 failed errno 2
  Device open /dev/sdd2 8:50 failed errno 2
  Device open /dev/sdd2 8:50 failed errno 2
  WARNING: Scan ignoring device 8:49 with no paths.
  WARNING: Scan ignoring device 8:50 with no paths.
[root@host-092 ~]# pvremove /dev/sd[abcdefgh][12]
  No PV found on device /dev/sdb1.
  No PV found on device /dev/sdb2.
  No PV found on device /dev/sdc1.
  No PV found on device /dev/sdc2.
  No PV found on device /dev/sdd1.
  No PV found on device /dev/sdd2.
  No PV found on device /dev/sde1.
  No PV found on device /dev/sde2.
  No PV found on device /dev/sdf1.
  No PV found on device /dev/sdf2.
  No PV found on device /dev/sdg1.
  No PV found on device /dev/sdg2.
  No PV found on device /dev/sdh1.
  No PV found on device /dev/sdh2.
  Device /dev/sda1 not found.
  Device /dev/sda2 not found.
[root@host-092 ~]# pvcreate /dev/sd[abcdefgh][12]
  Device open /dev/sdc1 8:33 failed errno 2
  Device open /dev/sdc1 8:33 failed errno 2
  Device open /dev/sdc2 8:34 failed errno 2
  Device open /dev/sdc2 8:34 failed errno 2
  Physical volume "/dev/sda1" successfully created.
  Physical volume "/dev/sda2" successfully created.
  Physical volume "/dev/sdb1" successfully created.
  Physical volume "/dev/sdb2" successfully created.
  Physical volume "/dev/sdc1" successfully created.
  Physical volume "/dev/sdc2" successfully created.
  Physical volume "/dev/sdd1" successfully created.
  Physical volume "/dev/sdd2" successfully created.
  Physical volume "/dev/sde1" successfully created.
  Physical volume "/dev/sde2" successfully created.
  Physical volume "/dev/sdf1" successfully created.
  Physical volume "/dev/sdf2" successfully created.
  Physical volume "/dev/sdg1" successfully created.
  Physical volume "/dev/sdg2" successfully created.
  Physical volume "/dev/sdh1" successfully created.
  Physical volume "/dev/sdh2" successfully created.
[root@host-092 ~]# pvremove /dev/sd[abcdefgh][12]
  Labels on physical volume "/dev/sda1" successfully wiped.
  Labels on physical volume "/dev/sda2" successfully wiped.
  Labels on physical volume "/dev/sdb1" successfully wiped.
  Labels on physical volume "/dev/sdb2" successfully wiped.
  Labels on physical volume "/dev/sdc1" successfully wiped.
  Labels on physical volume "/dev/sdc2" successfully wiped.
  Labels on physical volume "/dev/sdd1" successfully wiped.
  Labels on physical volume "/dev/sdd2" successfully wiped.
  Labels on physical volume "/dev/sde1" successfully wiped.
  Labels on physical volume "/dev/sde2" successfully wiped.
  Labels on physical volume "/dev/sdf1" successfully wiped.
  Labels on physical volume "/dev/sdf2" successfully wiped.
  Labels on physical volume "/dev/sdg1" successfully wiped.
  Labels on physical volume "/dev/sdg2" successfully wiped.
  Labels on physical volume "/dev/sdh1" successfully wiped.
  Labels on physical volume "/dev/sdh2" successfully wiped.

[root@host-092 ~]# pvcreate /dev/sd[abcdefgh][12]
  Device /dev/sdb1 excluded by a filter.
  Device /dev/sdc1 not found.
  Device open /dev/sde1 8:65 failed errno 2
  Device open /dev/sde1 8:65 failed errno 2
  Device open /dev/sde2 8:66 failed errno 2
  Device open /dev/sde2 8:66 failed errno 2
  Physical volume "/dev/sda1" successfully created.
  Physical volume "/dev/sda2" successfully created.
  Physical volume "/dev/sdb2" successfully created.
  Physical volume "/dev/sdc2" successfully created.
  Physical volume "/dev/sdd1" successfully created.
  Physical volume "/dev/sdd2" successfully created.
  Physical volume "/dev/sde1" successfully created.
  Physical volume "/dev/sde2" successfully created.
  Physical volume "/dev/sdf1" successfully created.
  Physical volume "/dev/sdf2" successfully created.
  Physical volume "/dev/sdg1" successfully created.
  Physical volume "/dev/sdg2" successfully created.
  Physical volume "/dev/sdh1" successfully created.
  Physical volume "/dev/sdh2" successfully created.
[root@host-092 ~]# pvremove /dev/sd[abcdefgh][12]
  No PV found on device /dev/sdb1.
  No PV found on device /dev/sdc1.
  Device open /dev/sdc2 8:34 failed errno 2
  Device open /dev/sdc2 8:34 failed errno 2
  WARNING: Scan ignoring device 8:34 with no paths.


Version-Release number of selected component (if applicable):
3.10.0-906.el7.x86_64

lvm2-2.02.179-1.el7    BUILT: Mon Jun 18 01:12:41 CDT 2018
lvm2-libs-2.02.179-1.el7    BUILT: Mon Jun 18 01:12:41 CDT 2018
lvm2-cluster-2.02.179-1.el7    BUILT: Mon Jun 18 01:12:41 CDT 2018
lvm2-lockd-2.02.179-1.el7    BUILT: Mon Jun 18 01:12:41 CDT 2018
lvm2-python-boom-0.8.5-6.el7    BUILT: Mon Jun 18 01:16:13 CDT 2018
cmirror-2.02.179-1.el7    BUILT: Mon Jun 18 01:12:41 CDT 2018
device-mapper-1.02.148-1.el7    BUILT: Mon Jun 18 01:12:41 CDT 2018
device-mapper-libs-1.02.148-1.el7    BUILT: Mon Jun 18 01:12:41 CDT 2018
device-mapper-event-1.02.148-1.el7    BUILT: Mon Jun 18 01:12:41 CDT 2018
device-mapper-event-libs-1.02.148-1.el7    BUILT: Mon Jun 18 01:12:41 CDT 2018
device-mapper-persistent-data-0.7.3-3.el7    BUILT: Tue Nov 14 05:07:18 CST 2017


How reproducible:
Often

Comment 2 Corey Marthaler 2018-06-19 16:45:16 UTC
Created attachment 1453005 [details]
verbose pvcreate attempt

Comment 4 David Teigland 2018-06-19 19:38:15 UTC
I've reproduced this on my own test machine.  It only seems to happen when using partitions.

Comment 5 David Teigland 2018-06-20 14:34:30 UTC
The device nodes for the partitions in /dev actually disappear, so it's not just an lvm issue.

In one terminal I run:

# while true; do ls /dev/sdb1 > /dev/null; ls /dev/sdd1 > /dev/null; ls /dev/sde1 > /dev/null; ls /dev/sdf1 > /dev/null; ls /dev/sdg1 > /dev/null; done

In another terminal I run repeated:
# pvcreate /dev/sd[bdefg]1; pvremove /dev/sd[bdefg]1


The first terminal will report a stream of:
ls: cannot access /dev/sdd1: No such file or directory
ls: cannot access /dev/sdd1: No such file or directory
ls: cannot access /dev/sdd1: No such file or directory
ls: cannot access /dev/sde1: No such file or directory
ls: cannot access /dev/sde1: No such file or directory
ls: cannot access /dev/sdg1: No such file or directory
ls: cannot access /dev/sdg1: No such file or directory
ls: cannot access /dev/sdd1: No such file or directory
ls: cannot access /dev/sdf1: No such file or directory
ls: cannot access /dev/sdg1: No such file or directory
ls: cannot access /dev/sdd1: No such file or directory


udev is the only thing I know of that mucks with device nodes, so the suspicion is that udev is doing something wrong or unexpected.

Comment 6 Alasdair Kergon 2018-06-20 16:17:07 UTC
Perhaps run the command under:

  strace -e trace=open,close

and confirm it is not opening /dev/sdb (and others which it should only be reading) with O_RDWR.

Comment 7 Alasdair Kergon 2018-06-20 16:25:59 UTC
(I'm guessing here, but if a device is partitioned, udev assumes nothing would open the underlying device for writing unless it is changing the partition table - other things should write only directly to the partitions themselves.  If such an open+close does happen, udev may assume the partition table was changed and recreate the partitioned devices - it's not clever enough to look at the 'delta' between old and new and work out what actually changed.)

Comment 8 Alasdair Kergon 2018-06-20 16:43:17 UTC
Probably not running the exact same version here, but on the branch 2018-06-01-stable and a simple 'pvs' command I'm seeing lots of lines like:

open("/dev/sda1", O_RDWR|O_DIRECT|O_NOATIME) = 9

which is a regression - it needs to be O_RDONLY like in older versions:

open("/dev/sda1", O_RDONLY|O_DIRECT|O_NOATIME) = 6

Comment 9 Alasdair Kergon 2018-06-20 16:54:47 UTC
Easy test:

Run 'udevadm monitor' alongside.

If you use the 'old' lvm and run 'pvs -a' you see no udev activity.
If you do the same with the 'new' lvm you see lots of udev activity due to opening devices O_RDRW when that wasn't needed.

Because of this udev behaviour, it's important that the code doesn't open devices O_RDWR unless it thinks it does need to write to them.

Comment 10 David Teigland 2018-06-20 17:11:27 UTC
Until we can stamp out this udev brokenness, this workaround works for me:

https://sourceware.org/git/?p=lvm2.git;a=commit;h=a30e6222799409ab6e6151683c95eb13f4abaefb

Comment 11 Alasdair Kergon 2018-06-20 17:14:08 UTC
(So unrelated to partitions in comment #7, rather just a general problem that it's losing track of which devices need opening readonly and which ones read/write, and that if you open a device read/write then that triggers udev operations after you close the device and these have either to be waited for (or suppressed) before you make further use of the same device.)

Comment 12 David Teigland 2018-06-20 17:22:03 UTC
There is definately a problem with device nodes for partitions going missing from /dev.  This doesn't happen with non-partitioned devices.

Comment 13 Corey Marthaler 2018-06-21 15:56:13 UTC
Quick check seems to show lvm2-2.02.179-2 fixes this issue.

Comment 15 Corey Marthaler 2018-08-13 22:17:39 UTC
Our tests scenarios are once again working fine on top of partitioned devices. Marking verified in the latest rpms.

3.10.0-931.el7.x86_64
lvm2-2.02.180-2.el7    BUILT: Wed Aug  1 11:22:48 CDT 2018
lvm2-libs-2.02.180-2.el7    BUILT: Wed Aug  1 11:22:48 CDT 2018
lvm2-cluster-2.02.180-2.el7    BUILT: Wed Aug  1 11:22:48 CDT 2018
device-mapper-1.02.149-2.el7    BUILT: Wed Aug  1 11:22:48 CDT 2018
device-mapper-libs-1.02.149-2.el7    BUILT: Wed Aug  1 11:22:48 CDT 2018
device-mapper-event-1.02.149-2.el7    BUILT: Wed Aug  1 11:22:48 CDT 2018
device-mapper-event-libs-1.02.149-2.el7    BUILT: Wed Aug  1 11:22:48 CDT 2018
device-mapper-persistent-data-0.7.3-3.el7    BUILT: Tue Nov 14 05:07:18 CST 2017




9 disk(s) to be used:
        host-093=/dev/sdg /dev/sdd /dev/sdh /dev/sda /dev/sdi /dev/sdc /dev/sdf /dev/sdb /dev/sde

on host-093...
dicing /dev/sdg into 2... 
dicing /dev/sdd into 2... 
dicing /dev/sdh into 2... 
dicing /dev/sda into 2... 
dicing /dev/sdi into 2... 
dicing /dev/sdc into 2... 
dicing /dev/sdf into 2... 
dicing /dev/sdb into 2... 
dicing /dev/sde into 2... 
re-reading disks on host-093...
Zeroing out the new partitions.../dev/sdg1.../dev/sdg2.../dev/sdd1.../dev/sdd2.../dev/sdh1.../dev/sdh2.../dev/sda1.../dev/sda2.../dev/sdi1.../dev/sdi2.../dev/sdc1.../dev/sdc2.../dev/sdf1.../dev/sdf2.../dev/sdb1.../dev/sdb2.../dev/sde1.../dev/sde2...

creating lvm devices...
host-093: pvcreate /dev/sde2 /dev/sde1 /dev/sdb2 /dev/sdb1 /dev/sdf2 /dev/sdf1 /dev/sdc2 /dev/sdc1 /dev/sdi2 /dev/sdi1
host-093: vgcreate   raid_sanity /dev/sde2 /dev/sde1 /dev/sdb2 /dev/sdb1 /dev/sdf2 /dev/sdf1 /dev/sdc2 /dev/sdc1 /dev/sdi2 /dev/sdi1

============================================================
Iteration 1 of 1 started at Mon Aug 13 17:14:55 CDT 2018
============================================================
SCENARIO (raid1) - [display_raid]
Create a raid and then display it a couple ways
host-093: lvcreate  --nosync --type raid1 -m 1 -n display_raid -L 300M raid_sanity
[...]

Comment 17 errata-xmlrpc 2018-10-30 11:03:47 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:3193


Note You need to log in before you can comment on or make changes to this bug.