Bug 2218239

Summary: lvchange --refresh does not always work
Product: Red Hat Enterprise Linux 8 Reporter: David Teigland <teigland>
Component: lvm2Assignee: LVM and device-mapper development team <lvm-team>
lvm2 sub component: Activating existing Logical Volumes QA Contact: cluster-qe <cluster-qe>
Status: CLOSED NOTABUG Docs Contact:
Severity: high    
Priority: unspecified CC: agk, heinzm, jbrassow, mgandhi, msnitzer, prajnoha, zkabelac
Version: 8.8   
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-07-12 20:57:54 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description David Teigland 2023-06-28 14:07:41 UTC
Description of problem:

When a raid1 LV has a transiently missing device, when that device returns the LV should begin using the device again when lvchange --refresh is run, but the refresh doesn't work sometimes.

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 David Teigland 2023-07-11 18:31:10 UTC
lvchange --refresh is doing identical steps in both good and bad cases, so I don't see where we might further debug lvchange.


# grep -e suspend -e resume -e load refresh-good
13:10:55.537901 lvchange[408156] device_mapper/libdm-config.c:1086  devices/ignore_suspended_devices not found in config: defaulting to 0
13:10:55.557357 lvchange[408156] mm/memlock.c:629  Entering prioritized section (locking for suspend).
13:10:55.560867 lvchange[408156] device_mapper/libdm-deptree.c:3131  Suppressed vgfo-lvfo1_rimage_1 (253:7) identical table reload.
13:10:55.560905 lvchange[408156] device_mapper/libdm-deptree.c:3131  Suppressed vgfo-lvfo1_rmeta_1 (253:5) identical table reload.
13:10:55.560943 lvchange[408156] device_mapper/libdm-deptree.c:3131  Suppressed vgfo-lvfo1_rimage_0 (253:6) identical table reload.
13:10:55.561212 lvchange[408156] device_mapper/libdm-deptree.c:3131  Suppressed vgfo-lvfo1_rmeta_0 (253:4) identical table reload.
13:10:55.561364 lvchange[408156] device_mapper/libdm-deptree.c:3131  Suppressed vgfo-lvfo1 (253:8) identical table reload.
13:10:55.567059 lvchange[408156] mm/memlock.c:626  Entering critical section (suspending).
13:10:55.567064 lvchange[408156] mm/memlock.c:587  Lock:   Memlock counters: prioritized:1 locked:0 critical:1 daemon:0 suspended:0
13:10:55.574266 lvchange[408156] device_mapper/ioctl/libdm-iface.c:1942  dm suspend   (253:8) [ noopencount flush skiplockfs ]   [2048] (*1)
13:10:55.600527 lvchange[408156] device_mapper/ioctl/libdm-iface.c:1942  dm suspend   (253:7) [ noopencount flush skiplockfs ]   [2048] (*1)
13:10:55.604529 lvchange[408156] device_mapper/ioctl/libdm-iface.c:1942  dm suspend   (253:5) [ noopencount flush skiplockfs ]   [2048] (*1)
13:10:55.608614 lvchange[408156] device_mapper/ioctl/libdm-iface.c:1942  dm suspend   (253:6) [ noopencount flush skiplockfs ]   [2048] (*1)
13:10:55.612485 lvchange[408156] device_mapper/ioctl/libdm-iface.c:1942  dm suspend   (253:4) [ noopencount flush skiplockfs ]   [2048] (*1)
13:10:55.617640 lvchange[408156] device_mapper/libdm-deptree.c:3131  Suppressed vgfo-lvfo1_rimage_1 (253:7) identical table reload.
13:10:55.617666 lvchange[408156] device_mapper/libdm-deptree.c:3131  Suppressed vgfo-lvfo1_rmeta_1 (253:5) identical table reload.
13:10:55.617687 lvchange[408156] device_mapper/libdm-deptree.c:3131  Suppressed vgfo-lvfo1_rimage_0 (253:6) identical table reload.
13:10:55.617711 lvchange[408156] device_mapper/libdm-deptree.c:3131  Suppressed vgfo-lvfo1_rmeta_0 (253:4) identical table reload.
13:10:55.617757 lvchange[408156] device_mapper/libdm-deptree.c:3131  Suppressed vgfo-lvfo1 (253:8) identical table reload.
13:10:55.618323 lvchange[408156] device_mapper/ioctl/libdm-iface.c:1942  dm resume   (253:7) [ noopencount flush ]   [2048] (*1)
13:10:55.618404 lvchange[408156] device_mapper/ioctl/libdm-iface.c:1942  dm resume   (253:5) [ noopencount flush ]   [2048] (*1)
13:10:55.618444 lvchange[408156] device_mapper/ioctl/libdm-iface.c:1942  dm resume   (253:6) [ noopencount flush ]   [2048] (*1)
13:10:55.618480 lvchange[408156] device_mapper/ioctl/libdm-iface.c:1942  dm resume   (253:4) [ noopencount flush ]   [2048] (*1)
13:10:55.618516 lvchange[408156] device_mapper/ioctl/libdm-iface.c:1942  dm resume   (253:8) [ noopencount flush ]   [2048] (*1)
13:10:55.620391 lvchange[408156] mm/memlock.c:639  Leaving critical section (resumed).
13:10:55.641411 lvchange[408156] mm/memlock.c:641  Leaving section (unlocking on resume).
13:10:55.641447 lvchange[408156] mm/memlock.c:598  Unlock: Memlock counters: prioritized:1 locked:1 critical:0 daemon:0 suspended:0



]# grep -e suspend -e resume -e load refresh-bad
14:58:59.201757 lvchange[12976] device_mapper/libdm-config.c:1086  devices/ignore_suspended_devices not found in config: defaulting to 0
14:58:59.212677 lvchange[12976] mm/memlock.c:629  Entering prioritized section (locking for suspend).
14:58:59.213394 lvchange[12976] device_mapper/libdm-deptree.c:3131  Suppressed vgfo-lvfo1_rimage_1 (253:7) identical table reload.
14:58:59.213428 lvchange[12976] device_mapper/libdm-deptree.c:3131  Suppressed vgfo-lvfo1_rmeta_1 (253:6) identical table reload.
14:58:59.213455 lvchange[12976] device_mapper/libdm-deptree.c:3131  Suppressed vgfo-lvfo1_rimage_0 (253:5) identical table reload.
14:58:59.213488 lvchange[12976] device_mapper/libdm-deptree.c:3131  Suppressed vgfo-lvfo1_rmeta_0 (253:3) identical table reload.
14:58:59.213535 lvchange[12976] device_mapper/libdm-deptree.c:3131  Suppressed vgfo-lvfo1 (253:8) identical table reload.
14:58:59.214967 lvchange[12976] mm/memlock.c:626  Entering critical section (suspending).
14:58:59.214973 lvchange[12976] mm/memlock.c:587  Lock:   Memlock counters: prioritized:1 locked:0 critical:1 daemon:0 suspended:0
14:58:59.224115 lvchange[12976] device_mapper/ioctl/libdm-iface.c:1942  dm suspend   (253:8) [ noopencount flush skiplockfs ]   [2048] (*1)
14:58:59.257117 lvchange[12976] device_mapper/ioctl/libdm-iface.c:1942  dm suspend   (253:7) [ noopencount flush skiplockfs ]   [2048] (*1)
14:58:59.261187 lvchange[12976] device_mapper/ioctl/libdm-iface.c:1942  dm suspend   (253:6) [ noopencount flush skiplockfs ]   [2048] (*1)
14:58:59.265027 lvchange[12976] device_mapper/ioctl/libdm-iface.c:1942  dm suspend   (253:5) [ noopencount flush skiplockfs ]   [2048] (*1)
14:58:59.269075 lvchange[12976] device_mapper/ioctl/libdm-iface.c:1942  dm suspend   (253:3) [ noopencount flush skiplockfs ]   [2048] (*1)
14:58:59.274102 lvchange[12976] device_mapper/libdm-deptree.c:3131  Suppressed vgfo-lvfo1_rimage_1 (253:7) identical table reload.
14:58:59.274126 lvchange[12976] device_mapper/libdm-deptree.c:3131  Suppressed vgfo-lvfo1_rmeta_1 (253:6) identical table reload.
14:58:59.274151 lvchange[12976] device_mapper/libdm-deptree.c:3131  Suppressed vgfo-lvfo1_rimage_0 (253:5) identical table reload.
14:58:59.274179 lvchange[12976] device_mapper/libdm-deptree.c:3131  Suppressed vgfo-lvfo1_rmeta_0 (253:3) identical table reload.
14:58:59.274232 lvchange[12976] device_mapper/libdm-deptree.c:3131  Suppressed vgfo-lvfo1 (253:8) identical table reload.
14:58:59.274303 lvchange[12976] device_mapper/ioctl/libdm-iface.c:1942  dm resume   (253:7) [ noopencount flush ]   [2048] (*1)
14:58:59.274373 lvchange[12976] device_mapper/ioctl/libdm-iface.c:1942  dm resume   (253:6) [ noopencount flush ]   [2048] (*1)
14:58:59.274424 lvchange[12976] device_mapper/ioctl/libdm-iface.c:1942  dm resume   (253:5) [ noopencount flush ]   [2048] (*1)
14:58:59.274477 lvchange[12976] device_mapper/ioctl/libdm-iface.c:1942  dm resume   (253:3) [ noopencount flush ]   [2048] (*1)
14:58:59.274526 lvchange[12976] device_mapper/ioctl/libdm-iface.c:1942  dm resume   (253:8) [ noopencount flush ]   [2048] (*1)
14:58:59.277193 lvchange[12976] mm/memlock.c:639  Leaving critical section (resumed).
14:58:59.303646 lvchange[12976] mm/memlock.c:641  Leaving section (unlocking on resume).
14:58:59.303754 lvchange[12976] mm/memlock.c:598  Unlock: Memlock counters: prioritized:1 locked:1 critical:0 daemon:0 suspended:0

Comment 2 David Teigland 2023-07-12 20:57:54 UTC
The problematic situation was:
raid1 LV with two disks
detach disk1
raid1 is using only disk2
detach disk2
attach disk1
attach disk2
lvchange --refresh

The act of detaching both (all) disks from under the raid device is what triggered errors from the md runtime and xfs.  When both disks were reattached, lvchange --refresh would not reload the table and restore both disks to the raid device (likely because it's suppressing identical table reloads.)  The ability to force refresh/reload, even when tables match, does sound like a useful option at some point, but probably not for the specific scenario described above.

I don't think we can expect this test case to work seamlessly, it's not a case that's been anticipated or tested before, and the expected behavior is undefined.  A transient outage of all disks in a raid dev could perhaps be better masked by configuring dm-multipath over the devices.