Bug 1065696
| Summary: | Clvmd reads metadata during suspend/commit/resume sequence | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Zdenek Kabelac <zkabelac> | ||||
| Component: | lvm2 | Assignee: | LVM and device-mapper development team <lvm-team> | ||||
| lvm2 sub component: | Default / Unclassified | QA Contact: | cluster-qe <cluster-qe> | ||||
| Status: | CLOSED CURRENTRELEASE | Docs Contact: | |||||
| Severity: | high | ||||||
| Priority: | high | CC: | agk, cmarthal, heinzm, jbrassow, msnitzer, nperic, prajnoha, prockai, zkabelac | ||||
| Version: | 7.0 | Keywords: | Regression, Triaged | ||||
| Target Milestone: | rc | ||||||
| Target Release: | --- | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | lvm2-2.02.105-5.el7 | Doc Type: | Bug Fix | ||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2014-06-13 12:28:15 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
|
Description
Zdenek Kabelac
2014-02-15 22:30:30 UTC
[Related report from Jon Brassow]
There is some (perhaps) strange suspend/resume actions being taken in lib/metadata/mirror.c:_remove_mirror_images(), but the problem lies in the label scanning code that is occurring in clvmd. The full backtrace is here:
#0 0x00007fa17b4a624d in read () from /lib64/libpthread.so.0
#1 0x00007fa17c17bcf0 in _io (where=0x7fa17c1459f0,
buffer=0x7fa174030000 "rRiM\002", should_write=0) at device/dev-io.c:96
#2 0x00007fa17c17c306 in _aligned_io (where=0x7fa17c145a60,
buffer=0x7fa17c145ac0 "", should_write=0) at device/dev-io.c:226
#3 0x00007fa17c17d881 in dev_read (dev=0x7fa1740197e0, offset=0, len=2048,
buffer=0x7fa17c145ac0) at device/dev-io.c:662
#4 0x00007fa17c197bed in _find_labeller (dev=0x7fa1740197e0,
buf=0x7fa17c146340 "", label_sector=0x7fa17c146338, scan_sector=0)
at label/label.c:113
#5 0x00007fa17c19842a in label_read (dev=0x7fa1740197e0,
result=0x7fa17c146598, scan_sector=0) at label/label.c:282
#6 0x00007fa17c16be21 in _rescan_entry (info=0x7fa1740265c0)
at cache/lvmcache.c:645
#7 0x00007fa17b6bf76e in dm_hash_iter (t=0x7fa17403fe50,
f=0x7fa17c16bdeb <_rescan_entry>) at datastruct/hash.c:224
#8 0x00007fa17c16be3d in _scan_invalid () at cache/lvmcache.c:650
#9 0x00007fa17c16bee1 in lvmcache_label_scan (cmd=0x7fa1740008e0, full_scan=0)
at cache/lvmcache.c:679
#10 0x00007fa17c191777 in _create_vg_text_instance (fid=0x7fa1740467c0,
fic=0x7fa17c146750) at format_text/format-text.c:1916
#11 0x00007fa17c19263b in _text_create_text_instance (fmt=0x7fa17403a760,
fic=0x7fa17c146750) at format_text/format-text.c:2300
#12 0x00007fa17c16c293 in lvmcache_get_vg (cmd=0x7fa1740008e0, vgname=0x0,
vgid=0x7fa17d8648c3 "C5TzHycY0w3AdDaFE6HKDissJFkkCoi0EkLNwaeiTFhrjOllomGGI1w3nY4MNU6B", precommitted=0) at cache/lvmcache.c:772
#13 0x00007fa17c1bbb7b in _vg_read (cmd=0x7fa1740008e0, vgname=0x0,
vgid=0x7fa17d8648c3 "C5TzHycY0w3AdDaFE6HKDissJFkkCoi0EkLNwaeiTFhrjOllomGGI1w3nY4MNU6B", warnings=1, consistent=0x7fa17c1468fc, precommitted=0)
at metadata/metadata.c:3042
#14 0x00007fa17c1bd110 in _vg_read_by_vgid (cmd=0x7fa1740008e0,
vgid=0x7fa17d8648c3 "C5TzHycY0w3AdDaFE6HKDissJFkkCoi0EkLNwaeiTFhrjOllomGGI1w3nY4MNU6B", precommitted=0) at metadata/metadata.c:3520
#15 0x00007fa17c1bd38a in lv_from_lvid (cmd=0x7fa1740008e0,
lvid_s=0x7fa17d8648c3 "C5TzHycY0w3AdDaFE6HKDissJFkkCoi0EkLNwaeiTFhrjOllomGGI1w3nY4MNU6B", precommitted=0) at metadata/metadata.c:3579
#16 0x00007fa17c16910b in _lv_suspend (cmd=0x7fa1740008e0,
lvid_s=0x7fa17d8648c3 "C5TzHycY0w3AdDaFE6HKDissJFkkCoi0EkLNwaeiTFhrjOllomGGI1w3nY4MNU6B", laopts=0x7fa17c146a80, error_if_not_suspended=0, ondisk_lv=0x0,
incore_lv=0x0) at activate/activate.c:1662
#17 0x00007fa17c169ac8 in lv_suspend_if_active (cmd=0x7fa1740008e0,
lvid_s=0x7fa17d8648c3 "C5TzHycY0w3AdDaFE6HKDissJFkkCoi0EkLNwaeiTFhrjOllomGGI1w3nY4MNU6B", origin_only=0, exclusive=0, ondisk_lv=0x0, incore_lv=0x0)
at activate/activate.c:1825
#18 0x00007fa17c1628d5 in do_suspend_lv (
resource=0x7fa17d8648c3 "C5TzHycY0w3AdDaFE6HKDissJFkkCoi0EkLNwaeiTFhrjOllomGGI1w3nY4MNU6B", command=156 '\234', lock_flags=69 'E') at lvm-functions.c:453
#19 0x00007fa17c162c02 in do_lock_lv (command=156 '\234', lock_flags=69 'E',
resource=0x7fa17d8648c3 "C5TzHycY0w3AdDaFE6HKDissJFkkCoi0EkLNwaeiTFhrjOllomGGI1w3nY4MNU6B") at lvm-functions.c:543
#20 0x00007fa17c15bec4 in do_command (client=0x7fa17d864a60,
msg=0x7fa17d8648b0, msglen=84, buf=0x7fa17c146d70, buflen=65516,
retlen=0x7fa17c146d6c) at clvmd-command.c:127
#21 0x00007fa17c16084d in process_local_command (msg=0x7fa17d8648b0,
msglen=84, client=0x7fa17d864a60, xid=156) at clvmd.c:1867
#22 0x00007fa17c160fe3 in process_work_item (cmd=0x7fa17d864b70)
at clvmd.c:2105
#23 0x00007fa17c16114b in lvm_thread_fn (arg=0x7fffa82f8910) at clvmd.c:2152
#24 0x00007fa17b49fde3 in start_thread () from /lib64/libpthread.so.0
#25 0x00007fa17b1c60dd in clone () from /lib64/libc.so.6
The problem stems from the fact that label reads are being attempted on device-mapper LVs that are suspended - in this particular case, the suspended mirror log. I only seem to come across this issue in the testsuite. I've tried to reproduce the issue on real devices and via DM PVs outside of the test suite, but have failed. It does, however, trigger everytime with the testsuite.
A 'git bisect' reveals the source of the problem to be the following commit:
1112 git bisect start
1113 git bisect good abc89422af75fa9e20d24285d1366e4631cb8748
1114 git bisect bad
1115 git bisect good
1118 git bisect bad
1120 git bisect bad
1123 git bisect good
1126 git bisect good
1127 git bisect bad
1130 git bisect good
1134 git bisect bad
1135 git bisect good
651d5093edde3e0ebee9d75be1c9834efc152d91 is the first bad commit
commit 651d5093edde3e0ebee9d75be1c9834efc152d91
Author: Petr Rockai <prockai>
Date: Tue Feb 19 02:32:25 2013 +0100
metadata: Avoid pv_read in find_pv_by_name.
last good: 2f5c12e3a88f6bc028e8355c58b6b2c89da301e7
I used gdb to set break points on clvmd to see if it was entering functions like 'lookup_p' and 'device_is_usable', but it never did. I haven't yet determined which path it is taking to read the labels and why it is skipping the important process of weeding out DM devices that are suspended.
One more thing I noticed is that 'ignore_lvm_mirrors' does not seem to be set in the configuration file for the testsuite, but this causes it to default to '1', which should avoid mirrors. This wouldn't be the source of this issue though. The timeout is happening because of the much more basic failure of attempting to read DM devices that are suspended.
Why this is only happening in RHEL7, I don't know.
I have checked in a fix, b391ae88e5ed3720a8b5a87312e0cd907a94fcc2 -- I'd like to point out it's not 651d5093edde3e0ebee9d75be1c9834efc152d9 that broke clvmd, as it doesn't change paths involved in the deadlock. The bisection might have gone wrong in some way (I didn't investigate why or how). However, a deadlock appears in the same testcase without clvmd, and it only appears on *some* systems, and only some of the time. So there might be a timing-dependent or otherwise system-dependent component in it (even though I currently don't see it). It might also be a separate bug. The main problem here is likely (as explained in the comment in the checkin) an interaction of missing devices, inconsistent metadata and cache validity tracking. In theory, the code path should never trip, but obviously something inside lvmcache is invalidated at some point, but it's hard to find that point. I can look in more detail later. The change should be otherwise safe. I think the bug is fixed at wrong level:
#11 0x00007fa17c19263b in _text_create_text_instance (fmt=0x7fa17403a760,
fic=0x7fa17c146750) at format_text/format-text.c:2300
#12 0x00007fa17c16c293 in lvmcache_get_vg (cmd=0x7fa1740008e0, vgname=0x0,
vgid=0x7fa17d8648c3
Here - the metadata must be already in cache and it should return them without any disk access.
If they are not - something is wrong - it could be the pv_read commit just revealed the bug in its full - but clvmd counts with committed & precommitted metadata to be available without any read in the suspend state.
The test which was failing was one of internal dev tests, which I will attach to this BZ as a reference. Created attachment 865907 [details]
Mirror failure tests
A simplified test as suggested by zkabelac: create VG lvcreate -an -Zn --type mirror -m 1 --ignoremonitoring -l1 -n mirror $vg "$dev1" "$dev2" "$dev3" disable dev "$dev1" lvchange --partial -aey $vg/mirror lvconvert -y --repair $vg/mirror vgreduce --removemissing $vg enable dev "$dev1" vgextend $vg "$dev1" disable dev "$dev2" lvconvert -y --repair $vg/mirror vgremove -f $vg Marking verified since the clustered mirror device failure tests passed as well as the test case described above. lvm_cluster_mirror_failure PASS 3.10.0-113.el7.x86_64 lvm2-2.02.105-13.el7 BUILT: Wed Mar 19 05:38:19 CDT 2014 lvm2-libs-2.02.105-13.el7 BUILT: Wed Mar 19 05:38:19 CDT 2014 lvm2-cluster-2.02.105-13.el7 BUILT: Wed Mar 19 05:38:19 CDT 2014 device-mapper-1.02.84-13.el7 BUILT: Wed Mar 19 05:38:19 CDT 2014 device-mapper-libs-1.02.84-13.el7 BUILT: Wed Mar 19 05:38:19 CDT 2014 device-mapper-event-1.02.84-13.el7 BUILT: Wed Mar 19 05:38:19 CDT 2014 device-mapper-event-libs-1.02.84-13.el7 BUILT: Wed Mar 19 05:38:19 CDT 2014 device-mapper-persistent-data-0.2.8-5.el7 BUILT: Fri Feb 28 19:15:56 CST 2014 cmirror-2.02.105-13.el7 BUILT: Wed Mar 19 05:38:19 CDT 2014 This request was resolved in Red Hat Enterprise Linux 7.0. Contact your manager or support representative in case you have further questions about the request. |