Bug 1966636 - Snapshot merge call stuck with LVM DBusD 2.03.12
Summary: Snapshot merge call stuck with LVM DBusD 2.03.12
Keywords:
Status: POST
Alias: None
Product: LVM and device-mapper
Classification: Community
Component: lvm2
Version: unspecified
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: ---
Assignee: Tony Asleson
QA Contact: cluster-qe
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-06-01 14:58 UTC by Vojtech Trefny
Modified: 2023-08-10 15:41 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Embargoed:
pm-rhel: lvm-technical-solution?
pm-rhel: lvm-test-coverage?


Attachments (Terms of Use)

Description Vojtech Trefny 2021-06-01 14:58:11 UTC
With 2.03.12 the Merge action never finishes when using the DBus API:

# busctl call com.redhat.lvmdbus1 /com/redhat/lvmdbus1/Lv/5 com.redhat.lvmdbus1.Snapshot Merge "ia{sv}" 1 0
o "/com/redhat/lvmdbus1/Job/1"

The produced job is stuck at 0 % and not completed

# busctl introspect com.redhat.lvmdbus1 /com/redhat/lvmdbus1/Job/1 com.redhat.lvmdbus1.Job
NAME                    TYPE      SIGNATURE RESULT/VALUE              FLAGS
.Remove                 method    -         -                         -
.Wait                   method    i         b                         -
.Complete               property  b         false                     emits-change writable
.GetError               property  (is)      -1 "Job is not complete!" emits-change
.Percent                property  d         0                         emits-change
.Result                 property  o         "/"                       emits-change

Backtrace for the lvconvert command spawned by the DBus daemon (/usr/sbin/lvm lvconvert --merge -i 1 testVG/testLV_bak --config global/notify_dbus=0):

(gdb) bt
#0  0x00007fada65cb4f7 in write () from /lib64/libc.so.6
#1  0x00007fada655b76d in _IO_file_write () from /lib64/libc.so.6
#2  0x00007fada655aae6 in new_do_write () from /lib64/libc.so.6
#3  0x00007fada655be3e in _IO_file_xsputn () from /lib64/libc.so.6
#4  0x00007fada654840f in buffered_vfprintf () from /lib64/libc.so.6
#5  0x000056024ebca6de in _vprint_log ()
#6  0x000056024ebca979 in print_log ()
#7  0x000056024eba454a in device_ids_match ()
#8  0x000056024eba5481 in setup_devices ()
#9  0x000056024ebcd954 in label_scan ()
#10 0x000056024eb8bd4b in lvmcache_label_scan ()
#11 0x000056024eb6e189 in process_each_lv ()
#12 0x000056024eb4584b in lvconvert_merge_cmd ()
#13 0x000056024eb4e767 in lvm_run_command ()
#14 0x000056024eb4ff96 in lvm2_main ()
#15 0x00007fada6501b75 in __libc_start_main () from /lib64/libc.so.6
#16 0x000056024eb2ae3e in _start ()

Same command works when run from shell so I assume the problem is somewhere in the daemon when polling the progress of the lvconvert.

Comment 1 David Teigland 2021-06-02 17:52:12 UTC
I can't get lvmdbusd to run to try this myself.  Could you set lvm.conf log/level=7 and log/file="/tmp/lvm.log" and rerun this so we can see what the commands run by the daemon are doing?

Comment 2 Tony Asleson 2021-06-02 18:24:07 UTC
The lvconvert is run in a separate process and the daemon is reading stdout, parsing it and updating the state of the daemon in a loop.  So while the merge is in progress another process is collecting the entire state of lvm at the same time.  So we *may* have a locking issue where one process that is doing the merge is holding a lock that the other process needs to collect the state of lvm, thus causing a deadlock.

Also when this error occurs, you can issue a dbus command to lvmdbusd and it will dump it's flight recorder to syslog, see: 

com.redhat.lvmdbus1.Manager.FlightRecorderDump

https://sourceware.org/git/?p=lvm2.git;a=commit;h=470a1f1c5031590823f8a58e641873da76dc1f46

Comment 3 Zdenek Kabelac 2021-06-02 19:28:16 UTC
This is likely related to Mikulas's snapshot patch that needs to be reverted in kernel a fixed with newer version.

Comment 4 Vojtech Trefny 2021-06-03 15:03:49 UTC
The problem is in lvmdbusd, the code which handles reading progress from stdout deadlocks when stderr output exceeds the pipe buffer. I'll prepare a patch for this.

Comment 5 Tony Asleson 2021-06-17 14:30:36 UTC
Correction committed upstream: https://sourceware.org/git/?p=lvm2.git;a=commit;h=c474f174cc8b0e855f984bf211f5416b42c644a1


Note You need to log in before you can comment on or make changes to this bug.