Hide Forgot
Description of problem: #### RHEL 6.5 Snap Merge [...] Merge snapshot snapper/merge1 back into the origin lvconvert --merge snapper/merge1 Activating origin/snap volume(s) Waiting for the snap merge to complete... [root@taft-01 ~]# lvs -a -o +devices LV VG Attr LSize Origin Data% Devices [merge1] snapper Swi-a-s--- 2.00g origin 1.51 /dev/sdb1(1024) merge2 snapper swi-a-s--- 2.00g origin 5.82 /dev/sdb1(1536) merge3 snapper swi-a-s--- 2.00g origin 5.82 /dev/sdb1(2048) merge4 snapper swi-a-s--- 2.00g origin 5.82 /dev/sdb1(2560) merge5 snapper swi-a-s--- 2.00g origin 5.82 /dev/sdb1(3072) origin snapper Owi-a-s--- 4.00g /dev/sdb1(0) [root@taft-01 ~]# lvs -a -o +devices LV VG Attr LSize Origin Data% Devices [merge1] snapper Swi-a-s--- 2.00g origin 0.42 /dev/sdb1(1024) merge2 snapper swi-a-s--- 2.00g origin 6.92 /dev/sdb1(1536) merge3 snapper swi-a-s--- 2.00g origin 6.92 /dev/sdb1(2048) merge4 snapper swi-a-s--- 2.00g origin 6.92 /dev/sdb1(2560) merge5 snapper swi-a-s--- 2.00g origin 6.92 /dev/sdb1(3072) origin snapper Owi-a-s--- 4.00g /dev/sdb1(0) [root@taft-01 ~]# lvs -a -o +devices LV VG Attr LSize Origin Data% Devices [merge1] snapper Swi-a-s--- 2.00g origin 0.00 /dev/sdb1(1024) merge2 snapper swi-a-s--- 2.00g origin 7.31 /dev/sdb1(1536) merge3 snapper swi-a-s--- 2.00g origin 7.31 /dev/sdb1(2048) merge4 snapper swi-a-s--- 2.00g origin 7.31 /dev/sdb1(2560) merge5 snapper swi-a-s--- 2.00g origin 7.31 /dev/sdb1(3072) origin snapper Owi-a-s--- 4.00g /dev/sdb1(0) # Here the merged snap finally goes away [root@taft-01 ~]# lvs -a -o +devices LV VG Attr LSize Origin Data% Devices merge2 snapper swi-a-s--- 2.00g origin 7.31 /dev/sdb1(1536) merge3 snapper swi-a-s--- 2.00g origin 7.31 /dev/sdb1(2048) merge4 snapper swi-a-s--- 2.00g origin 7.31 /dev/sdb1(2560) merge5 snapper swi-a-s--- 2.00g origin 7.31 /dev/sdb1(3072) origin snapper owi-a-s--- 4.00g /dev/sdb1(0) # RHEL 7.0 Snap Merge [...] Deactivating origin/snap volume(s) LV VG Attr LSize Origin Data% Devices merge1 snapper swi---s--- 2.00g origin /dev/sdc1(1024) merge2 snapper swi---s--- 2.00g origin /dev/sdc1(1536) merge3 snapper swi---s--- 2.00g origin /dev/sdc1(2048) merge4 snapper swi---s--- 2.00g origin /dev/sdc1(2560) merge5 snapper swi---s--- 2.00g origin /dev/sdc1(3072) origin snapper owi---s--- 4.00g /dev/sdc1(0) Merge snapshot snapper/merge1 back into the origin lvconvert --merge snapper/merge1 Activating origin/snap volume(s) Waiting for the snap merge to complete... P=7.93 P=7.28 P=6.48 P=5.82 P=5.06 P=4.30 P=3.64 P=3.02 P=2.30 P=1.59 P=0.92 P=0.37 P=0.00 P=0.00 *** After an hour [merge1] still exists *** [root@harding-02 ~]# lvs -a -o +devices LV VG Attr LSize Origin Data% Devices [merge1] snapper Swi-a-s--- 2.00g origin 0.00 /dev/sdc1(1024) merge2 snapper swi-a-s--- 2.00g origin 8.67 /dev/sdc1(1536) merge3 snapper swi-a-s--- 2.00g origin 8.67 /dev/sdc1(2048) merge4 snapper swi-a-s--- 2.00g origin 8.67 /dev/sdc1(2560) merge5 snapper swi-a-s--- 2.00g origin 8.67 /dev/sdc1(3072) origin snapper Owi-a-s--- 4.00g /dev/sdc1(0) Version-Release number of selected component (if applicable): 3.10.0-54.0.1.el7.x86_64 lvm2-2.02.103-6.el7 BUILT: Wed Nov 27 02:28:25 CST 2013 lvm2-libs-2.02.103-6.el7 BUILT: Wed Nov 27 02:28:25 CST 2013 lvm2-cluster-2.02.103-6.el7 BUILT: Wed Nov 27 02:28:25 CST 2013 device-mapper-1.02.82-6.el7 BUILT: Wed Nov 27 02:28:25 CST 2013 device-mapper-libs-1.02.82-6.el7 BUILT: Wed Nov 27 02:28:25 CST 2013 device-mapper-event-1.02.82-6.el7 BUILT: Wed Nov 27 02:28:25 CST 2013 device-mapper-event-libs-1.02.82-6.el7 BUILT: Wed Nov 27 02:28:25 CST 2013 device-mapper-persistent-data-0.2.8-2.el7 BUILT: Wed Oct 30 10:20:48 CDT 2013 cmirror-2.02.103-6.el7 BUILT: Wed Nov 27 02:28:25 CST 2013 How reproducible: Everytime
This happens both with and without lvmetad.
This may be caused by systemd killing process group and poll daemon okozina is working on is the solution. 1. Were you running the commands using qarsh (which is now socket activated systemd service)? 1.1 If so, when you try on command line, does the same happen? Possible workaround is to include `KillMode=process` in qarshd's service file.
(In reply to Marian Csontos from comment #2) > 1. Were you running the commands using qarsh (which is now socket activated > systemd service)? Yes i was. > 1.1 If so, when you try on command line, does the same happen? No. It appears to pass each time when run w/o qarsh. > Possible workaround is to include `KillMode=process` in qarshd's service > file. You mean in '/usr/lib/systemd/system/qarshd.socket'? If so, where in that file does it go? [Unit] Description=qarsh Socket for Per-Connection Servers [Socket] ListenStream=5008 Accept=yes [Install] WantedBy=sockets.target
Adding 'KillMode=process' to /usr/lib/systemd/system/qarshd@.service fixes the problem I was seeing. Thanks Marian.
Corey, keep in mind it does not fix the problem, it masks the problem so the rest of test suite can be executed.
Let's change the KillMode for this case only, so other failures unrelated to polling daemon are identified and triaged.
Moving to 7.1 as a complete solution to this requires a polldaemon to be run as a service (not forking from the lvm command directly). Also bug #814857 (rawhide). Changing the summary here to better reflect the problem.
As a workaround until we have the polldaemon as a proper service, we can use "systemd-run <any command needed>" to run the command as systemd transient service. For example: "systemd-run lvconvert --merge ..." "systemd-run pvmove ..." etc. The service (named "run-<PID>.service", the name is returned from the systemd-run command on output) will keep running even if logged out. The service is a usual system
(In reply to Peter Rajnoha from comment #10) > systemd-run command on output) will keep running even if logged out. The > service is a usual system ...is a usual systemd service so its output is logged and the service can be stopped if needed.
Fixed in upstream, release V2.02.120: https://git.fedorahosted.org/cgit/lvm2.git/commit/?id=0300730cc9ba058df830d9cb0981183b90ad17db The new daemon is named lvmpolld and is able to run as systemd native service with on-demand feature and configurable timeout after it shutdowns when being idle.
Marking verified with the latest rpms. The test cases mentioned in comment #0 now pass, along with other "background process" intensive test cases (pvmove, lvconvert, etc...). 3.10.0-313.el7.x86_64 lvm2-2.02.130-2.el7 BUILT: Tue Sep 15 07:15:40 CDT 2015 lvm2-libs-2.02.130-2.el7 BUILT: Tue Sep 15 07:15:40 CDT 2015 lvm2-cluster-2.02.130-2.el7 BUILT: Tue Sep 15 07:15:40 CDT 2015 device-mapper-1.02.107-2.el7 BUILT: Tue Sep 15 07:15:40 CDT 2015 device-mapper-libs-1.02.107-2.el7 BUILT: Tue Sep 15 07:15:40 CDT 2015 device-mapper-event-1.02.107-2.el7 BUILT: Tue Sep 15 07:15:40 CDT 2015 device-mapper-event-libs-1.02.107-2.el7 BUILT: Tue Sep 15 07:15:40 CDT 2015 device-mapper-persistent-data-0.5.5-1.el7 BUILT: Thu Aug 13 09:58:10 CDT 2015 cmirror-2.02.130-2.el7 BUILT: Tue Sep 15 07:15:40 CDT 2015 sanlock-3.2.4-1.el7 BUILT: Fri Jun 19 12:48:49 CDT 2015 sanlock-lib-3.2.4-1.el7 BUILT: Fri Jun 19 12:48:49 CDT 2015 lvm2-lockd-2.02.130-2.el7 BUILT: Tue Sep 15 07:15:40 CDT 2015
Okozina: I did a bit of a rewrite of the release note description for you to look over. I used the description in the lvm.conf file to summarize the command, then did a bit of an edit to what you wrote. Let me know if this is not correct. Steven
Looks neat and trim. Thank you!
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2015-2147.html