Bug 1220529 - Under some conditions dmeventd can hangs on exit and eat CPU
Summary: Under some conditions dmeventd can hangs on exit and eat CPU
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: lvm2
Version: 6.6
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: ---
Assignee: Alasdair Kergon
QA Contact: cluster-qe@redhat.com
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-05-11 19:09 UTC by Artem Komisarenko
Modified: 2019-12-16 04:44 UTC (History)
16 users (show)

Fixed In Version: lvm2-2.02.140-1.el6
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-05-11 01:17:05 UTC
Target Upstream Version:


Attachments (Terms of Use)
Log file from hanging dmevend (5.75 KB, text/x-csrc)
2015-05-11 19:09 UTC, Artem Komisarenko
no flags Details
It seems it is fix issue for me but I'm not sure is it OK (500 bytes, patch)
2015-05-11 19:11 UTC, Artem Komisarenko
no flags Details | Diff


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2016:0964 normal SHIPPED_LIVE lvm2 bug fix and enhancement update 2016-05-10 22:57:40 UTC

Description Artem Komisarenko 2015-05-11 19:09:39 UTC
Created attachment 1024314 [details]
Log file from hanging dmevend

Description of problem:

We using some specific configuration on our virtual machines and devices where LVM volumes activated by custom scripts in initramfs (to check partitions availability / formatting new drives etc. - almost on hardware). We have many cloned machines but there is only one where bug reproduced, moreover it is reproducing on every boot (>90%) on it and on cloned from it machines. Maybe it has some FS/LVM metadata defect(?) because power failure(?).

On this machine dmeventd eat 100% CPU (or few tens of % after applying 7c5feaed3be5c26f5782e4e891ba99862f1e968f fix).

After research I found that dmeventd running from initramfs under some conditions can't correctly exit and fall to infinite loop.

I've used current dmeventd.c with enabled logs (introduced by 160777bb3eda1cae32511292070c92e076305ef6) to retrieve additional information (see attached log file).

After digging I found that dmeventd ignores some types of errors in _event_wait() function (see repeating 'dm_task_run failed, errno = 32, Broken pipe' messages in attached log).

It seems it is could be easily fixed (and attached patch fixed it for me) but I'm not sure is it correct or not.

Comment 1 Artem Komisarenko 2015-05-11 19:11:01 UTC
Created attachment 1024315 [details]
It seems it is fix issue for me but I'm not sure is it OK

Comment 3 Zdenek Kabelac 2015-05-27 13:10:33 UTC
We have possibly resolved this issue differenly.

Could you please try to use this patch:

https://www.redhat.com/archives/lvm-devel/2015-May/msg00217.html

(Just ignore the 'magic' around symbols files).

Basically the errno (32) you see is not errno from ioctl.

Comment 4 Alasdair Kergon 2015-05-27 13:27:32 UTC
Can you take a core dump when it is looping?  (Or install the debuginfo package and attach to it with gdb and get backtraces if you're familiar with that)

Comment 5 Artem Komisarenko 2015-06-02 12:15:27 UTC
@Zdenek Kabelac

Thanks. I'll try it in few days.

@Alasdair Kergon

Unfortunately gdb trace is not informative and seems like following because static functions(?) even if debug packages installed:

#0  0x00a8b424 in __kernel_vsyscall ()
#1  0x003a05d1 in ?? ()
#2  0x0804c4cf in ?? ()
#3  0x002d9d26 in ?? ()
#4  0x08049cc1 in ?? ()

So, for good traces package should be rebuild with something like '-O0 -ggdb3'. I've already remove it on this machine but I remember where is issues in code.

In first, please see my patch. It prevents infinitely running tasks if running failed. Without it dmeventd infinitely retrying restart failed task (that failed). Possible it fixed with patch from Zdenek Kabelac answer, I'll check it.

In second, let see following code:

	for (;;) {
		if (_exit_now) {
			_exit_now = 0;
			/*
			 * When '_exit_now' is set, signal has been received,
			 * but can not simply exit unless all
			 * threads are done processing.
			 */
			_lock_mutex();
			nothreads = (dm_list_empty(&_thread_registry) &&
				     dm_list_empty(&_thread_registry_unused));
			_unlock_mutex();
			if (nothreads)
				break;
			syslog(LOG_ERR, "There are still devices being monitored.");
			syslog(LOG_ERR, "Refusing to exit.");
		}
		_process_request(&fifos);
		_cleanup_unused_threads();
	}

It is the loop that eat CPU. If there are working threads while dmeventd receive signal then '_exit_now = 0;' happened and there is no way to exit. Loop take following form:

	for (;;) {
		_process_request(&fifos);
		_cleanup_unused_threads();
	}

Comment 6 Artem Komisarenko 2015-06-02 15:22:19 UTC
I've tried patch. It changes logs to following but not fixed issue:

1433257783 [Thr b76e2b70]: dm_task_run failed, errno = 0, Success

So, now errno is 0 instead of 32.

---

Also I've take gdb trace:

#0  0x00e41424 in __kernel_vsyscall ()
#1  0x005795d1 in __xstat64_conv () from /lib/libc.so.6
#2  0x008d6ba8 in _process_request (fifos=0x0) at dmeventd.c:1610
#3  0x008d75ec in main (argc=1, argv=0xbfe027e4) at dmeventd.c:2248

So, as you see it is 'for (;;)' loop in 'main()' about that I write in comment #5.

Comment 7 Alasdair Kergon 2015-08-01 10:51:55 UTC
So (1) we have a problem where dm_task_run() is returning failure but not logging any information to explain why it is failing.

(2) The code is not exiting.  What threads are still present in that loop?  ("info threads" in gdb)  Is the problem that all the threads exited without setting _exit_now, or is the problem that one (or more) of the threads have not exited?

Comment 8 Alasdair Kergon 2015-08-01 11:18:26 UTC
For (1), we need to audit the code paths within dm_task_run() and add log messages at any place that causes an error but does not currently log anything.  (Or try to intercept the error earlier in gdb and find out what's causing it.)

For (2), if there are no threads left, we need to add code to detect this and exit.  If there is a thread still there which has nothing to do, we need to work out why it hasn't exited and add some logic to ensure it always exits at the appropriate place.

(Before forming a view on the original 'catch all' patch proposed, that might catch other situations where it would be incorrect to stop monitoring, we need to understand the actual conditions involved in the case under examination.)

Comment 9 Artem Komisarenko 2015-08-06 15:28:18 UTC
(2) It is one thread that infinitely trying run one the same task that always immediately failed. Please see attached log. I don't know what is doing this task but his name is 'HDD-HDDU' where /dev/HDD/HDDU is LVM snapshot.

Comment 10 Zdenek Kabelac 2015-10-29 12:34:13 UTC
New dmeventd code has been deployed in lvm2  2.02.133.

I'm convinced it should have fix reported issue - if the issue would be still reproducible with the version 133 - please reopen.

Comment 13 Roman Bednář 2016-03-03 10:10:23 UTC
Verified as SanityOnly. No new issues observed during regression testing.

Comment 14 researchiteng 2016-03-14 22:20:14 UTC
Issue is still there as of:
[root@x ~]# uname -a
Linux x 4.4.4-301.fc23.x86_64 #1 SMP Fri Mar 4 17:42:42 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

I have the system up to date, I've just restarted it.
846 root       2 -18  136284  42880   8708 S 100.0  0.5  41:32.21 dmeventd

When I try to run lvs or vgs commands, it hands for 30+ seconds, at this stage:

close(4)                                = 0
open("/run/lock/lvm/V_lvm2_sto_dok:aux", O_RDWR|O_CREAT|O_APPEND, 0777) = 4
rt_sigaction(SIGINT, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGINT, {0x5649444f2560, [], SA_RESTORER, 0x7f17f4d11b20}, {SIG_DFL, [], 0}, 8) = 0
rt_sigprocmask(SIG_BLOCK, NULL, ~[KILL STOP RTMIN RT_1], 8) = 0
rt_sigprocmask(SIG_SETMASK, ~[INT KILL STOP RTMIN RT_1], NULL, 8) = 0
flock(4, LOCK_EX)                       = 0
rt_sigaction(SIGINT, {SIG_DFL, [], SA_RESTORER, 0x7f17f4d11b20}, NULL, 8) = 0
stat("/run/lock/lvm/V_lvm2_sto_dok:aux", {st_mode=S_IFREG|0700, st_size=0, ...}) = 0
fstat(4, {st_mode=S_IFREG|0700, st_size=0, ...}) = 0
flock(4, LOCK_EX|LOCK_NB)               = 0
stat("/run/lock/lvm/V_lvm2_sto_dok:aux", {st_mode=S_IFREG|0700, st_size=0, ...}) = 0
fstat(4, {st_mode=S_IFREG|0700, st_size=0, ...}) = 0
unlink("/run/lock/lvm/V_lvm2_sto_dok:aux") = 0
close(4)                                = 0
open("/run/lock/lvm/V_lvm2_sto_dok", O_RDWR|O_CREAT|O_APPEND, 0777) = 4
rt_sigaction(SIGINT, NULL, {SIG_DFL, [], SA_RESTORER, 0x7f17f4d11b20}, 8) = 0
rt_sigaction(SIGINT, {0x5649444f2560, [], SA_RESTORER, 0x7f17f4d11b20}, {SIG_DFL, [], SA_RESTORER, 0x7f17f4d11b20}, 8) = 0
rt_sigprocmask(SIG_BLOCK, NULL, ~[INT KILL STOP RTMIN RT_1], 8) = 0
flock(4, LOCK_SH

Note: V_lvm2_sto_dok is a volume for docker. I have stopped docker daemon, still 1 thread of CPU remains 100% used by dmeventd.

After the ~30 seconds, it continues with:


rt_sigaction(SIGINT, {SIG_DFL, [], SA_RESTORER, 0x7f17f4d11b20}, NULL, 8) = 0
stat("/run/lock/lvm/V_lvm2_sto_dok", {st_mode=S_IFREG|0700, st_size=0, ...}) = 0
fstat(4, {st_mode=S_IFREG|0700, st_size=0, ...}) = 0
open("/proc/thread-self/attr/fscreate", O_RDWR|O_CLOEXEC) = 5
write(5, NULL, 0)                       = 0
close(5)                                = 0
write(3, "request=\"vg_lookup\"\nuuid =\"5Ovmu"..., 94) = 94
write(3, "\n##\n", 4)                   = 4
read(3, "response=\"OK\"\nname=\"lvm2_sto_dok", 32) = 32
read(3, "\"\nmetadata {\n\tid=\"5OvmuL-SzwU-I2"..., 1024) = 1024
read(3, "\"M59Jia-P3k6-sq9F-J6g8-gD7e-nVtL"..., 1056) = 891
lstat("/sys//dev/block/8:4", {st_mode=S_IFLNK|0777, st_size=0, ...}) = 0
gettid()                                = 3410
getrandom("Z\252\263\216l\267\367\320\306\360\220\277\5\302\"u", 16, GRND_NONBLOCK) = 16
access("/sys/subsystem", F_OK)          = -1 ENOENT (No such file or directory)
stat("/sys/bus", {st_mode=S_IFDIR|0755, st_size=0, ...}) = 0
open("/sys/bus", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 5
fstat(5, {st_mode=S_IFDIR|0755, st_size=0, ...}) = 0
getdents(5, /* 36 entries */, 32768)    = 1008
open("/usr/lib64/gconv/gconv-modules.cache", O_RDONLY) = 6
fstat(6, {st_mode=S_IFREG|0644, st_size=26254, ...}) = 0
mmap(NULL, 26254, PROT_READ, MAP_SHARED, 6, 0) = 0x7f17f5fa2000
close(6)                                = 0
futex(0x7f17f50998e8, FUTEX_WAKE_PRIVATE, 2147483647) = 0
getdents(5, /* 0 entries */, 32768)     = 0
close(5)                                = 0

Comment 15 Tim Dettrick 2016-03-22 01:43:45 UTC
(In reply to researchiteng from comment #14)
> Issue is still there as of:
> [root@x ~]# uname -a
> Linux x 4.4.4-301.fc23.x86_64 #1 SMP Fri Mar 4 17:42:42 UTC 2016 x86_64
> x86_64 x86_64 GNU/Linux

I am also seeing this behaviour with this exact kernel while using the Docker devicemapper storage driver on a desktop PC with LVM thin pools.

> When I try to run lvs or vgs commands, it hangs for 30+ seconds

I see similar behaviour:

---
$ time lvs
  LV          VG      Attr       LSize   Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
  docker      vg_main -wi-ao----  40.00g                                                    
  docker-pool vg_main twi-a-tz--  40.00g             97.72  84.94                           
  root        vg_main -wi-ao---- 800.00g                                                    
  swap        vg_main -wi-ao----  16.00g                                                    
lvs  0.01s user 0.01s system 0% cpu 21.267 total
$ time vgs
  VG      #PV #LV #SN Attr   VSize   VFree
  vg_main   1   4   0 wz--n- 899.97g 3.84g
vgs  0.00s user 0.01s system 0% cpu 24.264 total
---

I do occasionally see quicker responses.

I am using the same kernel with a very similar Docker+thinpool configuration on a number of Fedora Atomic VMs, but they do not demonstrate high dmeventd CPU usage or delayed lvs/vgs responses.

---
$ uname -a && time lvs
Linux reviewer-node-01.novalocal 4.4.4-301.fc23.x86_64 #1 SMP Fri Mar 4 17:42:42 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
  LV          VG       Attr       LSize   Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
  docker-pool atomicos twi-aot--- 118.12g             35.28  34.19                           
  root        atomicos -wi-ao----  10.00g                                                    

real    0m0.048s
user    0m0.015s
sys     0m0.029s
---

Comment 17 errata-xmlrpc 2016-05-11 01:17:05 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-0964.html

Comment 18 Kenjiro Nakayama 2016-10-04 04:41:05 UTC
Hi,

One of RHEL 7 users hit experienced that dmeventd eats CPU a lot (80-90%). The package version is below:

  # rpm -qa | grep lvm2
  lvm2-2.02.130-5.el7_2.5.x86_64
  lvm2-libs-2.02.130-5.el7_2.5.x86_64

Although the errata https://rhn.redhat.com/errata/RHBA-2016-0964.html has been released only for RHEL6, the issue reported on this ticket doesn't effect on RHEL7?
Thanks,

Comment 19 Peter Rajnoha 2016-10-04 08:44:38 UTC
(In reply to Kenjiro Nakayama from comment #18)
> Hi,
> 
> One of RHEL 7 users hit experienced that dmeventd eats CPU a lot (80-90%).
> The package version is below:
> 
>   # rpm -qa | grep lvm2
>   lvm2-2.02.130-5.el7_2.5.x86_64
>   lvm2-libs-2.02.130-5.el7_2.5.x86_64
> 
> Although the errata https://rhn.redhat.com/errata/RHBA-2016-0964.html has
> been released only for RHEL6, the issue reported on this ticket doesn't
> effect on RHEL7?
> Thanks,

We haven't backported any patches in 7.2 z-stream for dmeventd. There have been many fixes for dmeventd in later version (lvm2 v2.02.133/device-mapper v1.02.110 and higher). There's upcoming 7.3 RHEL release which will also include the new lvm2/device-mapper versions.

Comment 20 Kenjiro Nakayama 2016-10-04 08:58:12 UTC
My question might be bad. I wanted to confirm if this issue exists on RHEL 7.2 or not. So, since the fix is included in lvm2-2.02.143 or later, the package lvm2-2.02.130-5.el7_2.5.x86_64 has the issue and we have to wait for the fix in RHEL 7.3?

Comment 21 Peter Rajnoha 2016-10-04 09:17:23 UTC
(In reply to Kenjiro Nakayama from comment #20)
> My question might be bad. I wanted to confirm if this issue exists on RHEL
> 7.2 or not. So, since the fix is included in lvm2-2.02.143 or later, the
> package lvm2-2.02.130-5.el7_2.5.x86_64 has the issue and we have to wait for
> the fix in RHEL 7.3?

THe 7.2 has the older version of dmeventd where such problems might exist and because of high number of changes in this code, it's not feasible to backport these changes to 7.2.z as they might be problematic (...we would need to do full rebase in 7.2.z instead which is normally not done for z-stream versions). And because 7.3 is near and upcoming, we recommend to updating to the nevest lvm2 and device-mapper versions that is going to be there once 7.3 is released.


Note You need to log in before you can comment on or make changes to this bug.