1049928 – udisks-daemon takes 100% cpu

Bug 1049928 - udisks-daemon takes 100% cpu

Summary: udisks-daemon takes 100% cpu

Keywords:
Status:	CLOSED EOL
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	udisks2
Sub Component:
Version:	20
Hardware:	x86_64
OS:	Linux
Priority:	unspecified
Severity:	high
Target Milestone:	---
Assignee:	Michael Catanzaro
QA Contact:	Fedora Extras Quality Assurance
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2014-01-08 14:00 UTC by Mattias Bengtsson
Modified:	2015-06-29 14:15 UTC (History)
CC List:	10 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2015-06-29 14:15:36 UTC
Type:	Bug
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
tail udisks-daemon.strace (2.28 KB, text/plain) 2014-01-08 14:00 UTC, Mattias Bengtsson	no flags	Details
cat udisks-daemon.strace \| grep -v "poll" (3.40 KB, text/plain) 2014-01-08 14:01 UTC, Mattias Bengtsson	no flags	Details
backtrace that did not stop high CPU usage (13.40 KB, text/plain) 2015-04-11 03:23 UTC, Bill McGonigle	no flags	Details
View All

Description Mattias Bengtsson 2014-01-08 14:00:59 UTC

Created attachment 847156 [details]
tail udisks-daemon.strace

Description of problem:

udisks-daemon is currently taking up all CPU time on one of my cores. 

Version-Release number of selected component (if applicable):

udisks2-2.1.1-1.fc20.x86_64
libudisks2-2.1.1-1.fc20.x86_64

How reproducible:

I haven't been able to reproduce it. It happened after coming back from suspend though.

Additional info:

I did an strace of the process (strace -p pid -o udisks-daemon.strace) and it quickly grew to 2-300M before I stopped it.
Attached is the output from both:
1) tail udisks-daemon.strace
and
2) cat udisks-daemon.strace | grep -v "poll"

Comment 1 Mattias Bengtsson 2014-01-08 14:01:54 UTC

Created attachment 847157 [details]
cat udisks-daemon.strace | grep -v "poll"

Comment 2 Mattias Bengtsson 2014-01-08 14:05:10 UTC

So except for the few lines of non-poll messages in the second attachment the strace output was just 2-300M of polls and that was after letting it run for 15-30 seconds maybe. 
My guess is that udisks-daemon got stuck in a loop somewhere. :)

Comment 3 Tomáš Bžatek 2014-01-08 16:33:56 UTC

Thanks for your report. Unfortunately the strace excerpts are not showing much, I can only see filedescriptor IDs but no association.

Can you perhaps grab a stacktrace of the process? Have a look at http://fedoraproject.org/wiki/StackTraces for some gdb instructions. Please make sure to have debuginfo packages installed.

Comment 4 Mattias Bengtsson 2014-01-08 16:46:44 UTC

I've rebooted since this happened unfortunately, and I don't know if it will ever happen again. 
But I read up on how to attach to a running process with gdb now and I'll try to see if I can do some debugging at that time.

Comment 5 Mark van Rossum 2014-02-10 23:14:14 UTC

I have seen this too occasionally.

I'm running xfce with KDE. Maybe that matters, as there seems to be a lot of reports somehow relating to KDE.

Comment 6 Tomáš Bžatek 2014-02-11 10:12:29 UTC

(In reply to Mark van Rossum from comment #5)
> I have seen this too occasionally.
> 
> I'm running xfce with KDE. Maybe that matters, as there seems to be a lot of
> reports somehow relating to KDE.

Okay, when that happens again, can you please attach gdb to it and grab a stacktrace? gdb is able to attach to already running processes.

Comment 7 Mark van Rossum 2014-05-09 11:32:26 UTC

Ok, it is occuring again.

How do I grab a stacktrace? 
Do I need the debugversion of udisks-daemon?

Comment 8 Michael Vorburger 2014-05-26 18:15:38 UTC

https://bugs.freedesktop.org/show_bug.cgi?id=75906 ?

Comment 9 Mark van Rossum 2014-06-02 10:47:47 UTC

This is what I saw when I connected gdb to it.


Attaching to process 1758
Reading symbols from /usr/libexec/udisks-daemon...Reading symbols from /usr/lib/debug/usr/libexec/udisks-daemon.debug...done.
done.
Reading symbols from /lib64/libdbus-glib-1.so.2...Reading symbols from /usr/lib/debug/usr/lib64/libdbus-glib-1.so.2.2.2.debug...done.
done.
.
.
.
Loaded symbols for /lib64/libpcre.so.1
Reading symbols from /lib64/liblzma.so.5...Reading symbols from /usr/lib/debug/usr/lib64/liblzma.so.5.0.99.debug...done.
done.
Loaded symbols for /lib64/liblzma.so.5
0x0000003a70eea9dd in poll () at ../sysdeps/unix/syscall-template.S:81
81      T_PSEUDO (SYSCALL_SYMBOL, SYSCALL_NAME, SYSCALL_NARGS)
Traceback (most recent call last):
  File "/usr/share/gdb/auto-load/usr/lib64/libgobject-2.0.so.0.3800.2-gdb.py", line 9, in <module>
    from gobject import register
  File "/usr/share/glib-2.0/gdb/gobject.py", line 3, in <module>
    import gdb.backtrace
ImportError: No module named backtrace


Interestingly, after that the CPU use went back to normal.

Comment 10 Vladimir Kangin 2014-09-26 07:03:29 UTC

Same problem on 

# uname -a
Linux localhost.localdomain 3.15.8-200.fc20.x86_64 #1 SMP Fri Aug 1 00:38:50 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux

$ rpm -qa | grep udisks
libudisks2-2.1.2-2.fc20.x86_64
udisks-1.0.4-13.fc20.x86_64
udisks2-2.1.2-2.fc20.x86_64

$ locale
LANG=en_GB.utf8
LC_CTYPE="en_GB.utf8"
LC_NUMERIC="en_GB.utf8"
LC_TIME="en_GB.utf8"
LC_COLLATE="en_GB.utf8"
LC_MONETARY="en_GB.utf8"
LC_MESSAGES="en_GB.utf8"
LC_PAPER="en_GB.utf8"
LC_NAME="en_GB.utf8"
LC_ADDRESS="en_GB.utf8"
LC_TELEPHONE="en_GB.utf8"
LC_MEASUREMENT="en_GB.utf8"
LC_IDENTIFICATION="en_GB.utf8"
LC_ALL=

potentially it's happen often after going up after sleep mode.

Comment 11 Jonas Wielicki 2014-11-08 10:32:42 UTC

Can confirm this problem. 
$ uname -r 
3.16.6-203.fc20.x86_64

# rpm -qa | grep udisks
udisks2-2.1.2-2.fc20.x86_64
udisks-debuginfo-1.0.4-13.fc20.x86_64
libudisks2-2.1.2-2.fc20.x86_64
udisks2-debuginfo-2.1.2-2.fc20.x86_64
udisks-1.0.4-13.fc20.x86_64

(gdb) bt
#0  0x000000320dcea71d in poll () at ../sysdeps/unix/syscall-template.S:81
#1  0x00000032108495b4 in g_main_context_poll (priority=2147483647, n_fds=7, fds=0x1e5b660, timeout=136859, context=0x1dfda20)
    at gmain.c:4007
#2  g_main_context_iterate (context=0x1dfda20, block=block@entry=1, dispatch=dispatch@entry=1, self=<optimized out>) at gmain.c:3708
#3  0x0000003210849a3a in g_main_loop_run (loop=0x1e4aae0) at gmain.c:3907
#4  0x0000000000406fed in main (argc=1, argv=0x7ffffbfc7578) at main.c:254

I tried to mess with gdb a bit to see whether it enters some buggy idle routine of the main program, but I was unable to do it. I generated a coredump for memory investigation.

This did happen after normal system boot, no suspend/resume involved, no external drives attached. 

systemctl stop udisks2 and systemctl kill udisks2 don’t have any effect (have not tried with gdb whether a signal arrives).

CPU consumption obviously resumes when continuing or detaching gdb.

Any more information I could provide?

Comment 12 Jonas Wielicki 2014-11-08 10:33:33 UTC

I stand corrected, after the third or fourth attach/detach cycle, the CPU consumption ceased. May be unrelated though.

Comment 13 Fedora Admin XMLRPC Client 2015-02-26 17:26:26 UTC

This package has changed ownership in the Fedora Package Database.  Reassigning to the new owner of this component.

Comment 14 Fedora Admin XMLRPC Client 2015-02-26 22:39:19 UTC

This package has changed ownership in the Fedora Package Database.  Reassigning to the new owner of this component.

Comment 15 Bill McGonigle 2015-04-07 01:50:38 UTC

udisks2-2.1.3-4.fc21.x86_64

Hrm, I was strace'ing udisks, it had been burning CPU since login (maybe 10 minutes) and then I killed the strace so I could attached the gdb, and the high-CPU stopped immediately, similar to what Jonas noticed.

FWIW, the stracktrace is in md handling and the strace looks like it was constantly evaluating all md devices.  I recently added and removed disks from those md devices - mdstat is fine - don't know if udisks can't deal.

Looooong trace here:
  http://www.bfccomputing.com/downloads/fedora/udisks/udisks-trace.xz

Stack:

lookup_type_node_I (utype=139651031554256) at gtype.c:391
391       if (utype > G_TYPE_FUNDAMENTAL_MAX)
Missing separate debuginfos, use: debuginfo-install elfutils-libelf-0.161-2.fc21.x86_64 elfutils-libs-0.161-2.fc21.x86_64
(gdb) bt
#0  0x00007f03080c6e60 in g_type_check_instance_is_fundamentally_a (utype=139651031554256) at gtype.c:391
#1  0x00007f03080c6e60 in g_type_check_instance_is_fundamentally_a (type_instance=type_instance@entry=0x7f030a2692b0 [GParamString], fundamental_type=fundamental_type@entry=76) at gtype.c:3981
#2  0x00007f03080af556 in g_param_value_validate (pspec=pspec@entry=0x7f030a2692b0 [GParamString], value=value@entry=0x7ffe321b8c80) at gparam.c:659
#3  0x00007f03080ab5b6 in g_object_set_valist (nqueue=0x7f02f800e120, value=0x7ffe321b8c60, pspec=0x7f030a2692b0 [GParamString], object=0x7f030a278870 [UDisksLinuxMDRaid]) at gobject.c:1402
#4  0x00007f03080ab5b6 in g_object_set_valist (object=object@entry=0x7f030a278870 [UDisksLinuxMDRaid], first_property_name=first_property_name@entry=0x7f03086a6e8a "uuid", var_args=var_args@entry=0x7ffe321b8d30) at gobject.c:2159
#5  0x00007f03080abe6c in g_object_set (_object=0x7f030a278870, first_property_name=first_property_name@entry=0x7f03086a6e8a "uuid") at gobject.c:2269
#6  0x00007f03086999b5 in udisks_mdraid_set_uuid (object=object@entry=0x7f030a278870, value=value@entry=0x7f030a2a0b80 "4069a649:86bf49c4:bfe78010:bc810f04") at udisks-generated.c:26336
#7  0x00007f03091508e2 in udisks_linux_mdraid_update (mdraid=0x7f030a278870 [UDisksLinuxMDRaid], object=<optimized out>)
    at udiskslinuxmdraid.c:338
#8  0x00007f030914e908 in attr_changed (channel=0x7f030a27d4c0, cond=(G_IO_HUP | G_IO_NVAL | unknown: 840667200), user_data=0x7f02f8007750) at udiskslinuxmdraidobject.c:524
#9  0x00007f0307da47fb in g_main_context_dispatch (context=0x7f030a2109e0) at gmain.c:3111
#10 0x00007f0307da47fb in g_main_context_dispatch (context=context@entry=0x7f030a2109e0) at gmain.c:3710
#11 0x00007f0307da4b98 in g_main_context_iterate (context=0x7f030a2109e0, block=block@entry=1, dispatch=dispatch@entry=1, self=<optimized out>) at gmain.c:3781
#12 0x00007f0307da4ec2 in g_main_loop_run (loop=0x7f030a210990) at gmain.c:3975
#13 0x00007f0309132fb8 in main (argc=1, argv=0x7ffe321b92e8) at main.c:173

strace snip:

lstat("/sys", {st_mode=S_IFDIR|0555, st_size=0, ...}) = 0
lstat("/sys/devices", {st_mode=S_IFDIR|0755, st_size=0, ...}) = 0
lstat("/sys/devices/virtual", {st_mode=S_IFDIR|0755, st_size=0, ...}) = 0
lstat("/sys/devices/virtual/block", {st_mode=S_IFDIR|0755, st_size=0, ...}) = 0
lstat("/sys/devices/virtual/block/md2", {st_mode=S_IFDIR|0755, st_size=0, ...}) = 0
lstat("/sys/devices/virtual/block/md2/md", {st_mode=S_IFDIR|0755, st_size=0, ...}) = 0
lstat("/sys/devices/virtual/block/md2/md/dev-sda4", {st_mode=S_IFDIR|0755, st_size=0, ...}) = 0
lstat("/sys/devices/pci0000:00", {st_mode=S_IFDIR|0755, st_size=0, ...}) = 0
lstat("/sys/devices/pci0000:00/0000:00:1f.2", {st_mode=S_IFDIR|0755, st_size=0, ...}) = 0
lstat("/sys/devices/pci0000:00/0000:00:1f.2/ata1", {st_mode=S_IFDIR|0755, st_size=0, ...}) = 0
lstat("/sys/devices/pci0000:00/0000:00:1f.2/ata1/host0", {st_mode=S_IFDIR|0755, st_size=0, ...}) = 0
lstat("/sys/devices/pci0000:00/0000:00:1f.2/ata1/host0/target0:0:0", {st_mode=S_IFDIR|0755, st_size=0, ...}) = 0
lstat("/sys/devices/pci0000:00/0000:00:1f.2/ata1/host0/target0:0:0/0:0:0:0", {st_mode=S_IFDIR|0755, st_size=0, ...}) = 0
lstat("/sys/devices/pci0000:00/0000:00:1f.2/ata1/host0/target0:0:0/0:0:0:0/block", {st_mode=S_IFDIR|0755, st_size=0, ...}) = 0
lstat("/sys/devices/pci0000:00/0000:00:1f.2/ata1/host0/target0:0:0/0:0:0:0/block/sda", {st_mode=S_IFDIR|0755, st_size=0, ...}) = 0
lstat("/sys/devices/pci0000:00/0000:00:1f.2/ata1/host0/target0:0:0/0:0:0:0/block/sda/sda4", {st_mode=S_IFDIR|0755, st_size=0, ...}) = 0
open("/sys/devices/virtual/block/md2/md/dev-sda4/state", O_RDONLY) = 21
fstat(21, {st_mode=S_IFREG|0644, st_size=4096, ...}) = 0
read(21, "in_sync\n", 4096)             = 8
read(21, "", 4088)                      = 0
close(21)                               = 0
open("/sys/devices/virtual/block/md2/md/dev-sda4/slot", O_RDONLY) = 21
fstat(21, {st_mode=S_IFREG|0644, st_size=4096, ...}) = 0
read(21, "0\n", 4096)                   = 2
read(21, "", 4094)                      = 0
close(21)                               = 0
open("/sys/devices/virtual/block/md2/md/dev-sda4/errors", O_RDONLY) = 21
fstat(21, {st_mode=S_IFREG|0644, st_size=4096, ...}) = 0
read(21, "0\n", 4096)                   = 2
read(21, "", 4094)                      = 0
close(21)                               = 0
readlink("/sys/devices/virtual/block/md2/md/dev-sdb4/block", "../../../../../pci0000:00/0000:0"..., 4095) = 84
lstat("/sys", {st_mode=S_IFDIR|0555, st_size=0, ...}) = 0
lstat("/sys/devices", {st_mode=S_IFDIR|0755, st_size=0, ...}) = 0
lstat("/sys/devices/virtual", {st_mode=S_IFDIR|0755, st_size=0, ...}) = 0
lstat("/sys/devices/virtual/block", {st_mode=S_IFDIR|0755, st_size=0, ...}) = 0

Comment 16 Bill McGonigle 2015-04-11 03:23:20 UTC

Created attachment 1013332 [details]
backtrace that did not stop high CPU usage

Comment 17 Fedora End Of Life 2015-05-29 10:23:18 UTC

This message is a reminder that Fedora 20 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 20. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as EOL if it remains open with a Fedora  'version'
of '20'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 20 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 18 Phil 2015-06-18 09:18:41 UTC

same problem with fedora 22

4.0.4-303.fc22.x86_64
udisks2-2.1.5-1.fc22.x86_64

Comment 19 Fedora End Of Life 2015-06-29 14:15:36 UTC

Fedora 20 changed to end-of-life (EOL) status on 2015-06-23. Fedora 20 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.

Note You need to log in before you can comment on or make changes to this bug.