Bug 1310871

Summary: Very slow shutdown, it takes 1 minute 30 seconds
Product: [Fedora] Fedora Reporter: Jaroslav Škarvada <jskarvad>
Component: xfce4-power-managerAssignee: Christoph Wickert <christoph.wickert>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 23CC: christoph.wickert, extras-qa, johannbg, jsynacek, lnykryn, msekleta, muadda, nonamedotc, s, systemd-maint, zbyszek
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: xfce4-power-manager-1.5.2-6.fc23 xfce4-power-manager-1.5.2-6.fc24 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 1310608 Environment:
Last Closed: 2016-03-17 20:51:25 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Proposed fix none

Description Jaroslav Škarvada 2016-02-22 20:55:14 UTC
+++ This bug was initially created as a clone of Bug #1310608 +++

Description of problem:
This is not 100% reproducible, but occurs in more than 60% of shutdowns. It seems it cannot shutdown device mapper because it is busy, so it timeouts after 1:30. I suspected akmods, because the CPU is quite busy for the time it waits, but from the log it doesn't seems it's causing the problem.

Version-Release number of selected component (if applicable):
systemd-222-14.fc23.x86_64

How reproducible:
Sometimes (in more than 60% of shutdowns)

Steps to Reproduce:
1. poweroff

Actual results:
Waits and timeouts after 1 minute 30 seconds, then it shutdowns.

Expected results:
Immediate shutdown.

Additional info:
It starts happening after upgrade to F23.

Is there any debug option howto find out what is blocking the shutdown?

Notice the time difference in journal:
úno 22 01:27:31 yarda kernel: thinkpad_acpi: EC reports that Thermal Table has changed
úno 22 01:28:56 yarda systemd[1]: dev-dm\x2d0.device: Job dev-dm\x2d0.device/stop timed out.

--- Additional comment from Jan Synacek on 2016-02-22 12:07:22 CET ---

Also, this seems to be the same problem as in https://github.com/systemd/systemd/issues/2691.

--- Additional comment from Jan Synacek on 2016-02-22 12:09:40 CET ---

(In reply to Jaroslav Škarvada from comment #0)
> Is there any debug option howto find out what is blocking the shutdown?

You can try adding "systemd.log_level=debug" to the kernel command line and attach the output of the boot where the hang happened (use "journalctl -b -1 (minus one) after reboot).

--- Additional comment from Jaroslav Škarvada on 2016-02-22 21:52:10 CET ---

The delay is caused by systemd-coredump process which needs a bit more time on my core i7 than 90 seconds to complete (it's dumping memory and xz compressing it which is really CPU demanding especially if there is a lot of RAM). The problem is that the process is killed after 90 seconds, so nothing is dumped, which is really confusing. I see some serious problems here:

- user should be alerted what's going there on the console, i.e. that's coredump process is running, just showing "waiting for device-mapper service" (or similar) and counting down is totally useless

- it should be logged to journal that coredump process started and it's waiting for its completion, but unfortunately nothing is currently logged

- and finally the coredump process shouldn't be killed after 90 seconds, or it shouldn't use xz compression when dumping, but compress the core later (e.g. during next boot) which may result in much faster completion, otherwise it's useless time waste

By observing the coredump, it seems it's caused by xfce4-power-manager crash (SIGSEGV), backtrace follows:

#0  0x00007f8890ff3b3a in strlen () at /lib64/libc.so.6
#1  0x000055b775fab868 in get_device_description (upower=0x7f8874009270, device=device@entry=0x7f88740394e0)
    at xfpm-power-common.c:267
#2  0x000055b775fa7526 in power_manager_button_update_device_icon_and_details (button=0x55b777b04230, device=0x7f88740394e0)
    at ../panel-plugins/power-manager-plugin/power-manager-button.c:454
#3  0x00007f88922ee7a5 in g_closure_invoke () at /lib64/libgobject-2.0.so.0
#4  0x00007f8892300851 in signal_emit_unlocked_R () at /lib64/libgobject-2.0.so.0
#5  0x00007f8892309530 in g_signal_emit_valist () at /lib64/libgobject-2.0.so.0
#6  0x00007f88923098ff in g_signal_emit () at /lib64/libgobject-2.0.so.0
#7  0x00007f88922f2cd4 in g_object_dispatch_properties_changed () at /lib64/libgobject-2.0.so.0
#8  0x00007f88922f51c1 in g_object_notify () at /lib64/libgobject-2.0.so.0
#9  0x00007f88922ee7a5 in g_closure_invoke () at /lib64/libgobject-2.0.so.0
#10 0x00007f8892300851 in signal_emit_unlocked_R () at /lib64/libgobject-2.0.so.0
#11 0x00007f8892309530 in g_signal_emit_valist () at /lib64/libgobject-2.0.so.0
#12 0x00007f88923098ff in g_signal_emit () at /lib64/libgobject-2.0.so.0
#13 0x00007f88922f2cd4 in g_object_dispatch_properties_changed () at /lib64/libgobject-2.0.so.0
#14 0x00007f88922f51c1 in g_object_notify () at /lib64/libgobject-2.0.so.0
#15 0x00007f889254692f in up_device_glue_proxy_g_properties_changed () at /lib64/libupower-glib.so.3
#16 0x00007f888c3dad30 in ffi_call_unix64 () at /lib64/libffi.so.6
#17 0x00007f888c3da79b in ffi_call () at /lib64/libffi.so.6
#18 0x00007f88922eefc9 in g_cclosure_marshal_generic () at /lib64/libgobject-2.0.so.0
#19 0x00007f88922ee7a5 in g_closure_invoke () at /lib64/libgobject-2.0.so.0
#20 0x00007f8892300dfe in signal_emit_unlocked_R () at /lib64/libgobject-2.0.so.0
#21 0x00007f8892309530 in g_signal_emit_valist () at /lib64/libgobject-2.0.so.0
#22 0x00007f88923098ff in g_signal_emit () at /lib64/libgobject-2.0.so.0
#23 0x00007f889283283d in on_name_owner_changed () at /lib64/libgio-2.0.so.0
#24 0x00007f8892822554 in emit_signal_instance_in_idle_cb () at /lib64/libgio-2.0.so.0
#25 0x00007f8891fefe3a in g_main_context_dispatch () at /lib64/libglib-2.0.so.0
#26 0x00007f8891ff01d0 in g_main_context_iterate.isra () at /lib64/libglib-2.0.so.0
#27 0x00007f8891ff04f2 in g_main_loop_run () at /lib64/libglib-2.0.so.0
#28 0x00007f8894830255 in gtk_main () at /lib64/libgtk-3.so.0
#29 0x000055b775f934db in xfpm_start (bus=<optimized out>, client_id=0x0, dump=<optimized out>) at xfpm-main.c:240
#30 0x000055b775f93b4f in main (argc=1, argv=0x7ffe2e113e48) at xfpm-main.c:447

I am not sure why it's crashing, I guess there is probably null dereference when parsing some garbage/NULL got from the upower upon it's deinitialization. Well, it's very probably easily fixable xfce4-power-manager bug, so cloning there. But I think the above mentioned systemd issues should be also addressed.

Comment 1 Jaroslav Škarvada 2016-02-22 20:55:44 UTC
xfce4-power-manager-1.5.2-4.fc23.x86_64

Comment 2 Jaroslav Škarvada 2016-02-23 10:24:38 UTC
Created attachment 1129701 [details]
Proposed fix

I think it segfaults because either 'vendor' or 'model' is NULL. I am not sure whether it is allowed, maybe there is also upower bug, but I think the client (xfce4-power-manager) should handle even such strange cases. The attached patch is my naive approach to fix the segfault. Maybe it's better to exit or do some different thing than just injecting empty string ("") instead of NULLs. I am now testing it whether it resolved the slow shutdown problem for me.

Comment 3 Jaroslav Škarvada 2016-02-23 10:40:44 UTC
Upstream bugzilla:
https://bugzilla.xfce.org/show_bug.cgi?id=12465

Comment 4 Mukundan Ragavan 2016-02-25 01:22:33 UTC
Thanks for filing the bug upstream.


Is this happening on a laptop (I see thinkpad_acpi line) or on ac power?

Comment 5 Jaroslav Škarvada 2016-02-25 08:25:03 UTC
(In reply to Mukundan Ragavan from comment #4)
> Thanks for filing the bug upstream.
> 
> 
> Is this happening on a laptop (I see thinkpad_acpi line) or on ac power?

Both.

It seems the patch resolved the problem for me - no more crash in last two days.

Comment 6 Jaroslav Škarvada 2016-02-25 08:58:12 UTC
(In reply to Jaroslav Škarvada from comment #5)
> (In reply to Mukundan Ragavan from comment #4)
> > Thanks for filing the bug upstream.
> > 
> > 
> > Is this happening on a laptop (I see thinkpad_acpi line) or on ac power?
> 
> Both.
> 
IIRC.

I will try to locally modify the patch to find out what event is exactly sent there.

Comment 7 Mukundan Ragavan 2016-02-25 22:38:04 UTC
Jaroslav, your patch has pushed to master branch upstream. I will add that patch and build it.

Comment 8 Jaroslav Škarvada 2016-02-26 08:02:52 UTC
Great, thanks.

Comment 9 Fedora Update System 2016-03-09 04:23:58 UTC
xfce4-power-manager-1.5.2-6.fc24 has been submitted as an update to Fedora 24. https://bodhi.fedoraproject.org/updates/FEDORA-2016-712eb3050d

Comment 10 Fedora Update System 2016-03-09 04:24:13 UTC
xfce4-power-manager-1.5.2-6.fc23 has been submitted as an update to Fedora 23. https://bodhi.fedoraproject.org/updates/FEDORA-2016-88fbd1a26c

Comment 11 Fedora Update System 2016-03-09 22:55:17 UTC
xfce4-power-manager-1.5.2-6.fc23 has been pushed to the Fedora 23 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-88fbd1a26c

Comment 12 Fedora Update System 2016-03-10 01:56:29 UTC
xfce4-power-manager-1.5.2-6.fc24 has been pushed to the Fedora 24 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-712eb3050d

Comment 13 Fedora Update System 2016-03-17 20:51:18 UTC
xfce4-power-manager-1.5.2-6.fc23 has been pushed to the Fedora 23 stable repository. If problems still persist, please make note of it in this bug report.

Comment 14 Fedora Update System 2016-03-26 18:17:25 UTC
xfce4-power-manager-1.5.2-6.fc24 has been pushed to the Fedora 24 stable repository. If problems still persist, please make note of it in this bug report.