Bug 743518 - 30-60 seconds pause during shutdown sequence
30-60 seconds pause during shutdown sequence
Status: CLOSED DUPLICATE of bug 739836
Product: Fedora
Classification: Fedora
Component: systemd (Show other bugs)
16
x86_64 Linux
unspecified Severity urgent
: ---
: ---
Assigned To: systemd-maint
Fedora Extras Quality Assurance
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2011-10-05 04:41 EDT by Artem S. Tashkinov
Modified: 2011-11-08 13:04 EST (History)
9 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2011-11-07 07:17:18 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)
List of installed RPM packages (38.79 KB, text/plain)
2011-10-05 05:18 EDT, Artem S. Tashkinov
no flags Details
shutdown-log.txt.bz2 (36.21 KB, application/octet-stream)
2011-11-06 03:15 EST, Artem S. Tashkinov
no flags Details

  None (edit)
Description Artem S. Tashkinov 2011-10-05 04:41:34 EDT
Description of problem: upon boot, after I have worked in a text console and tried to reboot or shut down my computer by pressing Ctrl+Alt+Del or invoking `poweroff`, the system starts a shutdown process but it doesn't actually reboot or power offs, instead I see an empty console, I can press characters but nothing happens.

Version-Release number of selected component (if applicable): systemd-36-3.fc16.x86_64


How reproducible: quite often but randomly


Steps to Reproduce:
1. Boot into the system
2. Log in into a text console
3. Try to Ctrl+Alt+Del or `poweroff`
  
Actual results: systemd starts doing something, then the system freezes - all I see is an empty console, I can press characters and they appear on the screen  - however the system stalls eternally in this state. All other consoles are dead at this moment.


Expected results: reboot/poweroff


Additional info:
Comment 1 Michal Schmidt 2011-10-05 05:07:50 EDT
Does "sync && poweroff -f" work reliably?
Comment 2 Artem S. Tashkinov 2011-10-05 05:18:04 EDT
Created attachment 526449 [details]
List of installed RPM packages

I'm attaching a list of installed RPM packages generated using 

rpm -qa --qf '%{name}-%{version}-%{release}.%{arch}.rpm\n' | sort
Comment 3 Artem S. Tashkinov 2011-10-05 05:19:11 EDT
(In reply to comment #1)

Yes.
Comment 4 Michal Schmidt 2011-10-05 05:24:35 EDT
Is it reproducible if you boot with the parameter "single" and try to poweroff from there?
And how about with "emergency"?
Comment 5 Artem S. Tashkinov 2011-10-07 15:39:22 EDT
(In reply to comment #4)

In single mode everything works correctly.
Comment 6 Lennart Poettering 2011-10-10 21:00:24 EDT
Please boot with "systemd.log_level=debug systemd.log_target=kmsg rd_NO_PLYMOUTH plymouth.enable=0" on the kernel cmdline. Then shutdown and paste here (or take a photo if necessary) the last output you see on screen when the machine is stuck.
Comment 7 Artem S. Tashkinov 2011-10-11 07:40:35 EDT
(In reply to comment #6)

Kernel command line: BOOT_IMAGE=/boot/vmlinuz-3.1.0-0.rc8.git0.1.fc16.x86_64 root=UUID=dc19289c-125b-42c1-bf58-6c86ffeb3b22 ro quiet rhgb SYSFONT=ter-u20b LANG=en_US.UTF-8 KEYTABLE=ru systemd.log_level=debug systemd.log_target=kmsg rd_NO_PLYMOUTH plymouth.enable=0

No messages on the screen (after Ctrl+Alt+Del from a text console), the screen is black (I see two unrelated ALSA warning messages).

xorg.log:
...
(II) UnloadModule: "evdev"
(II) Unloading evdev
(II) NOUVEAU(0): Closed GPU channel 2
Server terminated successfully (0). Closing log file.

lxdm.log:
...
** (process:1095): CRITICAL **: QUIT BY SIGNAL
** Message: quit code 0
** Message: exit cb
Server terminated successfully (0). Closing log file.
** Message: free session

messages:
Oct 11 17:29:27 localhost kernel: Kernel logging (proc) stopped.
Oct 11 17:29:27 localhost rsyslogd: [origin software="rsyslogd" swVersion="5.8.5" x-pid="807" x-info="http://www.rsyslog.com"] exiting on signal 15.

That's all. After hard reboot (via a button on the case) / and /home are both in the dirty state: "warning: mounting unchecked fs, running e2fsck is recommended"
Comment 8 Artem S. Tashkinov 2011-10-11 07:42:51 EDT
/ is indeed very badly "unmounted" :)

e2fsck -D -C 0 -v -t /dev/sda9
e2fsck 1.41.12 (17-May-2010)
ROOT64 was not cleanly unmounted, check forced.
Pass 1: Checking inodes, blocks, and sizes
Deleted inode 655842 has zero dtime.  Fix<y>? yes
Deleted inode 655844 has zero dtime.  Fix<y>? yes
Deleted inode 655846 has zero dtime.  Fix<y>? yes
Deleted inode 655848 has zero dtime.  Fix<y>? yes
Deleted inode 655876 has zero dtime.  Fix<y>? yes

Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 3A: Optimizing directories
Pass 4: Checking reference counts
Pass 5: Checking group summary information
Block bitmap differences:  -(2661804--2661807) -(2663568--2663572) -(2720703--2720781) -(2765774--2766380)
Fix<y>? yes

Free blocks count wrong for group #81 (3849, counted=3858).
Fix<y>? yes

Free blocks count wrong for group #83 (15515, counted=15594).
Fix<y>? yes

Free blocks count wrong for group #84 (15551, counted=16158).
Fix<y>? yes

Free blocks count wrong (2163597, counted=2164292).
Fix<y>? yes

Inode bitmap differences:  -655842 -655844 -655846 -655848 -655876
Fix<y>? yes

Free inodes count wrong for group #80 (615, counted=620).
Fix<y>? yes

Free inodes count wrong (680368, counted=680373).
Fix<y>? yes

ROOT64: ***** FILE SYSTEM WAS MODIFIED *****
Comment 9 Michal Schmidt 2011-10-12 10:04:55 EDT
(In reply to comment #7)
> No messages on the screen (after Ctrl+Alt+Del from a text console), the screen
> is black (I see two unrelated ALSA warning messages).

Any chance you could arrange for capturing the debug messages using a serial console or netconsole?
Comment 10 Artem S. Tashkinov 2011-10-12 10:21:32 EDT
(In reply to comment #9)

That's gonna be difficult if not impossible. My PC lacks COM ports and my PC is the only computer device in my apartment.

So, let's pursue another avenue - is there a way to make sure that upon hitting Ctrl+Alt+Del in a text console systemd does switch to a virtual console where it dumps all the messages?

I'm asking this question because if I reboot from a graphical session, then I always hit this virtual console where systemd shows its shutdown progress. Ostensibly initiating a shutdown sequence from a text console doesn't always lead to the same result.

Maybe there's a way to force systemd to dump its shutdown messages to the same text console from where I called a shutdown sequence?
Comment 11 Michal Schmidt 2011-10-12 10:35:55 EDT
Try systemd.log_target=console
Comment 12 Fedora Admin XMLRPC Client 2011-10-20 12:31:14 EDT
This package has changed ownership in the Fedora Package Database.  Reassigning to the new owner of this component.
Comment 13 Artem S. Tashkinov 2011-10-29 12:29:18 EDT
It turns out systemd just hangs (or waits for something) for a whole 30 seconds at some stage of the shutdown process but since I see no meaningful messages on the screen (even in a verbose mode) I cannot say what's happening.

I can upload the entire image of my root filesystem if anyone's interested in debugging this problem in a virtual machine.
Comment 14 Artem S. Tashkinov 2011-11-05 05:01:49 EDT
During this pause I see nothing on the screen, and when systemd finally awakes it shows:

"Sending TERM signal to remaining processes ..."

then the system reboot/shutdowns almost instantly.

however this pause is killing me as it happens almost every time.
Comment 15 Michal Schmidt 2011-11-05 05:39:22 EDT
Please follow the method described in this message to capture logs from the shutdown:
http://lists.fedoraproject.org/pipermail/devel/2011-November/158955.html
Comment 16 Artem S. Tashkinov 2011-11-06 03:15:02 EST
Created attachment 531928 [details]
shutdown-log.txt.bz2

(In reply to comment #15)
Comment 17 Michal Schmidt 2011-11-07 07:17:18 EST
NetworkManager is having trouble quitting:

[   23.597625] systemd[1]: Trying to enqueue job reboot.target/start/replace
...
[   23.598044] systemd[1]: Installed new job NetworkManager.service/stop as 443
...
[   24.138436] systemd[1]: Job network.target/stop finished, result=done
[   24.138509] systemd[1]: NetworkManager.service changed running -> stop-sigterm
...
[   24.147265] NetworkManager[886]: NetworkManager[886]: <warn> disconnected by the system bus.
[   24.147268] NetworkManager[886]: NetworkManager[886]: <info> caught signal 15, shutting down normally.
...
[  114.051313] systemd[1]: NetworkManager.service stopping timed out. Killing.
[  114.051430] systemd[1]: NetworkManager.service changed stop-sigterm -> stop-sigkill
[  114.051444] systemd[1]: Running GC...
[  114.075205] systemd[1]: Received SIGCHLD from PID 886 (NetworkManager).

Apparently NM received the SIGTERM, but it kept running for some reason, until a timeout kicked in and sent a SIGKILL.

This looks like bug 739836.

*** This bug has been marked as a duplicate of bug 739836 ***
Comment 18 Jirka Klimes 2011-11-08 11:51:45 EST
Artem, are you able to reproduce the issue? Does it happen every time or just sometimes? Bug 739836 suggest that F16 GOLD is not affected any more.

Are you able to quit NetworkManager with sending NM TERM signal?
# pkill -TERM NetworkManager

glib2 could possible influence this bug, but I don't manage to provoke it on any system.
Comment 19 Artem S. Tashkinov 2011-11-08 13:04:48 EST
(In reply to comment #18)

I have reinstalled the system and disabled NetworkManager altogether, so I cannot reproduce this issue, because the faulty component is no longer active. (I have a static Internet connection so using NetworkManager makes no sense).

If I ever reproduce it on a different system, I'll reopen this bug report.

Note You need to log in before you can comment on or make changes to this bug.