Bug 635393 - Timeout too short to let kdump.service finish starting
Summary: Timeout too short to let kdump.service finish starting
Keywords:
Status: CLOSED UPSTREAM
Alias: None
Product: Fedora
Classification: Fedora
Component: systemd
Version: rawhide
Hardware: All
OS: Linux
low
medium
Target Milestone: ---
Assignee: Lennart Poettering
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
: 641227 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-09-19 11:55 UTC by Tomasz Torcz
Modified: 2010-10-09 11:49 UTC (History)
7 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2010-10-08 00:53:14 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)

Description Tomasz Torcz 2010-09-19 11:55:44 UTC
Description of problem:
kdump.service is taken from /etc/init.d/kdump script. During it first start, some initrd image is generated. Unfortunately this takes so much time that systemd kills job with timeout:

Sep 19 13:42:45 dhartha systemd[1]: kdump.service operation timed out. Terminating.

Being LSB script analysed job, there's no way to change timeout for it inside systemd.


Version-Release number of selected component (if applicable):
kexec-tools-2.0.0-38.fc15.x86_64
systemd-10-3.fc15.x86_64

How reproducible:
Always.

Steps to Reproduce:
1. systemctl start kdump.service
2. use systemd-cgls to observe kdump.service working (mkdumprd running) 
3. notice kdump.service killed by timeout
  
Actual results:
kdump.service operation timed out. Terminating.

Expected results:
kdump.service finish successfuly.

Additional info:

Comment 1 Lennart Poettering 2010-09-20 19:47:04 UTC
Hmm, what would be a suitable timeout here?

Comment 2 Michal Schmidt 2010-09-21 15:14:37 UTC
[CC: Neil Horman]

(In reply to comment #1)
> Hmm, what would be a suitable timeout here?

Comment 3 Tomasz Torcz 2010-09-22 09:07:28 UTC
# SYSTEMCTL_SKIP_REDIRECT=1 time /etc/init.d/kdump start
No kdump initial ramdisk found.                            [OSTRZEŻENIE]
Rebuilding /boot/initrd-2.6.36-0.24.rc5.git0.fc15.x86_64kdump.img
Starting kdump:                                            [  OK  ]
23.23user 59.01system 1:19.45elapsed 103%CPU (0avgtext+0avgdata 106816maxresident)k

So in my case it's 1 minute 20 second to generate initrd image. This number is highly machine dependent, and my computer is rather slow by today standards (1.8 GHz Core 2).
I've looked into converting /etc/init.d/kdump into systemd job, but the initscript is not simple at all :(

Comment 4 Bill Nottingham 2010-09-22 15:33:29 UTC
Is there a reason this initramfs generation couldn't be done as a %posttrans on packages with the kernel, much like the standard system initramfs is?

Comment 5 Lennart Poettering 2010-10-08 00:53:14 UTC
I have now changed the default timeout for sysv services to 3min (the default for native services stays at 1min, but since it is easy to change the timeout for them this should not be a problem). This is more than twice the amount of time the kdump script required on your machine, and hence I hope we are safe for a while.

Or in other words: Should this turn out to still be a too short timeout I'd be willing to bump it to 5mins or even higher. But let's wait until that happens...

Comment 6 Jon Masters 2010-10-08 09:20:55 UTC
I suggest a 5 minute timeout by default, since it can be configured down if needed. Otherwise we'll have similar problems - there never used to be any timeout for starting init scripts AFAIK.

Comment 7 Lennart Poettering 2010-10-08 11:39:49 UTC
Jon, how long did the script take on your machine?

Comment 8 Lennart Poettering 2010-10-08 14:08:38 UTC
*** Bug 641227 has been marked as a duplicate of this bug. ***

Comment 9 Jon Masters 2010-10-08 20:18:52 UTC
Well, my concern is not one script :) My concern is that the halting problem isn't solvable in general and UNIX systems don't historically put a cap on initscripts (that I'm generally aware of). Therefore, I see the point in having a timeout for sanity (kill a script that just never ends sounds nice), but I think it should default to being much larger than we're likely to see. Those who really care can reduce it down to whatever they like.

Comment 10 Jon Masters 2010-10-08 20:19:30 UTC
And hey, thanks for the fast turn around and responsiveness. Appreciated. Have a good weekend.

Comment 11 Lennart Poettering 2010-10-09 11:49:37 UTC
I think discussing whether the timeout should be 3 or 5 mins is a bit of bikeshedding. I'll just follow this algorithm here: double the longest service startup time we are aware of and round it up to the next round value. Which is what I did to come to the 3mins. I think there's little point in further discussions on this.


Note You need to log in before you can comment on or make changes to this bug.