Bug 450488 - "/sbin/telinit u" causes init (upstart) to reexecute init, but 'respawn' jobs then fail to re-enter "start" state
Summary: "/sbin/telinit u" causes init (upstart) to reexecute init, but 'respawn' jobs...
Keywords:
Status: CLOSED NEXTRELEASE
Alias: None
Product: Fedora
Classification: Fedora
Component: upstart
Version: 9
Hardware: i386
OS: Linux
low
medium
Target Milestone: ---
Assignee: Casey Dahlin
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
: 446018 451504 472925 478427 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2008-06-09 05:38 UTC by Robert Middleton
Modified: 2014-06-18 08:46 UTC (History)
17 users (show)

Fixed In Version: 0.3.9-22.fc10
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2009-02-28 03:21:16 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
Patch to allow upstart to preserve job information when re-execing itself (13.13 KB, patch)
2009-01-23 18:48 UTC, Philip Spencer
no flags Details | Diff

Description Robert Middleton 2008-06-09 05:38:47 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9) Gecko/2008051206 Firefox/3.0

Description of problem:
After running "telinit u" UpStart jobs set to respawn that were in "start" change to "stop". eg the jobs tty1 -> tty6 will no longer respawn.

This problem is made more serious by the fact that /etc/cron.daily/prelink will sometimes run "telinit u".

As such init/UpStart no longer does a basic job of keeping certain processes running.

Version-Release number of selected component (if applicable):
upstart-0.3.9-19.fc9.i386

How reproducible:
Always


Steps to Reproduce:
As root.
1. "initctl list" (to see jobs tty1 -> tty6 in "start" state)
2. "telinit u"
3. "tail /var/log/messages" to see that init has relaunched.
4. "initctl list" (jobs tty1 -> tty6 now in "stop" state)
5. login/logout of tty1 to end the tty1 process -- no login prompt will reappear as the tty1 job is not restarted.

Actual Results:
tty1 -> tty6 will not be restarted once they exit.

Expected Results:
UpStart jobs in the "start" state need to remain in the "start" state after "telinit u".
Alternatively tty1 -> tty6 need UpStart "start on" parameters that will see them restarted after init is relaunched.

Additional info:

Comment 1 Casey Dahlin 2008-06-09 06:23:42 UTC
This explains a LOT of other issues we've had. I'm going to go mark some
duplicates, then I'll get with Scott about a fix.

Comment 2 Casey Dahlin 2008-06-09 06:26:01 UTC
*** Bug 446018 has been marked as a duplicate of this bug. ***

Comment 3 Casey Dahlin 2008-08-04 20:37:19 UTC
*** Bug 451504 has been marked as a duplicate of this bug. ***

Comment 4 Alex R 2008-10-13 14:56:15 UTC
Its been a couple of months since anything appeared on this bug report by the looks of it, how are things progressing on getting it resolved? I have this same problem on my Fedora 9 system.

Comment 5 Casey Dahlin 2008-10-19 18:10:35 UTC
The upstart codebase has been kind of a moving target, so we've not been able to pin down a solution. We'll look at solving this once we get 0.5.0 running smoothly.

Comment 6 Jayson King 2008-12-11 03:23:34 UTC
same problem in Fedora 10

Comment 7 Jeremy Faith 2008-12-31 10:43:46 UTC
I can confirm that this problem still exists in Fedora 10.
Also I think bug #472925 is, for the main part, a duplicate of this bug.

Comment 8 Bill Nottingham 2009-01-05 18:29:21 UTC
*** Bug 472925 has been marked as a duplicate of this bug. ***

Comment 9 Casey Dahlin 2009-01-07 10:18:15 UTC
*** Bug 478427 has been marked as a duplicate of this bug. ***

Comment 10 Ryan O'Hara 2009-01-09 03:55:21 UTC
This really needs to be fixed. If I login at the console (tty), and subsequent logout causes the tty be lost. This has been described many times elsewhere. That would seem to be a serious problem.

And yes, this problem seems to have first appeared in Fedora 9 and is still occurring in Fedora 10.

Comment 11 Casey Dahlin 2009-01-09 05:17:26 UTC
Its a serious bug while the symptoms persist, but as you might surmise, rebooting resolves the issue. It only gets tripped again during certain updates. Most people never observe it twice.

Comment 12 Alex R 2009-01-09 13:33:19 UTC
Rebooting is not a practical solution. The bug is annoying and I ran into it on a significant number of occasions whilst using FC9. I got tired of waiting for it to be fixed so that I blew FC9 away and installed FreeBSD instead.

Comment 13 Jeremy Faith 2009-01-09 15:08:06 UTC
Perhaps until this issue is fixed the 'telinit u' lines should be removed from /etc/cron.daily/prelink and anywhere else it is called.

Comment 14 Jayson King 2009-01-09 18:36:17 UTC
prelink calls 'telinit u' whenever the init binary or libc.so.6 have changed. What does it need to do that for?

Comment 15 Casey Dahlin 2009-01-09 18:47:51 UTC
To make sure that the latest init binary is always loaded and always linked against the most recent libc.

We've decided to simply disable telinit u until there's a good solution for this. Fix forthcoming.

Comment 16 Jayson King 2009-01-09 19:04:16 UTC
(In reply to comment #15)
> To make sure that the latest init binary is always loaded and always linked
> against the most recent libc.

post-install scripts don't do that?

Comment 17 Casey Dahlin 2009-01-09 19:16:02 UTC
I think they do (by the same mechanism, so it doesn't avoid the bug). I guess its supposed to protect against home-built libc scenarios?

It wasn't my call at any rate :)

Comment 18 Ryan O'Hara 2009-01-09 20:01:18 UTC
(In reply to comment #15)
> To make sure that the latest init binary is always loaded and always linked
> against the most recent libc.
> 
> We've decided to simply disable telinit u until there's a good solution for
> this. Fix forthcoming.

Hmm. I assume that you're referring to the last line of the /etc/cron.daily/prelink. So without the 'telinit u', prelinking will still be done but init will not be restarted.

The way I understood this (perhaps incorrectly) was that an update could occur that changed libc. Eventually, the daily prelink cron job would detect this and cause the 'telinit u', which resulted in ttys being in stop state. That would explain why it is possible to do an yum update, reboot, and still hit this -- because the prelink had not been done. So the fallout is that whenever libc changes, the ttys are fine until the prelink cron job is run. After that, you should be able to reboot the machine and no longer hit this problem ... at least until libc changes again. Please correct me if I am wrong.

I looked at the tty[1-6] files in /etc/event.d and nothing jumped out at me. I don't see why the ttys get stopped on 'telinit u'.

Comment 19 Philip Spencer 2009-01-23 18:46:18 UTC
The following patch fixes the problem for us on upstart-0.3.9 by having init pass state information (the goal, state, and associated process ids) to its successor when re-execing itself.

Testing and comments are welcome.

Comment 20 Philip Spencer 2009-01-23 18:48:31 UTC
Created attachment 329868 [details]
Patch to allow upstart to preserve job information when re-execing itself

Comment 21 Casey Dahlin 2009-01-23 19:25:18 UTC
Just from a quick run-through, its a bit stylistically inconsistent with the rest of the codebase.

But I'm happy to try the patch.

Comment 22 Casey Dahlin 2009-01-24 01:15:38 UTC
Just pushed patched to Bodhi. The formatting is a bit off, but we can worry about that if upstream takes it. The people following this have waited long enough ;)

Bug will close when the updates pass testing.

Comment 23 Philip Spencer 2009-01-26 16:45:41 UTC
Thanks!

I didn't worry too much about the formatting for this version of the patch because I figured it would need considerable reworking anyway to go upstream, (since upstream is several versions ahead and seems to have significant differences in its event/job structures).

Instead I wanted to use a style that made it easy for me to see as much of the logic as possible per screen, since messing with init is something I wanted to be really careful with and I find I personally am more prone to make mistakes when writing in a less compact style.

If it tests well and is working and nobody finds any bugs, I figured that would at least allow Fedora 10 users like us to have a working init and then whoever does the work of adapting it to the upstream version could reformat it as appropriate.

Comment 24 Fedora Update System 2009-01-27 01:54:45 UTC
upstart-0.3.9-22.fc10 has been pushed to the Fedora 10 testing repository.  If problems still persist, please make note of it in this bug report.
 If you want to test the update, you can install it with 
 su -c 'yum --enablerepo=updates-testing update upstart'.  You can provide feedback for this update here: http://admin.fedoraproject.org/updates/F10/FEDORA-2009-1035

Comment 25 Fedora Update System 2009-01-27 01:55:38 UTC
upstart-0.3.9-22.fc9 has been pushed to the Fedora 9 testing repository.  If problems still persist, please make note of it in this bug report.
 If you want to test the update, you can install it with 
 su -c 'yum --enablerepo=updates-testing-newkey update upstart'.  You can provide feedback for this update here: http://admin.fedoraproject.org/updates/F9/FEDORA-2009-1047

Comment 26 Fedora Update System 2009-02-28 03:21:06 UTC
upstart-0.3.9-22.fc9 has been pushed to the Fedora 9 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 27 Fedora Update System 2009-02-28 03:29:04 UTC
upstart-0.3.9-22.fc10 has been pushed to the Fedora 10 stable repository.  If problems still persist, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.