435368 – argument handling is broken for restart events

Bug 435368 - argument handling is broken for restart events

Summary: argument handling is broken for restart events

Keywords:
Status:	CLOSED DEFERRED
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	upstart
Sub Component:
Version:	rawhide
Hardware:	All
OS:	Linux
Priority:	low
Severity:	low
Target Milestone:	---
Assignee:	Casey Dahlin
QA Contact:	Fedora Extras Quality Assurance
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	upstart 434764
TreeView+	depends on / blocked

Reported:	2008-02-28 21:02 UTC by Bill Nottingham
Modified:	2014-06-18 08:46 UTC (History)
CC List:	4 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2008-03-11 02:43:09 UTC
Type:	---
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Bill Nottingham 2008-02-28 21:02:22 UTC

Description of problem:

Take an example event:

...
start on foo

respawn
script
  echo $* > /tmp/foo.$$
  sleep 10
end script
...

Run 'initctl emit --no-wait foo bar baz'. The first run of this event will
correctly echo 'bar baz'. However, subsequent runs don't show the arguments.

Version-Release number of selected component (if applicable):

upstart-0.3.9-5.fc9.x86_64

Comment 1 Casey Dahlin 2008-02-29 16:58:26 UTC

0.5 changes the argument system substantially, in a way that fixes this issue
and others.

Unless we must must have this for F9, we can close it as WONTFIX

Comment 2 Bill Nottingham 2008-02-29 17:05:16 UTC

We need this for handling of serial console *getty entries, as far as I can
tell. The alternative would be writing one separate event for each potential
serial device.

Comment 3 Bill Nottingham 2008-02-29 17:05:33 UTC

(Which, considering the infinite number of serial minors, isn't really practical.)

Comment 4 Bill Nottingham 2008-03-01 02:30:50 UTC

Example, we'd define an event like:
-----
start on fedora.serial-console-available *
 
stop on runlevel [016]

respawn
exec /sbin/agetty /dev/$1 $2 vt100-nav

-----

We have a udev helper that emits (for example) 'fedora.serial-console-available
ttyS0 115200'. To not use arguments (or environment variables - they're
similarly broken), we'd have to write events for every combination of ports and
baud rate. (Or not use upstart. But it seems like the logical way to implement
this.)

Comment 5 Casey Dahlin 2008-03-01 05:02:47 UTC

We'll need Scott to look at this.

Like I said, there is a fix for this in trunk, but trunk is a very different
system right now. 0.5 will have the fix for Fedora 10, at which time this
becomes possible.

Comment 6 Scott James Remnant 2008-03-01 11:00:39 UTC

How did you do this in sysvinit?

Comment 7 Bill Nottingham 2008-03-03 16:51:05 UTC

We had an initscript that edited /etc/inittab. I'd rather not write an
initscripts that wrote upstart jobs on demand, for obvious reasons.

Comment 8 Casey Dahlin 2008-03-06 06:01:10 UTC

Could you make a script wrapper for agetty with something like:

while true; do /sbin/agetty /dev/$1 $2 vt100-nav; done

So that it would simply handle respawning by itself?

Comment 9 Bill Nottingham 2008-03-06 17:51:58 UTC

That's somewhat gross; also keeps a shell running when there's no need.

Comment 10 Casey Dahlin 2008-03-06 18:48:29 UTC

Its not great, but it could get us by in a pinch.

I'll have to see if patching in the argument fix is easy enough.

Comment 11 Scott James Remnant 2008-03-07 18:10:55 UTC

I'm not sure what are the obvious reasons you don't want to do the same with
Upstart?

What works for sysvinit should work for Upstart, no?

Comment 12 Bill Nottingham 2008-03-07 18:38:10 UTC

Editing inittab obviously doesn't work. Moreover, the way that it's done with
inittab editing is that it actively goes and *removes* entries that don't match
the current console configuration at the next boot; doing this with upstart in
this way is rather impracticaly as you're removing event files out from under.

The obvious *right* way to do this is by events triggered off of udev. If
upstart isn't actually offering us a useful way to do this, why are we bothering
with switching?

Comment 13 Casey Dahlin 2008-03-07 19:11:53 UTC

Not sure this will work anyway.

The first serial console will work, but then the next one will just say the job
is already running.

Instance jobs are another feature that's only in trunk atm. Right now its one
job per job file.

Comment 14 Casey Dahlin 2008-03-07 19:14:19 UTC

Scratch what I just said. Instance jobs are actually in 0.3.9

But the job definition is wrong for this reason. You need an "instance" stanza.

Working on a good solution to all of this with Scott right now.

Comment 15 Scott James Remnant 2008-03-07 19:18:52 UTC

In 0.3, respawn was intended to be a magic variant of "start on stopped
$UPSTART_JOB"; we didn't realise that this was the wrong behaviour until later
-- and that you actually want it to be just a reincarnation of the previous job,
started by the same request as before.

You can implement respawn yourself in the job, reemitting your own events to
restart the next instance:


start on fedora.serial-console-available
stop on runlevel [016]

instance
exec /sbin/agetty /dev/$1 $2 vt100-nav
pre-stop script
    touch /tmp/do-not-respawn-$1
end script
post-stop script
    if [ -f /tmp/do-not-respawn-$1 ]; then
        rm -f /tmp/do-not-respawn-$1
    fi

    initctl emit --no-wait fedora.serial-console-available $1 $2
end script


Since this is no longer a service, the event information will be available in
pre-stop and post-stop.  pre-stop only runs if the job stops naturally (ie. by
"stop on" or explicit "stop") so we use this to work out whether to respawn or not.

Because it's not a service, you'll need --no-wait on the emit calls; otherwise
they'll block until the getty actually dies.

Comment 16 Scott James Remnant 2008-03-07 19:19:37 UTC

Correction to the above

post-stop script
    if [ -f /tmp/do-not-respawn-$1 ]; then
        rm -f /tmp/do-not-respawn-$1
    else
        initctl emit --no-wait fedora.serial-console-available $1 $2
    fi
end script

Comment 17 Scott James Remnant 2008-03-07 19:27:23 UTC

(The above is obviously a hack for 0.3's brain-deadness; to prove the issue is
fixed, here would be the same job with trunk:

    start on fedora.serial-console-available
    stop on runlevel [016] or fedora.serial-console-gone $TTY
    env BAUD=38400
    instance $TTY
    respawn
    exec /sbin/agetty /dev/$TTY $BAUD vt100-nav

Obviously the first thing is that the environment actually carries on past a
respawn, that's useful.  It also lets you explicitly do things like "start
agetty TTY=ttyS0" (with $BAUD defaulting in the job).

Comment 18 Bill Nottingham 2008-03-07 22:10:14 UTC

So, not to ask a completely ridiculous question - should we use trunk? *ducks
and runs*

Can you set env variables in pre-stop to use in post-stop, or does all
communication have to be done via the fs?

Comment 19 Scott James Remnant 2008-03-08 01:50:39 UTC

Trunk has no initctl yet, so it would be kinda hard work :-)


You can't pass things from pre-stop to post-stop - can you think of a use case
that isn't caused by an Upstart bug? :)

Comment 20 Bill Nottingham 2008-03-10 17:08:14 UTC

Other than 'state on the filesystem is kind of a hack', no.

Comment 21 Scott James Remnant 2008-03-10 17:14:32 UTC

Sure, but in theory post-stop is only meant to clean up things created in
pre-start.  There are supposed to be better ways to react to a job stopping
(another job catching the stopped event, for example) for further work.

Comment 22 Bill Nottingham 2008-03-10 19:34:01 UTC

Actually, couldn't you just trap on 'UPSTART_EVENT' != 'fedora.serial...' in the
post-stop?

Comment 23 Casey Dahlin 2008-03-10 19:53:23 UTC

(In reply to comment #22)
> Actually, couldn't you just trap on 'UPSTART_EVENT' != 'fedora.serial...' in the
> post-stop?

How do we tell an explicit run of /sbin/stop from a crash? From the perspective
of post-stop they both have $UPSTART_EVENT = ""

Comment 24 Casey Dahlin 2008-03-10 20:12:07 UTC

We could use a state job. Have pre-stop start an empty job and post-stop kill
it. I don't know that that's prettier though.

Comment 25 Bill Nottingham 2008-03-10 20:15:22 UTC

A crash still has UPSTART_EVENT = the starting event, at least in brief testing.

Comment 26 Casey Dahlin 2008-03-10 20:28:11 UTC

My testing differs. I have it blank

Comment 27 Bill Nottingham 2008-03-10 20:39:00 UTC

Here's my event:


start on fedora.serial-console-available *
stop on runlevel [016]

instance
exec /sbin/agetty /dev/$1 $2 vt100-nav
post-stop script
	if [ "$UPSTART_EVENT" != "${UPSTART_EVENT##fedora.serial-console-available}" ];
then
		initctl emit --no-wait fedora.serial-console-available $1 $2
	fi
end script

On a crash (kill -SEGV) it respawns correctly, and on 'initctl stop <event>', it
stops correctly.

Comment 28 Casey Dahlin 2008-03-10 20:47:32 UTC

ahh. The segv is the difference. A regular -TERM doesn't do it.

Comment 29 Bill Nottingham 2008-03-11 02:43:09 UTC

As we do have something that works for what we need here, deferring.

Note You need to log in before you can comment on or make changes to this bug.