Description of problem: Take an example event: ... start on foo respawn script echo $* > /tmp/foo.$$ sleep 10 end script ... Run 'initctl emit --no-wait foo bar baz'. The first run of this event will correctly echo 'bar baz'. However, subsequent runs don't show the arguments. Version-Release number of selected component (if applicable): upstart-0.3.9-5.fc9.x86_64
0.5 changes the argument system substantially, in a way that fixes this issue and others. Unless we must must have this for F9, we can close it as WONTFIX
We need this for handling of serial console *getty entries, as far as I can tell. The alternative would be writing one separate event for each potential serial device.
(Which, considering the infinite number of serial minors, isn't really practical.)
Example, we'd define an event like: ----- start on fedora.serial-console-available * stop on runlevel [016] respawn exec /sbin/agetty /dev/$1 $2 vt100-nav ----- We have a udev helper that emits (for example) 'fedora.serial-console-available ttyS0 115200'. To not use arguments (or environment variables - they're similarly broken), we'd have to write events for every combination of ports and baud rate. (Or not use upstart. But it seems like the logical way to implement this.)
We'll need Scott to look at this. Like I said, there is a fix for this in trunk, but trunk is a very different system right now. 0.5 will have the fix for Fedora 10, at which time this becomes possible.
How did you do this in sysvinit?
We had an initscript that edited /etc/inittab. I'd rather not write an initscripts that wrote upstart jobs on demand, for obvious reasons.
Could you make a script wrapper for agetty with something like: while true; do /sbin/agetty /dev/$1 $2 vt100-nav; done So that it would simply handle respawning by itself?
That's somewhat gross; also keeps a shell running when there's no need.
Its not great, but it could get us by in a pinch. I'll have to see if patching in the argument fix is easy enough.
I'm not sure what are the obvious reasons you don't want to do the same with Upstart? What works for sysvinit should work for Upstart, no?
Editing inittab obviously doesn't work. Moreover, the way that it's done with inittab editing is that it actively goes and *removes* entries that don't match the current console configuration at the next boot; doing this with upstart in this way is rather impracticaly as you're removing event files out from under. The obvious *right* way to do this is by events triggered off of udev. If upstart isn't actually offering us a useful way to do this, why are we bothering with switching?
Not sure this will work anyway. The first serial console will work, but then the next one will just say the job is already running. Instance jobs are another feature that's only in trunk atm. Right now its one job per job file.
Scratch what I just said. Instance jobs are actually in 0.3.9 But the job definition is wrong for this reason. You need an "instance" stanza. Working on a good solution to all of this with Scott right now.
In 0.3, respawn was intended to be a magic variant of "start on stopped $UPSTART_JOB"; we didn't realise that this was the wrong behaviour until later -- and that you actually want it to be just a reincarnation of the previous job, started by the same request as before. You can implement respawn yourself in the job, reemitting your own events to restart the next instance: start on fedora.serial-console-available stop on runlevel [016] instance exec /sbin/agetty /dev/$1 $2 vt100-nav pre-stop script touch /tmp/do-not-respawn-$1 end script post-stop script if [ -f /tmp/do-not-respawn-$1 ]; then rm -f /tmp/do-not-respawn-$1 fi initctl emit --no-wait fedora.serial-console-available $1 $2 end script Since this is no longer a service, the event information will be available in pre-stop and post-stop. pre-stop only runs if the job stops naturally (ie. by "stop on" or explicit "stop") so we use this to work out whether to respawn or not. Because it's not a service, you'll need --no-wait on the emit calls; otherwise they'll block until the getty actually dies.
Correction to the above post-stop script if [ -f /tmp/do-not-respawn-$1 ]; then rm -f /tmp/do-not-respawn-$1 else initctl emit --no-wait fedora.serial-console-available $1 $2 fi end script
(The above is obviously a hack for 0.3's brain-deadness; to prove the issue is fixed, here would be the same job with trunk: start on fedora.serial-console-available stop on runlevel [016] or fedora.serial-console-gone $TTY env BAUD=38400 instance $TTY respawn exec /sbin/agetty /dev/$TTY $BAUD vt100-nav Obviously the first thing is that the environment actually carries on past a respawn, that's useful. It also lets you explicitly do things like "start agetty TTY=ttyS0" (with $BAUD defaulting in the job).
So, not to ask a completely ridiculous question - should we use trunk? *ducks and runs* Can you set env variables in pre-stop to use in post-stop, or does all communication have to be done via the fs?
Trunk has no initctl yet, so it would be kinda hard work :-) You can't pass things from pre-stop to post-stop - can you think of a use case that isn't caused by an Upstart bug? :)
Other than 'state on the filesystem is kind of a hack', no.
Sure, but in theory post-stop is only meant to clean up things created in pre-start. There are supposed to be better ways to react to a job stopping (another job catching the stopped event, for example) for further work.
Actually, couldn't you just trap on 'UPSTART_EVENT' != 'fedora.serial...' in the post-stop?
(In reply to comment #22) > Actually, couldn't you just trap on 'UPSTART_EVENT' != 'fedora.serial...' in the > post-stop? How do we tell an explicit run of /sbin/stop from a crash? From the perspective of post-stop they both have $UPSTART_EVENT = ""
We could use a state job. Have pre-stop start an empty job and post-stop kill it. I don't know that that's prettier though.
A crash still has UPSTART_EVENT = the starting event, at least in brief testing.
My testing differs. I have it blank
Here's my event: start on fedora.serial-console-available * stop on runlevel [016] instance exec /sbin/agetty /dev/$1 $2 vt100-nav post-stop script if [ "$UPSTART_EVENT" != "${UPSTART_EVENT##fedora.serial-console-available}" ]; then initctl emit --no-wait fedora.serial-console-available $1 $2 fi end script On a crash (kill -SEGV) it respawns correctly, and on 'initctl stop <event>', it stops correctly.
ahh. The segv is the difference. A regular -TERM doesn't do it.
As we do have something that works for what we need here, deferring.