Bug 752901 - RFE: clearer error messages
Summary: RFE: clearer error messages
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: systemd
Version: 16
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: systemd-maint
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-11-10 18:41 UTC by Bill Nottingham
Modified: 2014-03-17 03:29 UTC (History)
15 users (show)

Fixed In Version: systemd-37-11.fc16
Doc Type: Bug Fix
Doc Text:
Clone Of: 734268
Environment:
Last Closed: 2012-01-30 20:59:15 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)

Description Bill Nottingham 2011-11-10 18:41:46 UTC
+++ This bug was initially created as a clone of Bug #734268 +++

<lots snipped>

--- Additional comment from mingo on 2011-11-10 13:32:17 EST ---

<more snipped>
3)

I missed the 203/EXEC bit because it wasn't a familiar pattern (i mistakenly thought that it simply means that things were attempted to be executed) - would it be possible to prefix that output with -ENO, resulting in -ENOEXEC or so, signalling that it's a *failure*?

Also, the output is pretty meaningless because it's not structured, unless you really are experienced in systemctl output (which most of the people are not):

   Active: failed since Thu, 10 Nov 2011 07:32:09 +0100; 3min 8s ago
  Process: 5078 ExecStart=/etc/rc.d/rc.local start (code=exited, status=203/EXEC)

Something like this:

   Active: failed since Thu, 10 Nov 2011 07:32:09 +0100; 3min 8s ago
   Process: 5078 ExecStart=/etc/rc.d/rc.local start exited abnormally with status 203/-ENOEXEC - file not executable via exec()?

would have led me to the problem straight away.

The fact remains, i spent more than half an hour on something that should have been far more obvious to analyse at first sight. I only solved it because i straced the systemctl command, strace -s 100000 it once again to see all the message, saw the writes() to the socket resulting in a failure being passed back, then identified where the pipe was openened, realized that it was used by PID 1, then straced PID 1 and also looked at the systemd source code and man page while doing this.

If i have trouble with that and have to go to such lengths to figure out a pretty common failure then what will the typical sysadmin experience?

...

Another option might be replacing EXEC/-ENOEXEC with the translated perror() string.

Comment 1 Michal Schmidt 2011-11-11 10:57:50 UTC
(In reply to comment #0)
> I missed the 203/EXEC bit because it wasn't a familiar pattern (i mistakenly
> thought that it simply means that things were attempted to be executed) - would
> it be possible to prefix that output with -ENO, resulting in -ENOEXEC or so,
> signalling that it's a *failure*?

If systemctl runs on a tty, then the word "failed" and the text "(code=exited,
status=203/EXEC)" are printed in red to signal a failure.

The "203" is the exit status of a process. It is not an errno number, so calling it "ENOEXEC" would be very confusing.
We do not store the errno of the execve() call anywhere. We just know that the call failed for whatever reason and this is what the systemd-reserved exit code "203/EXEC" means.

> Something like this:
> 
>    Active: failed since Thu, 10 Nov 2011 07:32:09 +0100; 3min 8s ago
>    Process: 5078 ExecStart=/etc/rc.d/rc.local start exited abnormally with
> status 203/-ENOEXEC - file not executable via exec()?

Yes, we can easily add human-readable descriptions of the systemd-reserved exit codes to systemctl.

Comment 2 Bill Nottingham 2011-11-11 16:45:32 UTC
Is there a reason we lose the errno that can't be fixed?

Comment 3 Michal Schmidt 2011-11-14 08:43:58 UTC
It is fixable. We could:
 - log the errors from the forked process "sd(EXEC)". We just need to reopen
   the log fd, because we close all fds before we execve() the real service.
   That's easy to do.
 - communicate the errors from "sd(EXEC)" to systemd using the notify socket.
   This way we would not even need to have the reserved range of exit codes.

Both methods may fail if the service is configured to run in a weird namespace, but such cases are rare.

Comment 4 Michal Schmidt 2011-11-17 00:45:02 UTC
This is now improved a little in upstream:
http://cgit.freedesktop.org/systemd/commit/?id=4c2630ebf23b6348174f0bdf1110e90efe45259c

systemctl status still shows the same information, but in kmsg/syslog there will be a message:

[   12.933713] systemd[779]: Failed at step EXEC spawning /etc/rc.d/rc.local: Exec format error

Comment 5 Fedora Update System 2012-01-11 15:02:18 UTC
systemd-37-6.fc16 has been submitted as an update for Fedora 16.
https://admin.fedoraproject.org/updates/systemd-37-6.fc16

Comment 6 Fedora Update System 2012-01-11 20:57:55 UTC
Package systemd-37-6.fc16:
* should fix your issue,
* was pushed to the Fedora 16 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing systemd-37-6.fc16'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2012-0409/systemd-37-6.fc16
then log in and leave karma (feedback).

Comment 7 Fedora Update System 2012-01-16 02:25:29 UTC
Package systemd-37-7.fc16:
* should fix your issue,
* was pushed to the Fedora 16 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing systemd-37-7.fc16'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2012-0409/systemd-37-7.fc16
then log in and leave karma (feedback).

Comment 8 Fedora Update System 2012-01-17 20:23:25 UTC
Package systemd-37-8.fc16:
* should fix your issue,
* was pushed to the Fedora 16 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing systemd-37-8.fc16'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2012-0409/systemd-37-8.fc16
then log in and leave karma (feedback).

Comment 9 Fedora Update System 2012-01-22 22:54:43 UTC
Package systemd-37-10.fc16:
* should fix your issue,
* was pushed to the Fedora 16 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing systemd-37-10.fc16'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2012-0409/systemd-37-10.fc16
then log in and leave karma (feedback).

Comment 10 Fedora Update System 2012-01-26 22:58:30 UTC
Package systemd-37-11.fc16:
* should fix your issue,
* was pushed to the Fedora 16 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing systemd-37-11.fc16'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2012-0409/systemd-37-11.fc16
then log in and leave karma (feedback).

Comment 11 Fedora Update System 2012-01-30 20:59:15 UTC
systemd-37-11.fc16 has been pushed to the Fedora 16 stable repository.  If problems still persist, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.