Bug 444410
Summary: | bootsequence haldaemon failure | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Roger Depreeuw <rogdepre> | ||||||||
Component: | chkconfig | Assignee: | Bill Nottingham <notting> | ||||||||
Status: | CLOSED WORKSFORME | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||||
Severity: | high | Docs Contact: | |||||||||
Priority: | low | ||||||||||
Version: | 9 | CC: | gc, james, jthurtell, pertusus, redhat, rvokal, tgutwin, torel | ||||||||
Target Milestone: | --- | ||||||||||
Target Release: | --- | ||||||||||
Hardware: | i686 | ||||||||||
OS: | Linux | ||||||||||
Whiteboard: | |||||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||||
Doc Text: | Story Points: | --- | |||||||||
Clone Of: | Environment: | ||||||||||
Last Closed: | 2008-12-04 20:57:27 UTC | Type: | --- | ||||||||
Regression: | --- | Mount Type: | --- | ||||||||
Documentation: | --- | CRM: | |||||||||
Verified Versions: | Category: | --- | |||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||
Embargoed: | |||||||||||
Attachments: |
|
Description
Roger Depreeuw
2008-04-28 09:34:55 UTC
Are you running SELinux in enforcing mode? If so, does it work in permissive mode? I see the same. SELinux disabled. Also seems to influence NetworkManager which dies during boot. WAR: restart haldaemon and NetworkManger after login. --verbose=yes|no Print out debug (overrides HALD_VERBOSE) --use-syslog Print out debug messages to syslog instead of stderr. Use this option to get debug messages if hald runs as a daemon. Can you get any output from hal using these options ? May 2 23:18:27 bgo-s-101 hald[2539]: 23:18:27.419 [I] hald.c:669: hal 0.5.11rc2 May 2 23:18:27 bgo-s-101 hald[2539]: 23:18:27.420 [I] hald.c:678: Will daemonize May 2 23:18:27 bgo-s-101 hald[2539]: 23:18:27.420 [I] hald.c:679: Becoming a daemon May 2 23:18:27 bgo-s-101 hald[2540]: 23:18:27.422 [I] hald_dbus.c:5381: local server is listening at unix:a bstract=/var/run/hald/dbus-MGoEELB39j,guid=b352a454e6639b29ca70902c481b8523 May 2 23:18:27 bgo-s-101 hald[2540]: 23:18:27.423 [E] hald_dbus.c:5747: dbus_bus_get(): Failed to connect t o socket /var/run/dbus/system_bus_socket: No such file or directory May 2 23:19:17 bgo-s-101 pulseaudio[4168]: module-hal-detect.c: Couldn't connect to hald: (null): (null) May 2 23:19:43 bgo-s-101 pulseaudio[4410]: module-hal-detect.c: Couldn't connect to hald: (null): (null) WAR: 'cd /etc/rc5.d ; mv S26haldaemon S28haldaemon' and hald do not die. (In reply to comment #2) > I see the same. SELinux disabled. Also seems to influence NetworkManager which > dies during boot. WAR: restart haldaemon and NetworkManger after login. For what its worth... I see all the same symptoms. Restarting , hal and network after login also brings things to a working state. My SELinux is disabled. The delay at my boot is longer than 2 minutes. Its more like 5. looks like the real issue is with dbus ? Is dbus getting started before hal in your boot sequence ? (It shows up as "messagebus" in boot messages) A simple suggestion. Check whether the problem is the same as what happened for bug #444859: for some unknow reasons a bunch of services changed their priority to S99. Among this there is also messagebus, which thus starts AFTER haldaemon. The reason why this happened is a mystery for me, as well as the fact as trying to reset priorities by chkconfig resetpriotities has no effect. So it is a chkconfig problem rather than a haldaemon problem Don't see any errors here, but me having moved haldaemon to S28. Do you? [root@bgo-s-101 rc5.d]# ls -latr *mess* *hald* *udev* *acpi* *Consol* *ntpd* *NetworkM* *syslog* *yum* *blue* lrwxrwxrwx 1 root root 20 2008-02-09 12:43 S90ConsoleKit -> ../init.d/ConsoleKit lrwxrwxrwx 1 root root 22 2008-02-09 12:59 S97yum-updatesd -> ../init.d/yum-updatesd lrwxrwxrwx 1 root root 20 2008-04-09 08:31 S27messagebus -> ../init.d/messagebus lrwxrwxrwx 1 root root 17 2008-04-09 09:10 S12rsyslog -> ../init.d/rsyslog lrwxrwxrwx 1 root root 24 2008-04-19 13:13 S99NetworkManager -> ../init.d/NetworkManager lrwxrwxrwx 1 root root 17 2008-04-19 13:13 K75ntpdate -> ../init.d/ntpdate lrwxrwxrwx 1 root root 15 2008-04-19 13:13 S26acpid -> ../init.d/acpid lrwxrwxrwx 1 root root 19 2008-04-24 17:18 S26udev-post -> ../init.d/udev-post lrwxrwxrwx 1 root root 19 2008-05-01 22:52 S28haldaemon -> ../init.d/haldaemon lrwxrwxrwx 1 root root 14 2008-05-06 17:32 S58ntpd -> ../init.d/ntpd lrwxrwxrwx 1 root root 19 2008-05-06 17:32 S50bluetooth -> ../init.d/bluetooth I enabled haldaemon again to start during the bootsequence at run levels 2,3,4,5 I then disabled messagebus in level 2 and left it on at levels 3,4,5. Both cold boot and warm reboot resulted in a clean bootsequence and haldaemon was started nicely. Then i switched messagebus back on at run level 2 and the bootsequncing ran without unexpected results and haldaemon was again running. Does this make sens to you? Regards (In reply to comment #8) > A simple suggestion. Check whether the problem is the same as what happened for > bug #444859: for some unknow reasons a bunch of services changed their priority > to S99. Among this there is also messagebus, which thus starts AFTER haldaemon. > The reason why this happened is a mystery for me, as well as the fact as trying > to reset priorities by chkconfig resetpriotities has no effect. > > So it is a chkconfig problem rather than a haldaemon problem Yes. Thats it. messagebus is S99. I will reset them. Thanks for the help. Because reset priorities by chkconfig resetpriotities is not working... I had to manually mv the links in rc.d/* subdirs to the correct priorities. This at least got me booting without the haldaemon delay. BUT Things are still not perfect. I had to do a /etc/init.d/haldaemon restart /etc/init.d/network restart to get my network working (NetworkManager still won't hook up) For example in rc3.d #!/bin/bash mv S99acpid S44acpid mv S99haldaemon S26haldaemon mv S99messagebus S22messagebus mv S99cups S98cups mv S99network S10network mv S99NetworkManager S27NetworkManager mv S99nscd S30nscd mv S99ntpd S58ntpd mv S99ntpdate S57ntpdate mv S99sendmail S80sendmail mv S99xinetd S56xinetd mv S99yum-updatesd S97yum-updatesd rpmPackages is at: chkconfig-1.3.37-2 Please don't manually move the links. You need to reset the priorities in order, unfortunately. What services do you have installed now? (names and versions) Changing version to '9' as part of upcoming Fedora 9 GA. More information and reason for this action is here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping (In reply to comment #13) > Please don't manually move the links. You need to reset the priorities in order, > unfortunately. Do you mean chkconfig network resetpriotities chkconfig messagebus resetpriotities chkconfig haldaemon resetpriotities chkconfig NetworkManager resetpriotities .... in that order because that is the order they should start ? --> 10, 22 26 27 chkconfig throuws an error [root@warp2 init.d]# /sbin/chkconfig network resetpriotities chkconfig version 1.3.37 - Copyright (C) 1997-2000 Red Hat, Inc. This may be freely redistributed under the terms of the GNU Public License. usage: chkconfig [--list] [name] chkconfig --add <name> chkconfig --del <name> chkconfig --override <name> chkconfig [--level <levels>] <name> <on|off|reset|resetpriorities> I confirmed ... No changes were made to the links rpm -q chkconfig -i renders Name : chkconfig Relocations: (not relocatable) Version : 1.3.37 Vendor: Fedora Project Release : 2 Build Date: Wed 20 Feb 2008 04:36:39 AM PST Install Date: Sat 05 Apr 2008 10:51:35 PM PDT Build Host: xenbuilder4.fedora.phx.redhat.com Group : System Environment/Base Source RPM: chkconfig-1.3.37-2.src.rpm Size : 599326 License: GPLv2 Signature : DSA/SHA1, Tue 04 Mar 2008 07:38:40 AM PST, Key ID da84cbd430c9ecf8 Packager : Fedora Project snip... > > What services do you have installed now? (names and versions) See attached files for chkconfig --list and a head * in the init.d dir Created attachment 305780 [details]
init.d Head *
A head list of the init.d files to show the priorities
Created attachment 305781 [details]
services status/priorities
chkconfig --list
I went through each service and reset their priorities one at a time. I found Some interesting things happening... Some calls to chkconfig <servicename> resetpriotities worked; resetting the priority properly. - NetworkManager - haldaemon - bluetooth - messagebus - rsyslog All were at S99 before the reset, and then went to the correct priority. However, when I did the same for others, it took about a 1 1/2 seconds to execute which made me curious. and then when I looked if the priorities were reset... NO everything went back to S99. (even the ones I had eyeballed correct just before). So, I went through one by one doing a chkconfig <servicename> resetpriotities and then doing an ' ls -latr /etc/rc.d/rc5.d' to see what changed. these are the services that 'broke' the reset: ntpdate acpid cups nscd They had/are at S99 even though they should not be. There might be others I did not try. Is there something common among these??? that is messing with chkconfig Bill, I saw in https://bugzilla.redhat.com/show_bug.cgi?id=444859#c7 that you said to NOT have NetworkManager have a PROVIDES for $network in the init section. Mine does. What should it be? Here is the NetworkManager head... #!/bin/sh # # NetworkManager: NetworkManager daemon # # chkconfig: - 27 73 # description: This is a daemon for automatically switching network \ # connections to the best available connection. # # processname: NetworkManager # pidfile: /var/run/NetworkManager/NetworkManager.pid # ### BEGIN INIT INFO # Provides: network_manager $network # Required-Start: messagebus haldaemon # Required-Stop: messagebus haldaemon # Default-Start: 2 3 4 5 # Default-Stop: 0 1 6 # Short-Description: start and stop NetworkManager # Description: NetworkManager is a tool for easily managing network connections ### END INIT INFO For F8 and previous, it shouldn't. For F9, it should be OK. Created attachment 306575 [details]
Difference in priorities after resetpriorites for al daemons
The submitted diff file shows the difference of priorities in /etc/rc.d/rc3.d after doing a chkconfig <service> resetpriorities for alle services. This was triggered by the fact that my haldaemon didn's start during boot as well. Hate to be a "me too", but I'm seeing some weird stuff from chkconfig. Mysql has : # chkconfig: 2345 64 36 # description: A very fast and reliable SQL database engine. I run chkconfig mysql resetpriorities and it shows up as S99mysql ... ?????? It also seems to reset my boinc which I didn't even tell it to, which has : # chkconfig: 345 98 03 # description: This script starts the local BOINC client as a daemon # For more information about BOINC (the Berkeley Open Infrastructure # for Network Computing) see http://boinc.berkeley.edu # processname: boinc # config: /etc/sysconfig/boinc but still shows as S99boinc Tuc - are those the *full* headers, or are there any LSB INIT INFO blocks? Those take precedence... <BLUSH> Sorry. Did not know that. Yes, there are LSB INIT INFO blocks : MYSQL: ### BEGIN INIT INFO # Provides: mysql # Required-Start: $local_fs $network $remote_fs # Should-Start: ypbind nscd ldap ntpd xntpd # Required-Stop: $local_fs $network $remote_fs # Default-Start: 2 3 4 5 # Default-Stop: 0 1 6 # Short-Description: start and stop MySQL # Description: MySQL is a very fast and reliable SQL database engine. ### END INIT INFO boinc: ### BEGIN INIT INFO # Provides: boinc # Required-Start: $network # Required-Stop: $network # Default-Start: 3 4 5 # Default-Stop: 0 1 2 6 # Description: This script starts the local BOINC client as a daemon # For more information about BOINC (the Berkeley Open Infrastructure # for Network Computing) see http://boinc.berkeley.edu ### END INIT INFO My original intent is to start mysql before radius. Radius only has the chkconfig so it started it as S88. I'm not sure the interaction of chkconfig and LSB INIT INFO into deciding what to call the entry (And why it mucks with boinc when they are completely different). I could just rename the link to S64mysql, but if someone else comes along later it could reset it, or if the package is updated it could do the same. So with a quandry like this, whats the best way to go about it? Presumably, you have something providing $network starting at S99... figure out why that is happening, and you'll fix the problem. chkconfig NetworkManager resetpriorities may help. As requested, since this is not on a Fedora release, 449164 opened This probably doesn't add anything, but I've just had to diagnose the same problem on a laptop after upgrading from F7 to F9 (by booting the F9 DVD and choosing upgrade, and then later running "yum update" to update all packages to current). After the upgrade, in runlevel 5, S26haldaemon was starting before S26messagebus. On bootup, hal was pausing for several minutes and then saying "FAILED" with no further info. Resetting the priority of messagebus to 22 fixed this. Possibly unrelated: the network service seemed to be turned off. Problem still apparent during upgrade to F10 (from F8 in my case) Symptoms: Kbd and mouse not working after upgrade, so unable to do graphical login (gdm). Remote login confirmed system working ok, with exception of haldaemon. /var/log/Xorg.0.log reported errors on default kbd and pointer, possibly related to hald. Resetting priorities of haldaemon and messagebus using chkconfig cures problem Looking over this, I'm not seeing an actual bug with chkconfig itself - it's honoring the priorities and depedencies correctly. If you're seeing this issue, you may need to file a bug against the individual packages that they may need to reset their priorities on upgrade. |