Bug 157759
Summary: | named crashes when NetworkManager started using init.d script | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Rob Kooper <kooper> | ||||||
Component: | NetworkManager | Assignee: | Dan Williams <dcbw> | ||||||
Status: | CLOSED WORKSFORME | QA Contact: | Ben Levenson <benl> | ||||||
Severity: | medium | Docs Contact: | |||||||
Priority: | medium | ||||||||
Version: | 4 | CC: | dcbw, jvdias | ||||||
Target Milestone: | --- | ||||||||
Target Release: | --- | ||||||||
Hardware: | i686 | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2005-06-04 17:09:55 UTC | Type: | --- | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Bug Depends On: | |||||||||
Bug Blocks: | 136451 | ||||||||
Attachments: |
|
Description
Rob Kooper
2005-05-14 16:27:10 UTC
Over to named... However, it sounds like it could be an SELinux issue, since we've run into this before where NM dies when started from initscripts but is fine when run normally as root. The first run of named shows that pthread_create() failed - this can be due to lack of memory. The second run shows there was an error in the configuration file - forwarding is disabled because of it, but named continues to run. Please attach the /var/named/data/NetworkManager-named.conf configuration file you are using - this appears to be the source of the problem. Created attachment 114421 [details]
NetworkManager-named.conf
Attached is the NetworkManager-named.conf file. I am running on a IBM T30 with
1Gb of memory.
RE: ns_taskmgr_create() failed: no available threads Have you changed the stacksize ulimit from the default (10240 KB) ? ie. do you issue a "ulimit -s" in your initscripts or /etc/profile ? The only reason I know of why named's pthread_create() might fail is that named always asks for the pthread_attr_getstacksize(...) stack for each of its 4 threads. If the stacksize rlimit is unreasonably large, the thread create can fail . What is your threads-max limit (cat /proc/sys/kernel/threads-max -> 16364 )? RE: no forwarders seen; disabling forwarding This means NetworkManager is starting named with an empty forwarders{ ... } clause (ie. NO nameservers are configured) . NM should not start named if there are no nameservers to forward to. ulimit -a reports stack size of 10240 and threads-max = 32750 I started NetworkManager and before I connect, indeed the forwarders is an empty clause. How reproducible is the named crash problem for you ? If the named crash only happened once, then it is likely to have been caused by transient resource exhaustion - this can happen to any process that relies on pthread_create() (NM included) and is not a bug. The fact that NetworkManager starts named at all with an empty forwarders clause, nothing else in the config file, and 127.0.0.1 in resolv.conf is a NetworkManager bug . Still happens, even with the update of today of NetworkManager. I have not been able to start it correctly. I might try and reinstall FC4T3 later this week and see if it still exists after this. Please do try to reproduce this bug with the latest glibc*-2.3.5-6 and bind-9.3.1-4 from FC4 / rawhide. I cannot reproduce it here. glibc-2.3.4-19 introduced new threads libraries, which BIND was compiled to use . If you can still reproduce this problem with glibc*-2.3.5-6 and bind-9.3.1-4: - Do you have SELinux enabled ? If so, ensure you are up-to-date with selinux-policy-targeted, libselinux, policycoreutils and libsepol . - Does the problem still occur with SELinux disabled ? boot with "selinux=no" grub boot argument. If the problem is reproducible with the latest glibc, BIND and selinux RPMS: please download the attached "named" script and do the following: # pkill -TERM named # mv /usr/sbin/named /usr/sbin/named_exe # cp -fp named /usr/sbin # restorecon /usr/sbin/named # mkdir /tmp/named Once you have reproduced the problem, then: # tar -cpvf - /tmp/named | gzip -9 > /tmp/named.tar.gz # mv /usr/sbin/named /usr/sbin/named_dbg # mv /usr/sbin/named_exe /usr/sbin/named and append the named.tar.gz file to this bug. Created attachment 114569 [details]
debugging named script
I think I can reliably reproduce it now. The problem is indeed a SELinux problem. After I did a restorecon -R /etc the script works. Here is how I can break it. 1. Start named using the script /etc/rc.d/init/named start 2. Start network configuration using Desktop->System Setings->Network 3. Setup a wireless connection and activate the interface 4. Surf the web 5. Stop named using /etc/rc.d/init/named stop 6. Start NetworkManger using /etc/rc.d/init/NetworkManager start This will make the SELinux stop named when started from NetworkManager and making NetworkManager fail. Aha ! Many thanks for the information. So there are two problems here: 1. The NetworkManager init script does not get the right SELinux context during installation - or did you write to the script after you installed the RPM ? If not, NM should be setting the context of its initscript correctly in the RPM . 2. NetworkManager currently cannot be run when an instance of named is run from /etc/rc.d/init/named . NetworkManager requires its own dedicated named at the moment, and named cannot be used for other purposes on the same machine. There is work in the pipeline to rectify this - I've completed a version of named that provides dynamic management of forwarding zones over the D-BUS. Then you could run named at boot with its standard named.conf file (or one you've customized) using /etc/init.d/named, having it serve authoritative zones over external interfaces, while still dynamically configuring the forwarding zones used for queries from the localhost interface, and NM would not have to start up / shut down named every time it brings an interface up / down . A version of NM that uses bind-dbus should be out shortly. Actually I think NetworkManager starts named from inside the application. It seems that if I use system-config-network it changes a file somewhere such that SElinux will complain next time I start /etc/rc.d/init.d/NetworkManager, only way to fix this is running restorecon Well, there isn't much that the BIND package can do about these issues. NetworkManager should: - ideally not run named at all - it should use existing named started from initscript ( and use D-BUS to manage forwarders ). Until it does so: - NM should either use the /etc/init.d/named script to start named or check that no named instance is running before it runs named - should make the SELinux policy of its initscript compatible with running named . Over to NetworkManager. Tried today to see if I could reproduce this bug, but I could not. It seems that somewhere something changed enough to not trigger this again. Will close the bug. There is still the extra message in the logfile stating that named.conf has no forwarders in it. |