Description of problem: With today's rawhide, ypbind does not connect to the domain. I see the following: Apr 25 14:30:30 cynosure portmap[2157]: connect from 127.0.0.1 to set(ypbind): request from unprivileged port Apr 25 14:30:30 cynosure ypbind[2153]: Unable to register (YPBINDPROG, YPBINDVERS, udp). Version-Release number of selected component (if applicable): ypbind-1.17.2-5.i386. portmap-4.0-65.i386. How reproducible: Everytime
hmm... I wonder if we are running out of reserver ports so which means a non-reserver port is be used being used. Was there a lot of network activity happening when this happen? Is autofs trying to mount a bunch filesystems?
This was just during boot. I guess I lied with the reproducibility as I have not seen this again with any later rawhide or test release.
Is autofs running?
We run autofs with 4 NIS automount maps (/home, /data, /opt, /fs).
Is this only happening with selinux enabled?
I haven't been able to reproduce with later rawhide or fc4test installs. I'll close.
I've, just installed FC4 whit selinux --enforced, and the symptoms of this bug appeared, after that i reinstalled with selinux --disabled and the problem disapeared. I think this bug must be reopen. Thanks.
The problem is the portmapper fails port registrations when the port being registered is a reserve (or privileged) port but the socket on which the registration came in on is not on a privileged port. I'm not sure why this is done, but its been happening since RH 7.1 We can either turn off the port registration check in the portmapper or make ypbind create its own bound socket, which would allow it to detect this condition and rebind before doing the svc_register()
I have the same problem with FC4 and the lastest updates (ypbind-1.17.2-5, portmap-4.0-65). Selinux is enforcing and type targeted. When I try to start ypbind after the system has booted (using '/etc/init.d/ypbind start'), then sometimes ypbind connects successful and no error message is generated. But as we use yp dirstributed autofs maps these maps are not mounted when ypbind is not running. So it is important for us that ypbind is running and bound before /etc/init.d/autofs is called. Thanks in advance!
The problem doesn't seem to be isolated to ypbind. I have the same problem with ypserv. When starting while boot, ypserv will almost always fail with portmap[####]: connect from 127.0.0.1 to set(ypserv): request from unprivileged port ypserv[####]: Unable to register (YPPROG, YPOLDVERS, tcp). and subsequently, ypbind cannot find the server. ypserv works when I restart it manually after boot is done, i.e., ypbind can find and bind the second time.
Created attachment 120917 [details] Add 'sleep 5' after starting portmap. The problem still exists (because no new version of ypbind or portmap exist). But I found a workaround: I found that after the boot sometimes ypbind was connected and sometimes not. When it was connected then no error message from portmap about unprivileged port was seen in /var/log/messages. But when not connected that message was logged in /var/log/messages. When I start ypbind after the boot (with '/etc/init.d/ypbind start'), then it got allways a connection. I have read it the man page of portmap that it forks to the background after starting, and the starting process ends. At that time the startup script (/etc/init.d/portmap, e.g. /etc/rc5.d/S13portmap) ends, and the next startup scripts will start. I guess that the time for portmap process is to short to initialize it fully. Then when /etc/init.d/ypbind (/etc/rc5.d/S27ypbind) is started and starts ypbind, then sometimes portmap isn't already ready. I tried to insert a sleep command in the startup script /etc/init.d/portmap, and have the computer rebooted. With a wait time of 2 seconds (sleep 2) on a fast computer ypbind always was connected after the reboot. On a slower computer (Pentium 500 MHz) I got the connection only after a wait of 5 seconds. But I think this is just a too simple solution, just a ad hoc workaround - we need to test if portmap is ready to take the registration from ypbind from 127.0.0.1 - how can we test this? Another possiblity is that the reason why the sleep command helps to solve the problem is that other processes (and the kernel?) are also initializing themselfs and setting up some settings. It may be that some network initializing takes some time, or setting up selinux, or reading /etc/hosts.allow and /etc/hosts.deny (both are empty (in my test case, apart from comment lines))? I also want to notice that the messages from portmap and ypbind in /var/log/messages are placed (mixed) between many other kernel initialization messages, like ACPI kernel messages, SCSI kernel messages, and many other kernel messages. I think it is worth to notice that the SELinux messages in /var/log/messages are logged (written) to /var/log/messages _after_ the relevant portmap and ypbind error (or success) messages! But I don't know if this is because these kernel messages are buffered an logged later, or because the logged action is done at (or directly after) the logging. If SELinux is initialized after portmap and ypbind was started then some rights may not be set up properly, and this may a reason for the problem?
Any more work on this? This is really becoming a headache for me.
I've just seen this on FC5test3. On FC4 I got this problem for a few past days. Did you push any portmap/ypbind update recently?
Created attachment 127442 [details] Patch to add retries to ypbind during startup While I do think the real problem here is portmap startup (it forks into the background so startup continues even though it isn't ready to process svc_register calls from later programs), this may be the easiest way around. This adds retries to the svc_register() calls in ypbind. Patch is against the current devel branch of ypbind in fedora cvs, though probably will apply to all branches. It works fine for me on a system that consistantly fails at boot. Logs now show entries like (note udp first then tcp): Apr 6 20:22:38 hobbes portmap[2152]: connect from 127.0.0.1 to set(ypbind): request from unprivileged port Apr 6 20:22:38 hobbes ypbind[2148]: Unable to register (YPBINDPROG, YPBINDVERS, udp). Will retry in 1 sec. Apr 6 20:22:39 hobbes portmap[2157]: connect from 127.0.0.1 to set(ypbind): request from unprivileged port Apr 6 20:22:39 hobbes ypbind[2148]: Unable to register (YPBINDPROG, YPBINDVERS, tcp). Will retry in 1 sec. Apr 6 20:22:40 hobbes ypbind: bound to NIS server earth.cora.nwra.com though it varies on what svc_register calls it complains about. I don't think this is a bad compromise as ypbind startup already goes into a sleeping loop waiting for ypbind to bind to a domain. Unfortunatley, this means a similar fix for ypserv and perhaps other RPC/portmapped services. I tried a patch to portmap where I moved the call to daemon() to just before the svc_run() call in main(), but that didn't seem to help. So I'm not quite sure what all is involved in the race condition here, but that seems to be the limit to what is possible in portmap without very kludgey sleeps in the portmap startup script. After this it's all internal to the RPC library. portmap patch for reference: --- portmap_4/portmap.c.orig 2006-04-06 16:16:51.000000000 -0600 +++ portmap_4/portmap.c 2006-04-06 16:18:34.000000000 -0600 @@ -165,11 +165,6 @@ } } - if (!debugging && daemon(0, 0)) { - (void) fprintf(stderr, "portmap: fork: %s", strerror(errno)); - exit(1); - } - #ifdef LOG_MAIL openlog("portmap", debugging ? LOG_PID | LOG_PERROR : LOG_PID, FACILITY); @@ -242,6 +237,11 @@ /* Dying on SIGPIPE doesn't help anyone */ (void)signal(SIGPIPE, SIG_IGN); + if (!debugging && daemon(0, 0)) { + (void) fprintf(stderr, "portmap: fork: %s", strerror(errno)); + exit(1); + } + svc_run(); syslog(LOG_ERR, "run_svc returned unexpectedly"); abort();
Well, maybe scratch all that. I forgot that the problem goes away with selinux in permissive mode. So, here's my theory - selinux blocks certain port combinations in the normal RPC usage range that map to other well known services in that range (e.g. ldap). I just think I found a similar problem with automount/nfs on FC5 (bug 185636) where when the nfs mount daemon stumbles onto a port number that is blocked by selinux the mount fails. I think the same thing is happening here. So, it's not a race, but the retry gets us a new port number that isn't blocked by selinux. The problem for debugging is that the selinux stuff has dont audit rules that are preventing the denials from showing up. On FC5 you can use semodule -b enableaudit to disable the don't audit rules. Not sure what can be done on FC4. God I hate the way RPC uses port numbers....
Please forgive my ignorance regarding selinux and NIS. I'm trying to run NIS under FC5 using the services and authorisation guis. Three of the four services start OK, but ypbind refuses to. It doesn't start on booting, despite being flagged to do so, and when I try to start it from the gui, 49 times out of 50 the gui freezes, and ypbind doesn't start. I have to kill the gui and restart it. In about 1/4 of the failures, /var/log/messages gets a series of messages like: Apr 7 22:54:44 sc1 setsebool: The allow_ypbind policy boolean was changed to 1 by root Apr 7 22:54:44 sc1 portmap[21237]: connect from 127.0.0.1 to set(ypbind): request from unprivileged port Apr 7 22:54:44 sc1 ypbind[21233]: Unable to register (YPBINDPROG, YPBINDVERS, udp). (the other 3/4 of the time, the logs appear to be silent, but I might be missing something important amongst all the firewall-blocked samba broadcasts etc). I've finally got it going after what seems like the 1000th trial, but is probably about the 50th. Now I'm reluctant to even reboot, given the risk I won't be able to start it again. Note that I _don't_ see the interspersed messages with the above that Orion sees, but possibly this is just because the system is quiescent when I'm doing it, so there's nothing else writing to the logs. So my first question, does this seem likely to be the same bug? I've been tearing my hair out over it for the past week - if it is a bug, the plus is I'll feel a bit better about my ability to configure NIS 8^); the downside, I guess, is that it doesn't sound like it's going to be easy to fix. If so, is there any useful information I can provide? Regarding your patches, Orion, these would be patches to the NIS source distribution, right? So they would be activated whenever I tried to restart the ypbind service, not just on boot? In particular, they would be effective even when ypbind is started from the gui? I guess it could take a while, though, with the hit rate I'm getting.. it's pretty clear, by the way, that my hit rate is worse than you are reporting, I only finally got it started because I realised that just dumbly re-trying over and over actually could be useful in this case. If it is a selinux problem, any ideas why the hit rate could vary? In particular, could my use of firestarter rather than the default iptables be contributing? TIA Bob
Bob - Sounds like a much worse situation than normal. With pure selinux interaction, hit rate depends on how often you stumble onto the selinux blocked ports, and is generally quite rare. Firestarter could be an issue, though I'd be surprised if it is blocking localhost traffic. Can you disable it for a bit a test that way? It would be my first guess as to the problem. If you want to try my patched ypbind, you can get it from http://www.cora.nwra.com/~orion/fedora/ypbind-1.19-0.cora.1.i386.rpm (assuming an i386 machine). Also, the services panel may not be hanging. The ypbind service will wait up to 20 seconds to the service to start and ypbind to bind to the domain. During a "hang" you might see what processes are running, or run "strace -fp <pid-of-system-config-services>" and watch that during a restart of ypbind.
Created attachment 127508 [details] Audit log of stopping and restarting ypbind Audit log of a cycle of stopping and restarting ypbind from the gui (comments are within 1-2 lines of the actual point of occurrence).
Hmmm, I'm not now sure this is the same problem. I attached an audit log of stopping and restarting ypbind having enabled auditing per your instructions above. It may also explain some glitches with web browsers I have been experiencing - it appears that while the system is trying to restart ypbind, it may also be unable to properly handle the responses to http queries. To summarise the strangenesses: .on attempting to restart ypbind, the gui appears to freeze (actually just very slow, >20 seconds to respond to mouse clicks) .it never reports successfully restarting ypbind (>> 5 minutes) - on previous occasions, I have gone away to have lunch, it's still whirling away when I get back. .in this state, web browser 'http get's appear to hang .when the gui is stopped and restarted, it comes up showing ypbind as running. At this point, we get one of two states: 1 the browser running in a separate window completes its get and the page displays (immediately on the service configuration window opening) 1 ypbind gui is in a state where a single press of 'stop' will stop it or 2 the browser running in a separate window remains hung 2 ypbind is in a state where a single press of 'stop' results in 2a ypbind still being reported as running 2a the browser running in a parallel window now completes its get and the page displays (again, immediately on the 'stop' press) 2 after a second press of 'stop', ypbind is reported as stopped Looking at the log, I guess that the denied searches/writes/reads for domainname and ypbind are likely to be the source of the problem, so this means it is a file authorisation problem, rather than a port problem, right? Any ideas where I might look to figure out the source of these? By the way, you are right, the gui isn't frozen after the call to restart ypbind, it's just stuck in treacle - responses to commands do occur, but they take around 20 - 30 seconds. I guess I wasn't patient enough... but restarting it removes the treacle.
Further tracking of my problem with turning on auditing and interpreting the audit logs suggests my problem is a separate issue related to selinux policy. I have opened a new report under Bug #188572 for this.
Is the allow_ypbind boolean turned on? getsebool allow_ypbind If not you need to turn it on setsebool -P allow_ypbind=1
allow_ypbind is on.
Ok I think this is the same problem that mount and autoumount has. Basically it is calling bindresvport which will fail with eperm if SELinux is blocking the port. I have updated rawhide policy and will back port at the end of the week to FC5. Right now we are not intending on updating FC4, so you could disable the transition or just try again. Dan
I'm still seeing this with selinux-policy-targeted-2.3.2-1.fc5. Is this still going to be back ported to FC5?
Does it work in selinux-policy-targeted-2.3.3-8.fc5.
Nope. Still get this sometimes: Jul 24 15:02:38 vault portmap[3114]: connect from 127.0.0.1 to set(ypbind): request from unprivileged port Jul 24 15:02:38 vault ypbind[3111]: Unable to register (YPBINDPROG, YPBINDOLDVERS, tcp).
Ok It is only hanlding the tcp case not the udp case. So you will need to wait til next update. You can build custom policy for this auditallow -M myypbind -i /var/log/messages semodule -i myypbind.pp
Okay. The more I think about it though, it seems like ypbind should be allowed to use these ports for security reasons, but that something like my earlier patch should be used to look for an available port. Though perhaps the patch should be made to svc_register/bindresvport itself. It would also prevent rpc programs from "stealing" port from others (like cups, rsync, ldap, etc).
The rules I ended up finding, FYI: allow ypbind_t dhcpd_port_t:udp_socket name_bind; allow ypbind_t inetd_child_port_t:udp_socket name_bind; allow ypbind_t ipp_port_t:udp_socket name_bind; allow ypbind_t kerberos_port_t:udp_socket name_bind; allow ypbind_t ldap_port_t:udp_socket name_bind; allow ypbind_t rsync_port_t:udp_socket name_bind;
Fixed in selinux-policy-targeted-2.3.7-2.fc5.
confirmed. thanks!