Description of problem: Version-Release number of selected component (if applicable): RHAS 3 FCS is installed on intel platform, with new xinetd patch from bug fix 123522 (xinetd-2.3.12-6.3E). Linux test1 2.4.21-4.ELsmp #1 SMP Fri Oct 3 17:52:56 EDT 2003 i686 i686 i386 GNU/Linux Configured a rpc service in /etc/xinetd.d. # default: on service toto_server { disable = no type = RPC socket_type = stream wait = no protocol = tcp rpc_version = 1 rpc_number = 100145 user = root server = /etc/init.d/in.toto_server } Ran the client program and it got timed out for the first run but it was OK for the subsequence runs. I noticed that the server stays forever and does not go away, as the behaviour of inetd on Solaris. Another thing is that if manually killed the server and ran the client program again, the xinetd could not start the server. We need to either reboot the machine or restart the xinetd, which also is NOT the same behaviour as inetd. How reproducible: Steps to Reproduce: 1. add toto_server config file in /etc/xinetd.d 2. cp in.toto_server to /etc/init.d/ 3. add a line "toto_server 100145" to /etc/rpc. 4. service xinetd restart 5. ran client program (got timed out). 6. ran client program (ok this time) 7. manually kill toto_server 8. ran client program and it failed as toto_server is not UP. Actual results: Can't ran client program. Expected results: Could run client program even after the server dies. Also, the first time client program should not time out. Additional info:
Created attachment 100690 [details] server
Created attachment 100691 [details] client
Linda, could you upload the source files of the server and client, in place of the binaries you uploaded now?
Created attachment 100798 [details] Tar file contains sources for client/server. Do "tar xvf toto.tar" to retrieve files.
I have attached sources in the attachment. To begin with, I did "rpcgen -a toto.x".
What happens if you change "wait = no" to "wait = yes" in the toto server config? My test servers, which use "wait = yes" do not show this problem.
I changed "wait = yes" in the toto_server config file and the result is the same as "wait = no". The first time I ran toto_client and got "call failed: RPC: Timed out". The toto_client call was successful in the subsequent runs. Then, pkill toto_server and ran the client again. And, I got "RPC: Remote system error - Connection refused" and I didn't see the toto_server was up. BTW, did you get "RPC: Timed out" at the first run for your test server? You could also test my toto_server and toto_client as they are in the attachment. I don't know why yours works for "wait = yes" but not mine. Thanks!!
I tested the "wait = yes" again for toto_server/toto_client. It worked now. The scenarios that I tested are like: 1. The first client request still got TIMED OUT (it is still a bug). 2. The subsequent client requests are OK. 3. Then, pkill toto_server, do the "ps" and found out that toto_server is automatically started. 4. Now, even the first and subsequent client requests are OK. According to the man page of xinetd.conf, the "wait" should set to no for multithreaded tcp rpc service. Apparently, it does not behave that way. Note: also tested with xinetd-2.3.12-6.3E patch.
I suspect this bug has the same cause as #125485. If the rpc server misbehaves, and registers itself with portmap (overwriting xinetd's registration), after the server exits, xinetd is unaware that it's registration with portmap hsa been overwritten, and no client requests will ever be sent to it. Since no client request is sent to xinetd, it does not restart the server. (Actually, what I see is that the original client request (the one that timed out) is never read (because the toto_server opened its own socket), so xinetd sees the unread request and immediately starts a new toto_server when the initial one is killed. The second toto_server also ignores the original request. . . Etc. Reopen this bug if you can replicate it with an rpc server that was correctly written to be called from xinetd.