781880 – ypserv on F16 doesn't work with ypbind client in broadcast mode

Bug 781880 - ypserv on F16 doesn't work with ypbind client in broadcast mode

Summary: ypserv on F16 doesn't work with ypbind client in broadcast mode

Keywords:
Status:	CLOSED DUPLICATE of bug 869365
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	libtirpc
Sub Component:
Version:	16
Hardware:	x86_64
OS:	Mac OS
Priority:	unspecified
Severity:	medium
Target Milestone:	---
Assignee:	Steve Dickson
QA Contact:	Fedora Extras Quality Assurance
Docs Contact:
URL:
Whiteboard:
Duplicates (1):	732327 (view as bug list)
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2012-01-15 22:41 UTC by Stefan Krüger
Modified:	2013-02-21 22:15 UTC (History)
CC List:	13 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2012-11-05 09:05:33 UTC
Type:	---
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
working workaround (950 bytes, patch) 2012-04-24 14:34 UTC, Honza Horak	no flags	Details \| Diff
View All

Description Stefan Krüger 2012-01-15 22:41:40 UTC

Description of problem:
The ypserv YP server in F16 doesn't work with Mac OS X's ypbind client anymore. The same setup worked fine on F14.


Version-Release number of selected component (if applicable):
ypserv-2.26-9.fc16.x86_64
rpcbind-0.2.0-15.fc16.x86_64

How reproducible:
Worked fine in F14, doesn't work anymore in F16. The iptables firewall was disabled just in case.


Steps to Reproduce:
1. Install + configure ypserv on F16
2. Configure YP client on Mac OS X
3. Try ypwhich/ypcat on OS X, notice nothing works as expected
  
Actual results:
osx $ ypwhich
...


Expected results:
osx $ ypwhich
my.yp.lan


Additional info:
I started several daemons in debug mode:

osx $ rpcinfo -p f16srv
   program vers proto   port
    100000    4   tcp    111  portmapper
    100000    3   tcp    111  portmapper
    100000    2   tcp    111  portmapper
    100000    4   udp    111  portmapper
    100000    3   udp    111  portmapper
    100000    2   udp    111  portmapper
    100004    2   udp    800  ypserv
    100004    1   udp    800  ypserv
    100004    2   tcp    803  ypserv
    100004    1   tcp    803  ypserv

Looks good so far...

f16srv # rpcbind -d
local: 0 lookup routines :
rpcbind : my address is (null)
FUNCTION rbllist_addAdd the prog 100000 vers 3 to the rpcbind list
FUNCTION rbllist_addAdd the prog 100000 vers 4 to the rpcbind list
check binding for local
udp: 0 lookup routines :
rpcbind : my address is 0.0.0.0.0.111
FUNCTION rbllist_addAdd the prog 100000 vers 2 to the rpcbind list
FUNCTION rbllist_addAdd the prog 100000 vers 3 to the rpcbind list
FUNCTION rbllist_addAdd the prog 100000 vers 4 to the rpcbind list
check binding for udp
rmtcall fd for udp is 7
tcp: 0 lookup routines :
rpcbind : my address is 0.0.0.0.0.111
FUNCTION rbllist_addAdd the prog 100000 vers 2 to the rpcbind list
FUNCTION rbllist_addAdd the prog 100000 vers 3 to the rpcbind list
FUNCTION rbllist_addAdd the prog 100000 vers 4 to the rpcbind list
check binding for tcp
udp6: 0 lookup routines :
rpcbind : my address is ::.0.111
FUNCTION rbllist_addAdd the prog 100000 vers 3 to the rpcbind list
FUNCTION rbllist_addAdd the prog 100000 vers 4 to the rpcbind list
check binding for udp6
rmtcall fd for udp6 is 10
tcp6: 0 lookup routines :
rpcbind : my address is ::.0.111
FUNCTION rbllist_addAdd the prog 100000 vers 3 to the rpcbind list
FUNCTION rbllist_addAdd the prog 100000 vers 4 to the rpcbind list
check binding for tcp6
polling for read on fd < 5 6 7 8 9 10 11 >
poll returned read fds < 6 >
PMAP_UNSET request for (100004, 2) : Checking caller's adress (port = 804)
succeeded
svc_maxfd now 11
polling for read on fd < 5 6 7 8 9 10 11 >
poll returned read fds < 6 >
PMAP_UNSET request for (100004, 1) : Checking caller's adress (port = 805)
succeeded
svc_maxfd now 11
polling for read on fd < 5 6 7 8 9 10 11 >
poll returned read fds < 6 >
pmap_rmtcall callit req for (100004, 2, 2, udp) from 192.168.0.112.221.46 : not found
svc_maxfd now 11
polling for read on fd < 5 6 7 8 9 10 11 >
poll returned read fds < 6 >
PMAP_UNSET request for (100004, 2) : Checking caller's adress (port = 798)
succeeded
svc_maxfd now 11
polling for read on fd < 5 6 7 8 9 10 11 >
poll returned read fds < 6 >
PMAP_UNSET request for (100004, 1) : Checking caller's adress (port = 799)
succeeded
svc_maxfd now 11
polling for read on fd < 5 6 7 8 9 10 11 >
poll returned read fds < 6 >
PMAP_SET request for (100004, 2) : Checking caller's adress (port = 801)
succeeded
svc_maxfd now 11
polling for read on fd < 5 6 7 8 9 10 11 >
poll returned read fds < 6 >
PMAP_SET request for (100004, 1) : Checking caller's adress (port = 802)
succeeded
svc_maxfd now 11
polling for read on fd < 5 6 7 8 9 10 11 >
poll returned read fds < 6 >
PMAP_SET request for (100004, 2) : Checking caller's adress (port = 804)
succeeded
svc_maxfd now 11
polling for read on fd < 5 6 7 8 9 10 11 >
poll returned read fds < 6 >
PMAP_SET request for (100004, 1) : Checking caller's adress (port = 805)
succeeded
svc_maxfd now 11
polling for read on fd < 5 6 7 8 9 10 11 >
poll returned read fds < 6 >
pmap_rmtcall callit req for (100004, 2, 2, udp) from 192.168.0.112.221.46 : found at uaddr 0.0.0.0.3.32
...

hm?

f16srv # ypbind -d
Find securenet: 255.255.255.0 192.168.0.0
Find securenet: 255.255.255.255 127.0.0.1
ypserv.conf: files: 30
ypserv.conf: xfr_check_port: 1
ypserv.conf: 0.0.0.0/0.0.0.0:*:shadow.byname:2
ypserv.conf: 0.0.0.0/0.0.0.0:*:passwd.adjunct.byname:2
ypproc_domain_nonack("my.yp.lan") [From: 127.0.0.1:797]
connect from 127.0.0.1
	-> OK.
ypproc_domain_nonack("my.yp.lan") [From: 127.0.0.1:797]
connect from 127.0.0.1
	-> OK.
ypproc_domain_nonack("my.yp.lan") [From: 127.0.0.1:797]
connect from 127.0.0.1
	-> OK.
ypproc_domain_nonack("my.yp.lan") [From: 127.0.0.1:797]
connect from 127.0.0.1
	-> OK.
ypproc_domain_nonack("my.yp.lan") [From: 127.0.0.1:797]
connect from 127.0.0.1
	-> OK.
...

Why is ypserv seeing requests from localhost??! There's no ypbind running on f16srv.

osx # ypbind -d
ypbind: ypbindproc_domain_2 my.yp.lan
ypbind: dead domain my.yp.lan

Comment 1 Stefan Krüger 2012-01-16 21:05:54 UTC

FreeBSD and Fedora16 clients seem to be affected as well. Can somebody confirm that ypserv on F16 actually works?

Comment 2 Honza Horak 2012-01-17 08:07:20 UTC

(In reply to comment #1)
> FreeBSD and Fedora16 clients seem to be affected as well. Can somebody confirm
> that ypserv on F16 actually works?

I'm only able to test Fedora/RHEL packages, but I see no issues. If you can elaborate a bit more what can be wrong, I'll take a look a bit closer at it. 

Do you see anything suspicious in syslog when using Fedora client (server/client side)?

Comment 3 Stefan Krüger 2012-01-17 20:13:46 UTC

I'm sorry, you're right, F16 NIS clients work fine indeed. But I still have problems with OSX and *BSD clients and I don't know why. :(

As soon as I start ypbind on OSX or *BSD I see this when running ypserv -d on F16:

ypproc_domain_nonack("my.yp.lan") [From: 127.0.0.1:1010]
connect from 127.0.0.1
        -> OK.
ypproc_domain_nonack("my.yp.lan") [From: 127.0.0.1:1010]
connect from 127.0.0.1
        -> OK.
ypproc_domain_nonack("my.yp.lan") [From: 127.0.0.1:1010]
connect from 127.0.0.1
        -> OK.
ypproc_domain_nonack("my.yp.lan") [From: 127.0.0.1:1010]
connect from 127.0.0.1
        -> OK.
ypproc_domain_nonack("my.yp.lan") [From: 127.0.0.1:1010]
connect from 127.0.0.1
        -> OK.
ypproc_domain_nonack("my.yp.lan") [From: 127.0.0.1:1010]
connect from 127.0.0.1
        -> OK.
ypproc_domain_nonack("my.yp.lan") [From: 127.0.0.1:1010]
connect from 127.0.0.1
        -> OK.
...

With a F16 NIS client I see connections from the 192.168.0.0/24 network (i.e. the F16 NIS client IP).

I also tried downgrading ypserv and rpcbind using F14 packages, which didn't help.

So I'm kinda lost here as ypbind on *BSD and OSX provide no debug mechanisms.

Will try installing Solaris tomorrow and report back.

Comment 4 Stefan Krüger 2012-01-17 22:51:51 UTC

So, Solaris 10 NIS client works ofc fine with F16 NIS server, but! Solaris 10 NIS server also works with *BSD clients...

To sum it up:

F14 YP Server works fine
F16 YP Server works with F16, Solaris, but not with *BSD and Mac OS X clients
Solaris YP Server works fine

Maybe someone could d/l any *BSD, install it in VirtualBox and confirm that F16 ypserv doesn't work with *BSD ypbind?

Comment 5 Honza Horak 2012-01-18 13:21:24 UTC

(In reply to comment #4)
> Maybe someone could d/l any *BSD, install it in VirtualBox and confirm that F16
> ypserv doesn't work with *BSD ypbind?

It works for me in VM with Fedora 16 server and FreeBSD 9.0 client.

(In reply to comment #3)
> ypproc_domain_nonack("my.yp.lan") [From: 127.0.0.1:1010]
> connect from 127.0.0.1
>         -> OK.

Sounds to me like a kind of hostname misconfiguration. Don't your hostnames conflict somehow?

Comment 6 Stefan Krüger 2012-01-18 21:01:57 UTC

Thanks for checking it out, it's really weird that this is working for you out of the box. Anyway, I tried to nail it down again and found this:

freebsd9# ypbind
freebsd9# ypwhich
ypwhich: can't yp_bind: reason: Domain not bound
freebsd9# killall ypbind
freebsd9# ypbind -S my.yp.lan,f16srv -m
freebsd9# ypwhich
f16srv
freebsd9# killall ypbind
freebsd9# ypbind -ypset; sleep 1; ypset f16srv
freebsd9# ypwhich
f16srv

ypbind's -m is:

-m      Cause ypbind to use a 'many-cast' rather than a broadcast for
        choosing a server from the restricted mode server list.  In many-
        cast mode, ypbind will transmit directly to the
        YPPROC_DOMAIN_NONACK procedure of the servers specified in the
        restricted list and bind to the server that responds the fastest.
        This mode of operation is useful for NIS clients on remote sub-
        nets where no local NIS servers are available.  The -m flag can
        only be used in conjunction with the -S flag above (if used with-
        out the -S flag, it has no effect).

ypset does this:
The ypset utility tells the ypbind(8) process on the current machine which YP server process to communicate with.

I also logged the server side:

f16srv # ypserv -d

when running 'ypbind' on freebsd9:

ypproc_domain_nonack("my.yp.lan") [From: 127.0.0.1:1010]
connect from 127.0.0.1
        -> OK.

when running 'ypbind -S my.yp.lan,f16srv -m' on freebsd9:

ypproc_domain_nonack("my.yp.lan") [From: 192.168.0.3:795]
connect from 192.168.0.3
        -> OK.

when running 'ypbind -ypset; sleep 1; ypset f16srv' on freebsd9:

ypproc_domain_nonack("my.yp.lan") [From: 127.0.0.1:1010]
connect from 127.0.0.1
        -> OK.
ypproc_domain("my.yp.lan") [From: 192.168.0.3:905]
connect from 192.168.0.3
        -> Ok.


So the question is, why does it *not* work when FreeBSD's ypbind is in 'broadcast' mode?

Comment 7 Honza Horak 2012-01-24 12:31:14 UTC

(In reply to comment #0)
> ypproc_domain_nonack("my.yp.lan") [From: 127.0.0.1:797]
> connect from 127.0.0.1
>  -> OK.
> ...
> Why is ypserv seeing requests from localhost??! There's no ypbind running on
> f16srv.

It's because ypbind sends broadcast message to all local machines, where rpcbind tests if ypserv is running, which is actually the call above. That's fine.

(In reply to comment #6)
> So the question is, why does it *not* work when FreeBSD's ypbind is in
> 'broadcast' mode?

I've looked into it a bit and it turned out that even current Fedora's NIS client doesn't work in broadcast mode. So I tried to downgrade rpcbind together with libtirpc and this is result:

ypbind in broadcast mode *works* with the following older builds on the NIS server:
$ rpm -q libtirpc rpcbind
libtirpc-0.2.2-0.fc16.x86_64
rpcbind-0.2.0-11.fc16.x86_64

But ypbind in broadcast mode *doesn't work* with current builds on the NIS server:
$ rpm -q libtirpc rpcbind
libtirpc-0.2.2-1.1.fc16.x86_64
rpcbind-0.2.0-15.fc16.x86_64

Since rpcbind was only converted from SysV init to systemd, I suspect libtirpc to be the problem. There is a large patch porting changes from libtirpc-0.2.3-rc1. 

Though, current rpcbind-0.2.0-15.fc16.x86_64 doesn't work with older libtirpc-0.2.2-0.fc16.x86_64 (rpcbind segfaults at least in debug mode).

How to reproduce:
1. install, configure and run ypserv on NIS server
2. install ypbind on NIS client
3. turn off firewall on client and server
4. set domainname on NIS client according to server (domainname "mydomain")
5. run ypbind -d -broadcast


Actual results:
# ypbind -d -broadcast
...
6718: add_server() domain: mydomain, broadcast
6718: do_broadcast() for domain 'mydomain' is called
6718: broadcast: RPC: Timed out.
6718: leave do_broadcast() for domain 'mydomain'
...

[nis-client] $ ypwhich
ypwhich: Can't communicate with ypbind


Expected results:
...
6723: add_server() domain: mydomain, broadcast
6723: do_broadcast() for domain 'mydomain' is called
6723: Answer for domain 'mydomain' from server 'f16-x64-nis-server'
6723: leave do_broadcast() for domain 'mydomain'
...

[nis-client] $ ypwhich
f16-x64-nis-server


Additional info:
According to tcpdump output there is no UDP response from server's rpcbind to client's broadcast request when using current rpcbind-0.2.0-15.fc16.x86_64 and libtirpc-0.2.2-1.1.fc16.x86_64.

Comment 8 Stefan Krüger 2012-03-03 10:06:06 UTC

Any updates on this one?

Comment 9 Joshua Weage 2012-03-05 19:22:15 UTC

I've just ran into the same problem on CentOS 6.2 (RHEL 6.2).

Broadcast connections from a CentOS 5 client do not work, but when I specify the server name in /etc/yp.conf, ypbind binds to the NIS domain.

rpcbind-0.2.0.8.el6
libtirpc-0.2.1.5.el6
ypserv-2.19.22.el6

I don't see any bugs reported for this against RHEL 6 or CentOS 6.

Comment 10 dave thom 2012-04-11 09:07:20 UTC

Hi All
we have just migrated our NIS server to CentOS 6.2 (from 5.7) and now all our mac os x clients that did NIS logins can no longer do so. I'm in the process of installing BSD on a real (not virtual) machine to see is there is the same behaviour.

Comment 11 dave thom 2012-04-12 14:20:02 UTC

Ok it would seem FreeBSD is the same. We have two domains one is CentOS 5.7 one 6.2.
config NIS for domain1 works config for domain2 doesn't, all other clients apart from MacOSX and FreeBSD work

Comment 12 Honza Horak 2012-04-23 15:06:05 UTC

On 04/23/2012 01:28 PM, Steve Dickson wrote:
> I have a feeling broadcast mode has been broken a long time... I seem 
> remember when I maintain the code, broadcasts didn't work...

Last working builds I found are:
libtirpc-0.2.2-1.1.fc16.x86_64
rpcbind-0.2.0-15.fc16.x86_64
...so it seems to work the last time almost a year back.

> Does turning on debug for rpcbind (-d) show the broadcast reaching rpcbind?
> I just took a quick look at the code and its not clear if rpcbind is 
> listening for broadcasts or not...

I think it reaches rpcbind, but it doesn't send a respond. This is a debug output of rpcbind related to the ypbind broadcast request:

poll returned read fds < 6 >
pmap_rmtcall callit req for (100004, 2, 2, udp) from 192.168.122.42.133.220 : found at uaddr 0.0.0.0.3.121
addrmerge(caller, 0.0.0.0.3.121, NULL, udp
addrmerge: hint 127.0.0.1.0.111
addrmerge: returning 127.0.0.1.3.121
addrmerge(caller, 0.0.0.0.3.121, NULL, udp
addrmerge: hint 192.168.122.42.133.220
addrmerge: returning 192.168.122.223.3.121
merged uaddr 192.168.122.223.3.121
rpcbproc_callit_com:  original XID 62c39783, new XID e55bc6c0
svc_maxfd now 11
polling for read on fd < 5 6 7 8 9 10 11 >
poll returned read fds < 7 >
my_svc_run:  polled on forwarding fd 7, netid udp - calling handle_reply
handle_reply:  reply xid: -446970176 fi addr: 0x7f6977fdae00
polling for read on fd < 5 6 7 8 9 10 11 >

Comment 13 Steve Dickson 2012-04-23 17:43:51 UTC

(In reply to comment #12)
> On 04/23/2012 01:28 PM, Steve Dickson wrote:
> > I have a feeling broadcast mode has been broken a long time... I seem 
> > remember when I maintain the code, broadcasts didn't work...
> 
> Last working builds I found are:
> libtirpc-0.2.2-1.1.fc16.x86_64
> rpcbind-0.2.0-15.fc16.x86_64
These are the builds that are currently in f16, at least on my updated box...
$ rpm -q libtirpc rpcbind
libtirpc-0.2.2-1.1.fc16.x86_64
rpcbind-0.2.0-15.fc16.x86_64

But there has be some churn in both packages
libtirpc-0.2.2-0.fc16 to libtirpc-0.2.2-1.1.fc16
(http://koji.fedoraproject.org/koji/buildinfo?buildID=254334)

rpcbind-0.2.0-11.fc16 - rpcbind-0.2.0-15.fc16
(http://koji.fedoraproject.org/koji/buildinfo?buildID=263222)

So something definitely could have broken... You you mind looking
back to see which version things did work in? 

> ...so it seems to work the last time almost a year back.
> 
> > Does turning on debug for rpcbind (-d) show the broadcast reaching rpcbind?
> > I just took a quick look at the code and its not clear if rpcbind is 
> > listening for broadcasts or not...
> 
> I think it reaches rpcbind, but it doesn't send a respond. This is a debug
> output of rpcbind related to the ypbind broadcast request:
> 
> poll returned read fds < 6 >
> pmap_rmtcall callit req for (100004, 2, 2, udp) from 192.168.122.42.133.220 :
This means the call got there...

> found at uaddr 0.0.0.0.3.121
This means "something" was found.

> addrmerge(caller, 0.0.0.0.3.121, NULL, udp
addrmerge finds a server address that can be used by `caller' to contact
the local service specified by `serv_uaddr' (0.0.0.0.3.121) "

> addrmerge: hint 127.0.0.1.0.111
> addrmerge: returning 127.0.0.1.3.121
> addrmerge(caller, 0.0.0.0.3.121, NULL, udp
> addrmerge: hint 192.168.122.42.133.220
> addrmerge: returning 192.168.122.223.3.121
> merged uaddr 192.168.122.223.3.121
This means something was found. Does the 192.168.122.223 IP meaningful? 

> rpcbproc_callit_com:  original XID 62c39783, new XID e55bc6c0
This means a call to 192.168.122.223 is being set up
> svc_maxfd now 11
The lack of errors at this point means the call was probably successful 

> polling for read on fd < 5 6 7 8 9 10 11 >
> poll returned read fds < 7 >
This means another call came in...

> my_svc_run:  polled on forwarding fd 7, netid udp - calling handle_reply
This means its replay to a previous call...

> handle_reply:  reply xid: -446970176 fi addr: 0x7f6977fdae00
This means the reaply was found and the tirpc routine svc_sendreply()
was called... unfortunately the return value of svc_sendreply() is
not checked...  

> polling for read on fd < 5 6 7 8 9 10 11 >
This means rpcbind is waiting for another message...

At least from the rpcbind stand point, the message was received and
sent....

Just curious what that network trace of 'tshark host 192.168.122.223'
shows any traffic (assuming 192.168.122.22) is meaningful...

Comment 14 Honza Horak 2012-04-24 08:12:27 UTC

(In reply to comment #13)
> (In reply to comment #12)
> > Last working builds I found are:
> > libtirpc-0.2.2-1.1.fc16.x86_64
> > rpcbind-0.2.0-15.fc16.x86_64
> These are the builds that are currently in f16, at least on my updated box...
> $ rpm -q libtirpc rpcbind
> libtirpc-0.2.2-1.1.fc16.x86_64
> rpcbind-0.2.0-15.fc16.x86_64
> 
> But there has be some churn in both packages
> libtirpc-0.2.2-0.fc16 to libtirpc-0.2.2-1.1.fc16
> (http://koji.fedoraproject.org/koji/buildinfo?buildID=254334)
> 
> rpcbind-0.2.0-11.fc16 - rpcbind-0.2.0-15.fc16
> (http://koji.fedoraproject.org/koji/buildinfo?buildID=263222)
> 
> So something definitely could have broken... You you mind looking
> back to see which version things did work in? 

Sorry, I made a mistake. The last working builds I found were:
$ rpm -q libtirpc rpcbind
libtirpc-0.2.2-0.fc16.x86_64
rpcbind-0.2.0-11.fc16.x86_64
...so generally the builds before the churn you mentioned.

> > addrmerge: hint 127.0.0.1.0.111
> > addrmerge: returning 127.0.0.1.3.121
> > addrmerge(caller, 0.0.0.0.3.121, NULL, udp
> > addrmerge: hint 192.168.122.42.133.220
> > addrmerge: returning 192.168.122.223.3.121
> > merged uaddr 192.168.122.223.3.121
> This means something was found. Does the 192.168.122.223 IP meaningful? 

192.168.122.223 is the server where rpcbind + ypserv are running, 192.168.122.42 is a client, where ypbind -broadcast is running. So that seems correct to me.

> Just curious what that network trace of 'tshark host 192.168.122.223'
> shows any traffic (assuming 192.168.122.22) is meaningful...

Using current builds tshark shows only the following incoming traffic:
  0.000000 192.168.122.42 -> 192.168.122.255 Portmap 162 V2 CALLIT Call


After I downgraded to libtirpc-0.2.2-0.fc16.x86_64 and rpcbind-0.2.0-11.fc16.x86_64, tshark shows:

  6.297871 192.168.122.42 -> 192.168.122.255 Portmap 162 V2 CALLIT Call
  6.298537 192.168.122.223 -> 192.168.122.42 UDP 78 Source port: multiling-http  Destination port: 54231
  6.304615 192.168.122.42 -> 192.168.122.255 Portmap 162 V2 CALLIT Call
  6.305191 192.168.122.223 -> 192.168.122.42 UDP 78 Source port: multiling-http  Destination port: 49289

Comment 15 Honza Horak 2012-04-24 14:34:04 UTC

Created attachment 579877 [details]
working workaround

I've played a bit with rpcbind and libtirpc and found some interesting things:

First, current rpmbind-0.2.0-15.fc16 from koji segfaults with older libtirpc-0.2.2-0.fc16 or libtirpc-0.2.1-6.fc15. But it works fine if I rebuild the same rpcbind against the same older libtirpc on my own.

There is no change in soname, but I'd say there should be, since the segfault indicates some ABI incompatibility to me. Consider some minor soname bump, please.

Second, I tried to debug the communication and found the problem was really in svc_sendreply, which in turn called svc_dg_reply. Some changes related to authentication were made in svc_dg_reply, so I reverted some changes and the attached patch is a minimum change, which works fine.

I won't do more investigating, since I don't understand the authentication stuff at all. Please, take a look at the changes in svc_dg_reply made from libtirpc-0.2.1-6 until now.

Comment 16 Steve Dickson 2012-04-24 14:47:26 UTC

Thank you for your excellent debugging! I'm travelling today but I will get back to this asap...

Comment 17 Philippe Troin 2012-06-02 05:55:13 UTC

I confirm that the patch attachment 579877 [details] fixes all problems with (rpcbind-mediated) RPC broadcasts.

I have made some fixed packages available here:

http://rpm.fifi.org/f17-fifi/i386/libtirpc-0.2.2-2.1.0.0.1.fif17.i686.rpm
http://rpm.fifi.org/f17-fifi/i386/libtirpc-devel-0.2.2-2.1.0.0.1.fif17.i686.rpm
http://rpm.fifi.org/f17-fifi/x86_64/libtirpc-0.2.2-2.1.0.0.1.fif17.x86_64.rpm
http://rpm.fifi.org/f17-fifi/x86_64/libtirpc-devel-0.2.2-2.1.0.0.1.fif17.x86_64.rpm

Phil.

Comment 18 Honza Horak 2012-06-26 14:48:54 UTC

*** Bug 732327 has been marked as a duplicate of this bug. ***

Comment 19 Suzuki Takashi 2012-07-05 10:55:29 UTC

The same patch could be applied to libtirpc-0.2.1-5.el6 and it worked for me on RHEL 6.3 x86_64.

Comment 20 Chris Tracy 2012-09-09 22:02:52 UTC

Encountered this issue just now migrating our CentOS 5 NIS servers to CentOS 6.  All our OS X clients were unable to bind to NIS.  With the patch in comment 17 applied to libtirpc-0.2.1-5.el6, everything appears to be resolved.

Comment 21 Elliott Forney 2012-10-04 19:15:28 UTC

Any hope of getting this patch pushed to RHEL 6 and Fedora 16,17,18?  Our OS X clients are hopelessly broken and we are hoping it will fix rup and rusers as described in bug 732327.

Comment 22 Alan Johnson 2012-11-02 15:14:59 UTC

It has been a month since Elliott asked, so I'll nudge again: Any hope of getting this patch pushed to RHEL 6?

Comment 23 Honza Horak 2012-11-05 09:01:05 UTC

(In reply to comment #22)
> It has been a month since Elliott asked, so I'll nudge again: Any hope of
> getting this patch pushed to RHEL 6?

Fortunately yes, this issue should be fixed by bug #864056, which is going to be fixed in RHEL-6.4.

Comment 24 Honza Horak 2012-11-05 09:05:33 UTC

And there are also updates for Fedora:
https://admin.fedoraproject.org/updates/FEDORA-2012-16150/rpcbind-0.2.0-19.fc17

This should be fixed by bug #869365 for Fedora, so I'm closing this as a duplicate. Feel free to re-open if I'm mistaken.

*** This bug has been marked as a duplicate of bug 869365 ***

Note You need to log in before you can comment on or make changes to this bug.