Bug 132035

Summary: rpcgen -h hangs
Product: [Fedora] Fedora Reporter: mike eisler <mike>
Component: glibcAssignee: Jakub Jelinek <jakub>
Status: CLOSED UPSTREAM QA Contact:
Severity: medium Docs Contact:
Priority: medium    
Version: 2CC: drepper
Target Milestone: ---   
Target Release: ---   
Hardware: i386   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2004-09-28 02:59:41 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description mike eisler 2004-09-07 23:51:29 UTC
Description of problem:

Running rpcgen -h on the nfsv4 .x file (get it from
www.nfsv4.org)hangs.

Version-Release number of selected component (if applicable):


How reproducible:




Steps to Reproduce:
1. rpcgen -h nfs4_prot.x > n.h
2.
3.
  
Actual results:

hangs/ partial output in n.h:

[mre@iorich s]$ cat n.h
/*
 * Please do not edit this file.
 * It was generated using rpcgen.
 */

#ifndef _NFS4_PROT_H_RPCGEN
#define _NFS4_PROT_H_RPCGEN

#include <rpc/rpc.h>


#ifdef __cplusplus
extern "C" {
#endif



Expected results:

it should complete

Additional info:

I suspect there might be an interaction between how rpcgen uses
pipes and Linux 2.6's implementation. Here is some raw
data followed by some analysis:

[mre@iorich s]$ ps -eaf | grep -w mre
root     24045 24011  0 16:23 ?        00:00:00 login -- 
mre                              
mre      24046 24045  0 16:23 pts/0    00:00:00 -csh
mre      24158 24046  0 16:24 pts/0    00:00:23 rpcgen -h nfs4_prot.x
mre      24159 24158  0 16:24 pts/0    00:00:00 /lib/cpp -C -DRPC_HDR 
nfs4_prot.x
mre      24160 24159  0 16:24 pts/0    00:00:00 /usr/lib/gcc-
lib/x86_64-redhat-linux/3.3.3/cc1 -E -quiet -C -DRPC_HDR nfs4_prot.x
mre      24171 24046  0 16:24 pts/0    00:00:00 ps -eaf
mre      24172 24046  0 16:24 pts/0    00:00:00 grep -w mre
[mre@iorich s]$ strace -p 24160
Process 24160 attached - interrupt to quit
write(1, ";\nconst ACE4_WRITE_OWNER = 0x000"..., 4096 <unfinished ...>
Process 24160 detached
[mre@iorich s]$ strace -p 24158
Process 24158 attached - interrupt to quit
Process 24158 detached
[mre@iorich s]$ ls -lt /proc/24158/fd /proc/24160/fd
/proc/24160/fd:
total 5
l-wx------  1 mre gopher 64 Sep  7 16:23 5 -> pipe:[647701]
lrwx------  1 mre gopher 64 Sep  7 16:23 0 -> /dev/pts/0
l-wx------  1 mre gopher 64 Sep  7 16:23 1 -> pipe:[647701]
l-wx------  1 mre gopher 64 Sep  7 16:23 4 -
> /u/mre/src/acleditor/s/n.h
lrwx------  1 mre gopher 64 Sep  7 16:22 2 -> /dev/pts/0

/proc/24158/fd:
total 5
lrwx------  1 mre gopher 64 Sep  7 16:23 0 -> /dev/pts/0
l-wx------  1 mre gopher 64 Sep  7 16:23 1 -
> /u/mre/src/acleditor/s/n.h
lr-x------  1 mre gopher 64 Sep  7 16:23 3 -> pipe:[647701]
l-wx------  1 mre gopher 64 Sep  7 16:23 4 -
> /u/mre/src/acleditor/s/n.h
lrwx------  1 mre gopher 64 Sep  7 16:22 2 -> /dev/pts/0

So, apparently rpcgen is reading from a pipe
it created, and cc1 is writing to that same pipe?

Why, if there is reader and writer active, does this hang?
cc1 has two file descriptors open on the pipe.
A process, when it creates a pipe for communicating between
two processes, should close the converse file
descriptor in the child and parent. Clearly, this did not
happen for cc1. I suspect fixing rpcgen to do this would
fix the hang.

Comment 1 mike eisler 2004-09-08 22:21:01 UTC
The issue with pipes was a red herring. Fixing the pipe handling by 
closing the redundant descriptor didn't change anything.

Debugging with gdb, we see that the problem is an infinite loop in
isvectordef():


int
isvectordef (const char *type, relation rel)
{
  definition *def;

  for (;;)
    {
      switch (rel)
        {
        case REL_VECTOR:
          return !streq (type, "string");
        case REL_ARRAY:
          return 0;
        case REL_POINTER:
          return 0;
        case REL_ALIAS:
          def = findval (defined, type, typedefed);
          if (def == NULL)
            {
              return 0;
            }

          type = def->def.ty.old_type;
          rel = def->def.ty.rel;
        }
    }
}

This loop never returns because of these two typedefs in
the NFSv4 .x file:

typedef hyper int64_t;
typedef unsigned hyper uint64_t;

The problem is that, unlike the Solaris rpcgen,
the base native C types the GNU rpcgen uses for
hyper are int64_t and uint64_t (Solaris uses
longlong_t and ulonglong_t). Thus findval(,"int64_t",)
always returs a definition that has def.def.ty.old_type equal to
type.

One fix would be to change the native types to "long long",
but browsing the header files, apparently there is a concern
that some compilers don't have "long long" support.

Instead, I was able to cure the infinite loop by adding this after
the switch (){} block:

#if 1
          if (strcmp(type, def->def.ty.old_type) == 0) {
             return 0;
          }
#endif

I also changed print_datadef() to suppress the emission
of the function prototype for xdr_int64_t() since that is
already in /usr/include/rpc/xdr_int64_t:

  if (def->def_kind != DEF_PROGRAM && def->def_kind != DEF_CONST)
    {

#if 1
     if (def->def_kind == DEF_TYPEDEF && strcmp(def->def_name,
        def->def.ty.old_type) == 0) {
        return;
     }
#endif

      storexdrfuncdecl(def->def_name,
                       def->def_kind != DEF_TYPEDEF ||
                       !isvectordef(def->def.ty.old_type,
                                    def->def.ty.rel));
    }

I haven't put this fix through all its paces, such has trying
out the -a option. In any case the analysis doesn't
give an indication for the workaround: do not use .x files
that define a typedef that conflicts with a "core" rpcgen type.

Comment 2 Ulrich Drepper 2004-09-28 02:59:41 UTC
I've fixed this differently.  The changes are in upstream CVS and will
probably be in the next FC3 glibc binary.