Bug 156397

Summary: LTC13414-32-bit ping6 on 64-bit kernel not working
Product: Red Hat Enterprise Linux 3 Reporter: Issue Tracker <tao>
Component: kernelAssignee: David Woodhouse <dwmw2>
Status: CLOSED ERRATA QA Contact: Brian Brock <bbrock>
Severity: medium Docs Contact:
Priority: medium    
Version: 3.0CC: davem, dhowells, jparadis, petrides, tao
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: RHSA-2006-0144 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2006-03-15 15:57:43 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 168424    
Attachments:
Description Flags
Patch. none

Description Issue Tracker 2005-04-29 18:59:35 UTC
Escalated to Bugzilla from IssueTracker

Comment 9 David Howells 2005-10-13 14:40:20 UTC
Okay... I borrowed an x86_64 machine that had RHEL-3 installed. The ping6 
installed by default is 64-bit and works. If I stick a 32-bit i386 ping6 on 
there, that doesn't work, just like with ppc32/ppc64. 
 
It may still be in the arch 32->64 bit translation, since it's mostly the same 
for both archs. 

Comment 10 David Howells 2005-10-13 16:29:41 UTC
I used gdb to examine the parameters supplied to recvmsg() in userspace 
[strace won't show them unless the syscall returns successfully]: 
 
Breakpoint 1, 0x0ff4b8ec in recvmsg () from /lib/tls/libc.so.6 
(gdb) i r $r3 $r4 $r5 
r3             0x6              6               [arg 0: sockfd] 
r4             0xffffd618       4294956568      [arg 1: msg] 
r5             0x0              0               [arg 2: flags] 
 
(gdb) x/7 $r4   [struct msghdr *msg] 
0xffffd618:     0xffffc598  [msg_name] 
                0x00000080  [msg_namelen == 128] 
                0xffffd640  [msg_iov] 
                0x00000001  [msg_iovlen] 
0xffffd628:     0xffffc618  [msg_control] 
                0x00001000  [msg_controllen == 4096] 
                0x00000000  [msg_flags] 
 
(gdb) x/2 0xffffd640   [struct iovec *msg->msg_iovlen] 
0xffffd640:     0x1003a008  [iov_base] 
                0x00001070  [iov_len == 4208] 
 
(gdb) fini 
Run till exit from #0  0x0ff4b8ec in recvmsg () from /lib/tls/libc.so.6 
0x10003b14 in ?? () 
 
The parameters here look reasonable, and in any case, the syscall isn't 
returning EINVAL or EFAULT. 
 
I instrumented sys_recvmsg32() in the ppc64 kernel: 
 
+ 
+               printk("recvmsg(,{%p,%d,%p,%lu,%p,%lu,%x},,,)\n", 
+                      kern_msg.msg_name, kern_msg.msg_namelen, 
+                      kern_msg.msg_iov, (unsigned long) kern_msg.msg_iovlen, 
+                      kern_msg.msg_control, (unsigned long) 
kern_msg.msg_controllen, 
+                      kern_msg.msg_flags); 
+               printk("iov[0] = {%p,%lu}\n", 
+                      kern_msg.msg_iov[0].iov_base, 
+                      kern_msg.msg_iov[0].iov_len); 
+               printk("recvmsg(,,%d,%x,)\n", total_len, user_flags); 
+ 
                err = sock->ops->recvmsg(sock, &kern_msg, total_len, 
                                         user_flags, &scm); 
+ 
+               printk("recvmsg() = %d\n", err); 
+ 
 
Which gave results that look exactly like the userspace results, except where 
the addressed objects have been teleported to kernelspace: 
 
recvmsg(,{c00000000e0afb40,128,c00000000e0afa80,1,00000000ffffc618,4096,0},,,) 
iov[0] = {000000001003a008,4208} 
recvmsg(,,4208,0,) 
recvmsg() = -11 [EAGAIN] 
 

Comment 11 David Howells 2005-10-13 17:27:49 UTC
I've instrumented sys_recvmsg() too, to see how the parameters given to the 
64-bit ping64 are arrayed when passed on to the protocol handler: 
 
ping6: recvmsg64() 
recvmsg64(,
{c00000000ea4baa8,128,c00000000ea4b9e0,1,000001ff7fffe2a0,4096,0},,,) 
iov64[0] = {000000001003b010,4208} 
recvmsg64(,,4208,0,) 
recvmsg64() = 64 
64 bytes from fec0:ac10:1269:4242:20e:a6ff:fe20:4978: icmp_seq=0 ttl=64 
time=0.566 ms 
ping6: recvmsg64() 
recvmsg64(,
{c00000000ea4baa8,128,c00000000ea4b9e0,1,000001ff7fffe2a0,4096,0},,,) 
iov64[0] = {000000001003b010,4208} 
recvmsg64(,,4208,40,) 
recvmsg64() = -11 
 
Note that the first call to recvmsg() looks almost identical to the 32-bit 
version, apart from the fact that it returns successfully. The second call has 
an extra flag set (MSG_DONTWAIT I think), and fails with EAGAIN, but this 
seems reasonable as I think it's just to clean up extra copies of the ping 
reply. 

Comment 12 Ernie Petrides 2005-10-13 21:13:25 UTC
Fixing "hardware" field.

Comment 13 David Howells 2005-10-14 09:24:50 UTC
Fixing "hardware" field back again. 

Comment 14 David Woodhouse 2005-10-19 11:29:29 UTC
The problem here seems to be that the 32-bit compatibility setsockopt() in ppc64
and x86_64 manually swaps each pair of 32-bit words in the 'struct icmp6_filter'
argument when setting a filter....

        for (i = 0; i < 8; i += 2) {
                u32 tmp = kfilter.data[i];

                kfilter.data[i] = kfilter.data[i + 1];
                kfilter.data[i + 1] = tmp;
        }

I don't quite understand why it's doing this, but just bypassing it and letting
sockopt(SOL_ICMPV6, ICMPV6_FILTER)  through unmangled appears to make ping6 work
correctly. This also seems at first glance to match what the 2.6 kernel does.

It seems strange that someone added code for doing a conversion which is
entirely gratuitous though -- I need to double-check that removing it is really
the correct thing to do.

Comment 15 David Woodhouse 2005-10-19 11:38:54 UTC
Definitely looks like it can go. Here's the patch where the offending code was
removed from 2.6:

http://www.kernel.org/git/?p=linux/kernel/git/tglx/history.git;a=commit;h=531066f2b238b5aef235be9027fa3464f6b2d125

I'll generate a patch for 2.4 to remove the various instances of the same
conversion.


Comment 16 David Woodhouse 2005-10-19 11:48:34 UTC
Created attachment 120155 [details]
Patch.

Appears to affect only x86_64 and ppc64 (of the platforms we care about for
RHEL3).

Comment 17 Ernie Petrides 2005-10-27 02:21:09 UTC
A fix for this problem has just been committed to the RHEL3 U7
patch pool this evening (in kernel version 2.4.21-37.7.EL).


Comment 23 Red Hat Bugzilla 2006-03-15 15:57:43 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2006-0144.html