From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.0.1) Gecko/20021003 Description of problem: I have a cipe tunnel between two networks. Network A has a Redhat 7.3 router, Network B has a Redhat 8.0 router. Both machines are up to date on all errata as of 2002-10-21. Both machines are dual processor Pentium 2's running SMP kernels. When I scan Network B's router using nmap's UDP scan feature it causes a kernel oops, but does not entirely crash the machine. The nmap scan options are as follows: nmap -sU -p 28007-28007 -P0 <hostname> 28007 is the port cipe is listening to. It may be of interest to note that the host I am scanning from is NAT'ed to the same IP that one end of the tunnel terminates. I will attach the oops output from the kernel log to this bug. If you need any further information, I am happy to provide it. Version-Release number of selected component (if applicable): How reproducible: Always Steps to Reproduce: 1. (Router for Network B) ifup cipcb0 2. (Any host) nmap -sU -p 28007-28007 -P0 <routerbip> 3. Watch as the oops flies :) Actual Results: cipe produces an oops and all cipe tunnels become unusable. Expected Results: cipe should either produce a warning or silently discard any errant packets. Additional info: i have 'options cipcb cipe_debug=0' in /etc/modules.conf
Here is the oops info that was generated by Network B's router: Oct 21 22:12:43 gw kernel: Unable to handle kernel paging request at virtual address c4000000 Oct 21 22:12:43 gw kernel: printing eip: Oct 21 22:12:43 gw kernel: c49134d0 Oct 21 22:12:43 gw kernel: *pde = 00000000 Oct 21 22:12:43 gw kernel: Oops: 0000 Oct 21 22:12:43 gw kernel: cipcb natsemi pcnet32 mii ipt_REJECT ipt_state ipt_TOS ipt_LOG ipt_limit iptab Oct 21 22:12:43 gw kernel: CPU: 1 Oct 21 22:12:43 gw kernel: EIP: 0010:[<c49134d0>] Not tainted Oct 21 22:12:43 gw kernel: EFLAGS: 00010207 Oct 21 22:12:43 gw kernel: Oct 21 22:12:43 gw kernel: EIP is at crc32 [cipcb] 0x20 (2.4.18-17.8.0smp) Oct 21 22:12:43 gw kernel: eax: 000000c7 ebx: c4917260 ecx: 006a0538 edx: 05d1d13c Oct 21 22:12:43 gw kernel: esi: fffffff4 edi: c395fac8 ebp: c22edde0 esp: c22edd8c Oct 21 22:12:43 gw kernel: ds: 0018 es: 0018 ss: 0018 Oct 21 22:12:43 gw kernel: Process ciped-cb (pid: 7774, stackpage=c22ed000) Oct 21 22:12:43 gw kernel: Stack: c395fac0 c395fac8 c1eac000 c49133ff c395fac8 fffffff4 c1eac01c 00000008 Oct 21 22:12:43 gw kernel: c1263430 000091ac c3bf9032 c1ddc960 c3b2e300 c491180e c1eac000 c395fac8 Oct 21 22:12:43 gw kernel: c22edde0 00000008 c2631580 c22ede14 054cbd41 fffffff8 c3b2e300 c1eac000 Oct 21 22:12:43 gw kernel: Call Trace: [<c49133ff>] cipe_decrypt [cipcb] 0x6f (0xc22edd98)) Oct 21 22:12:43 gw kernel: [<c491180e>] cipe_decrypt_skb [cipcb] 0x1ee (0xc22eddc0)) Oct 21 22:12:43 gw kernel: [<c4911cfc>] cipe_recvmsg [cipcb] 0xcc (0xc22eddf4)) Oct 21 22:12:43 gw kernel: [<c024764e>] inet_recvmsg [kernel] 0x4e (0xc22ede28)) Oct 21 22:12:43 gw kernel: [<c01fe528>] sock_recvmsg [kernel] 0x58 (0xc22ede4c)) Oct 21 22:12:44 gw kernel: [<c01fe25c>] sockfd_lookup [kernel] 0x1c (0xc22ede80)) Oct 21 22:12:44 gw kernel: [<c01ff772>] sys_recvfrom [kernel] 0xb2 (0xc22ede94)) Oct 21 22:12:44 gw kernel: [<c0208120>] dev_ifsioc [kernel] 0x3d0 (0xc22edf20)) Oct 21 22:12:44 gw kernel: [<c01ff817>] sys_recv [kernel] 0x37 (0xc22edf64)) Oct 21 22:12:44 gw kernel: [<c01fff98>] sys_socketcall [kernel] 0x168 (0xc22edf80)) Oct 21 22:12:44 gw kernel: [<c015bdd1>] sys_ioctl [kernel] 0xc1 (0xc22edf94)) Oct 21 22:12:44 gw kernel: [<c0109447>] system_call [kernel] 0x33 (0xc22edfc0)) Oct 21 22:12:44 gw kernel: Oct 21 22:12:44 gw kernel: Oct 21 22:12:44 gw kernel: Code: 0f b6 04 39 41 30 d0 c1 ea 08 0f b6 c0 33 14 83 39 f1 72 ec
This also works against the router running Redhat 7.3. I will attach the oops output from that machine as well.
Created attachment 81484 [details] syslog output for both machines.
Created attachment 81584 [details] Test patch to attempt to trap bug
Can you try to reproduce with the above kernel patch, and see if that prints 'CIPE BUG' (and other info) to the kernel log / dmesg?
Also, can you try CIPE 1.5.4? http://sites.inka.de/sites/bigred/sw/cipe-1.5.4.tar.gz It definitely seems to have useful changes in the area of this oops.
On my uniprocessor test system the patch doesn't produce either an oops or a CIPE_BUG message. I'm going to push a kernel with this patch out to one of my production systems (keep your fingers crossed) and see what happens...
Ok, I tried the patch on a production system. Redhat 8.0 latest kernel compiled into an RPM with your patch added. Sending the same probe doesn't produce a kernel oops. However, it doesn't emit the CIPE_BUG message either. Strangely, the only diagnostic information I get is from dmesg which emits '<3>' when the packet hits and then '<7>' when I shut down the cipe link. I will test cipe 1.5.4, but I am hesitant to use it permanently. (I would rather use your vendor supplied binaries when possible, you have a quality assurance dept., I only have a few systems to test with. :) Any chance you could include cipe 1.5.4 in a future kernel errata?
This bug has been inappropriately marked MODIFIED. Please review the bug life cycle information at http://bugzilla.redhat.com/bugzilla/bug_status.cgi Changing bug status to ASSIGNED.
Can you confirm this bug is fixed in the most recent Red Hat errata kernel, for 8.0? (or in the phoebe beta kernel, or in the Red Hat rawhide kernel...)
Tested on latest errata kernel 2.4.18-19.8.0 and it appears to be fixed. No kernel oops. Subsequent scans continue to show port in 'open' status. Packets continue to flow over the connection. Looks good! Thanks.