Bug 674622

Summary: Oops NULL pointer tcp_write_xmit
Product: [Fedora] Fedora Reporter: John Reiser <jreiser>
Component: kernelAssignee: Neil Horman <nhorman>
Status: CLOSED WORKSFORME QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: rawhideCC: gansalmon, itamar, jonathan, kernel-maint, madhu.chinakonda, nhorman
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-04-29 14:24:08 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
kernel Oops traceback
none
patch to check pointers before assignment none

Description John Reiser 2011-02-02 16:46:23 UTC
Created attachment 476615 [details]
kernel Oops traceback

Description of problem: Kernel Oops on NULL pointer dereference from tcp_write_xmit.


Version-Release number of selected component (if applicable):
kernel-2.6.38-0.rc3.git0.1.fc15.x86_64

How reproducible:


Steps to Reproduce:
1. scp <local> <remote>
2.
3.
  
Actual results: Oops


Expected results: no Oops


Additional info:

Comment 1 Chuck Ebbert 2011-02-04 20:01:56 UTC
OOPS is at include/linux/skbuff.h:895:
static inline void __skb_insert(struct sk_buff *newsk,
                                struct sk_buff *prev, struct sk_buff *next,
                                struct sk_buff_head *list)
{
        newsk->next = next;
        newsk->prev = prev;
==>     next->prev  = prev->next = newsk;
        list->qlen++;
}

  next is NULL here


Called from include/linux/skbuff.h:991:
static inline void __skb_queue_after(struct sk_buff_head *list,
                                     struct sk_buff *prev,
                                     struct sk_buff *newsk)
{
        __skb_insert(newsk, prev, prev->next, list);
}

Called from include/net/tcp.h:1294:
static inline void tcp_insert_write_queue_after(struct sk_buff *skb,
                                                struct sk_buff *buff,
                                                struct sock *sk)
{
        __skb_queue_after(&sk->sk_write_queue, skb, buff);
}

Called from net/ipv4/tcp_output.c:tso_fragment:1515:
        /* Link BUFF into the send queue. */
        skb_header_release(buff);
==>     tcp_insert_write_queue_after(skb, buff, sk);


Called from net/ipv4/tcp_output.c:tcp_write_xmit:1784:
                if (skb->len > limit &&
==>                 unlikely(tso_fragment(sk, skb, limit, mss_now, gfp)))
                        break;

Comment 2 Chuck Ebbert 2011-02-06 01:12:35 UTC
Reported upstream 02/04/2011:
http://marc.info/?l=linux-netdev&m=129685175319992&w=4

Comment 3 Chuck Ebbert 2011-02-10 19:17:43 UTC
Are you doing anything out of the ordinary with the network, like changing the qdisc, altering network adapter settings with ethtool, changing default routing options or using different / unusual iptables options?

Comment 4 John Reiser 2011-02-10 19:23:45 UTC
No, I did not touch anything that I know of, and in particular nothing mentioned in Comment #3.

Comment 5 Neil Horman 2011-04-28 19:34:16 UTC
If chucks decode of the location is right, It seems that __skb_insert is just a little bit to simple for its own good.  specifically it seems to not take into account the possibility that the skb you are inserting before or after might be at the head of tail of a list (implying that skb->next or skb->prev is null respectively).  I'll attach a patch for you to try shortly

Comment 6 Neil Horman 2011-04-28 19:35:41 UTC
Created attachment 495655 [details]
patch to check pointers before assignment

Here you go, could you please give this a try and confirm if it fixes the problem?  Thanks!

Comment 7 Neil Horman 2011-04-28 23:50:17 UTC
Actually, sorry, scrap what I just said, that list is supposed to be circular, so there shouldn't ever be NULL.  It is, however a bit odd that the skb that we insert after in tso_fragment, is never unlinked from the list its on in sk_send_head (retrieved from tcp_send_head in tcp_write_xmit).  I wonder if we're not getting an odd race there.  Feel free to try this patch, but it shouldn't be needed.  I'll have a better patch for you ASAP.

Comment 8 John Reiser 2011-04-29 02:08:45 UTC
Don't spend too much time on this one.  It only happened once, not quite three months ago.  "How reproducible" is empty (blank).  When it did happen, it was 'scp' doing a "push".  Oops is always worth of a report, but this one might well be stale.

Comment 9 Neil Horman 2011-04-29 14:24:08 UTC
Ok, copy that, I'll close it as WORKSFORME, but if you run into it again, please reopen.