RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 2054037 - use-after-free in sctp_do_8_2_transport_strike [rhel-7.9.z]
Summary: use-after-free in sctp_do_8_2_transport_strike [rhel-7.9.z]
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: kernel
Version: 7.9
Hardware: All
OS: Linux
urgent
urgent
Target Milestone: rc
: ---
Assignee: Xin Long
QA Contact: ying xu
URL:
Whiteboard:
Depends On:
Blocks: 1880027
TreeView+ depends on / blocked
 
Reported: 2022-02-14 04:03 UTC by Jonathan Maxwell
Modified: 2023-03-07 09:54 UTC (History)
18 users (show)

Fixed In Version: kernel-3.10.0-1160.85.1.el7
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-03-07 09:54:02 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Gitlab redhat/rhel/src/kernel rhel-7 merge_requests 595 0 None None None 2022-12-15 22:48:14 UTC
Red Hat Issue Tracker RHELPLAN-112097 0 None None None 2022-02-14 04:05:20 UTC
Red Hat Knowledge Base (Solution) 6016061 0 None None None 2022-12-22 10:58:35 UTC

Description Jonathan Maxwell 2022-02-14 04:03:07 UTC
Description of problem:

A customer reported a crashed VM and uploaded a Vmcore:

From sureshk analysis:

crash> sys |grep -e RELEASE -e PANIC
     RELEASE: 3.10.0-1160.6.1.el7.x86_64
       PANIC: "BUG: unable to handle kernel NULL pointer dereference at 0000000000000268"

The backtrace if the crash is:
+++
    [exception RIP: sctp_do_8_2_transport_strike+0x71]
    RIP: ffffffffc07fc991  RSP: ffff9383f9643b80  RFLAGS: 00010246
    RAX: 0000000000000000  RBX: ffff93837b8c2c00  RCX: 0000000000000000
    RDX: 0000000000000000  RSI: ffff93837b8c2c00  RDI: ffff9383ed6e6000
    RBP: ffff9383f9643b98   R8: 0000000000000003   R9: ffff9383f9643c90
    R10: ffff938377345204  R11: 0000000000000005  R12: ffff9383ed6e6000
    R13: 0000000000000000  R14: 0000000000000003  R15: ffff9383f9643c90
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
#10 [ffff9383f9643ba0] sctp_cmd_interpreter at ffffffffc07fe385 [sctp]
#11 [ffff9383f9643c38] sctp_do_sm at ffffffffc07fcc91 [sctp]
#12 [ffff9383f9643e10] sctp_generate_timeout_event at ffffffffc07fd305 [sctp]
#13 [ffff9383f9643e58] sctp_generate_t2_shutdown_event at ffffffffc07fd3e3 [sctp]
#14 [ffff9383f9643e68] call_timer_fn at ffffffffa7cabd58
#15 [ffff9383f9643ea0] run_timer_softirq at ffffffffa7cae1ed
#16 [ffff9383f9643f18] __do_softirq at ffffffffa7ca4b95
#17 [ffff9383f9643f88] call_softirq at ffffffffa83984ec
#18 [ffff9383f9643fa0] do_softirq at ffffffffa7c2f715
#19 [ffff9383f9643fc0] irq_exit at ffffffffa7ca4f15
#20 [ffff9383f9643fd8] smp_apic_timer_interrupt at ffffffffa8399a88
#21 [ffff9383f9643ff0] apic_timer_interrupt at ffffffffa8395fba
+++

The sctp module was trying to access the sctp association, but crashed because the association was NULL

+++
Crashed while trying to access  "transport->asoc->rto_max"

and association is NULL

crash> struct sctp_transport.asoc ffff93837b8c2c00
  asoc = 0x0 <--- NULL
+++

freed slab object:

crash> kmem ffff93837b8c2c00
CACHE             OBJSIZE  ALLOCATED     TOTAL  SLABS  SSIZE  NAME
ffff9383f9007500     1024       3352      3456    108    32k  kmalloc-1024
  SLAB              MEMORY            NODE  TOTAL  ALLOCATED  FREE
  fffffb6542ee3000  ffff93837b8c0000     0     32         24     8
  FREE / [ALLOCATED]
   ffff93837b8c2c00  (cpu 1 cache)
+++

Actual results:

Use after free crash.

Expected results:

no crash.

Additional info:

This looks very similar to:

http://lkml.iu.edu/hypermail/linux/kernel/2104.2/05811.html

But I can't see a fix for that in upstream.

Comment 3 Jonathan Maxwell 2022-02-14 04:12:23 UTC
> sctp_do_8_2_transport_strike.constprop.0+0xa27/0xab0 net/sctp/sm_sideeffect.c:531

531 ▹       ▹       if (transport->state != SCTP_INACTIVE)↩

So it looks the KASAN report was a good match.

Comment 4 Jonathan Maxwell 2022-02-23 01:23:49 UTC
Hi Xin, do you have any update on this?

Regards

Jon

Comment 5 Xin Long 2022-02-23 03:51:14 UTC
(In reply to Jonathan Maxwell from comment #4)
> Hi Xin, do you have any update on this?
> 
Hi, Jon, sorry for late

Yes, I think it's the same one as the KASAN reported.
The fix is already on upstream:

  35b4f24415c8 sctp: do asoc update earlier in sctp_sf_do_dupcook_a

we may need this one too:

  51eac7f2f06b sctp: do asoc update earlier in sctp_sf_do_dupcook_b

Comment 6 Jonathan Maxwell 2022-02-23 03:53:51 UTC
Thanks Xin, awesome. I'll tell the customer we have a fix but it won't go into RHEL7 and report back here.

Comment 7 Sangam 2022-08-15 05:28:14 UTC
We have another customer facing the same issue, are we going to backport the known fix to RHEL 7 z-stream?

Comment 8 Curtis Taylor 2022-08-29 17:14:06 UTC
  commit a50d19c2501493fa7d8de3385c83329f5f42f93f

    Merge: sctp: fix a use after free crash of sctp_transport structure
      
    Xin Long (3):
      sctp: do asoc update earlier in sctp_sf_do_dupcook_a
      sctp: do asoc update earlier in sctp_sf_do_dupcook_b
      Revert "sctp: Fix SHUTDOWN CTSN Ack in the peer restart case"  <------ not mentioned so far in this BZ.

[] Confirms commit fixed by these patches is in RHEL7.9
  $ git show 35b4f24415c8 | grep Fixes:  <--- linux tree
    Fixes: 145cb2f7177d ("sctp: Fix bundling of SHUTDOWN with COOKIE-ACK")
  $ git log --oneline --grep="sctp: Fix bundling of SHUTDOWN"   <---- rhel7 tree
    92504ce6d122 [net] sctp: Fix bundling of SHUTDOWN with COOKIE-ACK

[] Is the revert also needed if this is ported to rhel7.9.z?
  $ git log --oneline --grep="sctp: Fix SHUTDOWN CTSN"  <--- rhel7 tree
    9836dfeb3786 [net] sctp: Fix SHUTDOWN CTSN Ack in the peer restart case
  $ git tag --contains=9836dfeb3786 | head -2
    RHEL-7.9
    kernel-3.10.0-1144.el7

Xin would the revert be necessary for rhel7.9?

Comment 9 Xin Long 2022-08-29 18:19:42 UTC
(In reply to Curtis Taylor from comment #8)
>   commit a50d19c2501493fa7d8de3385c83329f5f42f93f
> 
>     Merge: sctp: fix a use after free crash of sctp_transport structure
>       
>     Xin Long (3):
>       sctp: do asoc update earlier in sctp_sf_do_dupcook_a
>       sctp: do asoc update earlier in sctp_sf_do_dupcook_b
>       Revert "sctp: Fix SHUTDOWN CTSN Ack in the peer restart case"  <------
> not mentioned so far in this BZ.
> 
> [] Confirms commit fixed by these patches is in RHEL7.9
>   $ git show 35b4f24415c8 | grep Fixes:  <--- linux tree
>     Fixes: 145cb2f7177d ("sctp: Fix bundling of SHUTDOWN with COOKIE-ACK")
>   $ git log --oneline --grep="sctp: Fix bundling of SHUTDOWN"   <---- rhel7
> tree
>     92504ce6d122 [net] sctp: Fix bundling of SHUTDOWN with COOKIE-ACK
> 
> [] Is the revert also needed if this is ported to rhel7.9.z?
>   $ git log --oneline --grep="sctp: Fix SHUTDOWN CTSN"  <--- rhel7 tree
>     9836dfeb3786 [net] sctp: Fix SHUTDOWN CTSN Ack in the peer restart case
>   $ git tag --contains=9836dfeb3786 | head -2
>     RHEL-7.9
>     kernel-3.10.0-1144.el7
> 
> Xin would the revert be necessary for rhel7.9?

Not really, the revert is just an improvement, no fix in there.

Thanks.

Comment 11 Jonathan Maxwell 2022-09-09 04:53:14 UTC
Hi Xin,

Can you please provide devel_ack? So that Norm can proceed? 

Regards

Jon

Comment 14 Abhishek Rawal 2022-10-12 02:25:27 UTC
We have another customer facing the same issue with kernel-3.10.0-1160.76.1.el7 ;

++
https://galvatron-x86.cee.redhat.com/manager/301222650
retrace-server-interact 301222650 shell
retrace-server-interact 301222650 crash
++

Will it be possible, for engineering team to share us the information when(tentatively) the known fixes|commits will be merged in 7.z stream, please ?

Comment 15 Xin Long 2022-10-12 18:34:03 UTC
(In reply to Abhishek Rawal from comment #14)
> We have another customer facing the same issue with
> kernel-3.10.0-1160.76.1.el7 ;
> 
> ++
> https://galvatron-x86.cee.redhat.com/manager/301222650
> retrace-server-interact 301222650 shell
> retrace-server-interact 301222650 crash
> ++
> 
> Will it be possible, for engineering team to share us the information
> when(tentatively) the known fixes|commits will be merged in 7.z stream,
> please ?

It should be Dec 13th, the date of GA release.

Thanks.

Comment 47 errata-xmlrpc 2023-03-07 09:54:02 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: kernel security and bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2023:1091


Note You need to log in before you can comment on or make changes to this bug.