Bug 2054037
| Summary: | use-after-free in sctp_do_8_2_transport_strike [rhel-7.9.z] | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Jonathan Maxwell <jmaxwell> |
| Component: | kernel | Assignee: | Xin Long <lxin> |
| kernel sub component: | sctp | QA Contact: | ying xu <yinxu> |
| Status: | CLOSED ERRATA | Docs Contact: | |
| Severity: | urgent | ||
| Priority: | urgent | CC: | arawal, brstephe, cutaylor, dhoward, jaeshin, jiji, kjeon, kpfleming, linzhao, lxin, mleitner, mtesar, network-qe, nmurray, sababu, stanislav.moravec, sukulkar, yinxu |
| Version: | 7.9 | Keywords: | Triaged, ZStream |
| Target Milestone: | rc | Flags: | pm-rhel:
mirror+
|
| Target Release: | --- | ||
| Hardware: | All | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | kernel-3.10.0-1160.85.1.el7 | Doc Type: | If docs needed, set a value |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2023-03-07 09:54:02 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | |||
| Bug Blocks: | 1880027 | ||
> sctp_do_8_2_transport_strike.constprop.0+0xa27/0xab0 net/sctp/sm_sideeffect.c:531
531 ▹ ▹ if (transport->state != SCTP_INACTIVE)↩
So it looks the KASAN report was a good match.
Hi Xin, do you have any update on this? Regards Jon (In reply to Jonathan Maxwell from comment #4) > Hi Xin, do you have any update on this? > Hi, Jon, sorry for late Yes, I think it's the same one as the KASAN reported. The fix is already on upstream: 35b4f24415c8 sctp: do asoc update earlier in sctp_sf_do_dupcook_a we may need this one too: 51eac7f2f06b sctp: do asoc update earlier in sctp_sf_do_dupcook_b Thanks Xin, awesome. I'll tell the customer we have a fix but it won't go into RHEL7 and report back here. We have another customer facing the same issue, are we going to backport the known fix to RHEL 7 z-stream? commit a50d19c2501493fa7d8de3385c83329f5f42f93f
Merge: sctp: fix a use after free crash of sctp_transport structure
Xin Long (3):
sctp: do asoc update earlier in sctp_sf_do_dupcook_a
sctp: do asoc update earlier in sctp_sf_do_dupcook_b
Revert "sctp: Fix SHUTDOWN CTSN Ack in the peer restart case" <------ not mentioned so far in this BZ.
[] Confirms commit fixed by these patches is in RHEL7.9
$ git show 35b4f24415c8 | grep Fixes: <--- linux tree
Fixes: 145cb2f7177d ("sctp: Fix bundling of SHUTDOWN with COOKIE-ACK")
$ git log --oneline --grep="sctp: Fix bundling of SHUTDOWN" <---- rhel7 tree
92504ce6d122 [net] sctp: Fix bundling of SHUTDOWN with COOKIE-ACK
[] Is the revert also needed if this is ported to rhel7.9.z?
$ git log --oneline --grep="sctp: Fix SHUTDOWN CTSN" <--- rhel7 tree
9836dfeb3786 [net] sctp: Fix SHUTDOWN CTSN Ack in the peer restart case
$ git tag --contains=9836dfeb3786 | head -2
RHEL-7.9
kernel-3.10.0-1144.el7
Xin would the revert be necessary for rhel7.9?
(In reply to Curtis Taylor from comment #8) > commit a50d19c2501493fa7d8de3385c83329f5f42f93f > > Merge: sctp: fix a use after free crash of sctp_transport structure > > Xin Long (3): > sctp: do asoc update earlier in sctp_sf_do_dupcook_a > sctp: do asoc update earlier in sctp_sf_do_dupcook_b > Revert "sctp: Fix SHUTDOWN CTSN Ack in the peer restart case" <------ > not mentioned so far in this BZ. > > [] Confirms commit fixed by these patches is in RHEL7.9 > $ git show 35b4f24415c8 | grep Fixes: <--- linux tree > Fixes: 145cb2f7177d ("sctp: Fix bundling of SHUTDOWN with COOKIE-ACK") > $ git log --oneline --grep="sctp: Fix bundling of SHUTDOWN" <---- rhel7 > tree > 92504ce6d122 [net] sctp: Fix bundling of SHUTDOWN with COOKIE-ACK > > [] Is the revert also needed if this is ported to rhel7.9.z? > $ git log --oneline --grep="sctp: Fix SHUTDOWN CTSN" <--- rhel7 tree > 9836dfeb3786 [net] sctp: Fix SHUTDOWN CTSN Ack in the peer restart case > $ git tag --contains=9836dfeb3786 | head -2 > RHEL-7.9 > kernel-3.10.0-1144.el7 > > Xin would the revert be necessary for rhel7.9? Not really, the revert is just an improvement, no fix in there. Thanks. Hi Xin, Can you please provide devel_ack? So that Norm can proceed? Regards Jon We have another customer facing the same issue with kernel-3.10.0-1160.76.1.el7 ; ++ https://galvatron-x86.cee.redhat.com/manager/301222650 retrace-server-interact 301222650 shell retrace-server-interact 301222650 crash ++ Will it be possible, for engineering team to share us the information when(tentatively) the known fixes|commits will be merged in 7.z stream, please ? (In reply to Abhishek Rawal from comment #14) > We have another customer facing the same issue with > kernel-3.10.0-1160.76.1.el7 ; > > ++ > https://galvatron-x86.cee.redhat.com/manager/301222650 > retrace-server-interact 301222650 shell > retrace-server-interact 301222650 crash > ++ > > Will it be possible, for engineering team to share us the information > when(tentatively) the known fixes|commits will be merged in 7.z stream, > please ? It should be Dec 13th, the date of GA release. Thanks. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: kernel security and bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2023:1091 |
Description of problem: A customer reported a crashed VM and uploaded a Vmcore: From sureshk analysis: crash> sys |grep -e RELEASE -e PANIC RELEASE: 3.10.0-1160.6.1.el7.x86_64 PANIC: "BUG: unable to handle kernel NULL pointer dereference at 0000000000000268" The backtrace if the crash is: +++ [exception RIP: sctp_do_8_2_transport_strike+0x71] RIP: ffffffffc07fc991 RSP: ffff9383f9643b80 RFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff93837b8c2c00 RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffff93837b8c2c00 RDI: ffff9383ed6e6000 RBP: ffff9383f9643b98 R8: 0000000000000003 R9: ffff9383f9643c90 R10: ffff938377345204 R11: 0000000000000005 R12: ffff9383ed6e6000 R13: 0000000000000000 R14: 0000000000000003 R15: ffff9383f9643c90 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #10 [ffff9383f9643ba0] sctp_cmd_interpreter at ffffffffc07fe385 [sctp] #11 [ffff9383f9643c38] sctp_do_sm at ffffffffc07fcc91 [sctp] #12 [ffff9383f9643e10] sctp_generate_timeout_event at ffffffffc07fd305 [sctp] #13 [ffff9383f9643e58] sctp_generate_t2_shutdown_event at ffffffffc07fd3e3 [sctp] #14 [ffff9383f9643e68] call_timer_fn at ffffffffa7cabd58 #15 [ffff9383f9643ea0] run_timer_softirq at ffffffffa7cae1ed #16 [ffff9383f9643f18] __do_softirq at ffffffffa7ca4b95 #17 [ffff9383f9643f88] call_softirq at ffffffffa83984ec #18 [ffff9383f9643fa0] do_softirq at ffffffffa7c2f715 #19 [ffff9383f9643fc0] irq_exit at ffffffffa7ca4f15 #20 [ffff9383f9643fd8] smp_apic_timer_interrupt at ffffffffa8399a88 #21 [ffff9383f9643ff0] apic_timer_interrupt at ffffffffa8395fba +++ The sctp module was trying to access the sctp association, but crashed because the association was NULL +++ Crashed while trying to access "transport->asoc->rto_max" and association is NULL crash> struct sctp_transport.asoc ffff93837b8c2c00 asoc = 0x0 <--- NULL +++ freed slab object: crash> kmem ffff93837b8c2c00 CACHE OBJSIZE ALLOCATED TOTAL SLABS SSIZE NAME ffff9383f9007500 1024 3352 3456 108 32k kmalloc-1024 SLAB MEMORY NODE TOTAL ALLOCATED FREE fffffb6542ee3000 ffff93837b8c0000 0 32 24 8 FREE / [ALLOCATED] ffff93837b8c2c00 (cpu 1 cache) +++ Actual results: Use after free crash. Expected results: no crash. Additional info: This looks very similar to: http://lkml.iu.edu/hypermail/linux/kernel/2104.2/05811.html But I can't see a fix for that in upstream.