Bug 2159764 (CVE-2023-0160)

Summary: CVE-2023-0160 kernel: possibility of deadlock in libbpf function sock_hash_delete_elem
Product: [Other] Security Response Reporter: Alex <allarkin>
Component: vulnerabilityAssignee: Red Hat Product Security <security-response-team>
Status: CLOSED NOTABUG QA Contact:
Severity: low Docs Contact:
Priority: low    
Version: unspecifiedCC: acaringi, allarkin, bhu, carnil, chwhite, crwood, dbohanno, ddepaula, debarbos, dfreiber, dhoward, dvlasenk, ezulian, fhrbata, hkrzesin, jarod, jburrell, jdenham, jfaracco, jferlan, jforbes, jlelli, joe.lawrence, jshortt, jstancek, jwyatt, kcarcia, kernel-mgr, ldoskova, lgoncalv, lleshchi, lzampier, nmurray, ptalbert, qzhao, rogbas, rrobaina, rvrbovsk, rysulliv, scweaver, security-response-team, steve.beattie, tglozar, tyberry, vkumar, walters, wcosta, williams, wmealing, ycote
Target Milestone: ---Keywords: Security
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: kernel 6.4-rc1 Doc Type: If docs needed, set a value
Doc Text:
A deadlock flaw was found in the Linux kernel’s BPF subsystem. The fail happens in the function sock_hash_delete_elem. This flaw allows a local user to potentially crash the system.
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-03-23 16:18:03 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 2167769, 2223191    
Bug Blocks: 2151485    

Description Alex 2023-01-10 16:03:13 UTC
A Linux Kernel flaw deadlock found in the BPF. Existing reproducer allows to trigger a lockdep warning. Tested with kernel v5.15.25 and v5.19. The fail happens in the function sock_hash_delete_elem:

 Possible interrupt unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(&htab->buckets[i].lock);
                               local_irq_disable();
                               lock(&rq->__lock);
                               lock(&htab->buckets[i].lock);
  <Interrupt>
    lock(&rq->__lock);

 *** DEADLOCK ***

Comment 5 Product Security DevOps Team 2023-03-23 16:18:00 UTC
This bug is now closed. Further updates for individual products will be reflected on the CVE page(s):

https://access.redhat.com/security/cve/cve-2023-0160

Comment 6 Salvatore Bonaccorso 2023-03-24 05:04:59 UTC
Hi Alex,

(In reply to Alex from comment #0)
> A Linux Kernel flaw deadlock found in the BPF. Existing reproducer allows to
> trigger a lockdep warning. Tested with kernel v5.15.25 and v5.19. The fail
> happens in the function sock_hash_delete_elem:
> 
>  Possible interrupt unsafe locking scenario:
> 
>        CPU0                    CPU1
>        ----                    ----
>   lock(&htab->buckets[i].lock);
>                                local_irq_disable();
>                                lock(&rq->__lock);
>                                lock(&htab->buckets[i].lock);
>   <Interrupt>
>     lock(&rq->__lock);
> 
>  *** DEADLOCK ***

Are there any cross-references for this, i.e. where it originates
from and was it ever fixed upstream?

I'm asking to properly track the CVE in Debian.

Regards,
Salvatore

Comment 7 Salvatore Bonaccorso 2023-03-24 05:07:07 UTC
https://lore.kernel.org/lkml/CABcoxUayum5oOqFMMqAeWuS8+EzojquSOSyDA3J_2omY=2EeAg@mail.gmail.com/ might be the relevant report upstream.

Comment 8 Alex 2023-03-26 07:06:19 UTC
In reply to comment #6:
> Hi Alex,
> 
> (In reply to Alex from comment #0)
> > A Linux Kernel flaw deadlock found in the BPF. Existing reproducer allows to
> > trigger a lockdep warning. Tested with kernel v5.15.25 and v5.19. The fail
> > happens in the function sock_hash_delete_elem:
> > 
> >  Possible interrupt unsafe locking scenario:
> > 
> >        CPU0                    CPU1
> >        ----                    ----
> >   lock(&htab->buckets[i].lock);
> >                                local_irq_disable();
> >                                lock(&rq->__lock);
> >                                lock(&htab->buckets[i].lock);
> >   <Interrupt>
> >     lock(&rq->__lock);
> > 
> >  *** DEADLOCK ***
> 
> Are there any cross-references for this, i.e. where it originates
> from and was it ever fixed upstream?
> 
> I'm asking to properly track the CVE in Debian.
> 
> Regards,
> Salvatore

Hi Salvatore,
Yes, currently the only discussions happens (and likely patch not ready yet):
https://lore.kernel.org/all/CABcoxUayum5oOqFMMqAeWuS8+EzojquSOSyDA3J_2omY=2EeAg@mail.gmail.com/

You may ask reporter Hsin-Wei Hung to inform you when it ready.
Or ask John Fastabend , because he said (in the same discussion): "Thanks, I'll take a look.".

Alex

Comment 9 Alex 2023-07-16 11:24:39 UTC
Got reply from the reporter regarding this one:

Hi Alex,

The bug was fixed in ed17aa92dc56 ("bpf, sockmap: Revert buggy deadlock fix
in the sockhash and sockmap")

The fix basically disables hardirq instead of only softirq in the critical
section in delete() in sockmap/sockhash since an element can be deleted
from an eBPF program in the interrupt context (hardirq).

Thanks,
Hsin-Wei

Comment 10 Alex 2023-07-16 11:44:16 UTC
Created kernel tracking bugs for this issue:

Affects: fedora-all [bug 2223191]

Comment 11 Justin M. Forbes 2023-07-18 18:37:02 UTC
This was fixed for Fedora with the 6.2.15 stable kernel updates.

Comment 12 Steve Beattie 2023-07-18 19:43:20 UTC
Unfortunately, ed17aa92dc56 ("bpf, sockmap: Revert buggy deadlock fix in the sockhash and sockmap") was reverted upstream in 8c5c2a4898e3 ("bpf, sockmap: Revert buggy deadlock fix in the sockhash and sockmap"), which was also backported as part of the 6.2.15 stable kernel update (commit 3cd6ab3b5451).