Bug 1809076
| Summary: | snmpd crashes due to double-free when freeing security context | |||
|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Renaud Métrich <rmetrich> | |
| Component: | net-snmp | Assignee: | Josef Ridky <jridky> | |
| Status: | CLOSED ERRATA | QA Contact: | Evgeny Fedin <efedin> | |
| Severity: | high | Docs Contact: | ||
| Priority: | urgent | |||
| Version: | 7.7 | CC: | dchong, djez, efedin, jridky, mbliss, ovasik, sbroz, tcrider | |
| Target Milestone: | rc | Keywords: | EasyFix, Patch, Regression, Reproducer, ZStream | |
| Target Release: | --- | |||
| Hardware: | All | |||
| OS: | Linux | |||
| Whiteboard: | ||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | ||
| Doc Text: | Story Points: | --- | ||
| Clone Of: | ||||
| : | 1809077 1812938 1812940 (view as bug list) | Environment: | ||
| Last Closed: | 2020-09-29 20:56:26 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | 1780582 | |||
| Bug Blocks: | 1809077, 1812938, 1812940 | |||
|
Description
Renaud Métrich
2020-03-02 11:42:03 UTC
The coredump analysis shows the following:
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------
# gdb /usr/sbin/snmpd ./coredump
[...]
Program terminated with signal 6, Aborted.
#0 0x00007f67cc5d0377 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:55
55 return INLINE_SYSCALL (tgkill, 3, pid, selftid, sig);
(gdb) bt
#0 0x00007f67cc5d0377 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:55
#1 0x00007f67cc5d1a68 in __GI_abort () at abort.c:90
#2 0x00007f67cc612ec7 in __libc_message (do_abort=do_abort@entry=2,
fmt=fmt@entry=0x7f67cc7253f8 "*** Error in `%s': %s: 0x%s ***\n")
at ../sysdeps/unix/sysv/linux/libc_fatal.c:196
#3 0x00007f67cc61b6b9 in malloc_printerr (ar_ptr=0x7f67cc961760 <main_arena>, ptr=<optimized out>,
str=0x7f67cc7254b8 "double free or corruption (fasttop)", action=3) at malloc.c:4967
#4 _int_free (av=0x7f67cc961760 <main_arena>, p=<optimized out>, have_lock=0) at malloc.c:3843
#5 0x00007f67ce05f9ec in free_securityStateRef (pdu=<optimized out>, pdu=<optimized out>)
at snmp_api.c:3851
#6 0x00007f67ce066b00 in snmp_free_securityStateRef (pdu=<optimized out>) at snmp_api.c:3873
#7 0x00007f67cea22ae2 in free_agent_snmp_session (asp=0x559321baf420) at snmp_agent.c:1436
#8 0x00007f67cea26084 in handle_snmp_packet (op=<optimized out>, session=<optimized out>,
reqid=<optimized out>, pdu=0x559321baf300, magic=<optimized out>) at snmp_agent.c:1968
#9 0x00007f67ce06e2fc in _sess_process_packet (sessp=sessp@entry=0x559321b84980,
sp=sp@entry=0x559321baf4b0, isp=isp@entry=0x559321baf1e0, transport=transport@entry=0x559321baf250,
opaque=<optimized out>, olength=<optimized out>,
packetptr=packetptr@entry=0x559321bb8f70 "0\201\227\002\001\003\060\021\002\004\062\376_K\002\003",
length=154) at snmp_api.c:5436
#10 0x00007f67ce06eb5e in _sess_read (sessp=sessp@entry=0x559321b84980,
fdset=fdset@entry=0x7fff05449a10) at snmp_api.c:5851
#11 0x00007f67ce06f9a9 in snmp_sess_read2 (sessp=sessp@entry=0x559321b84980,
fdset=fdset@entry=0x7fff05449a10) at snmp_api.c:5883
#12 0x00007f67ce06f9fb in snmp_read2 (fdset=fdset@entry=0x7fff05449a10) at snmp_api.c:5485
#13 0x00005593206518f8 in receive () at snmpd.c:1341
#14 main (argc=<optimized out>, argv=<optimized out>) at snmpd.c:1130
(gdb)
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------
We had snmpd die while freeing some internal structure. It died because of a Double Free.
From the backtrace, we had the following code executed ("HERE" tag is where the crash happened):
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------
1419 void
1420 free_agent_snmp_session(netsnmp_agent_session *asp)
1421 {
:
1432 /* Clean up securityStateRef here to prevent a double free */
1433 if (asp->orig_pdu && asp->orig_pdu->securityStateRef)
1434 snmp_free_securityStateRef(asp->orig_pdu);
1435 if (asp->pdu && asp->pdu->securityStateRef)
1436 snmp_free_securityStateRef(asp->pdu); <--- HERE
:
1458 }
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------
The code being called, snmp_free_securityStateRef(), resets the pointer pdu->securityStateRef to NULL after freeing:
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------
3840 static void
3841 free_securityStateRef(netsnmp_pdu* pdu)
3842 {
:
3848 sptr = find_sec_mod(pdu->securityModel);
3849 if (sptr) {
3850 if (sptr->pdu_free_state_ref) {
3851 (*sptr->pdu_free_state_ref) (pdu->securityStateRef); <--- HERE
3852 } else {
:
3862 pdu->securityStateRef = NULL;
3863 }
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------
So in theory, this should work fine: this code can be called multiple times.
Unfortunately this assumes that the pointer pdu->securityStateRef is not referenced in other structures, which alas isn't the case:
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------
(gdb) p asp->orig_pdu
$20 = (netsnmp_pdu *) 0x559321be11c0
(gdb) p asp->pdu
$21 = (netsnmp_pdu *) 0x559321be1060
(gdb) p pdu
$26 = (netsnmp_pdu *) 0x559321baf300
(gdb) p *asp->orig_pdu
$24 = {version = 3, command = 161, reqid = 942092819, msgid = 855531339, transid = 20949, sessid = 0,
errstat = 0, errindex = 0, time = 0, flags = 4, securityModel = 3, securityLevel = 3,
msgParseModel = 0, transport_data = 0x559321bca040, transport_data_length = 60,
tDomain = 0x7f67ce2f3720 <netsnmpUDPDomain>, tDomainLen = 7, variables = 0x559321bcbb60,
community = 0x0, community_len = 0, enterprise = 0x0, enterprise_length = 0, trap_type = 0,
specific_type = 0, agent_addr = "\000\000\000", contextEngineID = 0x559321be12e0 "\200",
contextEngineIDLen = 17, contextName = 0x559321b838c0 "cts2", contextNameLen = 4,
securityEngineID = 0x559321b82b80 "\200", securityEngineIDLen = 17,
securityName = 0x559321be1300 "XXX", securityNameLen = 9, priority = 0, range_subid = 0,
securityStateRef = 0x0}
(gdb) p *asp->pdu
$25 = {version = 3, command = 161, reqid = 942092819, msgid = 855531339, transid = 20949, sessid = 0,
errstat = 0, errindex = 0, time = 0, flags = 4, securityModel = 3, securityLevel = 3,
msgParseModel = 0, transport_data = 0x559321ae4000, transport_data_length = 60,
tDomain = 0x7f67ce2f3720 <netsnmpUDPDomain>, tDomainLen = 7, variables = 0x559321bcb6e0,
community = 0x0, community_len = 0, enterprise = 0x0, enterprise_length = 0, trap_type = 0,
specific_type = 0, agent_addr = "\000\000\000", contextEngineID = 0x559321bafdc0 "\200",
contextEngineIDLen = 17, contextName = 0x559321be1180 "cts2", contextNameLen = 4,
securityEngineID = 0x5593219cdc90 "\200", securityEngineIDLen = 17,
securityName = 0x559321be11a0 "XXX", securityNameLen = 9, priority = 0, range_subid = 0,
securityStateRef = 0x559321bb8be0}
(gdb) p *pdu
$28 = {version = 3, command = 161, reqid = 942092819, msgid = 855531339, transid = 20949, sessid = 0,
errstat = 0, errindex = 0, time = 0, flags = 4, securityModel = 3, securityLevel = 3,
msgParseModel = 0, transport_data = 0x559321b8e3a0, transport_data_length = 60,
tDomain = 0x7f67ce2f3720 <netsnmpUDPDomain>, tDomainLen = 7, variables = 0x559321be0be0,
community = 0x0, community_len = 0, enterprise = 0x0, enterprise_length = 0, trap_type = 0,
specific_type = 0, agent_addr = "\000\000\000", contextEngineID = 0x559321bd52d0 "\200",
contextEngineIDLen = 17, contextName = 0x559321beabb0 "cts2g\177", contextNameLen = 4,
securityEngineID = 0x559321bafd30 "\200", securityEngineIDLen = 17,
securityName = 0x559321bb8ad0 "XXX", securityNameLen = 9, priority = 0, range_subid = 0,
securityStateRef = 0x559321bb8be0}
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------
From the variables above, taken on frame 8, we can see that "pdu", "asp->orig_pdu" and "asp->pdu" are 3 different structures,
but all refer to the same memory at location 0x559321bb8be0 as "*->securityStateRef" pointer.
This explains the crash: free_agent_snmp_session() freed the memory on line 1434 for asp->orig_pdu, and again on line 1436:
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------
1419 void
1420 free_agent_snmp_session(netsnmp_agent_session *asp)
1421 {
:
1432 /* Clean up securityStateRef here to prevent a double free */
1433 if (asp->orig_pdu && asp->orig_pdu->securityStateRef)
1434 snmp_free_securityStateRef(asp->orig_pdu);
1435 if (asp->pdu && asp->pdu->securityStateRef)
1436 snmp_free_securityStateRef(asp->pdu); <--- HERE
:
1458 }
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------
Going deeper, we have this code which handles cloning of PDUs (which happens here, "asp->orig_pdu" is a clone of "pdu" and "asp->pdu" is a clone of "asp->orig_pdu"):
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------
1388 netsnmp_agent_session *
1389 init_agent_snmp_session(netsnmp_session * session, netsnmp_pdu *pdu)
1390 {
:
1400 asp->pdu = snmp_clone_pdu(pdu);
1401 asp->orig_pdu = snmp_clone_pdu(pdu);
:
1417 }
581 netsnmp_pdu *
582 snmp_clone_pdu(netsnmp_pdu *pdu)
583 {
584 return _clone_pdu(pdu, 0); /* copies all variables */
585 }
551 static
552 netsnmp_pdu *
553 _clone_pdu(netsnmp_pdu *pdu, int drop_err)
554 {
555 netsnmp_pdu *newpdu;
556 newpdu = _clone_pdu_header(pdu);
557 newpdu = _copy_pdu_vars(pdu, newpdu, drop_err, 0, 10000); /* skip none, copy all */
558
559 return newpdu;
560 }
347 static
348 netsnmp_pdu *
349 _clone_pdu_header(netsnmp_pdu *pdu)
350 {
:
394 if (pdu != NULL && pdu->securityStateRef &&
395 pdu->command == SNMP_MSG_TRAP2) {
396
397 ret = usm_clone_usmStateReference((struct usmStateReference *) pdu->securityStateRef,
398 (struct usmStateReference **) &newpdu->securityStateRef );
:
405 }
406
407 if ((sptr = find_sec_mod(newpdu->securityModel)) != NULL &&
408 sptr->pdu_clone != NULL) {
409 /*
410 * call security model if it needs to know about this
411 */
412 (*sptr->pdu_clone) (pdu, newpdu);
413 }
:
416 }
#define SNMP_MSG_GETNEXT (ASN_CONTEXT | ASN_CONSTRUCTOR | 0x1) /* a1=161 */
#define SNMP_MSG_TRAP2 (ASN_CONTEXT | ASN_CONSTRUCTOR | 0x7) /* a7=167 */
(gdb) p asp->pdu->command
$31 = 161
(gdb) p 0x80|0x20|0x7
$32 = 167
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------
On line 395 of _clone_pdu_header(), we have a clone happening of the "securityStateRef" field, but here we don't fall into this since "pdu->command" is not 167, but 161.
Later on line 412 we have some clone code again, which is probably what makes "securityStateRef" be reference 3 times in various "pdu", "asp->pdu" and "asp->orig_pdu" structures.
Browsing Upstream's code, I can see a very similar issue being reported:
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------
commit 5f881d3bf24599b90d67a45cae7a3eb099cd71c9
Author: Bart Van Assche <bvanassche>
Date: Sat Jul 27 19:34:09 2019 -0700
libsnmp, USM: Introduce a reference count in struct usmStateReference
This patch fixes https://sourceforge.net/p/net-snmp/bugs/2956/.
---
snmplib/snmp_client.c | 22 +++++-----------------
1 file changed, 5 insertions(+), 17 deletions(-)
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------
The issue https://sourceforge.net/p/net-snmp/bugs/2956 has a similar backtrace:
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------
Program terminated with signal SIGABRT, Aborted.
#0 0x00007f3a8a220067 in raise () from /lib/x86_64-linux-gnu/libc.so.6
(gdb) bt
#0 0x00007f3a8a220067 in raise () from /lib/x86_64-linux-gnu/libc.so.6
#1 0x00007f3a8a221448 in abort () from /lib/x86_64-linux-gnu/libc.so.6
#2 0x00007f3a8a25e1b4 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#3 0x00007f3a8a26398e in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#4 0x00007f3a8a264696 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#5 0x00007f3a8a8ced96 in free_securityStateRef (pdu=<optimized out>, pdu=<optimized out>) at snmp_api.c:4018
#6 0x00007f3a8a8d2700 in snmp_free_securityStateRef (pdu=<optimized out>) at snmp_api.c:4040
#7 0x00007f3a8b0ed0ba in free_agent_snmp_session (asp=0x1fc7480) at snmp_agent.c:1612
#8 0x00007f3a8b0f1be4 in handle_snmp_packet (op=<optimized out>, session=<optimized out>, reqid=<optimized out>, pdu=0x1fc2680, magic=<optimized out>) at snmp_agent.c:2251
#9 0x00007f3a8a8deddf in _sess_process_packet_handle_pdu (transport=0x1fb4570, pdu=0x1fc2680, isp=0x1fb4850, sp=0x1fb48e0, sessp=0x1fb4820) at snmp_api.c:5827
#10 _sess_process_packet (sessp=0x1fb4820, sp=0x1fb48e0, isp=0x1fb4850, transport=0x1fb4570, opaque=<optimized out>, olength=<optimized out>, packetptr=0x1fc7b80 "0\201\225\002\001\003\060\021\002\004b\364\240\250\002\003", length=152) at snmp_api.c:5883
#11 0x00007f3a8a8e0342 in _sess_read (sessp=0x1fb4820, fdset=0x7ffd69c89d50) at snmp_api.c:6144
#12 0x00007f3a8a8e0919 in snmp_sess_read2 (sessp=sessp@entry=0x1fb4820, fdset=fdset@entry=0x7ffd69c89d50) at snmp_api.c:6417
#13 0x00007f3a8a8e096b in snmp_read2 (fdset=0x7ffd69c89d50) at snmp_api.c:5932
#14 0x0000000000404bd0 in ?? ()
#15 0x0000000000403e75 in main ()
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------
*** Bug 1810175 has been marked as a duplicate of this bug. *** Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (net-snmp bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:4037 |