Bug 1809076
Summary: | snmpd crashes due to double-free when freeing security context | |||
---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Renaud Métrich <rmetrich> | |
Component: | net-snmp | Assignee: | Josef Ridky <jridky> | |
Status: | CLOSED ERRATA | QA Contact: | Evgeny Fedin <efedin> | |
Severity: | high | Docs Contact: | ||
Priority: | urgent | |||
Version: | 7.7 | CC: | dchong, djez, efedin, jridky, mbliss, ovasik, sbroz, tcrider | |
Target Milestone: | rc | Keywords: | EasyFix, Patch, Regression, Reproducer, ZStream | |
Target Release: | --- | |||
Hardware: | All | |||
OS: | Linux | |||
Whiteboard: | ||||
Fixed In Version: | Doc Type: | If docs needed, set a value | ||
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 1809077 1812938 1812940 (view as bug list) | Environment: | ||
Last Closed: | 2020-09-29 20:56:26 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | 1780582 | |||
Bug Blocks: | 1809077, 1812938, 1812940 |
Description
Renaud Métrich
2020-03-02 11:42:03 UTC
The coredump analysis shows the following: -------- 8< ---------------- 8< ---------------- 8< ---------------- 8< -------- # gdb /usr/sbin/snmpd ./coredump [...] Program terminated with signal 6, Aborted. #0 0x00007f67cc5d0377 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:55 55 return INLINE_SYSCALL (tgkill, 3, pid, selftid, sig); (gdb) bt #0 0x00007f67cc5d0377 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:55 #1 0x00007f67cc5d1a68 in __GI_abort () at abort.c:90 #2 0x00007f67cc612ec7 in __libc_message (do_abort=do_abort@entry=2, fmt=fmt@entry=0x7f67cc7253f8 "*** Error in `%s': %s: 0x%s ***\n") at ../sysdeps/unix/sysv/linux/libc_fatal.c:196 #3 0x00007f67cc61b6b9 in malloc_printerr (ar_ptr=0x7f67cc961760 <main_arena>, ptr=<optimized out>, str=0x7f67cc7254b8 "double free or corruption (fasttop)", action=3) at malloc.c:4967 #4 _int_free (av=0x7f67cc961760 <main_arena>, p=<optimized out>, have_lock=0) at malloc.c:3843 #5 0x00007f67ce05f9ec in free_securityStateRef (pdu=<optimized out>, pdu=<optimized out>) at snmp_api.c:3851 #6 0x00007f67ce066b00 in snmp_free_securityStateRef (pdu=<optimized out>) at snmp_api.c:3873 #7 0x00007f67cea22ae2 in free_agent_snmp_session (asp=0x559321baf420) at snmp_agent.c:1436 #8 0x00007f67cea26084 in handle_snmp_packet (op=<optimized out>, session=<optimized out>, reqid=<optimized out>, pdu=0x559321baf300, magic=<optimized out>) at snmp_agent.c:1968 #9 0x00007f67ce06e2fc in _sess_process_packet (sessp=sessp@entry=0x559321b84980, sp=sp@entry=0x559321baf4b0, isp=isp@entry=0x559321baf1e0, transport=transport@entry=0x559321baf250, opaque=<optimized out>, olength=<optimized out>, packetptr=packetptr@entry=0x559321bb8f70 "0\201\227\002\001\003\060\021\002\004\062\376_K\002\003", length=154) at snmp_api.c:5436 #10 0x00007f67ce06eb5e in _sess_read (sessp=sessp@entry=0x559321b84980, fdset=fdset@entry=0x7fff05449a10) at snmp_api.c:5851 #11 0x00007f67ce06f9a9 in snmp_sess_read2 (sessp=sessp@entry=0x559321b84980, fdset=fdset@entry=0x7fff05449a10) at snmp_api.c:5883 #12 0x00007f67ce06f9fb in snmp_read2 (fdset=fdset@entry=0x7fff05449a10) at snmp_api.c:5485 #13 0x00005593206518f8 in receive () at snmpd.c:1341 #14 main (argc=<optimized out>, argv=<optimized out>) at snmpd.c:1130 (gdb) -------- 8< ---------------- 8< ---------------- 8< ---------------- 8< -------- We had snmpd die while freeing some internal structure. It died because of a Double Free. From the backtrace, we had the following code executed ("HERE" tag is where the crash happened): -------- 8< ---------------- 8< ---------------- 8< ---------------- 8< -------- 1419 void 1420 free_agent_snmp_session(netsnmp_agent_session *asp) 1421 { : 1432 /* Clean up securityStateRef here to prevent a double free */ 1433 if (asp->orig_pdu && asp->orig_pdu->securityStateRef) 1434 snmp_free_securityStateRef(asp->orig_pdu); 1435 if (asp->pdu && asp->pdu->securityStateRef) 1436 snmp_free_securityStateRef(asp->pdu); <--- HERE : 1458 } -------- 8< ---------------- 8< ---------------- 8< ---------------- 8< -------- The code being called, snmp_free_securityStateRef(), resets the pointer pdu->securityStateRef to NULL after freeing: -------- 8< ---------------- 8< ---------------- 8< ---------------- 8< -------- 3840 static void 3841 free_securityStateRef(netsnmp_pdu* pdu) 3842 { : 3848 sptr = find_sec_mod(pdu->securityModel); 3849 if (sptr) { 3850 if (sptr->pdu_free_state_ref) { 3851 (*sptr->pdu_free_state_ref) (pdu->securityStateRef); <--- HERE 3852 } else { : 3862 pdu->securityStateRef = NULL; 3863 } -------- 8< ---------------- 8< ---------------- 8< ---------------- 8< -------- So in theory, this should work fine: this code can be called multiple times. Unfortunately this assumes that the pointer pdu->securityStateRef is not referenced in other structures, which alas isn't the case: -------- 8< ---------------- 8< ---------------- 8< ---------------- 8< -------- (gdb) p asp->orig_pdu $20 = (netsnmp_pdu *) 0x559321be11c0 (gdb) p asp->pdu $21 = (netsnmp_pdu *) 0x559321be1060 (gdb) p pdu $26 = (netsnmp_pdu *) 0x559321baf300 (gdb) p *asp->orig_pdu $24 = {version = 3, command = 161, reqid = 942092819, msgid = 855531339, transid = 20949, sessid = 0, errstat = 0, errindex = 0, time = 0, flags = 4, securityModel = 3, securityLevel = 3, msgParseModel = 0, transport_data = 0x559321bca040, transport_data_length = 60, tDomain = 0x7f67ce2f3720 <netsnmpUDPDomain>, tDomainLen = 7, variables = 0x559321bcbb60, community = 0x0, community_len = 0, enterprise = 0x0, enterprise_length = 0, trap_type = 0, specific_type = 0, agent_addr = "\000\000\000", contextEngineID = 0x559321be12e0 "\200", contextEngineIDLen = 17, contextName = 0x559321b838c0 "cts2", contextNameLen = 4, securityEngineID = 0x559321b82b80 "\200", securityEngineIDLen = 17, securityName = 0x559321be1300 "XXX", securityNameLen = 9, priority = 0, range_subid = 0, securityStateRef = 0x0} (gdb) p *asp->pdu $25 = {version = 3, command = 161, reqid = 942092819, msgid = 855531339, transid = 20949, sessid = 0, errstat = 0, errindex = 0, time = 0, flags = 4, securityModel = 3, securityLevel = 3, msgParseModel = 0, transport_data = 0x559321ae4000, transport_data_length = 60, tDomain = 0x7f67ce2f3720 <netsnmpUDPDomain>, tDomainLen = 7, variables = 0x559321bcb6e0, community = 0x0, community_len = 0, enterprise = 0x0, enterprise_length = 0, trap_type = 0, specific_type = 0, agent_addr = "\000\000\000", contextEngineID = 0x559321bafdc0 "\200", contextEngineIDLen = 17, contextName = 0x559321be1180 "cts2", contextNameLen = 4, securityEngineID = 0x5593219cdc90 "\200", securityEngineIDLen = 17, securityName = 0x559321be11a0 "XXX", securityNameLen = 9, priority = 0, range_subid = 0, securityStateRef = 0x559321bb8be0} (gdb) p *pdu $28 = {version = 3, command = 161, reqid = 942092819, msgid = 855531339, transid = 20949, sessid = 0, errstat = 0, errindex = 0, time = 0, flags = 4, securityModel = 3, securityLevel = 3, msgParseModel = 0, transport_data = 0x559321b8e3a0, transport_data_length = 60, tDomain = 0x7f67ce2f3720 <netsnmpUDPDomain>, tDomainLen = 7, variables = 0x559321be0be0, community = 0x0, community_len = 0, enterprise = 0x0, enterprise_length = 0, trap_type = 0, specific_type = 0, agent_addr = "\000\000\000", contextEngineID = 0x559321bd52d0 "\200", contextEngineIDLen = 17, contextName = 0x559321beabb0 "cts2g\177", contextNameLen = 4, securityEngineID = 0x559321bafd30 "\200", securityEngineIDLen = 17, securityName = 0x559321bb8ad0 "XXX", securityNameLen = 9, priority = 0, range_subid = 0, securityStateRef = 0x559321bb8be0} -------- 8< ---------------- 8< ---------------- 8< ---------------- 8< -------- From the variables above, taken on frame 8, we can see that "pdu", "asp->orig_pdu" and "asp->pdu" are 3 different structures, but all refer to the same memory at location 0x559321bb8be0 as "*->securityStateRef" pointer. This explains the crash: free_agent_snmp_session() freed the memory on line 1434 for asp->orig_pdu, and again on line 1436: -------- 8< ---------------- 8< ---------------- 8< ---------------- 8< -------- 1419 void 1420 free_agent_snmp_session(netsnmp_agent_session *asp) 1421 { : 1432 /* Clean up securityStateRef here to prevent a double free */ 1433 if (asp->orig_pdu && asp->orig_pdu->securityStateRef) 1434 snmp_free_securityStateRef(asp->orig_pdu); 1435 if (asp->pdu && asp->pdu->securityStateRef) 1436 snmp_free_securityStateRef(asp->pdu); <--- HERE : 1458 } -------- 8< ---------------- 8< ---------------- 8< ---------------- 8< -------- Going deeper, we have this code which handles cloning of PDUs (which happens here, "asp->orig_pdu" is a clone of "pdu" and "asp->pdu" is a clone of "asp->orig_pdu"): -------- 8< ---------------- 8< ---------------- 8< ---------------- 8< -------- 1388 netsnmp_agent_session * 1389 init_agent_snmp_session(netsnmp_session * session, netsnmp_pdu *pdu) 1390 { : 1400 asp->pdu = snmp_clone_pdu(pdu); 1401 asp->orig_pdu = snmp_clone_pdu(pdu); : 1417 } 581 netsnmp_pdu * 582 snmp_clone_pdu(netsnmp_pdu *pdu) 583 { 584 return _clone_pdu(pdu, 0); /* copies all variables */ 585 } 551 static 552 netsnmp_pdu * 553 _clone_pdu(netsnmp_pdu *pdu, int drop_err) 554 { 555 netsnmp_pdu *newpdu; 556 newpdu = _clone_pdu_header(pdu); 557 newpdu = _copy_pdu_vars(pdu, newpdu, drop_err, 0, 10000); /* skip none, copy all */ 558 559 return newpdu; 560 } 347 static 348 netsnmp_pdu * 349 _clone_pdu_header(netsnmp_pdu *pdu) 350 { : 394 if (pdu != NULL && pdu->securityStateRef && 395 pdu->command == SNMP_MSG_TRAP2) { 396 397 ret = usm_clone_usmStateReference((struct usmStateReference *) pdu->securityStateRef, 398 (struct usmStateReference **) &newpdu->securityStateRef ); : 405 } 406 407 if ((sptr = find_sec_mod(newpdu->securityModel)) != NULL && 408 sptr->pdu_clone != NULL) { 409 /* 410 * call security model if it needs to know about this 411 */ 412 (*sptr->pdu_clone) (pdu, newpdu); 413 } : 416 } #define SNMP_MSG_GETNEXT (ASN_CONTEXT | ASN_CONSTRUCTOR | 0x1) /* a1=161 */ #define SNMP_MSG_TRAP2 (ASN_CONTEXT | ASN_CONSTRUCTOR | 0x7) /* a7=167 */ (gdb) p asp->pdu->command $31 = 161 (gdb) p 0x80|0x20|0x7 $32 = 167 -------- 8< ---------------- 8< ---------------- 8< ---------------- 8< -------- On line 395 of _clone_pdu_header(), we have a clone happening of the "securityStateRef" field, but here we don't fall into this since "pdu->command" is not 167, but 161. Later on line 412 we have some clone code again, which is probably what makes "securityStateRef" be reference 3 times in various "pdu", "asp->pdu" and "asp->orig_pdu" structures. Browsing Upstream's code, I can see a very similar issue being reported: -------- 8< ---------------- 8< ---------------- 8< ---------------- 8< -------- commit 5f881d3bf24599b90d67a45cae7a3eb099cd71c9 Author: Bart Van Assche <bvanassche> Date: Sat Jul 27 19:34:09 2019 -0700 libsnmp, USM: Introduce a reference count in struct usmStateReference This patch fixes https://sourceforge.net/p/net-snmp/bugs/2956/. --- snmplib/snmp_client.c | 22 +++++----------------- 1 file changed, 5 insertions(+), 17 deletions(-) -------- 8< ---------------- 8< ---------------- 8< ---------------- 8< -------- The issue https://sourceforge.net/p/net-snmp/bugs/2956 has a similar backtrace: -------- 8< ---------------- 8< ---------------- 8< ---------------- 8< -------- Program terminated with signal SIGABRT, Aborted. #0 0x00007f3a8a220067 in raise () from /lib/x86_64-linux-gnu/libc.so.6 (gdb) bt #0 0x00007f3a8a220067 in raise () from /lib/x86_64-linux-gnu/libc.so.6 #1 0x00007f3a8a221448 in abort () from /lib/x86_64-linux-gnu/libc.so.6 #2 0x00007f3a8a25e1b4 in ?? () from /lib/x86_64-linux-gnu/libc.so.6 #3 0x00007f3a8a26398e in ?? () from /lib/x86_64-linux-gnu/libc.so.6 #4 0x00007f3a8a264696 in ?? () from /lib/x86_64-linux-gnu/libc.so.6 #5 0x00007f3a8a8ced96 in free_securityStateRef (pdu=<optimized out>, pdu=<optimized out>) at snmp_api.c:4018 #6 0x00007f3a8a8d2700 in snmp_free_securityStateRef (pdu=<optimized out>) at snmp_api.c:4040 #7 0x00007f3a8b0ed0ba in free_agent_snmp_session (asp=0x1fc7480) at snmp_agent.c:1612 #8 0x00007f3a8b0f1be4 in handle_snmp_packet (op=<optimized out>, session=<optimized out>, reqid=<optimized out>, pdu=0x1fc2680, magic=<optimized out>) at snmp_agent.c:2251 #9 0x00007f3a8a8deddf in _sess_process_packet_handle_pdu (transport=0x1fb4570, pdu=0x1fc2680, isp=0x1fb4850, sp=0x1fb48e0, sessp=0x1fb4820) at snmp_api.c:5827 #10 _sess_process_packet (sessp=0x1fb4820, sp=0x1fb48e0, isp=0x1fb4850, transport=0x1fb4570, opaque=<optimized out>, olength=<optimized out>, packetptr=0x1fc7b80 "0\201\225\002\001\003\060\021\002\004b\364\240\250\002\003", length=152) at snmp_api.c:5883 #11 0x00007f3a8a8e0342 in _sess_read (sessp=0x1fb4820, fdset=0x7ffd69c89d50) at snmp_api.c:6144 #12 0x00007f3a8a8e0919 in snmp_sess_read2 (sessp=sessp@entry=0x1fb4820, fdset=fdset@entry=0x7ffd69c89d50) at snmp_api.c:6417 #13 0x00007f3a8a8e096b in snmp_read2 (fdset=0x7ffd69c89d50) at snmp_api.c:5932 #14 0x0000000000404bd0 in ?? () #15 0x0000000000403e75 in main () -------- 8< ---------------- 8< ---------------- 8< ---------------- 8< -------- *** Bug 1810175 has been marked as a duplicate of this bug. *** Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (net-snmp bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:4037 |