Bug 145643 - regex_internal.c:re_node_set_alloc allocates 0 bytes that are never freed
regex_internal.c:re_node_set_alloc allocates 0 bytes that are never freed
Status: CLOSED NOTABUG
Product: Fedora
Classification: Fedora
Component: glibc (Show other bugs)
3
All Linux
medium Severity medium
: ---
: ---
Assigned To: Jakub Jelinek
Brian Brock
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2005-01-20 07:26 EST by Arnold Robbins
Modified: 2007-11-30 17:10 EST (History)
0 users

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2005-01-24 11:35:14 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Arnold Robbins 2005-01-20 07:26:51 EST
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040510

Description of problem:
Valgrind tells me that re_node_set_alloc allocates blocks of
size 0 that are then never freed.  This is mainly a code cleanliness
issue, although there's bound to be internal malloc overhead too.

Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
1.Run valgrind --tool=memcheck --leak-check=yes on gawk test suite
2.Use en_US.utf8 as the locale.
3.
    

Additional info:

Here is my patch.
--- ../libre.new/regex_internal.c       2005-01-10 11:26:46.000000000
+0200
+++ regex_internal.c    2005-01-19 18:29:09.000000000 +0200
@@ -883,6 +885,16 @@
      re_node_set *set;
      int size;
 {
+  /*
+   * ADR: valgrind says size can be 0, which then doesn't
+   * free the block of size 0.  Harumph. This seems
+   * to work ok, though.
+   */
+  if (size == 0)
+    {
+       memset(set, 0, sizeof(*set));
+       return REG_NOERROR;
+    }
   set->alloc = size;
   set->nelem = 0;
   set->elems = re_malloc (int, size);
Comment 1 Jakub Jelinek 2005-01-20 12:38:21 EST
Can you give little more details?
I took gawk 3.14 (the latest I found) and in its make check I see only
definitely lost: 0 bytes in 0 blocks.
possibly lost:   0 bytes in 0 blocks.
for all invocations of ../gawk in en_US.UTF-8.
There are many still reachable's, but that I guess is coming from gawk's own
allocations, not regex (as no pointer should leek after regfree).

Also, I don't see any place where a node set wouldn't be re_node_set_free'd,
and that macro doesn't care if node->alloc is 0 or not.
Comment 2 Arnold Robbins 2005-01-23 06:52:18 EST
(In reply to comment #1)
> Can you give little more details?
> I took gawk 3.14 (the latest I found) and in its make check I see only
> definitely lost: 0 bytes in 0 blocks.
> possibly lost:   0 bytes in 0 blocks.
> for all invocations of ../gawk in en_US.UTF-8.
> There are many still reachable's, but that I guess is coming from gawk's own
> allocations, not regex (as no pointer should leek after regfree).
> 
> Also, I don't see any place where a node set wouldn't be re_node_set_free'd,
> and that macro doesn't care if node->alloc is 0 or not.

My mistake. Start with my devel version,
http://www.skeeve.com/gawk-3.1.4c.tar.gz. Remove my patch which is in place
and try with that. The regex routines are from CVS about two weeks ago.

Thanks
Comment 3 Jakub Jelinek 2005-01-24 11:25:01 EST
This looks like valgrind bug.
Breakpoint 8, re_node_set_alloc (set=0x8092448, size=17) at regex_internal.c:904
904     }
(gdb) p set->elems
$84 = (int *) 0x80924a8
(gdb) c
Continuing.

Breakpoint 8, re_node_set_alloc (set=0x80924d8, size=17) at regex_internal.c:904
904     }
(gdb) p set->elems
$85 = (int *) 0x8092538
(gdb) c
Continuing.

Breakpoint 8, re_node_set_alloc (set=0x8092bd0, size=17) at regex_internal.c:904
904     }
(gdb) p set->elems
$86 = (int *) 0x8092c30
(gdb) c
Continuing.

Breakpoint 8, re_node_set_alloc (set=0x8092c60, size=17) at regex_internal.c:904
904     }
(gdb) p set->elems
$87 = (int *) 0x8092cc0
(gdb) c
Continuing.
2

Breakpoint 6, 0x00696566 in exit () from /lib/tls/libc.so.6
(gdb) Quitt->elems
(gdb) p *((re_dfa_t *)(*(unsigned long
*)(&RS_re_yes_case->pat)))->state_table[1]->array[0]
$88 = {hash = 5, nodes = {alloc = 2, nelem = 0, elems = 0x8092478}, non_eps_nodes
= {alloc = 0, nelem = 0, elems = 0x80924a8},
  inveclosure = {alloc = 0, nelem = 0, elems = 0x0}, entrance_nodes = 0x8092488,
trtable = 0x0, word_trtable = 0x0,
  context = 0, halt = 0, accept_mb = 0, has_backref = 0, has_constraint = 1}
(gdb) p *((re_dfa_t *)(*(unsigned long
*)(&RS_re_yes_case->pat)))->state_table[2]->array[0]
$89 = {hash = 6, nodes = {alloc = 2, nelem = 0, elems = 0x8092508}, non_eps_nodes
= {alloc = 0, nelem = 0, elems = 0x8092538},
  inveclosure = {alloc = 0, nelem = 0, elems = 0x0}, entrance_nodes = 0x8092518,
trtable = 0x0, word_trtable = 0x0,
  context = 1, halt = 0, accept_mb = 0, has_backref = 0, has_constraint = 1}
(gdb) p *((re_dfa_t *)(*(unsigned long
*)(&RS_re_no_case->pat)))->state_table[1]->array[0]
$90 = {hash = 5, nodes = {alloc = 2, nelem = 0, elems = 0x8092c00}, non_eps_nodes
= {alloc = 0, nelem = 0, elems = 0x8092c30},
  inveclosure = {alloc = 0, nelem = 0, elems = 0x0}, entrance_nodes = 0x8092c10,
trtable = 0x0, word_trtable = 0x0,
  context = 0, halt = 0, accept_mb = 0, has_backref = 0, has_constraint = 1}
(gdb) p *((re_dfa_t *)(*(unsigned long
*)(&RS_re_no_case->pat)))->state_table[2]->array[0]
$91 = {hash = 6, nodes = {alloc = 2, nelem = 0, elems = 0x8092c90}, non_eps_nodes
= {alloc = 0, nelem = 0, elems = 0x8092cc0},
  inveclosure = {alloc = 0, nelem = 0, elems = 0x0}, entrance_nodes = 0x8092ca0,
trtable = 0x0, word_trtable = 0x0,
  context = 1, halt = 0, accept_mb = 0, has_backref = 0, has_constraint = 1}
(gdb) info brea
Num Type           Disp Enb Address    What
6   breakpoint     keep y   0x00696566 <exit+6>
        breakpoint already hit 1 time
8   breakpoint     keep y   0x0806ec05 in re_node_set_alloc at
regex_internal.c:903
        stop only if set->alloc == 0
        breakpoint already hit 4 times

As can be seen, the 4 cases where valgrind thinks they are definitely lost,
they are actually still reachable:

==15214== 0 bytes in 4 blocks are definitely lost in loss record 1 of 35
==15214==    at 0x1B904A90: malloc (vg_replace_malloc.c:131)
==15214==    by 0x806EBFC: re_node_set_alloc (regex_internal.c:900)
==15214==    by 0x806F7CF: register_state (regex_internal.c:1511)
==15214==    by 0x806FB19: re_acquire_state_context (regex_internal.c:1662)
==15214==
==15214== LEAK SUMMARY:
==15214==    definitely lost: 0 bytes in 4 blocks.
==15214==    possibly lost:   0 bytes in 0 blocks.
==15214==    still reachable: 142813 bytes in 135 blocks.
==15214==         suppressed: 0 bytes in 0 blocks.
==15214== Reachable blocks (those to which a pointer was found) are not shown.
==15214== To see them, rerun with: --show-reachable=yes

(you are not calling regfree during exit if gawk is run by an allocation
checker).

At least on the glibc regex testsuite, re_node_set_alloc with last argument 0
happens in 0.9% of all calls, so it is definitely not something we should waste
any instructions on.

Note You need to log in before you can comment on or make changes to this bug.