518247 – Valgrind doesn't handle SSE optimized string/memory functions

Bug 518247 - Valgrind doesn't handle SSE optimized string/memory functions

Summary: Valgrind doesn't handle SSE optimized string/memory functions

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	valgrind
Sub Component:
Version:	rawhide
Hardware:	All
OS:	Linux
Priority:	low
Severity:	medium
Target Milestone:	---
Assignee:	Dodji Seketeli
QA Contact:	Fedora Extras Quality Assurance
Docs Contact:
URL:
Whiteboard:
Duplicates (1):	526912 (view as bug list)
Depends On:
Blocks:	F12Target
TreeView+	depends on / blocked

Reported:	2009-08-19 16:33 UTC by H.J. Lu
Modified:	2009-11-04 15:28 UTC (History)
CC List:	10 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2009-11-04 15:28:14 UTC
Type:	---
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
Test case for strlen (592 bytes, text/x-csrc) 2009-09-07 09:08 UTC, Richard W.M. Jones	no flags	Details
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
KDE Software Compilation	206013	0	NOR	RESOLVED	valgrind redirection doesn't work with STT_GNU_IFUNC symbols	2020-08-25 06:53:02 UTC

Description H.J. Lu 2009-08-19 16:33:37 UTC

SSE optimized string/memory functions may read beyond the memory
boundary when it is safe to do so. Valgrind doesn't handle them.
I got tons of messages like:

==11893== Invalid read of size 8
==11893==    at 0x52FCB4C: __GI_strncmp (in /lib64/libc-2.10.90.so)
==11893==    by 0x52A71B3: _nl_load_locale_from_archive (in /lib64/libc-2.10.90.
so)
==11893==    by 0x52A6371: _nl_find_locale (in /lib64/libc-2.10.90.so)
==11893==    by 0x52A5A0B: setlocale (in /lib64/libc-2.10.90.so)
==11893==    by 0xA4F682: gcc_init_libintl (intl.c:54)

Comment 1 Lennart Poettering 2009-08-24 16:03:21 UTC

Hmm, this is a real problem making valgrind's output pretty much useless on rawhide right now.

Comment 2 Lennart Poettering 2009-09-01 16:52:37 UTC

Is there any chance we can get this fixed quickly? A lot of people are depending that debugging helpers such as vg work to fix the remaining bugs in F12!

Comment 3 Richard W.M. Jones 2009-09-07 09:08:21 UTC

Created attachment 359971 [details]
Test case for strlen

I also hit this bug over the weekend.  My simple
test case is attached.

Comment 4 Jakub Jelinek 2009-10-02 13:41:09 UTC

*** Bug 526912 has been marked as a duplicate of this bug. ***

Comment 5 Stefan Becker 2009-10-03 09:07:55 UTC

If I'm not mistaken this is also the root cause why valgrind shows false-positives for memory leaks. One example from the pidgin plugin I'm trying to debug:

Linux l3f1199 2.6.31.1-56.fc12.i686 #1 SMP Tue Sep 29 16:32:02 EDT 2009 i686 i686 i386 GNU/Linux

==9609== 1 bytes in 1 blocks are possibly lost in loss record 30 of 8,851
==9609==    at 0x402794C: malloc (vg_replace_malloc.c:195)
==9609==    by 0x4E336B4: g_malloc (gmem.c:131)
==9609==    by 0x4E4C2A9: g_strdup (gstrfuncs.c:102)
==9609==    by 0x643AB8F: sipmsg_parse_msg (sipmsg.c:43)

 33: struct sipmsg *sipmsg_parse_msg(const gchar *msg) {
...
 43: 	smsg->body = g_strdup(tmp + 4);

gstrfuncs.c:
101:       length = strlen (str) + 1;
102:       new_str = g_new (char, length);
103:       memcpy (new_str, str, length);

gmem.c:
131:       mem = glib_mem_vtable.malloc (n_bytes);

The corresponding free function:

240: void sipmsg_free(struct sipmsg *msg) {
...
261: 	g_free(msg->body);

Personally I would move this error to high priority, because valgrind is completely useless :(

Comment 6 Matthias Clasen 2009-10-04 20:12:31 UTC

The patch in https://bugs.kde.org/show_bug.cgi?id=206013 seems to work ok for me.
I've put my local builds with that patch up at http://mclasen.fedorapeople.org/

Comment 7 Dodji Seketeli 2009-10-05 07:50:13 UTC

(In reply to comment #6)
> The patch in https://bugs.kde.org/show_bug.cgi?id=206013 seems to work ok for
> me.
> I've put my local builds with that patch up at http://mclasen.fedorapeople.org/  

Thanks Matthias for the package.

Guys, please update this bug with your feedback about the patched package Matthias has put on http://mclasen.fedorapeople.org/. If it works well enough for you we'll update the rawhide valgrind package with that patch.

Comment 8 Stefan Becker 2009-10-05 10:24:11 UTC

Tested on i686: the invalid reads are gone, so I only get heap errors in my testing.

False positives are still there though, but I guess that is a different problem.

Comment 9 Dodji Seketeli 2009-10-05 12:11:09 UTC

(In reply to comment #8)
> Tested on i686: the invalid reads are gone, so I only get heap errors in my
> testing.
> 
> False positives are still there though, but I guess that is a different
> problem.

Thanks for testing.

As for the possible false positive, I'd recommand to file a different bug with a bit more information, so that we can track it with more detail.

Did the package you tested make valgrind useful for you or is it still useless?

Most of the programs I have tested it on didn't spit any false positive for me so far, unless some python related program. But even there, I haven't really investigated so I am not sure those are false positive yet.

Thanks.

Comment 10 Stefan Becker 2009-10-05 12:22:14 UTC

(In reply to comment #9)
> 
> Did the package you tested make valgrind useful for you or is it still useless?

It is still useless, unfortunately. When I have time I'll try running valgrind and that SAME package on a F11 machine. AFAIR the false-positives started once I updated my development to F12-rawhide.

Comment 11 Stefan Becker 2009-10-05 13:49:07 UTC

(In reply to comment #9)
> 
> As for the possible false positive, I'd recommand to file a different bug with
> a bit more information, so that we can track it with more detail.

See bug #527230

Comment 12 Dodji Seketeli 2009-10-05 13:53:44 UTC

Thanks Stefan.

Comment 13 Jakub Jelinek 2009-10-12 17:43:03 UTC

I've built valgrind-3.5.0-2 with this patch into dist-12-updates-candidate, but I see a bunch of testsuite regressions compared to 3.5.0-1 (which admittedly was run against much older glibc).  Some regressions are caused by glibc change to print -nan instead of nan for NaN values with sign bit set, but others, e.g.
coolo_strlen are clearly ifunc related:

--- coolo_strlen.stderr.exp     2009-08-19 15:37:29.000000000 +0200
+++ coolo_strlen.stderr.out     2009-10-12 18:46:42.000000000 +0200
@@ -1,2 +1,32 @@


+valgrind: m_oset.c:535 (vgPlain_OSetGen_Lookup): Assertion 't' failed.
+   at 0x........: report_and_quit (m_libcassert.c:145)
+   by 0x........: vgPlain_assert_fail (m_libcassert.c:217)
+   by 0x........: vgPlain_OSetGen_Lookup (m_oset.c:535)
+   by 0x........: vgPlain_redir_change_ifunc_target (m_redir.c:509)
+   by 0x........: vgPlain_scheduler (scheduler.c:1404)
+   by 0x........: run_a_thread_NORETURN (syswrap-linux.c:91)
+
+sched status:
+  running_tid=1
+
+Thread 1: status = VgTs_Runnable
+   at 0x........: * (vg_preloaded.c:92)
+   by 0x........: _dl_fixup (in /lib64/ld-2.10.90.so)
+   by 0x........: _dl_runtime_resolve (in /lib64/ld-2.10.90.so)
+   by 0x........: main (coolo_strlen.c:7)
...

Comment 14 Dodji Seketeli 2009-10-12 21:24:46 UTC

I am looking at this.

Comment 15 Jim Meyering 2009-10-26 20:08:19 UTC

Jakub, thanks for all of the work.
On F12, I was seeing way too much noise Invalid read of size 8 in e.g., __GI_strlen and __GI_strcmp.

I installed the latest from koji, valgrind.x86_64 1:3.5.0-6, and it seems to have solved the problems.

Note You need to log in before you can comment on or make changes to this bug.