Bug 632560 - strncasecmp fails sometimes
Summary: strncasecmp fails sometimes
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: glibc
Version: rawhide
Hardware: All
OS: Linux
low
medium
Target Milestone: ---
Assignee: Andreas Schwab
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-09-10 11:37 UTC by David Tardon
Modified: 2016-11-24 15:55 UTC (History)
5 users (show)

Fixed In Version: glibc-2.12.90-14
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2010-09-30 06:15:42 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
reproducer (1.59 MB, application/x-xz)
2010-09-10 11:37 UTC, David Tardon
no flags Details
precompiled binary (341.55 KB, application/octet-stream)
2010-09-10 11:38 UTC, David Tardon
no flags Details
XML source (3.96 KB, text/xml)
2010-09-10 11:38 UTC, David Tardon
no flags Details
snippet from gdb session (482 bytes, text/plain)
2010-09-10 11:39 UTC, David Tardon
no flags Details
Attempt to reproduce (664 bytes, text/plain)
2010-09-11 15:31 UTC, Ulrich Drepper
no flags Details
preprocessed source of the test case (26.89 KB, text/plain)
2010-09-14 08:11 UTC, David Tardon
no flags Details
assembler code of the test case (1.43 KB, text/plain)
2010-09-14 08:12 UTC, David Tardon
no flags Details
/proc/cpuinfo (1.32 KB, text/plain)
2010-09-15 09:48 UTC, David Tardon
no flags Details
gdb session log (25.38 KB, text/plain)
2010-09-20 17:34 UTC, David Tardon
no flags Details
cache.tar.gz (827 bytes, application/x-gzip)
2010-09-21 05:59 UTC, David Tardon
no flags Details

Description David Tardon 2010-09-10 11:37:44 UTC
Created attachment 446493 [details]
reproducer

Description of problem:
After update to glibc-2.12.90-8.x86_64 one of OO.o's internal build tools started to fail. I tracked it down to strncasecmp call (masked as strnicmp by #define in the source code :). It fails consistently at the same place, but only there: it runs fine most of the time :( See steps to reproduce for how to build the failing binary from sources; if you don't want to/need to do this, you can use the attached one.

Version-Release number of selected component (if applicable):
glibc-2.12.90-8.x86_64

How reproducible:
always

Steps to Reproduce:
1. get the attached tarball and unpack it somewhere
2. go to the unpacked dir
3. ./prepare.sh; this will do some compiling and fails at the end 
4. ./solver/300/unxlngx6/bin/xml2cmp -types stdout stoc/unxlngx6/misc/jen.xml

or

1. get the attached binary and XML source
2. ./xml2cmp -types stdout $(pwd)/jen.xml
  
Actual results:
strncasecmp fails, causing build error: "Syntax error missing or not matching end tag in file: ../../unxlngx6/misc/jen.xml in line 60"

Expected results:
build runs smoothly (or, as smoothly as it is possible with OO.o...)

Additional info:
It works if there are less than three components in the path to the XML source (i.e. xml2cmp a/b/jen.xml is fine, xml2mp a/b/c/jen.xml is not).

It works with glibc-2.12.90-6.x86_64 .

Maybe related to bug 632555 ?

Comment 1 David Tardon 2010-09-10 11:38:28 UTC
Created attachment 446494 [details]
precompiled binary

Comment 2 David Tardon 2010-09-10 11:38:57 UTC
Created attachment 446495 [details]
XML source

Comment 3 David Tardon 2010-09-10 11:39:42 UTC
Created attachment 446496 [details]
snippet from gdb session

Comment 4 David Tardon 2010-09-10 13:14:51 UTC
update: works with glibc-2.12.90-7.x86_64

Comment 5 Ulrich Drepper 2010-09-11 15:31:48 UTC
Created attachment 446670 [details]
Attempt to reproduce

I cannot reproduce any problem.  When I use your test program (compiled from source or the binary) I get

 com.sun.star.lang.DisposedException com.sun.star.lang.IllegalArgumentException com.sun.star.java.InvalidJavaSettingsException com.sun.star.java.JavaDisabledException com.sun.star.java.JavaInitializationException com.sun.star.java.JavaNotFoundException com.sun.star.java.JavaVMCreationFailureException com.sun.star.beans.NamedValue com.sun.star.beans.PropertyValue com.sun.star.java.RestartRequiredException com.sun.star.uno.TypeClass com.sun.star.uri/ExternalUriReferenceTranslator com.sun.star.lang.WrappedTargetRuntimeException com.sun.star.uno.XAggregation com.sun.star.lang.XComponent com.sun.star.uno.XComponentContext com.sun.star.container.XContainer com.sun.star.container.XContainerListener com.sun.star.uno.XCurrentContext com.sun.star.lang.XInitialization com.sun.star.task.XInteractionAbort com.sun.star.task.XInteractionContinuation com.sun.star.task.XInteractionHandler com.sun.star.task.XInteractionRequest com.sun.star.task.XInteractionRetry com.sun.star.java.XJavaThreadRegister_11 com.sun.star.java.XJavaVM com.sun.star.util.XMacroExpander com.sun.star.lang.XMultiServiceFactory com.sun.star.container.XNameAccess com.sun.star.lang.XServiceInfo com.sun.star.registry.XSimpleRegistry com.sun.star.lang.XSingleComponentFactory com.sun.star.lang.XSingleServiceFactory com.sun.star.lang.XTypeProvider com.sun.star.uno.XWeak


This is not what you wrote would happen and likely a different problem.


When I try to recreate the problematic call I also don't see any problem (see the attachment).  It works for all kinds of locales.

Note, this is with the upstream glibc, not the Fedora binary.  But these two shouldn't differ.

If you still can reproduce the issue look at the attachment.  Tweak it, if necessary, to reduce the test case.  I tried all three x86-64 strncasecmp implementations and nothing fails.

Comment 6 David Tardon 2010-09-14 08:06:54 UTC
(In reply to comment #5)
> Created attachment 446670 [details]
> Attempt to reproduce
> 
> I cannot reproduce any problem.  When I use your test program (compiled from
> source or the binary) I get
> 
>  com.sun.star.lang.DisposedException com.sun.star.lang.IllegalArgumentException
> --- cut ---
> 
> This is not what you wrote would happen and likely a different problem.
> 

No, that's the intended output.

> 
> When I try to recreate the problematic call I also don't see any problem (see
> the attachment).  It works for all kinds of locales.
> 
> Note, this is with the upstream glibc, not the Fedora binary.  But these two
> shouldn't differ.
> 
> If you still can reproduce the issue look at the attachment.  Tweak it, if
> necessary, to reduce the test case.  I tried all three x86-64 strncasecmp
> implementations and nothing fails.

I wish I couldn't reproduce it :( The test program gives me

s = 0x600db0, t = 0x601dbb, i = 62, j = 0

, which is obviously wrong.

I tried it without selinux, with older kernel (kernel-2.6.36-0.9.rc2.git3.fc15) and gcc (gcc-4.5.1-1.fc14)--both from 20100826 rawhide snapshot, because I know for sure I had no build problem at that time--but so far the only thing I found is it works with glibc up to glibc-2.12.90-7.x86_64 and doesn't work with any newer.

My current versions of (possibly) related packages are:

glibc-2.12.90-10.x86_64
gcc-4.5.1-3.fc14.x86_64
kernel-2.6.36-0.20.rc3.git4.fc15.x86_64

Comment 7 David Tardon 2010-09-14 08:11:23 UTC
Created attachment 447176 [details]
preprocessed source of the test case

I'm attaching preprocessed source and assembler code of the test case, just in case there is something wrong with my environment.

Comment 8 David Tardon 2010-09-14 08:12:11 UTC
Created attachment 447178 [details]
assembler code of the test case

Comment 9 Ulrich Drepper 2010-09-14 17:55:10 UTC
What processor do you have?  The version selected depends on this.  What's the output of /proc/cpuinfo?  Is it virtualized?

Comment 10 David Tardon 2010-09-15 09:48:20 UTC
Created attachment 447428 [details]
/proc/cpuinfo

No, it's not virtualized.

Comment 11 Ulrich Drepper 2010-09-15 13:55:52 UTC
The notorious Pentium D?  I think that processor was ripe with bugs.

I have no way to reproduce this myself so you'll have to do it.  You probably will have to single step through the code.  Start my test program with gdb, place a breakpoint on __strncasecmp_l_sse2.  If this doesn't work step through the function all until you arrive at the beginning of the real code.

Then step through the code instruction by instruction using 'si' and after every instruction print the register which has been modified.  I any register is modified.  Best to use the hex format.  At the beginning show all registers once:

               info all-registers


The code starts with

       mov    (%rcx),%rax
       testl  $0x0,0x278(%rax)
       jne    somehwere...
       test   %rdx,%rdx
       je     somewhere...
       cmp    $0x1,%rdx
       je     somewhere

You can skip all the above as long as you arrive at the following instruction.  Here I now intermix the print info you should provide.  I hope you get the idea.

       mov    %rdx,%r11
                                    p/x $r11
       mov    %esi,%ecx
                                    p/x $rcx
       mov    %edi,%eax
                                    p/x $rax
       and    $0x3f,%rcx
       and    $0x3f,%rax
       movdqa 0x0(%rip),%xmm5
                                    p/x $xmm5.v16_int8
       movdqa 0x0(%rip),%xmm6
                                    p/x $xmm6.v16_int8
       ...


Note the format used for printing the SSE registers.  If you don't know x86 assembler well enough to determine which registers are modified (if any, the test instructions above don't modify anything) then just type

   info all-registers

after every instruction executed using 'si'.

You might want to start gdb under script.  On a terminal command line, just type script, then start and run the debugger, terminate the debugger, and then press Control-D.  You'll end up with a file named 'typescript' in the current dir which has all the text output in it.  That's the info I need.

Comment 12 Ulrich Drepper 2010-09-15 14:07:36 UTC
Also, are you sure you have the latest microcode update for the Pentium D?  As I wrote before, the processor was buggy.

Comment 13 H.J. Lu 2010-09-15 14:13:42 UTC
If you think it is a strncasecmp problem, please
find a small, standalone testcase. I will look
into it.

Comment 14 Ulrich Drepper 2010-09-15 14:23:48 UTC
(In reply to comment #13)
> If you think it is a strncasecmp problem, please
> find a small, standalone testcase. I will look
> into it.

HJ, my attachment from comment #5 is a small test case.

Comment 15 H.J. Lu 2010-09-15 18:41:50 UTC
(In reply to comment #14)
> (In reply to comment #13)
> > If you think it is a strncasecmp problem, please
> > find a small, standalone testcase. I will look
> > into it.
> 
> HJ, my attachment from comment #5 is a small test case.

On

cpu family	: 15
model		: 6
model name	: Intel(R) Pentium(R) D CPU 3.73GHz
stepping	: 4

with glibc-2.12.1-2.f13.x86_64, I got

[hjl@gnu-33 bz632560]$ ./a.out 
s = 0x600db0, t = 0x601dbb, i = 0, j = 0
[hjl@gnu-33 bz632560]$ 

The output is the same as on Core i7.

Comment 16 Ulrich Drepper 2010-09-15 19:26:13 UTC
(In reply to comment #15)
> [hjl@gnu-33 bz632560]$ ./a.out 
> s = 0x600db0, t = 0x601dbb, i = 0, j = 0
> [hjl@gnu-33 bz632560]$ 
> 
> The output is the same as on Core i7.

That's what I see, too.  But you have to make sure you're also testing it on a machine which will use the sse2 version, not the ssse3 nor sse4.2 version.

As you see in comment #6, this isn't the case on that Pentium D.  Whenever I see that processor mentioned I think about processor bugs.  That's why I cc:ed you.

My guess is that a BIOS update is needed since I cannot reproduce it.

Comment 17 H.J. Lu 2010-09-15 19:35:20 UTC
(In reply to comment #16)
> (In reply to comment #15)
> > [hjl@gnu-33 bz632560]$ ./a.out 
> > s = 0x600db0, t = 0x601dbb, i = 0, j = 0
> > [hjl@gnu-33 bz632560]$ 
> > 
> > The output is the same as on Core i7.
> That's what I see, too.  But you have to make sure you're also testing it on a
> machine which will use the sse2 version, not the ssse3 nor sse4.2 version.

I tested it on

model name : Intel(R) Pentium(R) D CPU 3.73GHz

It doesn't have SSSE3.

> As you see in comment #6, this isn't the case on that Pentium D.  Whenever I
> see that processor mentioned I think about processor bugs.  That's why I cc:ed
> you.
> My guess is that a BIOS update is needed since I cannot reproduce it.

It is a good idea.

Comment 18 David Tardon 2010-09-20 17:34:36 UTC
Created attachment 448524 [details]
gdb session log

Comment 19 H.J. Lu 2010-09-20 17:43:35 UTC
(In reply to comment #18)
> Created attachment 448524 [details]
> gdb session log

(In reply to comment #18)
> Created attachment 448524 [details]
> gdb session log

Please answer the following questions:

1. Which testcase?
2. Does it fail every time on the same machine?
3. Does it fail on different machines without SSSE3?

Comment 20 David Tardon 2010-09-20 18:36:07 UTC
> 1. Which testcase?
u.c from comment 5

> 2. Does it fail every time on the same machine?
Yes. I thought I was pretty explicit about it in my previous comments...

> 3. Does it fail on different machines without SSSE3?
I suppose you do mean SSE2. It doesn't fail on my laptop with Intel Core i3, where SSE3 version is used, AFAICS.

Comment 21 H.J. Lu 2010-09-20 18:41:38 UTC
(In reply to comment #20)

> > 3. Does it fail on different machines without SSSE3?
> I suppose you do mean SSE2. It doesn't fail on my laptop with Intel Core i3,

I meant "without SSSE3". On machines without SSSE3, SSE2
version will be used.  Please try it on other machines without
SSSE3.

Comment 22 H.J. Lu 2010-09-20 18:52:31 UTC
Please upload /tmp/cache.info.tar.gz created by:

# cd /
# tar cfz /tmp/cache.info.tar.gz ./sys/devices/system/cpu/cpu0/cache

Comment 23 Ulrich Drepper 2010-09-20 21:05:21 UTC
I found and fixed one more limit check in strncasecmp.  This should fix the problem.  Why only you see it is a mystery, though.  Andreas will hopefully build a new glibc soon.

Comment 24 David Tardon 2010-09-21 05:59:19 UTC
Created attachment 448618 [details]
cache.tar.gz

Comment 25 Fedora Update System 2010-09-27 15:48:50 UTC
glibc-2.12.90-13 has been submitted as an update for Fedora 14.
https://admin.fedoraproject.org/updates/glibc-2.12.90-13

Comment 26 Fedora Update System 2010-09-27 20:07:18 UTC
glibc-2.12.90-13 has been pushed to the Fedora 14 testing repository.  If problems still persist, please make note of it in this bug report.
 If you want to test the update, you can install it with 
 su -c 'yum --enablerepo=updates-testing update glibc'.  You can provide feedback for this update here: https://admin.fedoraproject.org/updates/glibc-2.12.90-13

Comment 27 Fedora Update System 2010-09-28 17:32:01 UTC
glibc-2.12.90-14 has been pushed to the Fedora 14 testing repository.  If problems still persist, please make note of it in this bug report.
 If you want to test the update, you can install it with 
 su -c 'yum --enablerepo=updates-testing update glibc'.  You can provide feedback for this update here: https://admin.fedoraproject.org/updates/glibc-2.12.90-14

Comment 28 David Tardon 2010-09-29 06:36:43 UTC
Yup, that fixes it. Thanks, Ulrich!

Comment 29 Fedora Update System 2010-09-30 06:15:17 UTC
glibc-2.12.90-14 has been pushed to the Fedora 14 stable repository.  If problems still persist, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.