Bug 1208953 - Random "make check -j8" failures with kernel 3.19.2/3.19.3
Summary: Random "make check -j8" failures with kernel 3.19.2/3.19.3
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 21
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-04-04 13:09 UTC by H.J. Lu
Modified: 2015-04-26 12:51 UTC (History)
6 users (show)

Fixed In Version: kernel-3.19.5-200.fc21
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-04-26 12:50:18 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
Patches against kernel v3.19.4 (13.60 KB, application/octet-stream)
2015-04-15 17:37 UTC, H.J. Lu
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Linux Kernel 96311 0 None None None Never

Internal Links: 1233842

Description H.J. Lu 2015-04-04 13:09:07 UTC
I ran into strange pipe I/O issues with kernel 3.19.2/3.19.3 on both ia32 and
Intel64. I got random comparison failures with "make check -j 8" in GCC.  We use expect/tcl scripts to run GCC on testcases .  Our scripts catch the GCC outputs
via pipe and compare them against the expected outputs.   Under kernel
3.19.x, part of GCC outputs got lost under heavy load and comparison failed. 
When I ran scripts manually, it looked fine. Kernel 3.18.X is OK.

Comment 1 H.J. Lu 2015-04-06 19:50:43 UTC
In each case, the size of GCC outputs is > 4K:

-rw------- 1 hjl hjl 9796 Apr  6 12:39 bad.1
-rw------- 1 hjl hjl 6817 Apr  6 12:43 bad.2
-rw------- 1 hjl hjl 7077 Apr  6 12:45 bad.3
-rw------- 1 hjl hjl 43890 Apr  6 12:41 bad.1
-rw------- 1 hjl hjl  7678 Apr  6 12:44 bad.2

It could be related to PIPE_BUF.

Comment 2 Josh Boyer 2015-04-15 17:19:51 UTC
Thanks for reporting this upstream.  I've added v3 of the patch to the Fedora kernel in git.  Will be in the next builds.

Comment 3 H.J. Lu 2015-04-15 17:37:52 UTC
Created attachment 1014908 [details]
Patches against kernel v3.19.4

FYI, these are my patches for kernel 3.19.4 cherry-picked
from kernel master branch + the v3 patch.

Comment 4 H.J. Lu 2015-04-15 17:44:56 UTC
(In reply to Josh Boyer from comment #2)
> Thanks for reporting this upstream.  I've added v3 of the patch to the
> Fedora kernel in git.  Will be in the next builds.

Do we need other pty fixes?  V3 patch alone doesn't fix the problem
for me on kernel 3.19.4.

Comment 5 Josh Boyer 2015-04-15 17:52:27 UTC
Um... from https://bugzilla.kernel.org/show_bug.cgi?id=96311#c7:

 H.J. Lu 2015-04-10 17:28:55 UTC

The same patch also works on kernel 3.19.3-200 from Fedora 21.


So you said patch (singular) there.  Were you mistaken?  What exactly did you test?

Comment 6 H.J. Lu 2015-04-15 18:19:36 UTC
(In reply to Josh Boyer from comment #5)
> Um... from https://bugzilla.kernel.org/show_bug.cgi?id=96311#c7:
> 
>  H.J. Lu 2015-04-10 17:28:55 UTC
> 
> The same patch also works on kernel 3.19.3-200 from Fedora 21.

Yes, it worked for me on 3.19.3-200 too, but failed on 3.19.4-200.

> 
> So you said patch (singular) there.  Were you mistaken?  What exactly did
> you test?

I tested the v3 patch alone on both 3.19.3-200 and 3.19.4-200.
It passed on 3.19.3-200 and failed on 3.19.4-200.  The series of
patches I uploaded works on 3.19.4-200.  I didn't try them on
3.19.3-200.

Comment 7 Josh Boyer 2015-04-15 18:38:24 UTC
(In reply to H.J. Lu from comment #6)
> (In reply to Josh Boyer from comment #5)
> > Um... from https://bugzilla.kernel.org/show_bug.cgi?id=96311#c7:
> > 
> >  H.J. Lu 2015-04-10 17:28:55 UTC
> > 
> > The same patch also works on kernel 3.19.3-200 from Fedora 21.
> 
> Yes, it worked for me on 3.19.3-200 too, but failed on 3.19.4-200.
> 
> > 
> > So you said patch (singular) there.  Were you mistaken?  What exactly did
> > you test?
> 
> I tested the v3 patch alone on both 3.19.3-200 and 3.19.4-200.
> It passed on 3.19.3-200 and failed on 3.19.4-200.  The series of
> patches I uploaded works on 3.19.4-200.  I didn't try them on
> 3.19.3-200.

That doesn't make a whole lot of sense though.  There is literally nothing touching the tty/pty code going from 3.19.3 to 3.19.4.  Applying this specific single patch on top of either should result in the same thing.  The rest of the patches in 3.19.4 are very driver specific.

Did you perhaps get a false positive/negative in your testing somewhere?

Comment 8 H.J. Lu 2015-04-15 18:55:51 UTC
(In reply to Josh Boyer from comment #7)
> (In reply to H.J. Lu from comment #6)
> > (In reply to Josh Boyer from comment #5)
> > > Um... from https://bugzilla.kernel.org/show_bug.cgi?id=96311#c7:
> > > 
> > >  H.J. Lu 2015-04-10 17:28:55 UTC
> > > 
> > > The same patch also works on kernel 3.19.3-200 from Fedora 21.
> > 
> > Yes, it worked for me on 3.19.3-200 too, but failed on 3.19.4-200.
> > 
> > > 
> > > So you said patch (singular) there.  Were you mistaken?  What exactly did
> > > you test?
> > 
> > I tested the v3 patch alone on both 3.19.3-200 and 3.19.4-200.
> > It passed on 3.19.3-200 and failed on 3.19.4-200.  The series of
> > patches I uploaded works on 3.19.4-200.  I didn't try them on
> > 3.19.3-200.
> 
> That doesn't make a whole lot of sense though.  There is literally nothing
> touching the tty/pty code going from 3.19.3 to 3.19.4.  Applying this
> specific single patch on top of either should result in the same thing.  The
> rest of the patches in 3.19.4 are very driver specific.
> 
> Did you perhaps get a false positive/negative in your testing somewhere?

It is a race condition.  My test on 3.19.3-200 may not trigger
it.  But the same test triggered it on 3.19.4-200.  Have you tried
my test on 3.19.4-200? Does it fail without the fix and pass with it?

Comment 9 H.J. Lu 2015-04-15 19:46:10 UTC
I tried your pty-Fix-input-race-when-closing.patch on 3.19.4.
It does fix the problem.  Thanks.

Comment 10 Josh Boyer 2015-04-15 19:53:08 UTC
Wonderful.  Thanks for testing and confirming.

Comment 11 H.J. Lu 2015-04-16 04:35:08 UTC
(In reply to H.J. Lu from comment #9)
> I tried your pty-Fix-input-race-when-closing.patch on 3.19.4.
> It does fix the problem.  Thanks.

Unfortunately, pty-Fix-input-race-when-closing.patch works
on 3.19.4, but not on 3.19.4-200.  My patches work on both.

Comment 12 Josh Boyer 2015-04-16 12:44:44 UTC
That is again very confusing.  We aren't carrying anything else that would modify that code in the Fedora packages.

Comment 13 H.J. Lu 2015-04-16 12:56:49 UTC
Backport one pty fix won't fix all pty issues in 3.19 kernel.
A race condition may be triggered in many different ways.

Comment 14 Josh Boyer 2015-04-16 13:09:29 UTC
Sure, I understand that.  But saying it 'works' on 3.19.4 seems wrong then, because the race is likely still there and you just aren't hitting it.

Comment 15 H.J. Lu 2015-04-16 13:20:48 UTC
(In reply to Josh Boyer from comment #14)
> Sure, I understand that.  But saying it 'works' on 3.19.4 seems wrong then,
> because the race is likely still there and you just aren't hitting it.

That is correct.  This failure is random.

Comment 16 H.J. Lu 2015-04-20 23:18:37 UTC
3.19.5-200 is OK.  This

commit b1ff43a8757d7a57b49231db1713c5ec36477bdf
Author: Peter Hurley <peter>
Date:   Fri Jan 16 15:05:39 2015 -0500

    n_tty: Fix read buffer overwrite when no newline

is new in 3.19.5.

Comment 17 Fedora Update System 2015-04-21 13:22:08 UTC
kernel-3.19.5-200.fc21 has been submitted as an update for Fedora 21.
https://admin.fedoraproject.org/updates/kernel-3.19.5-200.fc21

Comment 18 Fedora Update System 2015-04-21 13:22:35 UTC
kernel-3.19.5-100.fc20 has been submitted as an update for Fedora 20.
https://admin.fedoraproject.org/updates/kernel-3.19.5-100.fc20

Comment 19 Fedora Update System 2015-04-22 22:41:37 UTC
Package kernel-3.19.5-100.fc20:
* should fix your issue,
* was pushed to the Fedora 20 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing kernel-3.19.5-100.fc20'
as soon as you are able to, then reboot.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2015-6579/kernel-3.19.5-100.fc20
then log in and leave karma (feedback).

Comment 20 Fedora Update System 2015-04-26 12:50:18 UTC
kernel-3.19.5-100.fc20 has been pushed to the Fedora 20 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 21 Fedora Update System 2015-04-26 12:51:04 UTC
kernel-3.19.5-200.fc21 has been pushed to the Fedora 21 stable repository.  If problems still persist, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.