Bug 1602275 - readChar on gzfile returns garbage on aarch64
Summary: readChar on gzfile returns garbage on aarch64
Keywords:
Status: CLOSED EOL
Alias: None
Product: Fedora
Classification: Fedora
Component: R
Version: 29
Hardware: aarch64
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Jeremy Linton
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks: F29FTBFS 1603313 1622474
TreeView+ depends on / blocked
 
Reported: 2018-07-18 06:29 UTC by Elliott Sales de Andrade
Modified: 2019-11-27 19:26 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1622474 (view as bug list)
Environment:
Last Closed: 2019-11-27 19:26:14 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 1582444 1 None None None 2021-01-20 06:05:38 UTC
Red Hat Bugzilla 1582555 0 urgent CLOSED regression in zlib-1.2.11-8: ARM optimizations broke git log on aarch64 2021-02-22 00:41:40 UTC
Red Hat Bugzilla 1583065 1 None None None 2021-01-20 06:05:38 UTC

Internal Links: 1582444 1582555 1583065

Description Elliott Sales de Andrade 2018-07-18 06:29:37 UTC
Description of problem:
The R-xml2 package is failing the mass rebuild [1]; this build [2] shows a more verbose log with the test error being:

-- 1. Failure: write_xml works with an implicit connections (@test-write_xml.R#4
readChar(file, 1000L) not identical to "<x/>\n".
1/1 mismatches
x[1]: "<x/>\nUUUUUUUUUUUUUUUU"
y[1]: "<x/>\n"
-- 2. Failure: write_xml works with nodeset input and connections (@test-write_x
readChar(file, 1000L) not identical to "<y/>".
1/1 mismatches
x[1]: "<y/>UUUUUUUUUUUUUUUU"
y[1]: "<y/>"
-- 3. Failure: write_xml works with node input and connections (@test-write_xml.
readChar(file, 1000L) not identical to "<y/>".
1/1 mismatches
x[1]: "<y/>UUUUUUUUUUUUUUUU"
y[1]: "<y/>"
-- 4. Failure: write_html work with html input (@test-write_xml.R#110)  --------
readChar(file, 1000L) not identical to "<!DOCTYPE html PUBLIC \"-//W3C//DTD HTML 4.0 Transitional//EN\" \"http://www.w3.org/TR/REC-html40/loose.dtd\">\n<html><head>\n<meta http-equiv=\"Content-Type\" content=\"text/html; charset=UTF-8\">\n<title>Foo</title>\n</head></html>\n".
1/1 mismatches
x[1]: "<!DOCTYPE html PUBLIC \"-//W3C//DTD HTML 4.0 Transitional//EN\" \"http://
x[1]: www.w3.org/TR/REC-html40/loose.dtd\">\n<html><head>\n<meta http-equiv=\"Co
x[1]: ntent-Type\" content=\"text/html; charset=UTF-8\">\n<title>Foo</title>\n</
x[1]: head></html>\nUUUUUUUUUUUUUUUU"
y[1]: "<!DOCTYPE html PUBLIC \"-//W3C//DTD HTML 4.0 Transitional//EN\" \"http://
y[1]: www.w3.org/TR/REC-html40/loose.dtd\">\n<html><head>\n<meta http-equiv=\"Co
y[1]: ntent-Type\" content=\"text/html; charset=UTF-8\">\n<title>Foo</title>\n</
y[1]: head></html>\n"

There are a bunch of U's at the end of the read data. In this build [3], I changed the test to only run:

echo -n "abc" | gzip > file
Rscript --vanilla -e 'quit(status = if(readChar(gzfile("file", "rb"), 1000L) == "abc") 0 else 1)'

On all other arches, this returns TRUE and passes, but on aarch64, it returns garbage and fails.


Version-Release number of selected component (if applicable):
3.5.0-5.fc29


Steps to Reproduce:
echo -n "abc" | gzip > file
Rscript --vanilla -e 'quit(status = if(readChar(gzfile("file", "rb"), 1000L) == "abc") 0 else 1)'


Actual results:
Fails on aarch64, passes elsewhere.


Expected results:
All arches should pass.


Additional info:
[1] https://koji.fedoraproject.org/koji/taskinfo?taskID=28163654
[2] https://koji.fedoraproject.org/koji/taskinfo?taskID=28359389
[3] https://koji.fedoraproject.org/koji/taskinfo?taskID=28388308

Comment 1 Elliott Sales de Andrade 2018-07-18 07:12:42 UTC
Annoyingly, this doesn't seem reproducible using the mock --forcearch option.

Comment 2 Jan Kurik 2018-08-14 09:52:23 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 29 development cycle.
Changing version to '29'.

Comment 3 Elliott Sales de Andrade 2018-08-23 07:44:03 UTC
I'm re-assigning to zlib as R uses deflateInit2/deflate/deflateEnd from there to do its gzip handling. zlib has some patches to use NEON optimizations on aarch64 that may be the cause of this bug.

You can see the builds on
Rawhide: https://koji.fedoraproject.org/koji/taskinfo?taskID=29242464
F29: https://koji.fedoraproject.org/koji/taskinfo?taskID=29242915
F28: https://koji.fedoraproject.org/koji/taskinfo?taskID=29242922

all fail now that a zlib containing the patch is on F28.

Note: these builds do not build R-xml2, but run the following simplified %check:
echo -n "abc" | gzip > file
xxd file
Rscript --vanilla -e 'ret <- readChar(gzfile("file", "rb"), 1000L); print(ret); quit(status = if(ret == "abc") 0 else 1)'

Comment 4 Elliott Sales de Andrade 2018-08-23 08:22:55 UTC
Also CC Peter Robinson who appears to have added those patches but is not a committer in pagure, or automatically CC'd.

Comment 5 Peter Robinson 2018-08-23 09:30:19 UTC
Jeremy can you take a look.

Comment 6 Zbigniew Jędrzejewski-Szmek 2018-08-27 09:24:33 UTC
There has been at least one successfull build after mass rebuild.

zlib-1.2.11-9.fc29: https://koji.fedoraproject.org/koji/buildinfo?buildID=1127542

Comment 7 Pavel Raiskup 2018-08-27 09:57:05 UTC
Reassign back against R, and clone for zlib.

Comment 8 Jeremy Linton 2018-08-29 19:56:09 UTC
Annoyingly I just saw this (out of the noise)

Yes, this sounds similar to the gzip case where the buffer isn't being terminated after its passed to zlib.

Comment 9 Jeremy Linton 2018-09-04 15:07:26 UTC
I'm not dead, but the holiday, and another bug took precidence. I'm looking at this again today, and hopfully should have a patch RSN.

Comment 10 Jeremy Linton 2018-09-04 20:05:14 UTC
I continue to see the test failure with:

Last 13 lines of output:
  x[1]: head></html>\nUUUUUUUUUUUUUUUU"
  y[1]: "<!DOCTYPE html PUBLIC \"-//W3C//DTD HTML 4.0 Transitional//EN\" \"http://
  y[1]: www.w3.org/TR/REC-html40/loose.dtd\">\n<html><head>\n<meta http-equiv=\"Co
  y[1]: ntent-Type\" content=\"text/html; charset=UTF-8\">\n<title>Foo</title>\n</
  y[1]: head></html>\n"


but the direct Rscript call works. This could just be a case of how its being compressed and the source len.

Comment 11 Jeremy Linton 2018-09-04 22:26:29 UTC
Putting a memset against the buffer in R-gzread() reproduces the readChar(gzfile("xxx", "rb")) output so that it consistently has a garbage character following the decompressed data (then the null).

Comment 12 Jeremy Linton 2018-09-04 23:47:19 UTC
Well, at the moment it looks like the following code snip at the top of R_gzread() works around the problem.

if (len)
    start[0] ='\0';

Which force terminates the _first_ byte of the buffer. This is because there are thousands and thousands of calls to R_gzread() in the simple example presented above and the string is read in tiny chunks. It appears that R is calling it an extra time to be sure that the source buffer has ended, and its this case (good output buf/len, but at Z_STREAM_END) which is returning an extra byte.

So at this point i'm not 100% sure who is at fault here, which is for tomorrow.

Comment 13 Ben Cotton 2019-10-31 20:43:00 UTC
This message is a reminder that Fedora 29 is nearing its end of life.
Fedora will stop maintaining and issuing updates for Fedora 29 on 2019-11-26.
It is Fedora's policy to close all bug reports from releases that are no longer
maintained. At that time this bug will be closed as EOL if it remains open with a
Fedora 'version' of '29'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 29 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 14 Ben Cotton 2019-11-27 19:26:14 UTC
Fedora 29 changed to end-of-life (EOL) status on 2019-11-26. Fedora 29 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.


Note You need to log in before you can comment on or make changes to this bug.