Bug 1562066 - some eccodes tests fail on i686 causing a build failure.
Summary: some eccodes tests fail on i686 causing a build failure.
Keywords:
Status: NEW
Alias: None
Product: Fedora
Classification: Fedora
Component: eccodes
Version: rawhide
Hardware: i686
OS: Linux
unspecified
unspecified
Target Milestone: ---
Assignee: Jos de Kloe
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks: FE-ExcludeArch-x86, F-ExcludeArch-x86
TreeView+ depends on / blocked
 
Reported: 2018-03-29 13:08 UTC by Jos de Kloe
Modified: 2022-10-04 09:26 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Type: Bug
Embargoed:


Attachments (Terms of Use)

Description Jos de Kloe 2018-03-29 13:08:10 UTC
Description of problem:

koji build fails for the eccodes package due to failures in the check section. 

Version-Release number of selected component (if applicable):

eccodes-2.7.0-1

How reproducible: always

Failing test cases are:
he following tests FAILED:
	 14 - eccodes_t_bufr_dump_encode_filter (Failed)
	 16 - eccodes_t_bufrdc_ref (Failed)
	 21 - eccodes_t_bufr_filter (Failed)
	 39 - eccodes_t_gts_get (Failed)
	 40 - eccodes_t_gts_ls (Failed)
	 41 - eccodes_t_gts_compare (Failed)
	 42 - eccodes_t_metar_ls (Failed)
	 43 - eccodes_t_metar_get (Failed)
	 44 - eccodes_t_metar_dump (Failed)
	 45 - eccodes_t_metar_compare (Failed)
	 48 - eccodes_t_sh_ieee64 (Child aborted)
	 91 - eccodes_t_bufr_dump_encode_fortran (Failed)
	 94 - eccodes_t_bufr_dump_encode_C (Failed)
	 96 - eccodes_t_bufr_dump_encode_python (Failed)
	182 - eccodes_p_grib_get_message_offset_test (Failed)
Errors while running CTest

for details see: https://koji.fedoraproject.org/koji/buildinfo?buildID=1061241

Comment 1 Jos de Kloe 2018-03-29 13:44:52 UTC
Note that the same tests fail on armv7hl, so the problem may be related.
see: https://bugzilla.redhat.com/show_bug.cgi?id=1562084

Comment 2 Jos de Kloe 2018-03-29 13:51:50 UTC
The gts and metar problems seem related.
They all end in the error "no message found in <file>" even though the file used for testing clearly contains data.
While trying to debug the eccodes_t_gts_get test case I could track the issue down to the file eccodes-2.7.0-Source/src/grib_io.c
function read_any_gts, line 893:

   r->read(r->read_data,buffer+already_read,message_size-already_read,&err);

This one returns with error code -1, but I have no idea why.

Comment 3 Jos de Kloe 2018-03-30 13:07:06 UTC
this issue has been reported upstream at:
https://software.ecmwf.int/issues/browse/SUP-2408

Comment 4 Jos de Kloe 2018-08-03 07:50:55 UTC
the problem remains after upgrading to eccodes 2.8.0

Comment 5 Jan Kurik 2018-08-14 10:08:42 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 29 development cycle.
Changing version to '29'.

Comment 6 Mamoru TASAKA 2021-12-23 08:21:30 UTC
Well, I've written comments for this on 
https://jira.ecmwf.int/browse/SUP-2408?focusedCommentId=569724&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-569724

But I will put some notes also on this bugzilla ticket.

Now with eccodes 2.4.1, the following 19 tests fail on Linux 32 bit (On Fedora, i686 and armv7hl):

```
         35 - eccodes_t_bufr_ecc-1290 (Failed)
         65 - eccodes_t_bufr_dump_encode_filter (Failed)
         67 - eccodes_t_bufrdc_ref (Failed)
         76 - eccodes_t_bufr_filter_unpack_pack (Failed)
        100 - eccodes_t_bufr_ecc-875 (Failed)
        118 - eccodes_t_gts_get (Failed)
        119 - eccodes_t_gts_ls (Failed)
        120 - eccodes_t_gts_count (Failed)
        121 - eccodes_t_gts_compare (Failed)
        122 - eccodes_t_metar_ls (Failed)
        123 - eccodes_t_metar_get (Failed)
        124 - eccodes_t_metar_dump (Failed)
        125 - eccodes_t_metar_compare (Failed)
        128 - eccodes_t_grib_sh_ieee64 (Failed)
        131 - eccodes_t_grib_lam_bf (Failed)
        153 - eccodes_t_grib_bitsPerValue (Failed)
        162 - eccodes_t_grib_copy (Failed)
        167 - eccodes_t_grib_second_order (Failed)
        193 - eccodes_t_grib_to_netcdf (Failed)
```

I've put proposal fix for the above issues on https://jira.ecmwf.int/browse/SUP-2408 .
The scratch build with my proposal fixes applied is: https://koji.fedoraproject.org/koji/taskinfo?taskID=80359704

For the following comment:

(In reply to Jos de Kloe from comment #2)
> The gts and metar problems seem related.
> They all end in the error "no message found in <file>" even though the file
> used for testing clearly contains data.
> While trying to debug the eccodes_t_gts_get test case I could track the
> issue down to the file eccodes-2.7.0-Source/src/grib_io.c
> function read_any_gts, line 893:
> 
>    r->read(r->read_data,buffer+already_read,message_size-already_read,&err);
> 
> This one returns with error code -1, but I have no idea why.

This is because before the above line, the following is executed:

    r->seek(r->read_data, already_read - message_size);

Here "already_read" and "message_size" are both size_t, and the above r->seek
(actually stdio_seek) second arguments accepts off_t value. Note that
here "already_read" is *smaller* than message_size.

On 64 bit, size_t is 64bit unsigned, off_t is 64bit signed. So when the
above "already_read - message_size" is once calculated on 64bit unsigned "huge" value
and then passed to "stdio_seek" second argument, it is again converted
to signed negative value, this is intended (i.e. there r->seek is intended
to seek toward the head of the file).

But on 32 bit, size_t is 32bit unsigned, but off_t is 64 bit signed because
compilation flags specify "-D_FILE_OFFSET_BITS=64". so the above
"already_read - message_size" is once calculated as 32 bit unsigned "huge" value,
then converted to 64 bit signed value - but on this case this 32 bit
unsigned "huge" value does not change (to intended negative value).
So r->seek (actually stdio_seek) accepts invalid huge positive value, and the following
r->read fails.

Comment 7 Mamoru TASAKA 2021-12-23 08:22:25 UTC
Well, actually 2.24.1, not 2.4.1...

Comment 8 Jos de Kloe 2022-10-04 09:26:31 UTC
Upstream has now decided that "No more work will be done from our side for 32-bit platforms.".
See: https://jira.ecmwf.int/browse/SUP-2408

So I propose to close this issue.


Note You need to log in before you can comment on or make changes to this bug.