Bug 1733743

Summary: [armv7] Unix.LargeFile.ftruncate 2^32 is miscompiled
Product: [Fedora] Fedora Reporter: Richard W.M. Jones <rjones>
Component: ocamlAssignee: Richard W.M. Jones <rjones>
Status: CLOSED RAWHIDE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: rawhideCC: alciregi, c.david86, esandeen, gemi, josef, kasal, kzak, lczerner, oliver, rjones
Target Milestone: ---   
Target Release: ---   
Hardware: armv7hl   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-07-31 20:10:32 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 245418, 910269    

Description Richard W.M. Jones 2019-07-28 09:06:37 UTC
Description of problem:

$ rm -f root
$ truncate -s 4G root
$ /usr/sbin/mke2fs -t ext2 -Fq root
mke2fs: Device size reported to be zero.  Invalid partition specified, or
	partition table wasn't reread after running fdisk, due to
	a modified partition being busy and in use.  You may need to reboot
	to re-read your partition table.

This only happens on armv7, not on other architectures.

Version-Release number of selected component (if applicable):

e2fsprogs 1.45.2-1.fc31

How reproducible:

Happened twice.

Steps to Reproduce:
1. See above.

Comment 1 Alessio 2019-07-28 12:54:27 UTC
(In reply to Richard W.M. Jones from comment #0)
> Description of problem:
> 
> $ rm -f root
> $ truncate -s 4G root
> $ /usr/sbin/mke2fs -t ext2 -Fq root
> mke2fs: Device size reported to be zero.  Invalid partition specified, or
> 	partition table wasn't reread after running fdisk, due to
> 	a modified partition being busy and in use.  You may need to reboot
> 	to re-read your partition table.

I read your message on the devel mailing list.
Issuing the above commands on a Raspberry Pi 3 these are the results

$ uname -a
Linux rpi3 5.3.0-0.rc1.git3.1.fc31.armv7hl #1 SMP Thu Jul 25 15:54:45 UTC 2019 armv7l armv7l armv7l GNU/Linux
$ cat /etc/redhat-release 
Fedora release 31 (Rawhide)
$ rpm -qv e2fsprogs
e2fsprogs-1.45.2-1.fc31.armv7hl

$ rm -f root
$ truncate -s 4G root
$ /usr/sbin/mke2fs -t ext2 -Fq root
warning: Unable to get device geometry for root

Comment 2 Richard W.M. Jones 2019-07-28 13:28:23 UTC
Interesting thanks.  I wonder why this fails in Koji :-?  I guess I've got to do a bit more testing ...

Comment 3 Alessio 2019-07-28 15:26:32 UTC
FWIW

$ rm -f root 
$ touch root
$ /usr/sbin/mke2fs -t ext4 -Fq root
mke2fs: Device size reported to be zero.  Invalid partition specified, or
	partition table wasn't reread after running fdisk, due to
	a modified partition being busy and in use.  You may need to reboot
	to re-read your partition table.

Comment 4 Richard W.M. Jones 2019-07-28 16:05:58 UTC
I modified a Koji build to run supermin, but also print the kernel etc.

Linux buildvm-armv7-05.arm.fedoraproject.org 5.2.2-200.fc30.armv7hl+lpae #1 SMP Sun Jul 21 15:36:24 UTC 2019 armv7l armv7l armv7l GNU/Linux

e2fsprogs-1.45.2-1.fc31.armv7hl

supermin: ext2: creating empty ext2 filesystem 'd2.4r818cjh/root'
RPM build errors:
BUILDSTDERR: mke2fs 1.45.2 (27-May-2019)
BUILDSTDERR: mke2fs: Device size reported to be zero.  Invalid partition specified, or
BUILDSTDERR: 	partition table wasn't reread after running fdisk, due to
BUILDSTDERR: 	a modified partition being busy and in use.  You may need to reboot
BUILDSTDERR: 	to re-read your partition table.
BUILDSTDERR: supermin: /usr/sbin/mke2fs -t ext2 -F 'd2.4r818cjh/root': command failed, see earlier errors
BUILDSTDERR: error: Bad exit status from /var/tmp/rpm-tmp.hycBi4 (%check)
BUILDSTDERR:     Bad exit status from /var/tmp/rpm-tmp.hycBi4 (%check)

All that supermin is doing here is basically the commands as outlined above:

https://github.com/libguestfs/supermin/blob/c97b3917068597a0e68e88d9a905da766ade40da/src/format_ext2.ml#L40-L55

I wonder if there's some problem with the page cache, for example mke2fs is using O_DIRECT
but the previous write to the filesystem hasn't been committed yet.

Comment 5 Richard W.M. Jones 2019-07-28 16:06:40 UTC
(In reply to Alessio from comment #3)
> FWIW
> 
> $ rm -f root 
> $ touch root
> $ /usr/sbin/mke2fs -t ext4 -Fq root
> mke2fs: Device size reported to be zero.  Invalid partition specified, or
> 	partition table wasn't reread after running fdisk, due to
> 	a modified partition being busy and in use.  You may need to reboot
> 	to re-read your partition table.

This is expected because "touch" creates a zero length file.  See the code
linked in the previous comment for what we actually do to create a 4G file.

Comment 6 Richard W.M. Jones 2019-07-28 16:38:11 UTC
I tried to fsync the directory before calling mke2fs but it made no difference.

Comment 7 Richard W.M. Jones 2019-07-29 08:27:53 UTC
Oh lovely, this is actually a compiler bug.

$ cat test.ml
let size = Int64.shift_left 1L 32 in
Unix.LargeFile.ftruncate Unix.stdout size
$ ocamlopt unix.cmxa test.ml -o test
$ strace ./test > /tmp/file
...
ftruncate64(1, 0)                       = 0
               ^
               this should be 2^32

Comment 8 Richard W.M. Jones 2019-07-29 20:45:24 UTC
Proposed patch:
https://github.com/ocaml/ocaml/pull/8843

Will be included in OCaml 4.08.1 which we'll upgrade to when it is released.

Comment 9 Richard W.M. Jones 2019-07-31 12:07:45 UTC
Upstream fix:
https://github.com/ocaml/ocaml/commit/5e4b55d3bd3fdf3e7512c132ad36d103d7131e72