Bug 1419736 - 32-bit stat returns wrong st_mtime if file timestamp does not fit in 32 bits
Summary: 32-bit stat returns wrong st_mtime if file timestamp does not fit in 32 bits
Keywords:
Status: NEW
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: rawhide
Hardware: x86_64
OS: Linux
unspecified
unspecified
Target Milestone: ---
Assignee: Carlos O'Donell
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-02-06 22:41 UTC by Paul Eggert
Modified: 2020-01-17 22:31 UTC (History)
18 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Type: Bug
Embargoed:


Attachments (Terms of Use)
Program illustrating stat bug when run in 32-bit mode (295 bytes, text/x-csrc)
2017-02-06 22:41 UTC, Paul Eggert
no flags Details

Description Paul Eggert 2017-02-06 22:41:04 UTC
Created attachment 1248203 [details]
Program illustrating stat bug when run in 32-bit mode

Description of problem:

stat, fstat, lstat, etc. all report incorrect file timestamps when operating in 32-bit mode and dealing with timestamps outside the 32-bit range

Version-Release number of selected component (if applicable):

Fedora 25

How reproducible:

Run a system call like 'stat', in a 32-bit executable, against a file whose timestamp is before 1901 or after 2038. fstat will succeed and will report a bogus time stamp.


Steps to Reproduce:

Take the attached program mtime.c and run the following on an x86-64 host:

gcc -m32 mtime.c
touch -d '1900-01-01' in
ls -l in
./a.out in

Actual results:

-rw-r--r--. 1 eggert eggert 0 Jan  1  1900 in
in: mtime = 2086007296 = Thu Feb  7 06:28:16 2036

Expected results:

-rw-r--r--. 1 eggert eggert 0 Jan  1  1900 in
in: Value too large for defined data type

Additional info:

This bug causes a gzip test case to fail on Ubuntu, and the same bug occurs on Fedora; please see:

https://bugs.gnu.org/25636#8

Comment 1 Carlos O'Donell 2017-02-09 02:18:04 UTC
Unfortunately the issue is not as simple as a black and white overflow check.

Firstly the overflow has to be applied to all 'struct timespec' used in all stat-like interfaces. That is quite a lot of new code in the hot path to catch only cases where it's <1901 or >2038.

Therefore glibc and I believe the linux kernel are still debating on the issue of fixing time overflow issues for 32-bit applications (see 'Debatable' in the wiki link below), instead the suggested solution will be to recompile the application using 64-bit time.

Please see "Y2038 Proofness Design" for a broad discussion of this problem and solution:
https://sourceware.org/glibc/wiki/Y2038ProofnessDesign

In this particular case there is nothing that glibc can do. We are calling the stat64 syscall, which is all the 32-bit process has available, and it returns already overflowed 32-bit struct timespec values. The 32-bit application doesn't have access to the 64-bit struct timespec from the struct kstat the 64-bit kernel uses internally. The stat64 API was designed to support larger file-related attributes, but not a larger 64-bit time.

In summary:
- As of today cannot represent time outside of 1901<->2038 for 32-bit applications using non-Y2038-proof APIs that have non-Y2038-proof types.
- There is no forthcoming fix other than to use 64-bit applications.
- The future fix is to recompile your 32-bit application with 64-bit time support.

Does that answer your question Paul?

Comment 2 Paul Eggert 2017-02-09 04:55:00 UTC
(In reply to Carlos O'Donell from comment #1)
> In this particular case there is nothing that glibc can do.

Thanks for the explanation. This bug should not be filed under 'glibc', then. I'll try to change its component to 'kernel'.

> Does that answer your question Paul?

Yes, thanks, though perhaps this bug's priority needs tweaking. True, we have until January 2038 before calls like 'time ()' stop working, and it'll be unlikely for 32-bit apps to run across naturally-occurring out-of-range file timestamps before then. However, we may need to worry about malicious attacks on 32-bit applications based on their mishandling of out-of-range file timestamps.

Comment 3 Carlos O'Donell 2017-02-09 13:03:55 UTC
(In reply to Paul Eggert from comment #2)
> (In reply to Carlos O'Donell from comment #1)
> > In this particular case there is nothing that glibc can do.
> 
> Thanks for the explanation. This bug should not be filed under 'glibc',
> then. I'll try to change its component to 'kernel'.

While at the highest level of abstraction it is debatable if the 32-bit APIs should be fixed, this particular case is solvable.

On a case-by-case basis I think we can discuss EOVERFLOW checking and returns. In the particular case of stat64, which is implemented by x86-specific code, this certainly needs fixing in the kernel to return EOVERFLOW.

e.g.

arch/x86/ia32/sys_ia32.c

 66 /*
 67  * Another set for IA32/LFS -- x86_64 struct stat is different due to
 68  * support for 64bit inode numbers.
 69  */
 70 static int cp_stat64(struct stat64 __user *ubuf, struct kstat *stat)
 71 {
 72         typeof(ubuf->st_uid) uid = 0;
 73         typeof(ubuf->st_gid) gid = 0;
 74         SET_UID(uid, from_kuid_munged(current_user_ns(), stat->uid));
 75         SET_GID(gid, from_kgid_munged(current_user_ns(), stat->gid));
 76         if (!access_ok(VERIFY_WRITE, ubuf, sizeof(struct stat64)) ||
 77             __put_user(huge_encode_dev(stat->dev), &ubuf->st_dev) ||
 78             __put_user(stat->ino, &ubuf->__st_ino) ||
 79             __put_user(stat->ino, &ubuf->st_ino) ||
 80             __put_user(stat->mode, &ubuf->st_mode) ||
 81             __put_user(stat->nlink, &ubuf->st_nlink) ||
 82             __put_user(uid, &ubuf->st_uid) ||
 83             __put_user(gid, &ubuf->st_gid) ||
 84             __put_user(huge_encode_dev(stat->rdev), &ubuf->st_rdev) ||
 85             __put_user(stat->size, &ubuf->st_size) ||

~~~
 86             __put_user(stat->atime.tv_sec, &ubuf->st_atime) ||
 87             __put_user(stat->atime.tv_nsec, &ubuf->st_atime_nsec) ||
 88             __put_user(stat->mtime.tv_sec, &ubuf->st_mtime) ||
 89             __put_user(stat->mtime.tv_nsec, &ubuf->st_mtime_nsec) ||
 90             __put_user(stat->ctime.tv_sec, &ubuf->st_ctime) ||
 91             __put_user(stat->ctime.tv_nsec, &ubuf->st_ctime_nsec) ||
~~~

All three of these struct timespecs would need independent conversion with overflow checks.

 92             __put_user(stat->blksize, &ubuf->st_blksize) ||
 93             __put_user(stat->blocks, &ubuf->st_blocks))
 94                 return -EFAULT;
 95         return 0;
 96 }

> > Does that answer your question Paul?
> 
> Yes, thanks, though perhaps this bug's priority needs tweaking. True, we
> have until January 2038 before calls like 'time ()' stop working, and it'll
> be unlikely for 32-bit apps to run across naturally-occurring out-of-range
> file timestamps before then. However, we may need to worry about malicious
> attacks on 32-bit applications based on their mishandling of out-of-range
> file timestamps.

We _do not_ have until 2038, in fact we probably have only a 5 year window remaining to fix this issue. Given the 10-year timelines for enterprise adoption and support, and a desire to see this kind of fix land earlier than that window, we probably need a good 15-year lead on the problem. Worse is that IoT deployments on cheap 32-bit hardware are going to need 64-bit time also.

Comment 4 Florian Weimer 2017-02-09 13:19:58 UTC
(In reply to Carlos O'Donell from comment #3)
> We _do not_ have until 2038, in fact we probably have only a 5 year window
> remaining to fix this issue. Given the 10-year timelines for enterprise
> adoption and support, and a desire to see this kind of fix land earlier than
> that window, we probably need a good 15-year lead on the problem. Worse is
> that IoT deployments on cheap 32-bit hardware are going to need 64-bit time
> also.

To be absolutely clear, nothing in this paragraph should be read as a commitment for any future enterprise distribution to continue to support running legacy 32-bit binaries, with a 32-bit time_t type or a 64-bit time_t type.

Comment 5 Carlos O'Donell 2017-02-09 15:14:48 UTC
(In reply to Florian Weimer from comment #4)
> (In reply to Carlos O'Donell from comment #3)
> > We _do not_ have until 2038, in fact we probably have only a 5 year window
> > remaining to fix this issue. Given the 10-year timelines for enterprise
> > adoption and support, and a desire to see this kind of fix land earlier than
> > that window, we probably need a good 15-year lead on the problem. Worse is
> > that IoT deployments on cheap 32-bit hardware are going to need 64-bit time
> > also.
> 
> To be absolutely clear, nothing in this paragraph should be read as a
> commitment for any future enterprise distribution to continue to support
> running legacy 32-bit binaries, with a 32-bit time_t type or a 64-bit time_t
> type.

Absolutely. My intent was to clarify that upstream has a shorter timeframe than 20 years. Particularly if upstream wishes to support downstreams that make a decision to continue to support running 32-bit binaries.

Comment 6 Laura Abbott 2017-02-09 15:51:24 UTC
This is better tracked on rawhide than F25. There's been ongoing work in the kernel for Y2038. What kernel versions have you verified this bug on?

Comment 7 Carlos O'Donell 2017-02-09 16:19:58 UTC
(In reply to Laura Abbott from comment #6)
> This is better tracked on rawhide than F25. There's been ongoing work in the
> kernel for Y2038. What kernel versions have you verified this bug on?

I've verified on 4.8.15-300.fc25.x86_64+debug. But upstream 4.10.0-rc7 still has the same code in cp_stat64() which doesn't check for overflows.

Comment 8 Paul Eggert 2017-02-20 21:19:21 UTC
> What kernel versions have you verified this bug on?

I verified it just now on 4.9.9-200.fc25.x86_64, the current Fedora 25 kernel.

Comment 9 Paul Eggert 2017-11-05 00:26:33 UTC
(In reply to Carlos O'Donell from comment #7)
> I've verified on 4.8.15-300.fc25.x86_64+debug. But upstream 4.10.0-rc7 still
> has the same code in cp_stat64() which doesn't check for overflows.

If I'm reading the kernel right, the problem is also in cp_new_stat, cp_new_stat64, cp_statx, and cp_compat_stat. (Not that I'm much of a kernel hacker.)

Is there some way to boost the priority of this bug? As you say, it needs to be fixed reasonably soon if we're going to fix it at all. I'm writing now because a new gzip bug report about the issue came in here:

https://bugs.gnu.org/29033#8

While checking out the new bug report, I verified that the kernel bug is still in 4.13.10-200.fc26.x86_64, the current Fedora 26 kernel.

Comment 10 Florian Weimer 2017-11-05 10:03:01 UTC
(In reply to Paul Eggert from comment #9)
> Is there some way to boost the priority of this bug?

You need to report it upstream.  Probably send mail to Alexander Viro <viro.org.uk>, David Howells <dhowells> (who added statx) and Cc: linux-fsdevel.org.  Perhaps Cc: Eric Biggers <ebiggers> as well.

Comment 11 Paul Eggert 2017-11-05 21:41:56 UTC
(In reply to Florian Weimer from comment #10)

> You need to report it upstream.

Thanks, I sent it as you suggested, archived here:

https://marc.info/?l=linux-fsdevel&m=150991765312229


Note You need to log in before you can comment on or make changes to this bug.