Bug 2212765
| Summary: | 6.4.0-0.rc4.20230601git929ed21dfdb6.38.fc39.ppc64le: wait4(): ru_maxrss is rounded down to multiplies of 2 MB | ||||||
|---|---|---|---|---|---|---|---|
| Product: | [Fedora] Fedora | Reporter: | Petr Pisar <ppisar> | ||||
| Component: | kernel | Assignee: | Kernel Maintainer List <kernel-maint> | ||||
| Status: | NEW --- | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||
| Severity: | medium | Docs Contact: | |||||
| Priority: | unspecified | ||||||
| Version: | rawhide | CC: | acaringi, adscvr, airlied, alciregi, bskeggs, dhorak, ellerman, hdegoede, hpa, jarodwilson, josef, kasal, kernel-maint, lgoncalv, linville, masami256, mchehab, opohorel, ppisar, ptalbert, scweaver, steved | ||||
| Target Milestone: | --- | Keywords: | Regression | ||||
| Target Release: | --- | ||||||
| Hardware: | ppc64le | ||||||
| OS: | Linux | ||||||
| URL: | https://koschei.fedoraproject.org/package/time | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | Type: | --- | |||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Bug Depends On: | |||||||
| Bug Blocks: | 1071880, 2168842, 2231791 | ||||||
| Attachments: |
|
||||||
|
Description
Petr Pisar
2023-06-06 09:48:06 UTC
This looks like a bug in kernel. Compare ru_maxrss on x86_64:
$ strace -v -e wait4 -- time -f %M /usr/bin/true
wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, {ru_utime={tv_sec=0, tv_usec=522}, ru_stime={tv_sec=0, tv_usec=0}, ru_maxrss=1152, ru_ixrss=0, ru_idrss=0, ru_isrss=0, ru_minflt=66, ru_majflt=0, ru_nswap=0, ru_inblock=0, ru_oublock=0, ru_msgsnd=0, ru_msgrcv=0, ru_nsignals=0, ru_nvcsw=1, ru_nivcsw=0}) = 12931
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=12931, si_uid=500, si_status=0, si_utime=0, si_stime=0} ---
1152
+++ exited with 0 +++
While on ppc64le:
# strace -v -e wait4 -- ./time -f %M /usr/bin/true
wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, {ru_utime={tv_sec=0, tv_usec=596}, ru_stime={tv_sec=0, tv_usec=0}, ru_maxrss=0, ru_ixrss=0, ru_idrss=0, ru_isrss=0, ru_minflt=47, ru_majflt=0, ru_nswap=0, ru_inblock=0, ru_oublock=0, ru_msgsnd=0, ru_msgrcv=0, ru_nsignals=0, ru_nvcsw=1, ru_nivcsw=1}) = 23827
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=23827, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
0
+++ exited with 0 +++
# uname -r
6.4.0-0.rc4.20230601git929ed21dfdb6.38.fc39.ppc64le
It looks like ru_maxrss value is rounded down to nearest 2048 KB:
[root@ibm-p9z-27-lp10 time-1.9]# strace -v -e wait4 -- ./time -f %M ./tests/time-aux -m 1M
wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, {ru_utime={tv_sec=0, tv_usec=745}, ru_stime={tv_sec=0, tv_usec=0}, ru_maxrss=0, ru_ixrss=0, ru_idrss=0, ru_isrss=0, ru_minflt=70, ru_majflt=0, ru_nswap=0, ru_inblock=0, ru_oublock=0, ru_msgsnd=0, ru_msgrcv=0, ru_nsignals=0, ru_nvcsw=1, ru_nivcsw=0}) = 24895
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=24895, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
0
+++ exited with 0 +++
[root@ibm-p9z-27-lp10 time-1.9]# strace -v -e wait4 -- ./time -f %M ./tests/time-aux -m 2M
wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, {ru_utime={tv_sec=0, tv_usec=0}, ru_stime={tv_sec=0, tv_usec=2891}, ru_maxrss=2048, ru_ixrss=0, ru_idrss=0, ru_isrss=0, ru_minflt=86, ru_majflt=0, ru_nswap=0, ru_inblock=0, ru_oublock=0, ru_msgsnd=0, ru_msgrcv=0, ru_nsignals=0, ru_nvcsw=1, ru_nivcsw=1}) = 24900
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=24900, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
2048
+++ exited with 0 +++
[root@ibm-p9z-27-lp10 time-1.9]# strace -v -e wait4 -- ./time -f %M ./tests/time-aux -m 3M
wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, {ru_utime={tv_sec=0, tv_usec=965}, ru_stime={tv_sec=0, tv_usec=0}, ru_maxrss=2048, ru_ixrss=0, ru_idrss=0, ru_isrss=0, ru_minflt=102, ru_majflt=0, ru_nswap=0, ru_inblock=0, ru_oublock=0, ru_msgsnd=0, ru_msgrcv=0, ru_nsignals=0, ru_nvcsw=1, ru_nivcsw=1}) = 24907
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=24907, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
2048
+++ exited with 0 +++
[root@ibm-p9z-27-lp10 time-1.9]# strace -v -e wait4 -- ./time -f %M ./tests/time-aux -m 4M
wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, {ru_utime={tv_sec=0, tv_usec=0}, ru_stime={tv_sec=0, tv_usec=1995}, ru_maxrss=4096, ru_ixrss=0, ru_idrss=0, ru_isrss=0, ru_minflt=118, ru_majflt=0, ru_nswap=0, ru_inblock=0, ru_oublock=0, ru_msgsnd=0, ru_msgrcv=0, ru_nsignals=0, ru_nvcsw=1, ru_nivcsw=1}) = 24914
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=24914, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
4096
+++ exited with 0 +++
[root@ibm-p9z-27-lp10 time-1.9]# strace -v -e wait4 -- ./time -f %M ./tests/time-aux -m 5M
wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, {ru_utime={tv_sec=0, tv_usec=1199}, ru_stime={tv_sec=0, tv_usec=0}, ru_maxrss=4096, ru_ixrss=0, ru_idrss=0, ru_isrss=0, ru_minflt=135, ru_majflt=0, ru_nswap=0, ru_inblock=0, ru_oublock=0, ru_msgsnd=0, ru_msgrcv=0, ru_nsignals=0, ru_nvcsw=1, ru_nivcsw=1}) = 24919
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=24919, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
4096
+++ exited with 0 +++
[root@ibm-p9z-27-lp10 time-1.9]# strace -v -e wait4 -- ./time -f %M ./tests/time-aux -m 6M
wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, {ru_utime={tv_sec=0, tv_usec=0}, ru_stime={tv_sec=0, tv_usec=3769}, ru_maxrss=6144, ru_ixrss=0, ru_idrss=0, ru_isrss=0, ru_minflt=150, ru_majflt=0, ru_nswap=0, ru_inblock=0, ru_oublock=0, ru_msgsnd=0, ru_msgrcv=0, ru_nsignals=0, ru_nvcsw=1, ru_nivcsw=1}) = 24926
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=24926, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
6144
+++ exited with 0 +++
Hi Dan, Could you take a look at this for us (kernel-ark team) when you get a chance? Thank you. Scott If it's a kernel issue, then it's more likely a runtime, not buildtime, kernel version that changed. The time test-suite broke on 03/31 if I see right, so we need to check what kernel was used on the builders at that time. It could be a 6.1 -> 6.2 migration ... FWIW the time-max-rss.sh test is skipped on my F-36 with kernel-6.4.0-0.rc3.28.fc39.ppc64le from the time-1.9/tests/time-max-rss.log time-max-rss.sh: skipped test: getrusage(2) returns zero in ru_maxrss SKIP tests/time-max-rss.sh (exit status: 77) So it might be more complicated ... Created attachment 1972244 [details]
Reproducer
Instead of using time's package test suite, this short C program should demonstrate the problem. Though, I haven't run it on PowerPC yet.
[dan@talos tmp]$ gcc -O2 -Wall -o test test.c [dan@talos tmp]$ ./test Child mallocated 3145728 B. wait4() reported 0 B. Which is fewer than the allocated memory. This is on F-36 with kernel-6.4.0-0.rc3.28.fc39.ppc64le, I guess this means the reproducer works. with kernel-6.0.7-301.fc37.ppc64le (P8 KVM guest) I am getting [root@ibm-p8-storage-02-guest-10 ~]# ./test Child mallocated 3145728 B. wait4() reported 2293760 B. Which is fewer than the allocated memory. If I build it without optimizations, it prints nonzero, yet too small number: # gcc -O0 -Wall -o test test.c # ./test Child mallocated 3145728 B. wait4() reported 2097152 B. Which is fewer than the allocated memory. # uname -r 6.4.0-0.rc7.20230621gite660abd551f1.55.fc39.ppc64le While with -O2 it reports 0 on the same kernel. Do I have a bug in my reproducer? Or is the compiler too smart and optimizes out a for loop which forces mapping pages into resident set size? Or is there a bug in glibc wrapper for the syscall? wait4() is known to always return 0 on some platforms where process accounting is not fully implemented. That's why the test skips in case of 0. I don't know what whas the situation with Linux on PowerPC. and with kernel-6.3.8-100.fc37.ppc64le on the same guest Child mallocated 3145728 B. wait4() reported 0 B. Which is fewer than the allocated memory. Hi Michael, seems we have some problems with the maxrss value returned by wait4() on recent ppc64le kernels. Or perhaps on the glibc side of the syscall ... Could you or someone else on the IBM side take a look? Thanks, Dan Thanks Michael for the link - https://lore.kernel.org/all/20230616180718.17725-1-mkoutny@suse.com/ should be the fix. If I see right, then 6.2 is the first kernel where things broke (started to behave differently) compared to previous kernels. (In reply to Dan Horák from comment #11) > Thanks Michael for the link - > https://lore.kernel.org/all/20230616180718.17725-1-mkoutny@suse.com/ should > be the fix. If I see right, then 6.2 is the first kernel where things broke > (started to behave differently) compared to previous kernels. Hi Dan, Correct me if I'm missing something but it doesn't look like this patch was accepted. Was there a v2 somewhere that I'm not finding? Hi Scott, (In reply to Scott Weaver from comment #12) > (In reply to Dan Horák from comment #11) > > Thanks Michael for the link - > > https://lore.kernel.org/all/20230616180718.17725-1-mkoutny@suse.com/ should > > be the fix. If I see right, then 6.2 is the first kernel where things broke > > (started to behave differently) compared to previous kernels. > > Hi Dan, > > Correct me if I'm missing something but it doesn't look like this patch was > accepted. Was there a v2 somewhere that I'm not finding? I suspect you are right that there hasn't been any new version posted yet ... |