Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1952888

Summary: ps command only outputting ~130 procs/min
Product: Red Hat Enterprise Linux 8 Reporter: John Siddle <jsiddle>
Component: crashAssignee: ltao
Status: CLOSED ERRATA QA Contact: xiaoying yan <yiyan>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 8.3CC: aubaker, cye, lijiang, ltao, ruyang, stalexan, yiyan
Target Milestone: betaKeywords: Triaged
Target Release: ---Flags: pm-rhel: mirror+
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: crash-7.3.1-5.el8 Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-05-10 13:55:49 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description John Siddle 2021-04-23 12:52:29 UTC
Description of problem:

For galvatron task ID 886910425 the 'ps' command was running very slowly. With this core's ~65k processes it would take ~7 hours to finish the command.


Version-Release number of selected component (if applicable):

Issue exists both on the version of crash installed on galvatron, 7.2.7-3.el8, and upstream version 7.2.9++ (commit dccdf95c)


How reproducible:

Always


Steps to Reproduce:
1. retrace-server-interact 886910425 crash 
2. ps
3.

Actual results:

Running 'ps > ps.out' in crash:

 crash 7.2.7-3.el8:

     [jsiddle@galvatron crash]$ wc -l ps.out ; sleep 60 ; wc -l ps.out
     273 ps.out
     411 ps.out

     138/min

 crash 7.2.9++:

     [jsiddle@galvatron crash]$ wc -l ps.out ; sleep 60 ; wc -l ps.out
     209 ps.out
     341 ps.out

     132/min


Expected results:


Additional info:

Comment 1 ltao 2021-06-10 03:37:35 UTC
(In reply to John Siddle from comment #0)
> Description of problem:
> 
> For galvatron task ID 886910425 the 'ps' command was running very slowly.
> With this core's ~65k processes it would take ~7 hours to finish the command.
> 

Hello John,

How can I access to the vmcore which has the ~65k processes and causes the performance degradation?
I'd like to have a debug with it.

> 
> Version-Release number of selected component (if applicable):
> 
> Issue exists both on the version of crash installed on galvatron,
> 7.2.7-3.el8, and upstream version 7.2.9++ (commit dccdf95c)
> 
> 
> How reproducible:
> 
> Always
> 
> 
> Steps to Reproduce:
> 1. retrace-server-interact 886910425 crash 

Which server should I login to invoke the command?

Thanks!
Tao Liu

> 2. ps
> 3.
>
> Actual results:
> 
> Running 'ps > ps.out' in crash:
> 
>  crash 7.2.7-3.el8:
> 
>      [jsiddle@galvatron crash]$ wc -l ps.out ; sleep 60 ; wc -l ps.out
>      273 ps.out
>      411 ps.out
> 
>      138/min
> 
>  crash 7.2.9++:
> 
>      [jsiddle@galvatron crash]$ wc -l ps.out ; sleep 60 ; wc -l ps.out
>      209 ps.out
>      341 ps.out
> 
>      132/min
> 
> 
> Expected results:
> 
> 
> Additional info:

Comment 2 John Siddle 2021-06-10 12:11:57 UTC
Hi Tao, thanks for looking into this. You can access the core with:

  # ssh <kerberos>@galvatron.cee.redhat.com
  #crash /cores/exceptions/siddle/vmcore_bz1952888 /cores/retrace/repos/kernel/x86_64/usr/lib/debug/lib/modules/3.10.0-1160.el7.x86_64/vmlinux

Comment 3 ltao 2021-07-28 02:05:16 UTC
Hello John,

The vmcore in /cores/exceptions/siddle/vmcore_bz1952888 is no longer accessible. Previously I can access to it, but now I cannot. Maybe because it has time limits for storing the vmcores in galvatron.cee.redhat.com server? Anyway, is it possible for me to access to the vmcore again? Thanks!

Tao Liu

Comment 4 John Siddle 2021-07-28 12:21:52 UTC
Hi Tao,

Sorry about that. I've re-queued the core so you should be able to access it on galvatron with:

    # retrace-server-interact 339468762 crash

Comment 5 ltao 2021-07-29 08:23:35 UTC
Hi John,

Thanks! Now I can access to the vmcore.

Thanks,
Tao Liu

Comment 6 ltao 2021-08-03 10:06:13 UTC
Hello John,

The reason for "ps output slowly" is because the processes have so many threads(nearly 20000).
When "ps" command is invoked, crash will travel through all threads of each process(from 'first' to 'last'), and accumulate
each thread's rss to get the final rss of the process:

                                while (first <= last)
				{
					if (!readmem(first->task +
                                           ....
					rss += sync_rss;

					if (!readmem(first->task +
                                           ....
					rss += sync_rss;

					if (first == last)
						break;
					first++;
				}
 
I modified the code a little to print the number of threads for each process, we can see each process hold threads nearly 20000,
which means the above while loop will be executed 20000 and readmem will be executed 40000 for each process. Merging 2 readmem into 1 doesn't increase the performance much, and readmem cannot be cached because the reading address differs from each run.

----21870 threads----
    300  10765   4  ffff8816603d2100  IN   4.6 50726924 12236960  java
----21941 threads----
    301  11080  10  ffff882936c4c200  IN   5.0 51334600 13329952  java
----21870 threads----
    302  10765   4  ffff881615fce300  IN   4.6 50726924 12236960  java
----19918 threads----
    303  10688  13  ffff8826c1dd6300  IN   7.7 64202900 20482612  rocksdb:high3
----19918 threads----
    304  10688  29  ffff8826c1dd4200  IN   7.7 64202900 20482612  rocksdb:high4
----19918 threads----
    305  10688  12  ffff8816867f2100  IN   7.7 64202900 20482612  rocksdb:high5
----19918 threads----
    306  10688  13  ffff8816ef3a9080  IN   7.7 64202900 20482612  rocksdb:high6
----19918 threads----
    307  10688  18  ffff8817413e0000  IN   7.7 64202900 20482612  rocksdb:high7
----21941 threads----
    308  11080   3  ffff8828a2e72100  IN   5.0 51334600 13329952  java
----19918 threads----
----21941 threads----
    309  10688  26  ffff8817413e4200  IN   7.7 64202900 20482612  java
----21870 threads----
    310  11080  29  ffff882818394200  IN   5.0 51334600 13329952  java
....

We can have a comparison with another vmcore which ps command exit fast:

----1 threads----
    832      1   0  ffff91f8b713dc40  IN   0.1   81592   5536  dbus-daemon
----0 threads----
    834      1   0  ffff91f8b9ee8000  IN   0.1   29432   3108  chronyd
----2 threads----
    836      1   0  ffff91f8b6adbd80  IN   0.3  390660  17212  NetworkManager
----8 threads----
    837      1   0  ffff91f8b4f03d80  IN   0.5 1770768  25032  polkitd
----1 threads----
    838      1   0  ffff91f8b6addc40  IN   0.1  160152   6968  rngd
----5 threads----
    839      1   0  ffff91f8b68c5c40  IN   0.1  101724   3996  gssproxy
----5 threads----
    840      1   0  ffff91f8bad48000  IN   0.1  101724   3996  gssproxy
----5 threads----
    841      1   0  ffff91f8bad49ec0  IN   0.1  101724   3996  gssproxy
----5 threads----
    842      1   0  ffff91f8bad4bd80  IN   0.1  101724   3996  gssproxy

From my point, I think it is difficult to optimize the performance when we have so many threads to iterate for a vmcore. And performance degradation is expected for the case.

Any thoughts?

Thanks,
Tao Liu

Comment 7 ltao 2021-08-15 14:22:04 UTC
I did some analysis, the performance bottle neck is task.c:show_ps_data:get_task_mem_usage(tc->task, tm).
And I modified the code a little, in order to show the time consuming status, the result is as follows:

    300  10765   4  ffff8816603d2100  IN   4.6 50726924 12236960  --IO:71.488000ms UC:665.982000ms TO:820.022000ms--java
    301  11080  10  ffff882936c4c200  IN   5.0 51334600 13329952  --IO:72.305000ms UC:666.907000ms TO:818.050000ms--java
    302  10765   4  ffff881615fce300  IN   4.6 50726924 12236960  --IO:71.338000ms UC:666.138000ms TO:819.704000ms--java
    303  10688  13  ffff8826c1dd6300  IN   7.7 64202900 20482612  --IO:65.100000ms UC:605.082000ms TO:741.355000ms--rocksdb:high3
    304  10688  29  ffff8826c1dd4200  IN   7.7 64202900 20482612  --IO:65.154000ms UC:605.144000ms TO:741.154000ms--rocksdb:high4
    305  10688  12  ffff8816867f2100  IN   7.7 64202900 20482612  --IO:65.081000ms UC:604.962000ms TO:740.791000ms--rocksdb:high5
    306  10688  13  ffff8816ef3a9080  IN   7.7 64202900 20482612  --IO:64.864000ms UC:604.935000ms TO:740.851000ms--rocksdb:high6
    307  10688  18  ffff8817413e0000  IN   7.7 64202900 20482612  --IO:64.972000ms UC:604.859000ms TO:740.734000ms--rocksdb:high7
    308  11080   3  ffff8828a2e72100  IN   5.0 51334600 13329952  --IO:71.905000ms UC:666.892000ms TO:817.826000ms--java
    309  10688  26  ffff8817413e4200  IN   7.7 64202900 20482612  --IO:65.026000ms UC:604.982000ms TO:741.286000ms--java

where TO is the total time spent by get_task_mem_usage, IO is time spent on IO operation within get_task_mem_usage, UC is time spent on decompression of dump pages within get_task_mem_usage. The provided vmcore is compressed by zlib.

We can see from the chart, zlib decompression takes the most share of time(~ 80%). So the real performance bottle neck is the zlib decompress operation.

Currently there is an attempt to introduce zstd into makedumpfile and crash-utility. zstd has better performance on decompression. So let's hold this bz for a while, after bz1988894 and bz1988895 introduced zstd, then let's see how the performance goes.

Comment 8 ltao 2021-08-22 10:38:47 UTC
Although the zstd patch is not ready, but I made a preview zstd build of crash and makedumpfile, the result is as follows:

zstd:
    300  10765   4  ffff8816603d2100  IN   4.6 50726924 12236960  --IO:65.617000ms UC:134.430000ms TO:335.789000ms--java
    301  11080  10  ffff882936c4c200  IN   5.0 51334600 13329952  --IO:65.305000ms UC:135.441000ms TO:329.671000ms--java
    302  10765   4  ffff881615fce300  IN   4.6 50726924 12236960  --IO:65.902000ms UC:136.298000ms TO:338.498000ms--java
    303  10688  13  ffff8826c1dd6300  IN   7.7 64202900 20482612  --IO:58.565000ms UC:119.940000ms TO:293.834000ms--rocksdb:high3
    304  10688  29  ffff8826c1dd4200  IN   7.7 64202900 20482612  --IO:60.134000ms UC:123.286000ms TO:301.443000ms--rocksdb:high4
    305  10688  12  ffff8816867f2100  IN   7.7 64202900 20482612  --IO:57.937000ms UC:120.782000ms TO:293.018000ms--rocksdb:high5
    306  10688  13  ffff8816ef3a9080  IN   7.7 64202900 20482612  --IO:59.598000ms UC:122.807000ms TO:298.603000ms--rocksdb:high6
    307  10688  18  ffff8817413e0000  IN   7.7 64202900 20482612  --IO:60.480000ms UC:125.481000ms TO:303.518000ms--rocksdb:high7
    308  11080   3  ffff8828a2e72100  IN   5.0 51334600 13329952  --IO:73.477000ms UC:156.957000ms TO:379.909000ms--java
    309  10688  26  ffff8817413e4200  IN   7.7 64202900 20482612  --IO:59.791000ms UC:121.758000ms TO:298.450000ms--java
    310  11080  29  ffff882818394200  IN   5.0 51334600 13329952  --IO:66.661000ms UC:137.595000ms TO:337.225000ms--java

zlib:
    300  10765   4  ffff8816603d2100  IN   4.6 50726924 12236960  --IO:69.370000ms UC:368.684000ms TO:582.293000ms--java
    301  11080  10  ffff882936c4c200  IN   5.0 51334600 13329952  --IO:66.690000ms UC:354.471000ms TO:551.126000ms--java
    302  10765   4  ffff881615fce300  IN   4.6 50726924 12236960  --IO:68.033000ms UC:358.157000ms TO:565.365000ms--java
    303  10688  13  ffff8826c1dd6300  IN   7.7 64202900 20482612  --IO:60.599000ms UC:322.041000ms TO:500.626000ms--rocksdb:high3
    304  10688  29  ffff8826c1dd4200  IN   7.7 64202900 20482612  --IO:61.274000ms UC:323.296000ms TO:501.506000ms--rocksdb:high4
    305  10688  12  ffff8816867f2100  IN   7.7 64202900 20482612  --IO:61.183000ms UC:320.451000ms TO:497.747000ms--rocksdb:high5
    306  10688  13  ffff8816ef3a9080  IN   7.7 64202900 20482612  --IO:60.073000ms UC:321.479000ms TO:496.462000ms--rocksdb:high6
    307  10688  18  ffff8817413e0000  IN   7.7 64202900 20482612  --IO:60.322000ms UC:322.834000ms TO:498.485000ms--rocksdb:high7
    308  11080   3  ffff8828a2e72100  IN   5.0 51334600 13329952  --IO:69.348000ms UC:357.250000ms TO:557.994000ms--java
    309  10688  26  ffff8817413e4200  IN   7.7 64202900 20482612  --IO:63.237000ms UC:326.888000ms TO:510.211000ms--java
    310  11080  29  ffff882818394200  IN   5.0 51334600 13329952  --IO:68.812000ms UC:358.102000ms TO:558.209000ms--java

zstd does have better decompression performance.
As for the UC time difference between comment 7 and comment 8, I guess it's related to different system loading. I will do more testing on
time measurement, but the above chart can give us a glimpse about its better performance...

Comment 28 lijiang 2022-01-30 05:32:35 UTC
Upstream patch: d52cccfaa96e ("Improve the ps performance for vmcores with large number of threads")

Comment 36 errata-xmlrpc 2022-05-10 13:55:49 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (crash bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:1864