Bug 1566983
Summary: | Polkitd: The utils_spawn_data_free reap timeout subprocess did not work resulting in a large number of zombie processes | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Li Ning <lining916740672> | ||||||||
Component: | polkit | Assignee: | Jan Rybar <jrybar> | ||||||||
Status: | CLOSED EOL | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||||
Severity: | medium | Docs Contact: | |||||||||
Priority: | unspecified | ||||||||||
Version: | 28 | CC: | lining916740672 | ||||||||
Target Milestone: | --- | ||||||||||
Target Release: | --- | ||||||||||
Hardware: | All | ||||||||||
OS: | Linux | ||||||||||
Whiteboard: | |||||||||||
Fixed In Version: | Doc Type: | If docs needed, set a value | |||||||||
Doc Text: | Story Points: | --- | |||||||||
Clone Of: | Environment: | ||||||||||
Last Closed: | 2019-05-28 19:23:55 UTC | Type: | Bug | ||||||||
Regression: | --- | Mount Type: | --- | ||||||||
Documentation: | --- | CRM: | |||||||||
Verified Versions: | Category: | --- | |||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||
Embargoed: | |||||||||||
Attachments: |
|
I made a patch to fix this issue. I have did some test,it can fix. Created attachment 1433062 [details]
0001-polkitd-make-sure-child-process-exits-will-be-proces.patch
I made a better and simpler patch to make sure child process exits will be processed.
This patch made 3 timeout source.
The 1st one will send SIGTERM at 10s,
2nd one will send SIGKILL at 15s,
last one quit the main loop.
Once child process exit and child watch source was processed , the main loop quit. Otherwise we quit main loop at 20s.
Timer1: 10s send SIGTERM.
Timer2: 15s send SIGKILL
Timer3: 20s exit the mainloop
0 ~ 10s: child exit normally
10 ~ 15s: child exit by SIGTERM
15 ~ 20s: child exit by SIGKILL
20s ~ : child seems to be abnormal. we quit main loop.
Created attachment 1434285 [details]
polkitd-fix-zombie-not-reaped-when-js-spawned-proces.patch
This patch seems to be much better and simpler.
This message is a reminder that Fedora 28 is nearing its end of life. On 2019-May-28 Fedora will stop maintaining and issuing updates for Fedora 28. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as EOL if it remains open with a Fedora 'version' of '28'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version. Thank you for reporting this issue and we are sorry that we were not able to fix it before Fedora 28 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior this bug is closed as described in the policy above. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. Fedora 28 changed to end-of-life (EOL) status on 2019-05-28. Fedora 28 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. If you are unable to reopen this bug, please file a new report against the current release. If you experience problems, please add a comment to this bug. Thank you for reporting this bug and we are sorry it could not be fixed. |
Created attachment 1421280 [details] 0001-add-child-reaper-thread-to-fix-zombies Description of problem: When subprocess running timeout in rules , the subprocess will be zombie. The utils_spawn_data_free reap timeout subprocess did not work , and it result in a large number of zombie processes. utils_spawn_data_free kill SIGTERM to timeout subprocess, and set a child watch source to reap the child, but the child watch source can't work because of the release of it's main_loop and context outside. I paste the key code here. static void js_polkit_spawn() { ... out: g_strfreev (argv); g_free (standard_output); g_free (standard_error); g_clear_object (&data.res); // triger utils_spawn_data_free and set child watch source if (loop != NULL) g_main_loop_unref (loop); // destroy loop if (context != NULL) g_main_context_unref (context);// destroy context return ret; } When the loop and context being destroy, the child watch source didn't work. subprocess exit and become a zombie. Version-Release number of selected component (if applicable): polkitd all version How reproducible: 100% Steps to Reproduce: 1. Add a debug rule , this rule waill run spawn process over 10s and result in a timeout [root@localhost ~]# cat /etc/polkit-1/rules.d/01-test.rules polkit.addRule(function(action, subject) { polkit.log("debug start") try { polkit.spawn(["/usr/bin/sleep", "15"]); } catch (error) { // polkit.log(error) } }); 2. make the rules work 3. Actual results: subprocess become zombie Expected results: no zombies Additional info: [root@localhost ~]# ps -ef |grep polkit |grep -v polkit polkitd 1501 1 0 Mar31 ? 00:02:51 /usr/lib/polkit-1/polkitd --no-debug polkitd 5060 1501 0 12:37 ? 00:00:00 [sleep] <defunct> polkitd 5367 1501 0 12:38 ? 00:00:00 [sleep] <defunct> polkitd 5631 1501 0 12:38 ? 00:00:00 [sleep] <defunct> polkitd 5915 1501 0 12:38 ? 00:00:00 [sleep] <defunct> polkitd 14052 1501 0 12:42 ? 00:00:00 sleep 15 [root@localhost ~]# journalctl -fu polkit -- Logs begin at Sat 2018-03-31 14:36:03 CST. -- Apr 03 12:39:11 2-3 polkitd[1501]: /etc/polkit-1/rules.d/01-test.rules:5: Error: Error spawning helper: Timed out after 10 seconds (g-io-error-quark, 24) Apr 03 12:39:21 2-3 polkitd[1501]: /etc/polkit-1/rules.d/01-test.rules:5: Error: Error spawning helper: Timed out after 10 seconds (g-io-error-quark, 24) Apr 03 12:40:11 2-3 polkitd[1501]: /etc/polkit-1/rules.d/01-test.rules:5: Error: Error spawning helper: Timed out after 10 seconds (g-io-error-quark, 24) Apr 03 12:40:21 2-3 polkitd[1501]: /etc/polkit-1/rules.d/01-test.rules:5: Error: Error spawning helper: Timed out after 10 seconds (g-io-error-quark, 24)