Bug 1938323
| Summary: | Wifi settings panel causes gnome-control-panel to hang | |||
|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 8 | Reporter: | Andrew Mike <amike> | |
| Component: | gnome-control-center | Assignee: | Benjamin Berg <bberg> | |
| gnome-control-center sub component: | Network and Wi-Fi panels | QA Contact: | Filip Pokryvka <fpokryvk> | |
| Status: | CLOSED ERRATA | Docs Contact: | ||
| Severity: | high | |||
| Priority: | high | CC: | alanm, bberg, brclark, casantos, cgarnach, feborges, fpokryvk, it, jesse.thilo, jkoten, jwright, mkielian, mkolbas, redhat-bugzilla, sbarcomb, srinidhi.s, tpopela | |
| Version: | 8.3 | Keywords: | Triaged, ZStream | |
| Target Milestone: | rc | Flags: | pm-rhel:
mirror+
|
|
| Target Release: | --- | |||
| Hardware: | All | |||
| OS: | Linux | |||
| Whiteboard: | ||||
| Fixed In Version: | gnome-control-center-3.28.2-29.el8 | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | ||
| Clone Of: | ||||
| : | 2024646 2024649 2024650 2024706 2024707 (view as bug list) | Environment: | ||
| Last Closed: | 2022-05-10 14:03:16 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | ||||
| Bug Blocks: | 2024649, 2024650, 2024706, 2024707 | |||
|
Description
Andrew Mike
2021-03-12 19:42:15 UTC
This issue has been fixed upstream. Unfortunately, the fix is really involved and backporting it to RHEL 8 would effectively require updating the network configuration code in its entirety. So, my take is that this is a "wontfix". Unless someone makes a good case for updating the whole network configuration part of gnome-control-center (and it turns out that this is indeed feasible to do). Uhh, wait, I just read the description again. So, my comment was in relation to the fact that updating the list of wifi connections is really slow (CPU time wise). However: dbus-daemon[1546]: [system] The maximum number of pending replies for ":1.33663" (uid=1000 pid=41349 comm="gnome-control-center wifi " label="unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023") has been reached (max_replies_per_connection=1024) means there is something else going on. So the system bus is getting flooded. But, I am not sure what is happening here … Hey, @bberg, We're encountering the same issue with RHEL 8.3. This issue has become critical for us. 1. Is there any configurational Workaround to overcome it? 2. In your last comment "2021-03-16 10:48:35 UTC", you mentioned you might found something? What is the status of that? When should we expect a fix for RHEL 8.3 for this issue? In this current state, RHEL 8.3 has become a blocker for our product, and moving to a different RHEL version (8.4) is not an option due to this specific issue. We need at least a Workaround (better a fix) to continue with RHEL 8.3. Please advise. The proper fix for this is likely to update gnome-control-center so that it processes requests at a reasonable rate. This will happen in RHEL 9 only, with a low chance of a fix going into RHEL 8.5. You might be able to work around the issue by modifying /etc/dbus-1/system.conf, and adding (in between the <busconfig></busconfig> tags): <limit name="max_replies_per_connection">4096</limit> (or adding a file to /etc/dbus1-/system.d/ with the appropriate line wrapped by the "busconfig" tag). Reboot after changing the configuration. However, it is not clear that this will fix the issue, as there might be a priority inversion issue where processing the replies has a lower priority than updating the UI. Please try to change the configuration and see if the issue still occurs afterwards. Thank you @bberg for the fast reply. I'll try the workaround and will update you accordingly. @bberg, I checked the configuration you gave me. Unfortunately, it didn't work. What I found, after playing with the Wifi window, is that it hangs in different scenarios, for example, - Passing between Wifi Networks, where the to destination network "requires authentication" - it takes ~20 seconds to display the password windows. - Forgetting a WiFi network while connecting to it. Can you help me understand how can I debug this issue and to collect more valuable logs? Are you in a really busy environment with a lot of APs? The old g-c-c code was inefficient at handling the AP list, we are talking of quadratic algorithms here doing UI updates. As such, if you are in a busy environment, then I totally expect the gnome-control-center window to freeze for periods of times and become almost unusable. If this is not a busy environment, then there might be another issue (in particular, things continuing after 25s sounds like a timeout on DBus). If this is a DBus timeout, then you should see error messages in the output of g-c-c (or the logs). You can also try to run "G_MESSAGES_DEBUG=all gnome-control-center" in the terminal which could give you more information about what is going on. @bberg, we're detecting ~25 AP. Maybe more. Just to clarify, RHEL 8.2 is working as expected and is able to provide a good user experience without any major hangs. The issue is only in RHEL 8.3 (looks like a degradation). I'll try to enable the debug you mentioned (thanks for your reply) and to see if I can find something. Any help from your side to assist with this issue will be appreciated. As I mentioned, it is a critical issue as the UX is horrible. Hey @bberg, I need your help understanding my findings. - After enabling the debug on gnome_control_center I noticed that it's in-kind of a loop adding the same AP (~33 in my environment) to its list. - Then I moved to gdb and saw it is signaled to update the list every second or so. The "populate_ap_list" is called over and over again. #0 0x00007f61d936ae83 in g_hash_table_lookup () from /lib64/libglib-2.0.so.0 #1 0x00007f61d9673ce6 in g_signal_emit_valist () from /lib64/libgobject-2.0.so.0 #2 0x00007f61d9675043 in g_signal_emit () from /lib64/libgobject-2.0.so.0 #3 0x00007f61d965cdf4 in g_object_dispatch_properties_changed () from /lib64/libgobject-2.0.so.0 #4 0x00007f61d965f3f1 in g_object_notify_by_pspec () from /lib64/libgobject-2.0.so.0 #5 0x00007f61d8d9c810 in gtk_widget_init () from /lib64/libgtk-3.so.0 #6 0x00007f61d967b8ad in g_type_create_instance () from /lib64/libgobject-2.0.so.0 #7 0x00007f61d965dfc8 in g_object_new_internal () from /lib64/libgobject-2.0.so.0 #8 0x00007f61d965ff4e in g_object_new_valist () from /lib64/libgobject-2.0.so.0 #9 0x00007f61d96602ad in g_object_new () from /lib64/libgobject-2.0.so.0 #10 0x0000560e781742d3 in make_row.constprop () #11 0x0000560e781753e3 in populate_ap_list () #12 0x00007f61d96583bd in g_closure_invoke () from /lib64/libgobject-2.0.so.0 #13 0x00007f61d966b945 in signal_emit_unlocked_R () from /lib64/libgobject-2.0.so.0 #14 0x00007f61d9674a06 in g_signal_emit_valist () from /lib64/libgobject-2.0.so.0 #15 0x00007f61d9675043 in g_signal_emit () from /lib64/libgobject-2.0.so.0 #16 0x00007f61d965cdf4 in g_object_dispatch_properties_changed () from /lib64/libgobject-2.0.so.0 #17 0x00007f61d965c72e in g_object_notify_queue_thaw () from /lib64/libgobject-2.0.so.0 #18 0x00007f61d965f47b in g_object_thaw_notify () from /lib64/libgobject-2.0.so.0 #19 0x00007f61d456deb3 in _dbus_handle_changes_commit () from /lib64/libnm.so.0 #20 0x00007f61d45708f9 in _dbus_managed_objects_changed_cb () from /lib64/libnm.so.0 #21 0x00007f61d995d8d8 in emit_signal_instance_in_idle_cb () from /lib64/libgio-2.0.so.0 #22 0x00007f61d9378fbb in g_idle_dispatch () from /lib64/libglib-2.0.so.0 #23 0x00007f61d937c67d in g_main_context_dispatch () from /lib64/libglib-2.0.so.0 #24 0x00007f61d4602416 in _ctx_integ_source_dispatch () from /lib64/libnm.so.0 #25 0x00007f61d937c67d in g_main_context_dispatch () from /lib64/libglib-2.0.so.0 #26 0x00007f61d937ca48 in g_main_context_iterate.isra () from /lib64/libglib-2.0.so.0 #27 0x00007f61d937cae0 in g_main_context_iteration () from /lib64/libglib-2.0.so.0 #28 0x00007f61d99450ed in g_application_run () from /lib64/libgio-2.0.so.0 - Finally, I added a DEBUG log level to the Network manager and found it is explicitly requested to perform an AP Scan [user@lp0032 ~]$ journalctl -u NetworkManager -b -f | grep -v "wifi-ap" | grep wifi-scan | grep -i start Jul 28 14:21:10 lp0032 NetworkManager[1650]: <debug> [1627471270.8247] device[a40af747817df8c8] (wlp0s20f3): wifi-scan: start explicit scan (0 SSIDs to probe scan) Jul 28 14:21:11 lp0032 NetworkManager[1650]: <debug> [1627471271.0614] device[a40af747817df8c8] (wlp0s20f3): wifi-scan: start explicit scan (0 SSIDs to probe scan) Jul 28 14:21:12 lp0032 NetworkManager[1650]: <debug> [1627471272.5694] device[a40af747817df8c8] (wlp0s20f3): wifi-scan: start explicit scan (0 SSIDs to probe scan) Jul 28 14:21:14 lp0032 NetworkManager[1650]: <debug> [1627471274.0783] device[a40af747817df8c8] (wlp0s20f3): wifi-scan: start explicit scan (0 SSIDs to probe scan) Jul 28 14:21:15 lp0032 NetworkManager[1650]: <debug> [1627471275.5862] device[a40af747817df8c8] (wlp0s20f3): wifi-scan: start explicit scan (0 SSIDs to probe scan) Jul 28 14:21:17 lp0032 NetworkManager[1650]: <debug> [1627471277.0964] device[a40af747817df8c8] (wlp0s20f3): wifi-scan: start explicit scan (0 SSIDs to probe scan) Jul 28 14:21:18 lp0032 NetworkManager[1650]: <debug> [1627471278.6063] device[a40af747817df8c8] (wlp0s20f3): wifi-scan: start explicit scan (0 SSIDs to probe scan) Jul 28 14:21:20 lp0032 NetworkManager[1650]: <debug> [1627471280.1134] device[a40af747817df8c8] (wlp0s20f3): wifi-scan: start explicit scan (0 SSIDs to probe scan) Jul 28 14:21:21 lp0032 NetworkManager[1650]: <debug> [1627471281.6188] device[a40af747817df8c8] (wlp0s20f3): wifi-scan: start explicit scan (0 SSIDs to probe scan) Jul 28 14:21:23 lp0032 NetworkManager[1650]: <debug> [1627471283.1263] device[a40af747817df8c8] (wlp0s20f3): wifi-scan: start explicit scan (0 SSIDs to probe scan) Jul 28 14:21:24 lp0032 NetworkManager[1650]: <debug> [1627471284.6284] device[a40af747817df8c8] (wlp0s20f3): wifi-scan: start explicit scan (0 SSIDs to probe scan) Jul 28 14:21:26 lp0032 NetworkManager[1650]: <debug> [1627471286.1344] device[a40af747817df8c8] (wlp0s20f3): wifi-scan: start explicit scan (0 SSIDs to probe scan) Jul 28 14:21:27 lp0032 NetworkManager[1650]: <debug> [1627471287.6474] device[a40af747817df8c8] (wlp0s20f3): wifi-scan: start explicit scan (0 SSIDs to probe scan) Jul 28 14:21:29 lp0032 NetworkManager[1650]: <debug> [1627471289.1594] device[a40af747817df8c8] (wlp0s20f3): wifi-scan: start explicit scan (0 SSIDs to probe scan) Jul 28 14:21:30 lp0032 NetworkManager[1650]: <debug> [1627471290.6673] device[a40af747817df8c8] (wlp0s20f3): wifi-scan: start explicit scan (0 SSIDs to probe scan) Jul 28 14:21:32 lp0032 NetworkManager[1650]: <debug> [1627471292.1833] device[a40af747817df8c8] (wlp0s20f3): wifi-scan: start explicit scan (0 SSIDs to probe scan) Jul 28 14:21:33 lp0032 NetworkManager[1650]: <debug> [1627471293.6914] device[a40af747817df8c8] (wlp0s20f3): wifi-scan: start explicit scan (0 SSIDs to probe scan) Jul 28 14:21:35 lp0032 NetworkManager[1650]: <debug> [1627471295.1915] device[a40af747817df8c8] (wlp0s20f3): wifi-scan: start explicit scan (0 SSIDs to probe scan) Jul 28 14:21:36 lp0032 NetworkManager[1650]: <debug> [1627471296.6921] device[a40af747817df8c8] (wlp0s20f3): wifi-scan: start explicit scan (0 SSIDs to probe scan) Jul 28 14:21:38 lp0032 NetworkManager[1650]: <debug> [1627471298.1927] device[a40af747817df8c8] (wlp0s20f3): wifi-scan: start explicit scan (0 SSIDs to probe scan) Jul 28 14:21:39 lp0032 NetworkManager[1650]: <debug> [1627471299.6934] device[a40af747817df8c8] (wlp0s20f3): wifi-scan: start explicit scan (0 SSIDs to probe scan) Jul 28 14:21:41 lp0032 NetworkManager[1650]: <debug> [1627471301.1938] device[a40af747817df8c8] (wlp0s20f3): wifi-scan: start explicit scan (0 SSIDs to probe scan) Jul 28 14:21:42 lp0032 NetworkManager[1650]: <debug> [1627471302.6936] device[a40af747817df8c8] (wlp0s20f3): wifi-scan: start explicit scan (0 SSIDs to probe scan) Jul 28 14:21:44 lp0032 NetworkManager[1650]: <debug> [1627471304.1938] device[a40af747817df8c8] (wlp0s20f3): wifi-scan: start explicit scan (0 SSIDs to probe scan) Jul 28 14:21:45 lp0032 NetworkManager[1650]: <debug> [1627471305.6941] device[a40af747817df8c8] (wlp0s20f3): wifi-scan: start explicit scan (0 SSIDs to probe scan) Jul 28 14:21:47 lp0032 NetworkManager[1650]: <debug> [1627471307.1953] device[a40af747817df8c8] (wlp0s20f3): wifi-scan: start explicit scan (0 SSIDs to probe scan) Jul 28 14:21:48 lp0032 NetworkManager[1650]: <debug> [1627471308.6964] device[a40af747817df8c8] (wlp0s20f3): wifi-scan: start explicit scan (0 SSIDs to probe scan) Jul 28 14:21:50 lp0032 NetworkManager[1650]: <debug> [1627471310.1975] device[a40af747817df8c8] (wlp0s20f3): wifi-scan: start explicit scan (0 SSIDs to probe scan) Jul 28 14:21:51 lp0032 NetworkManager[1650]: <debug> [1627471311.6980] device[a40af747817df8c8] (wlp0s20f3): wifi-scan: start explicit scan (0 SSIDs to probe scan) Jul 28 14:21:53 lp0032 NetworkManager[1650]: <debug> [1627471313.1989] device[a40af747817df8c8] (wlp0s20f3): wifi-scan: start explicit scan (0 SSIDs to probe scan) Jul 28 14:21:54 lp0032 NetworkManager[1650]: <debug> [1627471314.6995] device[a40af747817df8c8] (wlp0s20f3): wifi-scan: start explicit scan (0 SSIDs to probe scan) Jul 28 14:21:56 lp0032 NetworkManager[1650]: <debug> [1627471316.1996] device[a40af747817df8c8] (wlp0s20f3): wifi-scan: start explicit scan (0 SSIDs to probe scan) Jul 28 14:21:57 lp0032 NetworkManager[1650]: <debug> [1627471317.6993] device[a40af747817df8c8] (wlp0s20f3): wifi-scan: start explicit scan (0 SSIDs to probe scan) Jul 28 14:21:59 lp0032 NetworkManager[1650]: <debug> [1627471319.1995] device[a40af747817df8c8] (wlp0s20f3): wifi-scan: start explicit scan (0 SSIDs to probe scan) Jul 28 14:22:00 lp0032 NetworkManager[1650]: <debug> [1627471320.6995] device[a40af747817df8c8] (wlp0s20f3): wifi-scan: start explicit scan (0 SSIDs to probe scan) Jul 28 14:22:02 lp0032 NetworkManager[1650]: <debug> [1627471322.2003] device[a40af747817df8c8] (wlp0s20f3): wifi-scan: start explicit scan (0 SSIDs to probe scan) Jul 28 14:22:03 lp0032 NetworkManager[1650]: <debug> [1627471323.7005] device[a40af747817df8c8] (wlp0s20f3): wifi-scan: start explicit scan (0 SSIDs to probe scan) Jul 28 14:22:05 lp0032 NetworkManager[1650]: <debug> [1627471325.2011] device[a40af747817df8c8] (wlp0s20f3): wifi-scan: start explicit scan (0 SSIDs to probe scan) > populate_ap_list
Yeah, that is the n^4 inefficiency that I was talking about. And it seems like we request more and more scans even though the earlier ones are not finished yet.
There might be workarounds, but it is going to be ugly to come up with one in this case. And pulling the proper fix into a z-stream is definitely not possible.
OK, you can try the following build: https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=38477871 This pushes the AP list refresh into an idle handler. It'll still burn CPU, but hopefully it is smooth and able to actually process all information. @bberg, thank you for the reply. - It seems the link is broken. I don't have access to this link. - Please note that I'm using gnome-control-center-3.28.2-22.el8.x86_64, can it be that it missing a small fix that you made in g-c-c - https://gitlab.gnome.org/GNOME/gnome-control-center/uploads/0e7490f6dcd2273e7f7f38c14366f9c2/0001-wifi-Only-cancel-pending-scan-if-wireless-is-disable.patch Right, of course, the URL is only available from within Red Hat, sorry about that. https://fedorapeople.org/~benzeagnome-control-center-3.28.2-22.idle.el8_3.x86_64.rpm Hmm, I don't remember all the details of the bug. At the end, it led to rewriting the whole UI code. Seems like https://gitlab.gnome.org/GNOME/gnome-control-center/-/merge_requests/287 is basically what I now did here (though, re-done independently …). Sorry, the correct URL is https://fedorapeople.org/~benzea/gnome-control-center-3.28.2-22.idle.el8_3.x86_64.rpm (forgot the /). Thanks a lot @bberg . The fix is working great. The Wifi windows UX is back to normal. The issue I have is only during the RPM installation. Due to the "idle" version suffix, when I install it, it requires all its dependencies with the "idle" suffix. In my environment I used "rpm --nodepts" to remove the old package and to install the new one. We wish to install it as part of our current ansible-playbook. Would it be possible to provide an RPM without that prefix to avoid the dependencies prefix mismatch? We wish to avoid patchy workarounds same as I did to install it on my local PC. Hey @bberg, would you be able to provide the aforementioned RPM build? Sure, https://fedorapeople.org/~benzea/gnome-control-center-3.28.2-22.el8_3.x86_64.rpm Setting devel_ack, as I think the patch is good to be pushed into z-stream. Thank a lot bberg This needs to be fixed in RHEL 8.6. Is a backport to z-stream of earlier RHEL versions desired here? Hey @bberg, 2 days ago I downloaded RHEL 8.5 to see if the fix there and saw that the issues reproduce, and now I see your update. So first of all, thanks for the update and clarification. As for your question, the answer is that we are eager to get a fix for RHEL 8.3. The private RPM fix that you kindly provided still depends on other g-c-c dependencies RPM from the current private build. In order to install our private RPM, we had to perform some "hack" by removing the old g-c-c RPM while keeping its dependencies and installing your private RPM while avoiding its dependencies installation. In that way, we replace one RPM and another RPM while avoiding their dependencies. It is not something that we wish to perform to our product outside dev/testing environment. Would it be possible to backport it to RHEL 8.3 z-stream? I think we'll need a bug for each distro. I can do appropriate clones. Right, my bad. Someone else needs to decide on whether this is backported or not. Hey @bberg and @jkoten Thanks for pushing it. This becomes a blocking issue for us. I noticed a backport requires only RHEL 8.4 & RHEL 8.5. We need it for RHEL 8.3. In order for us to upgrade to RHEL 8.[4,5] is a huge effort, hence we will highly appreciate it if you backport it also for RHEL 8.3. Thanks in advance for all your efforts here. Good morning, @bberg and @jkoten -
First of all, thank you very much for your responsivness.
When we try to install the supplied RPM (gnome-control-center-3.28.2-22.el8_3.x86_64.rpm) on a 8.3 system we run into a slight dependncy issue:
Error:
Problem: conflicting requests
- nothing provides gnome-control-center-filesystem = 3.28.2-22.el8_3 needed by gnome-control-center-3.28.2-22.el8_3.x86_64
while the available version for the gnome-control-center-filesystem package is close (but no cigar):
$ rpm --query gnome-control-center-filesystem
gnome-control-center-filesystem-3.28.2-22.el8.noarch
a) Is the difference of el8_3 != el8, a real difference or is it just a build issue that can be rectified with a re-building of the RPM?
b) What are the ramifications of this conflict? If we "force" the dependency are we likely to run into any compatability issues?
c) Are we correct in assuming that the fix is in the scanning algorithm and not in the interface between the components?
Thank you in advance
Yaacov
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (gnome-control-center bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2022:1870 Cu is still seeing slowdown in wi-fi scanning |