Bug 2178299 - openbox crashing randomly after the upgrade to F38
Summary: openbox crashing randomly after the upgrade to F38
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: openbox
Version: 38
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Miroslav Lichvar
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
: 2190035 2190055 2192188 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2023-03-14 18:42 UTC by Petr Stodulka
Modified: 2023-05-03 02:29 UTC (History)
6 users (show)

Fixed In Version: openbox-3.6.1-22.fc38
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-05-03 02:29:03 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
journalctl-snippet.log (15.51 KB, text/plain)
2023-03-14 18:43 UTC, Petr Stodulka
no flags Details
xsession-errors after the crash (912 bytes, text/plain)
2023-03-14 18:47 UTC, Petr Stodulka
no flags Details
core_backtrace (4.36 KB, text/plain)
2023-03-14 18:52 UTC, Petr Stodulka
no flags Details
coredump.tgz (1.10 MB, application/gzip)
2023-03-14 18:56 UTC, Petr Stodulka
no flags Details
list of downgraded rpms (4.67 KB, text/plain)
2023-03-22 10:30 UTC, Petr Stodulka
no flags Details

Description Petr Stodulka 2023-03-14 18:42:34 UTC
Description of problem:
After the upgrade F36 -> F37-> F38, openbox start to crash randomly, closing the session and getting back to the display manager. Unfortunately I do not have a clear reproducer as the trigger is changing all the time. Following actions triggers the crash randomly:
  * switch between windows
  * open new window (e.g. start firefox)
  * highlight text
  * move a window between displays
  * switch between workspaces

I've tried to limit the number of running application to minimum to reduce the noise for the investigation (e.g. tint2 has been oftenly visible in crash logs, so I stopped using it, still seeing the crashes). Also I put selinux into permissive mode, without any change around the original issue.


Version-Release number of selected component (if applicable):
  * openbox-3.6.1-21.fc38.x86_64

Additional info:
  * Up-to-date F38
  * context menu does not appear in terminal (xterm, terminator)
    after clicking right mouse button
  * Display manager: lightdm
  * 3 active displays
  * Oftenly can be visible the following error log, which is sometimes separated
    by time from crashes, so I guess it's not relevant:
        pam_systemd(login:session): Failed to release session: Access denied
  * Despite the changes in the system, abrt always produce the same crashdump,
    removing other crashes as duplicates. Today, hit 19 chrashes in ~5h.
    Openbox crashes usually in ~2min, but I had lucky ~2h session without issues.
  * Most of the memory is free (64 GB RAM, ~4 GB used)
  * systemd-oomd is disabled & masked

Additional data put in attachments

Comment 1 Petr Stodulka 2023-03-14 18:43:47 UTC
Created attachment 1950731 [details]
journalctl-snippet.log

Comment 2 Petr Stodulka 2023-03-14 18:47:31 UTC
Created attachment 1950743 [details]
xsession-errors after the crash

Comment 3 Petr Stodulka 2023-03-14 18:52:43 UTC
Created attachment 1950744 [details]
core_backtrace

Comment 4 Petr Stodulka 2023-03-14 18:56:18 UTC
Created attachment 1950745 [details]
coredump.tgz

Comment 5 Petr Stodulka 2023-03-22 10:30:52 UTC
Created attachment 1952708 [details]
list of downgraded rpms

I've tried to downgrade number of packages without the success to fix the bug - the list of downgraded packages from the history of dnf transactions in attachment. I tried to downgrade more packages but I've started to hit various additional problems and I had not enough time to continue with experiments in more details, so sharing at least this. (e.g. I haven't tried to downgrade lightdm - just realized now..)


Currently I do not have the system to reproduce the issue as it seems I successfully downgraded back to F37 - I am actually surprised the system survived that and seems to be working as before, without seeing unusual errors.

When I have a time, I will try a clean installation to see whether the same is happening on a clean F38 system.

Comment 6 Miroslav Lichvar 2023-04-27 08:35:28 UTC
*** Bug 2190035 has been marked as a duplicate of this bug. ***

Comment 7 Jonathan Buch 2023-04-27 14:55:53 UTC
The core backtrace

{   "address": 94693188868051
,   "build_id": "07cc0b5ced9a6f556aace94d1c24ce80901e98c7"
,   "build_id_offset": 124883
,   "function_name": "client_calc_layer"
,   "file_name": "/usr/bin/openbox"
}

is exactly the one from bug #2190055 and https://retrace.fedoraproject.org/faf/reports/648708/

I've left a patch which works for me in the other bug.

Comment 8 Miroslav Lichvar 2023-04-27 14:58:45 UTC
*** Bug 2190055 has been marked as a duplicate of this bug. ***

Comment 9 Miroslav Lichvar 2023-04-27 15:02:32 UTC
There is an upstream bug report: https://bugzilla.icculus.org/show_bug.cgi?id=6669

It also has a proposed patch, slightly smaller than the one in our bug #2190055. It seems it wasn't accepted yet (or I don't know which is the right upstream repository). If you can verify that it fixes the issue for you, I can add it to our package.

Comment 10 Jonathan Buch 2023-04-27 15:59:41 UTC
Had a look at the one from
https://bugs.archlinux.org/task/77853
and https://bugzilla-attachments.icculus.org/attachment.cgi?id=3646
and it copies the full list before iterating.

The one with the loop in
https://bbs.archlinux.org/viewtopic.php?id=284299
is too intransparent for me (I can't prove adhoc that it will terminate).

My version only copies part of the list, not the full list, so it might be ever so slightly faster and needs less memory.  However I'm not sure it is NULL-safe around freeing the list.  The documenation for g_list_free does not seem to specify.
However, I've been using my code for a day, seems ok.

The one in
https://bugzilla-attachments.icculus.org/attachment.cgi?id=3647
"repairs" the list after corrupting it.  I can't prove adhoc that the code would be fine in all cases.

Taking the patch from the arch tracker should be good I guess.

Comment 11 Fedora Update System 2023-04-27 16:50:47 UTC
FEDORA-2023-6bf51a3399 has been submitted as an update to Fedora 38. https://bodhi.fedoraproject.org/updates/FEDORA-2023-6bf51a3399

Comment 12 Fedora Update System 2023-04-28 04:37:23 UTC
FEDORA-2023-6bf51a3399 has been pushed to the Fedora 38 testing repository.
Soon you'll be able to install the update with the following command:
`sudo dnf upgrade --enablerepo=updates-testing --refresh --advisory=FEDORA-2023-6bf51a3399`
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2023-6bf51a3399

See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

Comment 13 Miroslav Lichvar 2023-05-02 07:27:28 UTC
*** Bug 2192188 has been marked as a duplicate of this bug. ***

Comment 14 Fedora Update System 2023-05-03 02:29:03 UTC
FEDORA-2023-6bf51a3399 has been pushed to the Fedora 38 stable repository.
If problem still persists, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.