Bug 1436605

Summary: Anaconda stuck in a logging lock + import lock deadlock.
Product: Red Hat Enterprise Linux 7 Reporter: Radek Vykydal <rvykydal>
Component: anacondaAssignee: Anaconda Maintenance Team <anaconda-maint-list>
Status: CLOSED DUPLICATE QA Contact: Release Test Team <release-test-team-automation>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 7.3CC: cstratak, pholica
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-05-11 13:41:03 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1392968    
Attachments:
Description Flags
backtrace of hanging anaconda process none

Description Radek Vykydal 2017-03-28 09:45:49 UTC
Created attachment 1266893 [details]
backtrace of hanging anaconda process

The issue started to appear as a race condition hit when running our local kickstart tests [2] after Gtk rebase during 7.4 development [1], when PyGIWarnings for importing without version specified started to appear. The warnings should be handled (disappear) in scope of bug 1433943 so I think there is high chance the freezing issue will disappear with the warnings, but I am filing this bug for the record as we might try to fix the root cause or refer to the bug when we hit the issue later/again. As can be seen from the attached backtrace, Thread 1 and Thread 4 are in a deadlock.

Thread 1 (main) is holding import lock when importing datetime spoke which triggers PyGIWarning on import from gi repository. The warning is passed to anaconda logging handler which is waiting for logging lock held by Thread 4.

Thread 4 (storage initialization thread) is logging from blivet, a Size object being part of the message. Formatting of the Size object value involves translation which triggers import of locale.normalize in a function of gettext python module which is waiting to acquire import lock held by thread 1.

(Thread 3 is waiting for the logging lock, Thread 2 is waiting for Thread 4 to join)

We are using Python 2.7. For newer versions of Python the issue would be mitigated or fixed by two changes in Python:

- removing lazy/local import "from locale import normalize" from gettext.py (seems to be have been there only for the reason of locale name conflict) - https://github.com/python/cpython/commit/31e87203248047aa99ddbb4165b3b20c758196d8
- per-module import lock in Python 3.3 - https://github.com/python/cpython/blob/d5adb7f65d30afd00921e6c22e9e2b8c323c058d/Doc/whatsnew/3.3.rst#a-finer-grained-import-lock


[1] eg http://download.eng.brq.redhat.com/pub/rhel/nightly/RHEL-7.4-20170323.n.0/compose/Server/x86_64/os/
[2] see https://bugzilla.redhat.com/show_bug.cgi?id=1431618#c0 for details on the environment of kickstart-tests where I am seeing the deadlock

Comment 2 Radek Vykydal 2017-03-28 09:57:25 UTC
Any advice / recommendation from Python side please?

Comment 3 Charalampos Stratakis 2017-03-28 11:04:20 UTC
I don't have free cycles this week to check that, but some days ago I stumbled upon this upstream issue which might be relevant [0]. Could you take a look?

[0] http://bugs.python.org/issue6721

Comment 4 Radek Vykydal 2017-03-29 11:25:13 UTC
(In reply to Charalampos Stratakis from comment #3)

> [0] http://bugs.python.org/issue6721

I think our case is not related to the issues caused by combining threading and forking. Both locked threads are running in a single process.
For the record, removing the local import from _expand_lang as in https://github.com/python/cpython/commit/31e87203248047aa99ddbb4165b3b20c758196d8 does fix the issue for me though it is not a complete solution of the problem.

Comment 5 Radek Vykydal 2017-04-04 13:28:07 UTC
(In reply to Radek Vykydal from comment #0)
> Created attachment 1266893 [details]
> backtrace of hanging anaconda process
> 
> The issue started to appear as a race condition hit when running our local
> kickstart tests [2] after Gtk rebase during 7.4 development [1], when
> PyGIWarnings for importing without version specified started to appear. The
> warnings should be handled (disappear) in scope of bug 1433943 so I think
> there is high chance the freezing issue will disappear with the warnings,

The warnings should disappear in anaconda-21.48.22.107-1
(https://bugzilla.redhat.com/show_bug.cgi?id=1433943#c3)

Comment 6 Samantha N. Bueno 2017-05-10 08:52:29 UTC
*** Bug 1437836 has been marked as a duplicate of this bug. ***

Comment 7 Radek Vykydal 2017-05-11 13:41:03 UTC
(In reply to Radek Vykydal from comment #5)
> (In reply to Radek Vykydal from comment #0)
> > Created attachment 1266893 [details]
> > backtrace of hanging anaconda process
> > 
> > The issue started to appear as a race condition hit when running our local
> > kickstart tests [2] after Gtk rebase during 7.4 development [1], when
> > PyGIWarnings for importing without version specified started to appear. The
> > warnings should be handled (disappear) in scope of bug 1433943 so I think
> > there is high chance the freezing issue will disappear with the warnings,
> 
> The warnings should disappear in anaconda-21.48.22.107-1
> (https://bugzilla.redhat.com/show_bug.cgi?id=1433943#c3)

Marking as a duplicate of the bug 1433943.

*** This bug has been marked as a duplicate of bug 1433943 ***