Bug 1716691

Summary: [abrt] gnome-initial-setup: __freelocale(): gnome-initial-setup killed by SIGSEGV (when a locale is unexpectedly unavailable)
Product: [Fedora] Fedora Reporter: Adam Williamson <awilliam>
Component: gnome-initial-setupAssignee: Rui Matos <tiagomatos>
Status: CLOSED UPSTREAM QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: rawhideCC: fweimer, gnome-sig, jstpierr, mcatanzaro+wrong-account-do-not-cc, petersen, tiagomatos
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Unspecified   
Whiteboard: abrt_hash:996776c267be14dad92a9ca4e0fd46b4dc36dfe1;VARIANT_ID=workstation; openqa
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-06-04 21:24:28 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1716710    
Bug Blocks:    
Attachments:
Description Flags
File: backtrace
none
File: cgroup
none
File: core_backtrace
none
File: cpuinfo
none
File: dso_list
none
File: environ
none
File: exploitable
none
File: limits
none
File: maps
none
File: mountinfo
none
File: open_fds
none
File: proc_pid_status none

Description Adam Williamson 2019-06-03 23:05:48 UTC
Description of problem:
Happens during openQA upgrade tests; these involve upgrading a system from F29 or F30 to F31 *before* g-i-s has run, so g-i-s runs for the first time *after* the upgrade has completed. When the upgrade has completed and we boot the upgraded system, it seems g-i-s crashes in this way.

Version-Release number of selected component:
gnome-initial-setup-3.33.2-1.fc31

Additional info:
reporter:       libreport-2.10.0
backtrace_rating: 4
cmdline:        /usr/libexec/gnome-initial-setup --existing-user
crash_function: __freelocale
executable:     /usr/libexec/gnome-initial-setup
journald_cursor: s=75140769fa4c44a8a68910d0e147b588;i=3fec;b=188b0b3d393e42d182357eb76fb05abf;m=4e38c1d;t=58a6ca463a107;x=e7ec9e87372d6c06
kernel:         5.2.0-0.rc2.git1.2.fc31.x86_64
rootdir:        /
runlevel:       N 5
type:           CCpp
uid:            1000

Truncated backtrace:
Thread no. 1 (10 frames)
 #0 __freelocale at freelocale.c:44
 #1 welcome at ../gnome-initial-setup/pages/language/gis-welcome-widget.c:128
 #2 fill_stack at ../gnome-initial-setup/pages/language/gis-welcome-widget.c:171
 #3 gis_welcome_widget_constructed at ../gnome-initial-setup/pages/language/gis-welcome-widget.c:189
 #4 g_object_new_internal at ../gobject/gobject.c:1867
 #6 _gtk_builder_construct at gtkbuilder.c:718
 #7 builder_construct at gtkbuilderparser.c:139
 #9 end_element at gtkbuilderparser.c:1075
 #10 emit_end_element at ../glib/gmarkup.c:1093
 #11 g_markup_parse_context_parse at ../glib/gmarkup.c:1643

Comment 1 Adam Williamson 2019-06-03 23:05:50 UTC
Created attachment 1576777 [details]
File: backtrace

Comment 2 Adam Williamson 2019-06-03 23:05:51 UTC
Created attachment 1576778 [details]
File: cgroup

Comment 3 Adam Williamson 2019-06-03 23:05:52 UTC
Created attachment 1576779 [details]
File: core_backtrace

Comment 4 Adam Williamson 2019-06-03 23:05:53 UTC
Created attachment 1576780 [details]
File: cpuinfo

Comment 5 Adam Williamson 2019-06-03 23:05:54 UTC
Created attachment 1576781 [details]
File: dso_list

Comment 6 Adam Williamson 2019-06-03 23:05:55 UTC
Created attachment 1576782 [details]
File: environ

Comment 7 Adam Williamson 2019-06-03 23:05:56 UTC
Created attachment 1576783 [details]
File: exploitable

Comment 8 Adam Williamson 2019-06-03 23:05:57 UTC
Created attachment 1576784 [details]
File: limits

Comment 9 Adam Williamson 2019-06-03 23:05:59 UTC
Created attachment 1576785 [details]
File: maps

Comment 10 Adam Williamson 2019-06-03 23:05:59 UTC
Created attachment 1576786 [details]
File: mountinfo

Comment 11 Adam Williamson 2019-06-03 23:06:00 UTC
Created attachment 1576787 [details]
File: open_fds

Comment 12 Adam Williamson 2019-06-03 23:06:01 UTC
Created attachment 1576788 [details]
File: proc_pid_status

Comment 13 Adam Williamson 2019-06-03 23:07:16 UTC
Problem seems to be that locale is null, I think?

Comment 14 Adam Williamson 2019-06-03 23:13:43 UTC
This looks a lot like it's probably caused by this commit by mcatanzaro:

https://gitlab.gnome.org/GNOME/gnome-initial-setup/commit/83696b544e241293233fa7eb09a0113d722a89e9

so, CCing him.

Comment 15 Adam Williamson 2019-06-04 00:22:12 UTC
I think https://bugzilla.redhat.com/show_bug.cgi?id=1715891 may have caused this. This bug seems to have shown up first in the Fedora-Rawhide-20190603.n.0 tests, and that was the compose in which Rawhide went from glibc-2.29.9000-21.fc31 to glibc-2.29.9000-23.fc31 .

Comment 16 Michael Catanzaro 2019-06-04 00:29:06 UTC
Two problems:

 (a) My code is broken for assuming newlocale() succeeds without checking return value. I will change the code to check the return value, and print an appropriate error message when the call fails.
 (b) GNOME expects all available locales to be installed. It appears the crash is caused by the Chinese locale disappearing between June 2 rawhide (last good) and June 3 rawhide (first bad). We currently have no plans to change gnome-initial-setup, gnome-control-center, and other applications that offer language selection widgets (e.g. Epiphany) for them to function properly in the absence of particular locales. If locales are missing I think they would generally just disappear from the list of locales that are available for selection, but certain locales are hardcoded to be presented first, including Chinese, and we have no plans to change that at this time.

Comment 17 Michael Catanzaro 2019-06-04 00:32:30 UTC
BTW to be clear: I will fix (a), but someone who knows how locale installation is supposed to work will have to comment on (b). We don't have any code to install locales in gnome-initial-setup, or (to the best of my knowledge) gnome-control-center, etc.

Comment 18 Adam Williamson 2019-06-04 00:42:42 UTC
Note, this is only happening on the *upgrade* tests. g-i-s is running OK in fresh install tests, e.g. https://openqa.fedoraproject.org/tests/408461 . I'll try and figure out what it is about upgrading specifically that causes this problem.

Comment 19 Adam Williamson 2019-06-04 01:16:22 UTC
Huh, OK, so the cause seems relatively simple: somehow, on upgrade, /usr/lib/locale/locale-archive just goes completely missing.

I just recreated what the openQA test in question does manually: installed Fedora 30 Workstation (from the Everything netinst), then upgraded to Rawhide using dnf system-upgrade . I checked the upgraded system, and it simply has no /usr/lib/locale/locale-archive file at all. `rpm -V glibc-all-langpacks` shows:

missing     /usr/lib/locale/locale-archive

I don't know *why* this happens, yet.

Comment 20 Carlos O'Donell 2019-06-04 01:22:43 UTC
(In reply to Adam Williamson from comment #18)
> Note, this is only happening on the *upgrade* tests. g-i-s is running OK in
> fresh install tests, e.g. https://openqa.fedoraproject.org/tests/408461 .
> I'll try and figure out what it is about upgrading specifically that causes
> this problem.

Do you have glibc-all-langpacks installed?

Only if you have glibc-all-langpacks do you also have /usr/lib/locale/locale-archive which is the mmap'able copy of all the installed locales ready for use by the process.

The change we made was to copy in a complete version of this file without processing or filtering it in any way.

You shouldn't be able to tell the difference.

Are you running processes during %post?

Has rpm somehow not upgraded /usr/lib/locale/locale-archive?

Can you get us a copy of that file for your system?

Comment 21 Adam Williamson 2019-06-04 01:23:08 UTC
Filed https://bugzilla.redhat.com/show_bug.cgi?id=1716710 .

Comment 22 Carlos O'Donell 2019-06-04 01:29:45 UTC
(In reply to Adam Williamson from comment #19)
> Huh, OK, so the cause seems relatively simple: somehow, on upgrade,
> /usr/lib/locale/locale-archive just goes completely missing.
> 
> I just recreated what the openQA test in question does manually: installed
> Fedora 30 Workstation (from the Everything netinst), then upgraded to
> Rawhide using dnf system-upgrade . I checked the upgraded system, and it
> simply has no /usr/lib/locale/locale-archive file at all. `rpm -V
> glibc-all-langpacks` shows:
> 
> missing     /usr/lib/locale/locale-archive
> 
> I don't know *why* this happens, yet.

This may be a transitional issue we didn't consider.

So the old glibc will have a %postun to remove /usr/lib/locale/locale-archive.

The old glibc used to recreate locale-archive via %posttrans.

The new glibc will install locale-archive normally as a normal installed file.

It may be the case that the old glibc's %postun is deleting the new glibc's locale-archive.

Then the new glibc doesn't do anything to reinstall it.

Comment 23 Adam Williamson 2019-06-04 15:11:48 UTC
I wasn't setting a 'depends on' because the crash in g-i-s indicates a bug regardless of the *reason* why the locale couldn't be loaded, just for the record...

Comment 24 Michael Catanzaro 2019-06-04 21:24:28 UTC
I don't think we need to track this separately since the crash will go away as soon as bug #1716710 is fixed, but for the record, I've submitted a fix in https://gitlab.gnome.org/GNOME/gnome-initial-setup/merge_requests/39.