Bug 1631759 - gcr-prompter segfaults on ppc64le
Summary: gcr-prompter segfaults on ppc64le
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: gcr
Version: 30
Hardware: ppc64le
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Matthias Clasen
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks: PPCTracker
TreeView+ depends on / blocked
 
Reported: 2018-09-21 13:46 UTC by Dan Horák
Modified: 2019-06-17 18:27 UTC (History)
13 users (show)

Fixed In Version: gcr-3.28.1-4.fc30
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-06-17 18:27:38 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
preprocessed source file (616.20 KB, text/plain)
2018-10-15 12:35 UTC, Dan Horák
no flags Details

Description Dan Horák 2018-09-21 13:46:20 UTC
Description of problem:
gcr-prompter segfaults on F-28 ppc64le making it difficult (impossible) to enter the password for ssh keys or similar.

[dan@talos ~]$ coredumpctl info 11654
           PID: 11654 (gcr-prompter)
           UID: 1000 (dan)
           GID: 1000 (dan)
        Signal: 11 (SEGV)
     Timestamp: Fri 2018-09-21 12:27:27 CEST (3h 12min ago)
  Command Line: /usr/libexec/gcr-prompter
    Executable: /usr/libexec/gcr-prompter
 Control Group: /user.slice/user-1000.slice/user/dbus.service
          Unit: user
     User Unit: dbus.service
         Slice: user-1000.slice
     Owner UID: 1000 (dan)
       Boot ID: db6e984a640c44a5ac3baa811d2a5cdb
    Machine ID: d94ac98ea91043d3892dab218d99209d
      Hostname: talos.danny.cz
       Storage: /var/lib/systemd/coredump/core.gcr-prompter.1000.db6e984a640c44a5ac3baa811d2a5cdb.11654.1537525647000000.lz4
       Message: Process 11654 (gcr-prompter) of user 1000 dumped core.
                
                Stack trace of thread 11654:
                #0  0x00007fffb5519990 g_value_object_peek_pointer (libgobject-2.0.so.0)
                #1  0x00007fffb5519990 n/a (libgobject-2.0.so.0)
                #2  0x00007fffb5519990 n/a (libgobject-2.0.so.0)


I see the traceback has very little info, I'll try to investigate it deeper.


Version-Release number of selected component (if applicable):
gcr-3.28.0-1.fc28.ppc64le

How reproducible:
100%

Steps to Reproduce:
1. start seahorse
2. create new keyring
3. crash when before being able to enter the keyring password

Actual results:
segfault

Expected results:
no segfault

Additional info:
https://retrace.fedoraproject.org/faf/reports/bthash/6544e2b1dd64d49db76118d6872698ee3e589a50

Comment 1 Dan Horák 2018-09-21 14:17:37 UTC
journal has these entries when the crash happens

zář 21 15:57:11 talos.danny.cz sudo[15467]: pam_unix(sudo:session): session closed for user root
zář 21 15:57:11 talos.danny.cz audit[15467]: USER_END pid=15467 uid=0 auid=1000 ses=2 subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 msg='op=PAM:session_close grantors=pam_keyinit,pam_limits,pam_keyinit,pam_limits,pam_systemd,pam_unix acct="root" exe="/usr/bin/sudo" hostname=? addr=? terminal=/dev/pts/4 res=success'
zář 21 15:57:11 talos.danny.cz audit[15467]: CRED_DISP pid=15467 uid=0 auid=1000 ses=2 subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 msg='op=PAM:setcred grantors=pam_localuser,pam_unix acct="root" exe="/usr/bin/sudo" hostname=? addr=? terminal=/dev/pts/4 res=success'
zář 21 15:57:13 talos.danny.cz audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=fprintd comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
zář 21 15:57:13 talos.danny.cz systemd[1]: Started man-db-cache-update.service.
zář 21 15:57:13 talos.danny.cz audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=man-db-cache-update comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
zář 21 15:57:13 talos.danny.cz audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=man-db-cache-update comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
zář 21 15:57:13 talos.danny.cz audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=run-r8aa5ebd358204dc5af87014605d7dfb4 comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
zář 21 15:57:19 talos.danny.cz systemd[1]: Started PC/SC Smart Card Daemon.
zář 21 15:57:19 talos.danny.cz audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=pcscd comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
zář 21 15:57:32 talos.danny.cz dbus-daemon[3167]: [session uid=1000 pid=3167] Activating service name='org.gnome.keyring.SystemPrompter' requested by ':1.23' (uid=1000 pid=3149 comm="/usr/bin/gnome-keyring-daemon --daemonize --login " label="unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023")
zář 21 15:57:32 talos.danny.cz kernel: gcr-prompter[15876]: segfault (11) at 9 nip 7fffaae79990 lr 7fffaae79990 code 1 in libgobject-2.0.so.0.5600.1[7fffaae60000+70000]
zář 21 15:57:32 talos.danny.cz kernel: gcr-prompter[15876]: code: 4e800020 00000000 00000000 00000000 39200000 f9230008 4e800020 00000000 
zář 21 15:57:32 talos.danny.cz kernel: gcr-prompter[15876]: code: 00000000 00000000 60000000 60420000 <e8630008> 4e800020 00000000 00000000 
zář 21 15:57:32 talos.danny.cz audit[15876]: ANOM_ABEND auid=1000 uid=1000 gid=1000 ses=3 subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 pid=15876 comm="gcr-prompter" exe="/usr/libexec/gcr-prompter" sig=11 res=1
zář 21 15:57:32 talos.danny.cz audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=systemd-coredump@9-15881-0 comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
zář 21 15:57:32 talos.danny.cz systemd[1]: Started Process Core Dump (PID 15881/UID 0).
zář 21 15:57:32 talos.danny.cz gcr-prompter[15876]: bus acquired: org.gnome.keyring.SystemPrompter
zář 21 15:57:32 talos.danny.cz gcr-prompter[15876]: Gcr: registering prompter
zář 21 15:57:32 talos.danny.cz gcr-prompter[15876]: bus acquired: org.gnome.keyring.PrivatePrompter
zář 21 15:57:32 talos.danny.cz dbus-daemon[3167]: [session uid=1000 pid=3167] Successfully activated service 'org.gnome.keyring.SystemPrompter'
zář 21 15:57:32 talos.danny.cz gcr-prompter[15876]: Gcr: received BeginPrompting call from callback /org/gnome/keyring/Prompt/p18@:1.23
zář 21 15:57:32 talos.danny.cz systemd-coredump[15882]: Process 15876 (gcr-prompter) of user 1000 dumped core.
                                                          
                                                          Stack trace of thread 15876:
                                                          #0  0x00007fffaae79990 g_value_object_peek_pointer (libgobject-2.0.so.0)
                                                          #1  0x00007fffaae79990 n/a (libgobject-2.0.so.0)
                                                          #2  0x00007fffaae79990 n/a (libgobject-2.0.so.0)
zář 21 15:57:32 talos.danny.cz audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=systemd-coredump@9-15881-0 comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
zář 21 15:57:34 talos.danny.cz abrt-server[15891]: Deleting problem directory ccpp-2018-09-21-15:57:33.375018-15876 (dup of ccpp-2018-09-20-22:19:26.574975-27022)
zář 21 15:57:34 talos.danny.cz abrt-notification[15944]: Process 27022 (gcr-prompter) crashed in g_value_object_peek_pointer()

Comment 2 Dan Horák 2018-10-12 13:34:57 UTC
I've spent some time on it and got a bit closer to where it actually crashes.

- start with GCR_PERSIST=yes in env and set some breakpoints

(gdb) where
#0  0x00007ffff68ce044 in g_object_notify (object=0x10058a20 [GcrSystemPrompter], property_name=0x7ffff7e5c3e8 "prompting") at gobject.c:1212
#1  0x00007ffff7e06328 in gcr_system_prompter_dispose (obj=0x10058a20 [GcrSystemPrompter]) at gcr/gcr-system-prompter.c:323
#2  0x00007ffff68cbb44 in g_object_unref (_object=0x10058a20) at gobject.c:3303
#3  0x0000000010001cf8 in main (argc=1, argv=0x7fffffffee18) at ui/gcr-prompter-tool.c:256
(gdb) n
1217<-->  if (!pspec)
(gdb).
1223<-->    g_object_notify_by_spec_internal (object, pspec);
(gdb).

Thread 1 "lt-gcr-prompter" received signal SIGSEGV, Segmentation fault.
g_value_object_peek_pointer (value=0x0) at gobject.c:3808
3808<-->  return value->data[0].v_pointer;

And I was able to reproduce the crash on F-29 too. When gcr-prompter is run from the command line, it attaches to dbus and waits for 10 seconds, then is finishes itself. And at this point it crashes.

But what's more interesting, when run with glib2 rebuilt with -O0, then there is no crash. So I suspect wrong code generated somewhere in glib2 ...

Comment 3 Dan Horák 2018-10-12 13:39:55 UTC
And no more crashes when adding a keyring in seahorse, reassigning to gcc.

Comment 4 Jakub Jelinek 2018-10-12 13:45:59 UTC
Can you please bisect glib2 to find at least which object file in glib2 is problematic (mix -O2 and -O0 glib2 objects at each step)?  Is that gobject.c that shows up above in the backtrace?
Does -fno-strict-aliasing -O2 help?

Comment 5 Dan Horák 2018-10-12 13:57:36 UTC
yes, that's my plan

Comment 6 Dan Horák 2018-10-12 14:15:18 UTC
for the record - upstream adds -fno-strict-aliasing when using gcc - see https://gitlab.gnome.org/GNOME/glib/blob/master/configure.ac#L2652 and https://bugzilla.gnome.org/show_bug.cgi?id=791622

Comment 7 Dan Horák 2018-10-15 12:34:11 UTC
I have bisected the glib code to the g_cclosure_marshal_VOID__VOID() function [1]. When this one is compiled with __attribute__((optimize(("O0")))), then I get no crash from the gcr-prompter tool. It must be -O0, it still segfaults with -O1.

This is with gcc-8.1.1-5.fc28.ppc64le, but I'll give it a try with 8.2 gcc too.

[1] https://gitlab.gnome.org/GNOME/glib/blob/master/gobject/gmarshal.c#L848

Comment 8 Dan Horák 2018-10-15 12:35:25 UTC
Created attachment 1494031 [details]
preprocessed source file

it's compiled with

gcc -DHAVE_CONFIG_H -I. -I.. -DG_LOG_DOMAIN=\"GLib-GObject\" -I.. -I../glib -I../glib -I.. -DG_ENABLE_DEBUG -DGOBJECT_COMPILATION -pthread -Wall -Wstrict-prototypes -Wduplicated-branches -Wmisleading-indentation -Wno-bad-function-cast -Werror=declaration-after-statement -Werror=missing-prototypes -Werror=implicit-function-declaration -Werror=pointer-arith -Werror=init-self -Werror=format=2 -Werror=missing-include-dirs -fvisibility=hidden -O2 -g -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -fexceptions -fstack-protector-strong -grecord-gcc-switches -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -mcpu=power8 -mtune=power8 -funwind-tables -fstack-clash-protection -fno-strict-aliasing -MT libgobject_2_0_la-gmarshal.lo -MD -MP -MF .deps/libgobject_2_0_la-gmarshal.Tpo -c gmarshal.c  -fPIC -DPIC -o .libs/libgobject_2_0_la-gmarshal.o

Comment 9 Florian Weimer 2018-10-15 13:11:54 UTC
(In reply to Dan Horák from comment #7)
> I have bisected the glib code to the g_cclosure_marshal_VOID__VOID()
> function [1]. When this one is compiled with
> __attribute__((optimize(("O0")))), then I get no crash from the gcr-prompter
> tool. It must be -O0, it still segfaults with -O1.
> 
> This is with gcc-8.1.1-5.fc28.ppc64le, but I'll give it a try with 8.2 gcc
> too.
> 
> [1] https://gitlab.gnome.org/GNOME/glib/blob/master/gobject/gmarshal.c#L848

Do you know which functions are called prior to the crash?  I wonder if there is simply a type mismatch between the callback and the signal definition.  If the callback function is defined with additional function arguments, this could lead to problems even if the additional function arguments are not used.

Comment 10 Jakub Jelinek 2018-10-15 17:08:37 UTC
That seems like a rather simple function that unless using LTO isn't really inlined either and don't really see what gcc could miscompile on that.  In the -O0 version that works, can you check in the debugger what callbacks are called through that and as Florian noted, what their prototypes are?

Comment 11 Ben Cotton 2019-05-02 20:38:55 UTC
This message is a reminder that Fedora 28 is nearing its end of life.
On 2019-May-28 Fedora will stop maintaining and issuing updates for
Fedora 28. It is Fedora's policy to close all bug reports from releases
that are no longer maintained. At that time this bug will be closed as
EOL if it remains open with a Fedora 'version' of '28'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 28 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 12 Dan Horák 2019-05-08 17:07:21 UTC
Still there in F-30, I should get back to it, it's annoying :-)

Comment 13 Dan Horák 2019-05-31 17:58:29 UTC
Spent some time today debugging and I'm a bit closer, so writing some notes here. Right now I'm looking at the crash that happens when gcr-prompter finishes after the 10 sec inactivity timeout, this is easiest to reproduce (gdb /usr/libexec/gcr-prompter, run, wait 10 sec, segfault).

After forcing couple gobject private functions "noinline" I get a quite good backtrace

(gdb) where
#0  0x00007ffff68e50cc in closure_invoke_notifiers (closure=closure@entry=0x0, notify_type=notify_type@entry=3) at ../gobject/gclosure.c:295
#1  0x00007ffff68e7688 in g_closure_invoke (closure=0x0, return_value=0x7fffffffe350, n_param_values=<optimized out>, param_values=0x7fffffffe350, invocation_hint=0x7fffffffe1b0)
    at ../gobject/gclosure.c:817
#2  0x00007ffff69054c8 in signal_emit_unlocked_R (node=node@entry=0x100536b0, detail=detail@entry=650, instance=instance@entry=0x1005c220, emission_return=emission_return@entry=0x0, 
    instance_and_params=instance_and_params@entry=0x7fffffffe350) at ../gobject/gsignal.c:3635
#3  0x00007ffff690fe44 in g_signal_emit_valist (instance=0x1005c220, signal_id=<optimized out>, detail=<optimized out>, var_args=0x7fffffffe530 " 5\017\020") at ../gobject/gsignal.c:3391
#4  0x00007ffff6910290 in g_signal_emit (instance=<optimized out>, signal_id=<optimized out>, detail=<optimized out>) at ../gobject/gsignal.c:3447
#5  0x00007ffff68ee1e8 in g_object_dispatch_properties_changed (object=0x1005c220, n_pspecs=<optimized out>, pspecs=<optimized out>) at ../gobject/gobject.c:1088
#6  0x00007ffff68ee910 in g_object_notify_by_spec_internal (object=object@entry=0x1005c220, pspec=<optimized out>) at ../gobject/gobject.c:1182
#7  0x00007ffff68f1a90 in g_object_notify (object=0x1005c220, property_name=0x7ffff7e78e48 "prompting") at ../gobject/gobject.c:1230
#8  0x00007ffff7e31bd4 in gcr_system_prompter_dispose (obj=0x1005c220) at gcr/gcr-system-prompter.c:322
#9  0x00007ffff68ef5e8 in g_object_unref (_object=<optimized out>) at ../gobject/gobject.c:3309
#10 g_object_unref (_object=0x1005c220) at ../gobject/gobject.c:3239
#11 0x00000000100013f4 in main (argc=<optimized out>, argv=<optimized out>) at ui/gcr-prompter-tool.c:256


from https://gitlab.gnome.org/GNOME/glib/blob/master/gobject/gclosure.c#L776

closure_invoke_notifiers (closure, PRE_NOTIFY) is OK, then marshal(...) is called, it screws the things up, then closure_invoke_notifiers (closure, POST_NOTIFY) is called with NULL as the "closure" parameter. Now to figure out what marshal() calls.

Comment 14 Dan Horák 2019-05-31 22:04:24 UTC
I think I got it, details later.

Comment 15 Dan Horák 2019-06-01 08:13:39 UTC
fixed in https://gitlab.gnome.org/GNOME/gcr/merge_requests/16

Comment 16 Fedora Update System 2019-06-02 16:14:35 UTC
FEDORA-2019-bace17ae69 has been submitted as an update to Fedora 30. https://bodhi.fedoraproject.org/updates/FEDORA-2019-bace17ae69

Comment 17 Fedora Update System 2019-06-03 01:20:55 UTC
gcr-3.28.1-4.fc30 has been pushed to the Fedora 30 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2019-bace17ae69

Comment 18 Fedora Update System 2019-06-17 18:27:38 UTC
gcr-3.28.1-4.fc30 has been pushed to the Fedora 30 stable repository. If problems still persist, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.