Red Hat Bugzilla – Bug 1260577
problem with signals with enum typed arguments
Last modified: 2015-11-19 03:19:31 EST
Created attachment 1070906 [details]
test program to reproduce issue
On s390x and ppc64 architectures, we have an issue with NetworkManager that I think is a bug in glib ( https://bugzilla.redhat.com/show_bug.cgi?id=1062301#c38 )
It seems that when you have a glib signal
- with several arguments
- with a GEnum typed argument
- with several subscribers,
then the passed enum value becomes zero.
The attached test program fails on s390x machine:
gcc -Og -g `pkg-config --cflags --libs glib-2.0 gobject-2.0` ./test-rh1062301.c
Using host libthread_db library "/lib64/libthread_db.so.1".
** Message: (_signal_cb1) signal accepted
ERROR:./test-rh1062301.c:109:_signal_cb1: assertion failed (test_flags == *expected): (0 == 2)
Program received signal SIGABRT, Aborted.
0x000003fffdcbf8e0 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
56 return INLINE_SYSCALL (tgkill, 3, pid, selftid, sig);
Missing separate debuginfos, use: debuginfo-install libffi-3.0.13-16.el7.s390x
#0 0x000003fffdcbf8e0 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1 0x000003fffdcc1170 in __GI_abort () at abort.c:90
#2 0x000003fffdea115e in g_assertion_message (domain=0x0, file=0x80001378 "./test-rh1062301.c", line=<optimized out>, func=<optimized out>, message=<optimized out>) at gtestutils.c:2292
#3 0x000003fffdea154a in g_assertion_message_cmpnum (domain=0x0, file=0x80001378 "./test-rh1062301.c", line=<optimized out>, func=0x80001428 <__FUNCTION__.11051> "_signal_cb1", expr=<optimized out>, arg1=<optimized out>, cmp=0x800013a4 "==", arg2=<optimized out>, numtype=105 'i') at gtestutils.c:2340
#4 0x0000000080000f4a in _signal_cb1 (obj=<optimized out>, dummy1=<optimized out>, test_flags=<optimized out>, dummy2=<optimized out>, expected=<optimized out>) at ./test-rh1062301.c:109
#5 0x000003fffdc618b4 in ffi_call_SYSV () at /lib64/libffi.so.6
#6 0x000003fffdc6217e in ffi_call () at /lib64/libffi.so.6
#11 0x000003fffdfa3a62 in <emit signal ??? on instance 0x8000bc00 [MamanBar]> (instance=instance@entry=0x8000bc00, signal_id=<optimized out>, detail=detail@entry=0) at gsignal.c:3365
#7 0x000003fffdf882e4 in g_cclosure_marshal_generic (closure=0x8000bb40, return_gvalue=0x0, n_param_values=<optimized out>, param_values=<optimized out>, invocation_hint=<optimized out>, marshal_data=0x0) at gclosure.c:1448
#8 0x000003fffdf87984 in g_closure_invoke (closure=0x8000bb40, return_value=0x0, n_param_values=<optimized out>, param_values=0x3ffffffebb8, invocation_hint=0x3ffffffead8) at gclosure.c:768
#9 0x000003fffdf9b62e in signal_emit_unlocked_R (node=node@entry=0x8000f310, detail=detail@entry=0, instance=instance@entry=0x8000bc00, emission_return=emission_return@entry=0x0, instance_and_params=instance_and_params@entry=0x3ffffffebb8) at gsignal.c:3553
#10 0x000003fffdfa37bc in g_signal_emit_valist (instance=<optimized out>, signal_id=<optimized out>, detail=<optimized out>, var_args=<optimized out>) at gsignal.c:3309
#12 0x0000000080000fa2 in _signal_emit (obj=obj@entry=0x8000bc00 [MamanBar], flags=flags@entry=TEST_FLAG_2) at ./test-rh1062301.c:123
#13 0x000000008000124a in main () at ./test-rh1062301.c:147
(In reply to Thomas Haller from comment #0)
> - with a GEnum typed argument
GFlags actually... didn't test with GEnum.
This indeed looks like a glib or ffi bug. Could we do with something less invasive to work around it? Something along the lines of
diff --git a/src/nm-config.c b/src/nm-config.c
index d805948..c782616 100644
@@ -1890,7 +1890,13 @@ nm_config_class_init (NMConfigClass *config_class)
G_STRUCT_OFFSET (NMConfigClass, config_changed),
NULL, NULL, NULL,
- G_TYPE_NONE, 3, NM_TYPE_CONFIG_DATA, NM_TYPE_CONFIG_CHANGE_FLAGS, NM_TYPE_CONFIG_DATA);
+ G_TYPE_NONE, 3, NM_TYPE_CONFIG_DATA,
+ /* https://bugzilla.redhat.com/show_bug.cgi?id=1260577 */
+ /* NM_TYPE_CONFIG_CHANGE_FLAGS, */
+ G_STATIC_ASSERT (sizeof (guint) == sizeof (NMConfigChangeFlags));
or maybe even ifdef'd with defined(__s390x__) || defined (__PPC64__) or whichever platforms are known broken; and maybe a test to make sure we didn't leave out any platform?
(In reply to Colin Walters from comment #4)
> See https://bugzilla.gnome.org/show_bug.cgi?id=686662
I don't think this is related.
Note that for both NetworkManager and the attached reproducer, the enum is of 4 bytes (same as gint/guint).
Also note, that the test doesn't fail, if only one handler is connected to the signal.
You could determine if it's FFI by using a generated marshaller instead of g_cclosure_marshal_generic.
Hardware watchpoints might help figure out when the variable is being changed? Or alternatively, just add some debugging printfs into libgobject?
(In reply to Colin Walters from comment #6)
> You could determine if it's FFI by using a generated marshaller instead of
> Hardware watchpoints might help figure out when the variable is being
> changed? Or alternatively, just add some debugging printfs into libgobject?
The test-program also fails with glib from upstream master.
Also, the failure is for GEnum too (not only GFlags).
While gclosure.c has some special handling for GEnum types, GFlags are treated just like a guint. As the error doesn't happen with G_TYPE_UINT/G_TYPE_INT as signal type, that hints that the error is not in libffi.
But since gclosure.c doesn't handly GFlags differently from guint, it seems to hint that the error isn't in gclosure either.
Anyway, I don't know...
(In reply to Lubomir Rintel from comment #3)
> This indeed looks like a glib or ffi bug. Could we do with something less
> invasive to work around it? Something along the lines of
added this workaround to NM: http://cgit.freedesktop.org/NetworkManager/NetworkManager/commit/?id=e7d66f1df61ebdc7652ba34a4e1baddcc86b9e26
Fixed upstream now: https://bugzilla.gnome.org/show_bug.cgi?id=754882
I can do the backport as soon as the flags come in.
This is a low risk fix. Can someone please set the blocker flag?
I've verified this bug according the reproducer in comment 0 on all archs: x86_64, s390x, ppc64 and ppc64le. On all these archs the reproducer passed.
The reproducer has been added into our standard glib2 automation testset.
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.