abrt version: 2.0.3 architecture: x86_64 cmdline: ro root=/dev/mapper/vg_vorpalblade-lv_root rd_LUKS_UUID=luks-29b27f25-aec9-429d-82d1-ed512bb04bdb rd_LVM_LV=vg_vorpalblade/lv_root rd_LVM_LV=vg_vorpalblade/lv_swap rd_NO_MD rd_NO_DM LANG=en_US.UTF-8 SYSFONT=latarcyrheb-sun16 KEYTABLE=us quiet initcall_debug printk.time=y init=/sbin/bootchartd component: kernel kernel: 2.6.40.4-5.fc15.x86_64 kernel_tainted: 512 kernel_tainted_long: Taint on warning. os_release: Fedora release 15 (Lovelock) package: kernel reason: WARNING: at kernel/signal.c:2013 do_signal_stop+0x246/0x2c3() time: Tue Sep 20 12:11:30 2011 backtrace: :WARNING: at kernel/signal.c:2013 do_signal_stop+0x246/0x2c3() :Hardware name: 4291CL9 :Modules linked in: nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack tcp_lp fuse ebtable_nat ebtables xt_CHECKSUM bridge 8021q garp stp llc sunrpc cpufreq_ondemand acpi_cpufreq mperf rfcomm bnep ip6t_REJECT btusb bluetooth snd_hda_codec_hdmi snd_hda_codec_conexant arc4 snd_hda_intel snd_hda_codec snd_hwdep iwlagn snd_seq snd_seq_device mac80211 snd_pcm thinkpad_acpi uvcvideo cfg80211 iTCO_wdt videodev snd_timer e1000e xhci_hcd i2c_i801 media snd_page_alloc snd iTCO_vendor_support rfkill v4l2_compat_ioctl32 soundcore joydev microcode virtio_net kvm_intel kvm ipv6 xts gf128mul dm_crypt sdhci_pci sdhci mmc_core wmi i915 drm_kms_helper drm i2c_algo_bit i2c_core video [last unloaded: nf_conntrack] :Pid: 22112, comm: plugin-containe Tainted: G W 2.6.40.4-5.fc15.x86_64 #1 :Call Trace: : [<ffffffff81054c8e>] warn_slowpath_common+0x83/0x9b : [<ffffffff81054cc0>] warn_slowpath_null+0x1a/0x1c : [<ffffffff81064f4a>] do_signal_stop+0x246/0x2c3 : [<ffffffff8148690b>] ? schedule+0x690/0x6be : [<ffffffff81065c18>] get_signal_to_deliver+0x153/0x3f6 : [<ffffffff81008f56>] do_signal+0x69/0x65e : [<ffffffff814880b4>] ? _raw_spin_unlock_irqrestore+0x17/0x19 : [<ffffffff811584de>] ? ep_scan_ready_list+0x145/0x165 : [<ffffffff81158f3e>] ? sys_epoll_wait+0x2c7/0x341 : [<ffffffff8100958c>] do_notify_resume+0x28/0x83 : [<ffffffff8148eb10>] int_signal+0x12/0x17 comment: :I was able to trigger this while playing around sending SIGSTOP/SIGCONT to various process groups, while using the tool tamefox http://github.com/lmacken/tamefox :I have not been able to easily reproduce it, but it felt like a race condition involving the stopped app grabbing the x server display and locking the interface up.
if (current->group_stop & GROUP_STOP_PENDING) { ==> WARN_ON_ONCE(!(current->group_stop & GROUP_STOP_SIGMASK)); goto retry; }
(In reply to comment #0) > > :WARNING: at kernel/signal.c:2013 do_signal_stop+0x246/0x2c3() Hmm. Thanks. So far I don't understand how this is possible. And this code was significantly changed in 3.1, I already forgot the details. I'll try to reread it again. Perhaps utrace patches broke get_signal_to_deliver... > :I was able to trigger this while playing around sending SIGSTOP/SIGCONT to > various process groups, while using the tool tamefox > http://github.com/lmacken/tamefox unlikely I can reproduce ;) > :I have not been able to easily reproduce it, May be because of WARN_ON_ONCE, you need to reboot to see the warning again. I'll try to think more... Can you try the debugging patch if I make it?
(In reply to comment #2) > (In reply to comment #0) > May be because of WARN_ON_ONCE, you need to reboot to see the warning > again. > > I'll try to think more... Can you try the debugging patch if I make it? Sure. I'll reboot and try and reproduce the issue at some point today as well.
I am able to reliably reproduce this issue by doing the following: firefox & pkill -STOP firefox strace -f -p $(pidof firefox)
(In reply to comment #4) > I am able to reliably reproduce this issue by doing the following: By 'reliably', I mean 'occassionally'. It feels a bit racy, and sometimes the strace seems to CONT the process, but sometimes it doesn't. The times that it doesn't work, I am able to reproduce this by opening up 'htop' and attempting to strace to once of firefox's threads.
(In reply to comment #4) > I am able to reliably reproduce this issue by doing the following: > > firefox & > pkill -STOP firefox > strace -f -p $(pidof firefox) OK, thanks a lot... But this is quite different. I didn't try to inspect the ptrace paths, because I assume your previous test-case doesn't use ptrace but still triggers the warning? Anyway, thanks for the info, I'll continue tomorrow.
Now that I think about it, the original oops *might* have been triggered using ptrace. I frequently use htop and hit 's' to strace the process, I may have done that out of muscle memory to begin with.
Created attachment 524655 [details] [PATCH stable-3.0] do_signal_stop: don't clear GROUP_STOP_SIGMASK if task_is_stopped() Could you please test this patch? It should fix the problem. But, I wasn't able to reproduce it until I understood what happens (damn! this took me 2 days of grepping ;) May be your testing has found something else... This is upstream bug, I'll send the patch to -stable. Ironically, 3.1 has the similar problem although the code and the reason are quite different. And. Both are buggy wrt jctl stop && ptrace/clone, this needs another fix.
(In reply to comment #8) > Created attachment 524655 [details] > [PATCH stable-3.0] do_signal_stop: don't clear GROUP_STOP_SIGMASK if > task_is_stopped() > > Could you please test this patch? It should fix the problem. > But, I wasn't able to reproduce it until I understood what > happens (damn! this took me 2 days of grepping ;) May be > your testing has found something else... > > This is upstream bug, I'll send the patch to -stable. > > Ironically, 3.1 has the similar problem although the code > and the reason are quite different. > > And. Both are buggy wrt jctl stop && ptrace/clone, this needs > another fix. I am unable to reproduce my problem with your patch applied. Thanks!