Hide Forgot
Description of problem: abrt is too strict when it judges if kernel is tainted. Version-Release number of selected component (if applicable): abrt-2.0.8-5.el6.x86_64 libreport-2.0.9-3.el6.x86_64 How reproducible: always on my hardware (corporate T510) Steps to Reproduce: 1. install RHEL6 x86_64 on said hardware 2. create a kvm guest there 3. suspend & resume the system while the guest is running Actual results: abrt refuses to analyze the crash because of tainted kernel Expected results: abrt continues Additional info: module license list: # for mod in $(lsmod | grep -v Module | cut -d ' ' -f 1) \ ; do modinfo $mod ; done | grep license | less | sort | uniq license: Dual BSD/GPL license: GPL license: GPL and additional rights license: GPL v2 dmesg report about the crash: ------------[ cut here ]------------ WARNING: at arch/x86/kvm/x86.c:1838 kvm_arch_vcpu_load+0x103/0x150 [kvm]() (Not tainted) Hardware name: 4384AT6 Modules linked in: hidp fuse ebtable_nat ebtables rfcomm sco bnep l2cap autofs4 sunrpc cpufreq_ondemand acpi_cpufreq freq_table mperf bridge stp llc ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ip6 table_filter ip6_tables ipv6 ipt_REJECT xt_state iptable_filter ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 ip_tables fpu aesni_intel cryptd aes_x86_64 aes_gener ic xts gf128mul dm_crypt vhost_net macvtap macvlan tun kvm_intel kvm uinput btusb bluetooth thinkpad_acpi arc4 iwlwifi mac80211 cfg80211 rfkill sg uvcvideo videodev v4l2_compat_ioctl32 microcode in tel_ips i2c_i801 iTCO_wdt iTCO_vendor_support shpchp snd_hda_codec_hdmi snd_hda_codec_conexant snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm snd_timer snd soundcore snd_page_ alloc e1000e ext4 mbcache jbd2 sr_mod cdrom sd_mod crc_t10dif firewire_ohci firewire_core crc_itu_t sdhci_pci sdhci mmc_core ahci wmi i915 drm_kms_helper drm i2c_algo_bit i2c_core video output dm_m irror dm_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan] Pid: 3271, comm: qemu-kvm Not tainted 2.6.32-262.el6.x86_64 #1 Call Trace: (trace cut in order not to confuse bz search) ---[ end trace 2e360406a0e72392 ]--- done. video LNXVIDEO:00: Restoring backlight state ------------[ cut here ]------------ --> note the *Not Tainted* there not-reportable field of abrt says: A kernel problem occurred, but your kernel has beentainted (flags:G W ). Kernel maintainers are unable todiagnose tainted reports.
Created attachment 578911 [details] dmesg Looking into dmesg once more and better, it seems that: 1. kernel throws the initial warning 2. kernel throws the same warning again and declares itself tainted 3. abrt merges the two warnings 4. abrt chooses to ignore the warnings because one of them is tainted given this, I'd opt for "not tainted is higher priority of duplicate reports" approach and offer user to submit/debug that.
This request was not resolved in time for the current release. Red Hat invites you to ask your support representative to propose this request, if still desired, for consideration in the next release of Red Hat Enterprise Linux.
I think this one is fixed by this commit in rhel6 git branch: commit 8b0956286967e92bd68c04eb4e051225f00fce44 Author: Jiri Moskovcak <jmoskovc> Date: Thu Jul 12 11:24:39 2012 +0200 oops: don't create oops dir in reverse rhbz#814594 when kernel starts printing oops, only very first one is not tainted. If we start in reverse, it will firstly create direcotry with tainted oops and others will be deleted as dup of first one (if there are same), thus we loose NOT tainted oops. Signed-off-by: Nikola Pajkovsky <npajkovs>
During the searching how to reproduce the problem the fix itself was questioned. Here is the IRC log of the communication with Denys: lzachar: I've managed to reproduce it once - but I can't do it again. When I run abrt-dump-oops -D dmesg both old and new make three ooops-dirs lzachar: I remember that on that one "successful" reproduction it was just one oops directory lzachar: old version is abrt-2.0.8-5.el6.x86_64 (same as in that bug report) dvlasenk: abrt-dump-oops does not refise to create problem dirs if it sees a tainted oops dvlasenk: It just adds kernel_tainted_short and kernel_tainted_long elements to the problem dir lzachar: ah. any tips how can reproduce the problem then? dvlasenk: You can check whether these elements (files) are properly created in the problem dirs. lzachar: well, there are - 2 tainted and 1 not tainted lzachar: but the same holds true for the old version as well dvlasenk: And the problem is ... ? lzachar: according to the bug report, there was a case when tainted made the not tainted one to be lost lzachar: so the user was not allowed to report lzachar: I would like to be able to reproduce that, so I can see that is was fixed lzachar: there is a difference in the order of writing the oopses dvlasenk: Understood dvlasenk: abrt-dump-oops doesn't do dup elimination lzachar: (which was your patch I believe) lzachar: any service I could run? lzachar: so it will do that dvlasenk: It is done on the post-create step IIRC (that is, by abrtd) dvlasenk: This part of the config triggers it in koops_event.conf: dvlasenk: EVENT=post-create analyzer=Kerneloops dvlasenk: # >> instead of > is due to bugzilla.redhat.com/show_bug.cgi?id=854266 dvlasenk: abrt-action-analyze-oops && dvlasenk: dmesg >>dmesg && dvlasenk: abrt-action-generate-core-backtrace dvlasenk: abrt-action-save-kernel-data dvlasenk: hmm dvlasenk: I'm not sure we in fact check for that case, and it still can happen dvlasenk: (It wasn't happening because in the past post-creates were never running concurrently) dvlasenk: Perhaps you need to repoen that bug :(
My mistake, it can be (easily) reproduced. Moving back to ON_QA.
Cause When a multiple kernel oops happens in a short period of time ABRT only the first one is relevant, because the later oopses might be just consequences of the first problem. ABRT was wrongly sorting the processed oopses, so it saved the last oops instead of the first one. Consequence The most important first oops never got into problem report. Fix Fixed the processing order. Result In case of processing more oops at the same time the first one is correctly saved NOTE: this is the CCFR as for 808721, because these bzs have actually the same fix
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2013-0290.html