Bug 286641

Summary: intermittent problem makes machine froze
Product: [Fedora] Fedora Reporter: Bruno <dblongo>
Component: kernelAssignee: Kernel Maintainer List <kernel-maint>
Status: CLOSED INSUFFICIENT_DATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: urgent Docs Contact:
Priority: medium    
Version: 7CC: chris.brown, dblongo
Target Milestone: ---   
Target Release: ---   
Hardware: i386   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2008-01-13 23:31:27 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Bruno 2007-09-11 18:41:06 UTC
Description of problem:

There is a intermittent problem. The machine froze without dumping anything at
/var/log/messages. This problem is happening for a long time.

The last time that machine froze was 04:00 AM.
I found in /var/log/messages at 11:38 PM the error below:

Sep 10 23:46:01 myfw kernel: BUG: unable to handle kernel paging request at 
virtual address 58dda05f
Sep 10 23:46:01 myfw kernel:  printing eip:
Sep 10 23:46:01 myfw kernel: c0479e4f
Sep 10 23:46:01 myfw kernel: *pde = 00000000
Sep 10 23:46:01 myfw kernel: last sysfs file: /class/net/lo/type
Sep 10 23:46:01 myfw kernel: Modules linked in: vfat fat usb_storage arc4 
ppp_mppe ppp_async crc_ccitt ppp_generic slhc xfrm4_tunnel af_key sit 
nf_nat_ftp nf_conntrack_ftp xt_tcpudp ipt_LOG iptable_nat nf_nat 
nf_conntrack_ipv4 xt_state nf_conntrack nfnetlink iptable_filter ip_tables 
x_tables xfrm4_mode_tunnel deflate zlib_deflate twofish twofish_common camellia 
serpent blowfish cbc ecb blkcipher xcbc crypto_null tunnel4 ipcomp esp4 ah4 aes 
des sha256 ipv6 video sbs button dock battery ac 3c59x mii e1000 i2c_i801 
parport_pc iTCO_wdt iTCO_vendor_support i2c_core parport i6300esb floppy 
rtc_cmos sr_mod cdrom sg dm_snapshot dm_zero dm_mirror dm_mod ata_generic 
ata_piix libata sd_mod scsi_mod ext3 jbd mbcache ehci_hcd ohci_hcd uhci_hcd
Sep 10 23:46:01 myfw kernel: Fixing recursive fault but reboot is needed!

Version-Release number of selected component (if applicable):
The problem happened in kernel-2.6.22.1-41.fc7.
Today i upgraded my kernel to kernel-2.6.22.4-65.fc7.

How reproducible:

This machine running the packages below:
    iptables-1.3.7-2
    net-snmp-5.4-14.fc7
    samba-3.0.25c-0.fc7
    squid-2.6.STABLE13-1.fc7
    openswan-2.4.7-3.fc7
    ppp-2.4.4-2
    pptp-1.7.1-2.fc6

I don't know how to reproduce this problem, cause i have the same packages and 
same configurations in other network and its works fine.

Expected results:

The machine don't froze.

Additional info:

I changed the machine 3 times. 
The latest machine that i using is a IBM XSeries PIV Xeon 3.2 GHz with 2GB of 
RAM and HD 160GB.

Comment 1 Chuck Ebbert 2007-09-12 22:07:07 UTC
Try booting with some different kernel options:

  nolapic
  pci=nomsi,nommconf


Comment 2 Bruno 2007-09-13 00:39:09 UTC
This computer be in a production network.
I will include this 2 parameters on grub.conf and reboot the computer tomorrow.

I don't know this parameters yet. I tried to find about its in internet but i 
keep without understand what this parameters do.

Please, could you explain to me what nolapic and pci=nomsi,nommconf will do?

Thanks!

Comment 3 Christopher Brown 2007-10-03 13:58:39 UTC
Hello Bruno,

I'm reviewing this bug as part of the kernel bug triage project, an attempt to
isolate current bugs in the fedora kernel.

http://fedoraproject.org/wiki/KernelBugTriage

I am CC'ing myself to this bug and will try and assist you in resolving it if I can.

nolapic disables the local APIC (accessible programmable interrupt controller)
on your system.

nommconf disables mmconfig, which is "Low-level direct PCI config space access".

nomsi disables the use of Message Signaled Interrupts which may be poorly
implemented on your hardware. MSI is a feature of the PCI 2.2 and greater which
allows cards to issue interrupts as a write message so you can do a lot more
than just issue an interrupt.

These boot flags will help the system overcome poor hardware implementation -
some new hardware will not yet be blacklisted in the code so you need to use add
these flags to overcome the issues manually.

Anyway, could you tell me if you are still having problems with the latest
kernel?If the problem no longer exists then please close this bug or I'll do so
in a few days if there is no additional information lodged.

Cheers
Chris

Comment 4 Christopher Brown 2008-01-13 23:31:27 UTC
As indicated previously there has been no update on the progress of this bug
therefore I am closing it as INSUFFICIENT_DATA. Please re-open if the issue
still occurs for you and I will try to assist in its resolution. Thank you for
taking the time to report the initial bug.