Bug 133905
Summary: | kernel crash, fatal exception, accessing /proc, EXT3-fs error | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 3 | Reporter: | Tapio Vaattanen <tapio.vaattanen> |
Component: | kernel | Assignee: | Ernie Petrides <petrides> |
Status: | CLOSED ERRATA | QA Contact: | |
Severity: | medium | Docs Contact: | |
Priority: | medium | ||
Version: | 3.0 | CC: | anderson, aviro, dhoward, jhedstro, lwoodman, nixuser, peterm, petrides, riel, sct, tao |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | i686 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2005-05-18 13:28:11 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Tapio Vaattanen
2004-09-28 13:34:57 UTC
Steps ro Reproduce step one should be: 1. On one virtual console something like "while true ; do tar cvf /tmp/proc.tar /proc; done" the /proc was frogotten from the while loop. What hardware was this problem seen on ? ( lspci and lsmod would be helpful ). HP Proliant ML350, VMware 3.11 running RHES3,0 on virtual machine, HP Deskpro. All HW where I tested the loop produced similar behaviour, no exceptions. This really isn't HW related, since the loop example above crashes all the systems I've tested it including VMware virtual machines. Output of lspci on ML350: [root@linux root]# lspci 00:00.0 Host bridge: ServerWorks CMIC-LE Host Bridge (GC-LE chipset) (rev 33) 00:00.1 Host bridge: ServerWorks CMIC-LE Host Bridge (GC-LE chipset) 00:00.2 Host bridge: ServerWorks CMIC-LE Host Bridge (GC-LE chipset) 00:02.0 SCSI storage controller: Adaptec AHA-3960D / AIC-7899A U160/m (rev 01) 00:02.1 SCSI storage controller: Adaptec AHA-3960D / AIC-7899A U160/m (rev 01) 00:03.0 VGA compatible controller: ATI Technologies Inc Rage XL (rev 27) 00:04.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5702X Gigabit Ethernet (rev 02) 00:05.0 System peripheral: Compaq Computer Corporation Advanced System Management Controller 00:0f.0 ISA bridge: ServerWorks CSB5 South Bridge (rev 93) 00:0f.1 IDE interface: ServerWorks CSB5 IDE Controller (rev 93) 00:0f.2 USB Controller: ServerWorks OSB4/CSB5 OHCI USB Controller (rev 05) 00:0f.3 Host bridge: ServerWorks CSB5 LPC bridge 00:11.0 Host bridge: ServerWorks CIOB-X2 PCI-X I/O Bridge (rev 05) 00:11.2 Host bridge: ServerWorks CIOB-X2 PCI-X I/O Bridge (rev 05) 02:02.0 RAID bus controller: Compaq Computer Corporation Smart Array 64xx (rev 01) And lsmod on ML350: [root@linux root]# lsmod Module Size Used by Not tainted parport_pc 18852 1 (autoclean) lp 9124 0 (autoclean) parport 38816 1 (autoclean) [parport_pc lp] autofs 13620 0 (autoclean) (unused) 8021q 17320 0 (autoclean) (unused) tg3 58312 1 floppy 57488 0 (autoclean) sg 37228 0 (autoclean) microcode 6848 0 (autoclean) st 31428 0 keybdev 2976 0 (unused) mousedev 5624 0 (unused) hid 22276 0 (unused) input 6144 0 [keybdev mousedev hid] usb-ohci 23176 0 (unused) usbcore 80928 1 [hid usb-ohci] ext3 89960 3 jbd 55060 3 [ext3] cciss 64032 8 aic7xxx 162064 0 sd_mod 13360 0 (unused) scsi_mod 112680 4 [sg st cciss aic7xxx sd_mod] On my machine ( shuttle ; SIS 651 with IDE disks )the problem was in the DMA code. The DMA interface has mapped into memory, read-volatile registers whereby reading the memory location causes the register to shift to the next batch of data ( see ide_end_drive_cmd() ). THe tar of /proc/kcore was stealing data from the ide driver. This is a specific example of a class of probem whereby reading /proc files can have unwelcome side-effects. With some hardware, the /proc/bus files could have similar problems. It can be legitimately challenged that this is not a bug. Only superuser can read the relevent files; and the files do reside in a file-system which should be treated with caution. However, these files are not "special files" to utilities like 'find'. Except for their location under /proc there is no reason to think that reading these files could cause side-effects. And to the average system administrator from a UNIX background, the characteristics of the /proc file-system may not immediately spring to mind when doing, for example, a spontaneous backup or a search. There are a number of different remedies for this specific situation - kcore can be made modular with only a little tweaking; or could skip uncacheable MTRRs by default. But these do not address the larger issue, and since they change the functionality of a long-established file, could cause problems elsewhere. At the very least, I think a warning in the proc(5) man page is in order. It turns out that there was a kernel bug in the handling for /proc/kcore that under certain conditions was causing random memory corruption. A fix for this problem was committed to the RHEL3 U5 patch pool on 28-Jan-2005 (in kernel version 2.4.21-27.10.EL). Hi, Debby. In response to comment #4, the /proc/kcore driver already has logic to avoid access to mapped regions with the VM_IOREMAP flag set. Do you know of problematic regions that don't use VM_IOREMAP but should? An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2005-294.html *** Bug 110890 has been marked as a duplicate of this bug. *** |