Bug 80483 - DVDrip "use PSU core" seems to corrupt memory with some setups
Summary: DVDrip "use PSU core" seems to corrupt memory with some setups
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Linux
Classification: Retired
Component: kernel
Version: 8.0
Hardware: i686
OS: Linux
medium
high
Target Milestone: ---
Assignee: Arjan van de Ven
QA Contact: Brian Brock
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2002-12-27 02:23 UTC by d. johnson
Modified: 2007-04-18 16:49 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2004-09-30 15:40:20 UTC
Embargoed:


Attachments (Terms of Use)

Description d. johnson 2002-12-27 02:23:48 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.2.1) Gecko/20021224

Description of problem:
Processes that make extensive use of pipes end up being unkillable.  They do not
die.  You can not 'strace -p' or any other sort of attach to the process.  They
continue to consume CPU time, and refuse to quit running.

After you 'kill -9' the process and wait around 6 hours, it usually exits.  (Not
always.)

Version-Release number of selected component (if applicable):


How reproducible:
Sometimes

Steps to Reproduce:
Easiest way I've found is to use dvdrip.

1. run 'dvdrip' and tell it to encode a DVD.
2. this runs multiple transcode pipes
3. This process never finishes.  The pipes from transcode never exit.
    

Actual Results:  Reboot usually.

Expected Results:  Transcode processes should exit after the EOF is read from
the pipes.

Additional info:

Reproducable 100% using any of the following kernels: kernel-2.4.18-17.8.0,
kernel-2.4.18-18.8.0, kernel-2.4.20-2.2.

Bug has been confirmed by several others, and is not restricted to my specific
hardware or to transcode.  (It just happens that transcode is the easiest way
for me to replicate this bug).

Comment 1 Anthony Rumble 2002-12-28 03:11:37 UTC
I have similar things happening to me, however, I am not convinced it is pipe
related, but rather mmap related. As it happens to me when running transcode by
itself, rpm, pine and gimp, and the common thing amongst them all, is mmap.

It always happens while doing heavy file IO.

When the process is "unkillable", it's in state "schedule" in the kernel.

Sometimes, after extended periods of time, the process will actually die, but
this doesn't always happen. Mostly have to reboot to clear.

Unfortunately, I cannot reproduce this in a testable fasion, but I am trying.


Comment 2 Anthony Rumble 2002-12-28 03:56:14 UTC
Ok, I am now able to reliably hang gimp under 2.4.18-18.8.0.
I can gimp this large photo, and do several Mirror flips in succession,
usually after the 4th of 5th, it hangs (unkillable).

But, I just went to 2.5.18-19.8.0, and I cannot reproduce this hang any more.
It may not be that it's fixed, but the race-condition has moved elsewhere.

I will keep thrashing and see what I can find.

Running Athlon-600, UDMA66 drives, using Athlon optimised kernel.

Could this perhaps be related to the old athlon memcpy optimisation issue with
Via boards?

Comment 3 d. johnson 2002-12-28 04:31:27 UTC
For reference, my system is Athlon-1200 using rhat's athlon kernel, nVidia
Corporation NV25 [GeForce4 Ti4200] (rev a3) with NVIDIA driver -4191, and ATA100
drive. [VIA Technologies, Inc. VT82C586B PIPC Bus Master IDE (rev 06)]

I have tried other NVIDIA revs as well: 2880, 2960.

Comment 4 d. johnson 2002-12-30 02:39:37 UTC
Retried with 2.4.20-virgin, no pipe/mmap problems at all.  I still loaded the
NVIDIA_kernel module (4191) under this kernel, so that rules out a simple
"Nvidia caused it" issue, as well as any hardware specific bugs with my setup.

Actual dvdrip (transcode) log (for reference) was:
Sun Dec 29 08:50:35 2002 Starting job (1): Transcoding video - title #1, pass 1
Sun Dec 29 08:50:35 2002 Executing command: mkdir -m 0775 -p
'/home/dj/tmp/Matrix/tmp' && cd /home/dj/tmp/Matrix/tmp && transcode -a 0 -x
vob,null -i /home/dj/tmp/Matrix/vob/001 -w 1951,250,100 -b 192,0,0 -s 1.412 -V 
-C 1 -I 1 -f 24,1 -g 720x480 -M 2 -j
62,8,62,6 -Z 752x320 -R 1 -y divx4,null --psu_mode --nav_seek
/home/dj/tmp/Matrix/tmp/Matrix-001-nav.log --no_split  -o /dev/null && echo
DVDRIP_SUCCESS  (PID=1528)
Sun Dec 29 13:46:13 2002 Successfully finished job (1): Transcoding video -
title #1, pass 1
Sun Dec 29 13:46:13 2002 Starting job (2): Transcoding video - title #1, pass 2
Sun Dec 29 13:46:13 2002 Executing command: mkdir -m 0775 -p
'/home/dj/tmp/Matrix/tmp' && cd /home/dj/tmp/Matrix/tmp && transcode -a 0 -x vob
-i /home/dj/tmp/Matrix/vob/001 -w 1951,250,100 -b 192,0,0 -s 1.412 -V  -C 1 -I 1
-f 24,1 -g 720x480 -M 2 -j 62,8,62,6 -Z 752x320 -R 2 -y divx4 -E 48000
--psu_mode --nav_seek /home/dj/tmp/Matrix/tmp/Matrix-001-nav.log --no_split  -o
/home/dj/tmp/Matrix/avi/001/Matrix-001.avi && echo
DVDRIP_SUCCESS  (PID=5516)
Sun Dec 29 18:56:52 2002 Successfully finished job (2): Transcoding video -
title #1, pass 2

Please let me know if you'd like any more details.

Comment 5 Alan Cox 2002-12-30 14:15:16 UTC
It doesnt rule out an nvidia module interaction with mmap. It could be a kernel
bug but until its reproduced with the RH kernel without the nvidia module its
not that interesting.

Alan


Comment 6 Nat Friedman 2003-01-01 21:06:37 UTC
I've reproduced this as well, doing a 180deg rotation on a very large image in
gimp.  I'm running a virgin 2.4.18-3 rhat kernel with no binary modules loaded
(lsmod |head -1 reports "Not tainted").  The machine is an IBM T23 laptop with a
pIII.

Comment 7 d. johnson 2003-01-17 12:49:31 UTC
After many, many itterations and crashes, I've narrowed transcodes issue down to
the "Use PSU core" option, which is default ON in dvdrip.

If you disable this, the kernel has no issue.  Left enabled, the process becomes
unkillable.

This testing has been done without the NVIDIA kernel modules loaded.  It has
also been tested on a non-redhat-kernel, where it runs fine.  (virgin 2.4.20 was
used in testing as well.)

For the past week, my kernel has been untainted so that I could properly run
these tests.

Comment 8 Alan Cox 2003-06-05 13:31:57 UTC
(the other mmap related reports sound like stuff fixed in errata already)


Comment 9 Bugzilla owner 2004-09-30 15:40:20 UTC
Thanks for the bug report. However, Red Hat no longer maintains this version of
the product. Please upgrade to the latest version and open a new bug if the problem
persists.

The Fedora Legacy project (http://fedoralegacy.org/) maintains some older releases, 
and if you believe this bug is interesting to them, please report the problem in
the bug tracker at: http://bugzilla.fedora.us/



Note You need to log in before you can comment on or make changes to this bug.