Bug 123434 - hang during ifdown on reboot
Summary: hang during ifdown on reboot
Keywords:
Status: CLOSED WORKSFORME
Alias: None
Product: Red Hat Enterprise Linux 3
Classification: Red Hat
Component: kernel
Version: 3.0
Hardware: All
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Stephen Tweedie
QA Contact: Brian Brock
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2004-05-18 15:34 UTC by Alex Lyashkov
Modified: 2007-11-30 22:07 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2004-10-15 00:23:18 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
log from serial console. (6.18 KB, application/x-tar)
2004-05-18 16:48 UTC, Alex Lyashkov
no flags Details

Description Alex Lyashkov 2004-05-18 15:34:49 UTC
Description of problem:
I started 2.4.21-4smp at UP maschine (it`s my bug..) and textbox been
deadlocked at reboot.

Version-Release number of selected component (if applicable):
2.4.21-4

How reproducible:


Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:
part of tasks trace.

kjournald     S 00000000  3776    14      1            58    10 (L-TLB)
Call Trace:   [<c0123464>] schedule [kernel] 0x2f4 (0xf669df3c)
[<c0123b62>] interruptible_sleep_on [kernel] 0x52 (0xf669df80)
[<f88134bc>] kjournald [jbd] 0x14c (0xf669dfb0)
[<f8813350>] commit_timeout [jbd] 0x0 (0xf669dfd4)
[<f8813370>] kjournald [jbd] 0x0 (0xf669dfe4)
[<c010958d>] kernel_thread_helper [kernel] 0x5 (0xf669dff0)
kjournald     S 00000000  3944  2441      1          2445    58 (L-TLB)
Call Trace:   [<c0123464>] schedule [kernel] 0x2f4 (0xf6611f3c)
[<c0123b62>] interruptible_sleep_on [kernel] 0x52 (0xf6611f80)
[<f88134bc>] kjournald [jbd] 0x14c (0xf6611fb0)
[<f8813350>] commit_timeout [jbd] 0x0 (0xf6611fd4)
[<f8813370>] kjournald [jbd] 0x0 (0xf6611fe4)
[<c010958d>] kernel_thread_helper [kernel] 0x5 (0xf6611ff0)

kjournald     S 00000000  3912  2445      1          3220  2441 (L-TLB)
Call Trace:   [<c0123464>] schedule [kernel] 0x2f4 (0xc1fb3f3c)
[<c0123b62>] interruptible_sleep_on [kernel] 0x52 (0xc1fb3f80)
[<f88134bc>] kjournald [jbd] 0x14c (0xc1fb3fb0)
[<f88175c0>] .rodata.str1.32 [jbd] 0x13e0 (0xc1fb3fb4)
[<f8813370>] kjournald [jbd] 0x0 (0xc1fb3fc4)
[<f8813350>] commit_timeout [jbd] 0x0 (0xc1fb3fd4)
[<f8813370>] kjournald [jbd] 0x0 (0xc1fb3fe4)
[<c010958d>] kernel_thread_helper [kernel] 0x5 (0xc1fb3ff0)

Comment 1 Rik van Riel 2004-05-18 16:09:33 UTC
Alex, can you reproduce this bug with the latest RHEL3 kernel ?
(2.4.21-14EL)

Comment 2 Alex Lyashkov 2004-05-18 16:17:25 UTC
I don`t know. -4smp is my "safemode" kernel, base kernel for this box
- freevps kernel, but if it need i recompile -15smp and try start/stop
box. 



Comment 3 Stephen Tweedie 2004-05-18 16:37:00 UTC
"S" state is the normal process state for a journal thread that has no
work to do and is sitting idle.  The processes would have to be in "D"
state for there to be any sign of a deadlock here.

We need better information to even begin to diagnose this.  Just what
exactly does "deadlocked at reboot" mean?  How far does the reboot
process get before stalling?  I assume it's not a complete lockup at
that point, because you can extract process trace info?

Comment 4 Alex Lyashkov 2004-05-18 16:48:00 UTC
Created attachment 100307 [details]
log from serial console.

This full log session recorded from serial console.
system lock after "reboot" command.

Comment 5 Stephen Tweedie 2004-05-18 18:00:29 UTC
Is the hang always at the "eth0: " line?

Comment 6 Alex Lyashkov 2004-05-18 18:32:39 UTC
2.4.21-4 always. -15 not tried.

Comment 7 Stephen Tweedie 2004-05-19 10:53:12 UTC
OK, so "ifdown" is hanging, and this has nothing to do with ext3. 
I'll update the subject line.

Also, we can't support home-built kernels, simply because we have no
idea of the configuration or patches that have been applied there. 
Please try to reproduce this on 2.4.21-15EL.

Comment 8 Alex Lyashkov 2004-05-19 11:54:24 UTC
you look to serial console log or not ? how i see - not.
it`s RH builded 2.4.21-4 kernel and it hanged.
well, i will found time for rebuild 2.4.21-15 and test...




Comment 9 Stephen Tweedie 2004-05-19 13:46:17 UTC
Yes I have, but I was looking at your own comment about recompiling
-15smp.  I'm not sure why you'd have to recompile in order to test
2.4.21-15.EL unless you're using your own home-built configuration.

Anyway, if the ifdown is continually in "R" state in this case, the
important thing is to find out what it's running at that point. 
alt-sysrq-t is not reliable for the current running process (because
it hunts through the stack based on the last recorded stack pointer,
not the _current_ stack pointer), so an alt-sysrq-p (or, if you're
using .15.EL, alt-sysrq-w to grab all CPUs on an SMP box) will help to
find out what is sticking.

Comment 10 Stephen Tweedie 2004-10-15 00:23:18 UTC
Please reopen this bug report if you are able to capture the requested
alt-sysrq-w/p.  Thanks.


Note You need to log in before you can comment on or make changes to this bug.