164391 – 8250 probe kills pmac

Bug 164391 - 8250 probe kills pmac

Summary: 8250 probe kills pmac

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	kernel
Sub Component:
Version:	4
Hardware:	powerpc
OS:	Linux
Priority:	medium
Severity:	high
Target Milestone:	---
Assignee:	David Woodhouse
QA Contact:	Brian Brock
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2005-07-27 16:19 UTC by miguel
Modified:	2007-11-30 22:11 UTC (History)
CC List:	3 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2005-09-07 10:27:50 UTC
Type:	---
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description miguel 2005-07-27 16:19:37 UTC

From Bugzilla Helper:
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.8) Gecko/20050511 Firefox/1.0.4

Description of problem:
yaboot hangs at the line

Serial: 8250/16550 driver $Revision 1.90 $ 76 ports, IRQ sharing enabled

which is immediately after the agpgart test/program/probe

Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
1. start computer with fc4 cd in drive
2. type 'linux' and press enter when prompted
3. wait until the "Serial: 8250/16550 driver $Revision 1.90 $ 76 ports, IRQ sharing enabled" line comes and wait for the rest of eternity :P
  

Actual Results:  computer hanged

Expected Results:  continued installation properly

Additional info:

im doing this test without any PCI cards or Hard Drives plugged in. i doubt this has to do with anything, but ot moght be valuable information

Comment 1 Paul Nasrat 2005-07-27 17:03:11 UTC

This is the kernel hanging.

Can you try booting with 

linux debug 

Note down any additional messages:

Then try:

linux video=ofonly debug

Comment 2 miguel 2005-07-28 14:24:17 UTC

linux debug shows an extra line after the first one like so:

Serial: 8250/16550 driver $Revision 1.90 $ 76 ports, IRQ sharing enabled
IN from bad port 3f9 at c01dd558

and "linux video=ofonly debug" shows the new line twice

Serial: 8250/16550 driver $Revision 1.90 $ 76 ports, IRQ sharing enabled
IN from bad port 3f9 at c01dd558
IN from bad port 3f9 at c01dd558

Comment 3 David Woodhouse 2005-08-13 11:29:12 UTC

I can reproduce this on my dual G4. Is yours a dual-cpu machine? It doesn't
happen every time -- sometimes it boots up OK. Some kernel builds seem to suffer
less than others; I don't think I've seen it at all with the current
(2.6.12-1.1398) FC4 kernel, but today's FC4 head (2.6.12-1.1432) does suffer on
about 1 in five boots.

I suspect there's something wrong with the way we just let drivers do I/O
accesses  to non-existent ports and catch the resulting machine checks. I don't
think we're ending up in the machine check handler at all on the one that kills
the system. Annoyingly, adding any kind of printk debugging seems to make it
more reliable. I've left it in an endlessly rebooting loop...

Comment 4 Benjamin Herrenschmidt 2005-08-16 00:48:46 UTC

I think we need to use the new platform device stuff for setting up the 8250
ports with ppc32 CONFIG_MULTIPLATFORM instead of the current hard coded crap in
serial.h.

Then, it's just a matter of {chrp,prep,pmac}_setup.c to create appropriate
platform devices (none for pmac, device-tree based for chrp, hard coded legacy
list for prep). Most of the code can probably be re-used from ppc64 (yet another
good candidate for arch-powerpc :)

Comment 5 miguel 2005-08-16 02:11:07 UTC

yes, it is a dual-processor computer, im sorry i didnt state that before. the message i was getting 
from the error, it seemed like it was a problem with the video card drivers.

i have a dual 450mhz machine, and i found a link to all the hardware specs of the machine, just in 
case anyone needs anything else

http://www.apple-history.com/?
page=gallery&model=g4giga&performa=off&sort=date&order=ASC

if anyone could keep me updated on the progess of the bug it would be wonderful :P

Comment 6 David Woodhouse 2005-08-16 11:28:47 UTC

Patch at http://david.woodhou.se/linux-2.6.12-serial-of.patch

Comment 7 David Woodhouse 2005-08-16 15:55:44 UTC

Miguel, I can build an updated kernel RPM -- would you be able to test it?
You'll need to get Fedora installed in the first place before you can do so.

If you try a few times, I think you should be able to boot from the CD and get
it installed. The problem doesn't occur 100% of the time for me.

Also try booting with 'maxcpus=1' on the kernel command line.

Comment 8 miguel 2005-08-18 15:03:21 UTC

well, at the 9th try of typing "linux debug maxcpus=1" i finally got into the
blue fedora core test-media screen. or at least thats what i think it is.

but then, at "loading ohci1394" there was an error that resulted in white text
taking over the screen.

Loading ohci1934 driver...Oops: machine check, sig: 7 [#1]
NIP: F21F55EC LR: F21F6500 SP: C0FD5DD0 REGS: c0fd5d20 TRAP: 0200    Not tainted
MSR: 00041030 EE: 0 PR: 0 FP: 0 ME: 1 IR/DR:11
TASK eff64090[402] 'loader' THREAD: c0fd400
Last syscall: 120
GPR00: 00000000 C0FD5DD0 EFF69090 F20F6000 00000004 000000C0 00000000 F2340000
GPR08: E000A002 00008400 F20F6000 F21FAAD0 82000448 10144F78 00000000 00000000
GPR16: 00000000 F2200000 EFE1EAFC EFE1EA14 EFE1EB70 EFE1EA88 00000000 00000000
GPR24: F2200000 00009032 C03C0000 C0FD4000 0121EAC0 00000000 F2200000 EFE1E9F0
NIP [f21f55ec] get_phy_reg+0x20/0x270 [ohci1394]
LR [f21f6500] set_phy_reg_mask+0x20/0x50 [ohci1394]
Call trace:
 [f21f6500] set_phy_reg_mask+0x20/0x50 [ohci1394]
 [f21f6acc] ohci1394_pci_probe+0x59c/0xbc0 [ohci1394]
 [c015edec] pci_device_probe+0x9c/0x360
 [c01e3764] driver_probe_device+0x54/0xc0
 [c01e2864] driver_attach+0x94/0xd0
 [c01e3964] bus_add_driver+0xc4/0x1c0
 [c01e3f00] driver_register+0x60/0x70
 [c015e618] pci_register_driver+0xa8/0xf0
 [f1035018] ohci1394_init+0x18/0x50 [ohci1394]
 [c00508fc] sys_init_module+0x17c/0x350
 [c0004820] ret_from_syscall+0x0/0x44
ieee1394: Host added: ID:BUS:[0-00:1023] GUID[1394040922000f09]

and i dont know if thats the whole error, but thats what my screen can show :P

im guessing its my firewire pci card thats causing this problem though,
considering 1394 and pci is shown a lot... im going to remove the card and see
if theres a nofirewire option for boot, and then come back to report.

one last thing though, i noticed that sometimes the "IN from bad port 3f9 at
c01dd558" error is shown once, and sometimes twice. the time i was able to get
to the blue screen, i saw the error scroll up about 8 times. i dont know if this
is anything significant or not, but i just wanted to let you know.

Comment 9 David Woodhouse 2005-08-18 17:59:42 UTC

Please update your install tree from
rsync://zeniv.uk.linux.org/ftp/pub/people/dwmw2/fc4-pegasos/

That install tree has the 8250 probe bug fixed, and may well have the firewire
thing fixed too. I'm not sure about a 'nofirewire' option but try 'noprobe'.

Note You need to log in before you can comment on or make changes to this bug.