Bug 223750

Summary: Installing os by local cdrom method failed
Product: [Fedora] Fedora Reporter: Zhang Yanmin <yanmin.zhang>
Component: anacondaAssignee: Prarit Bhargava <prarit>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: medium Docs Contact:
Priority: medium    
Version: rawhideCC: davej, dchapman, prarit
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: 2.6.20-1.3017 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2007-03-26 13:51:14 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Anaconda log
none
syslog
none
lspci output
none
New syslog including both ata_pixx output and ide_cd/piix output
none
The syslog from Hitachi
none
Patch to fix ATA driver
none
New patch to fix irq conversion issue
none
Patch to add a quirk for ide controller on tiger-4
none
New patch of ide controller quirk on ia64 none

Description Zhang Yanmin 2007-01-22 05:28:35 UTC
Description of problem:
I built up new DVD image from the latest development tree and installed
os on my ia64 box by local cdrom method. The installation failed.

Version-Release number of selected component (if applicable):
anaconda-11.2.0.9-1

How reproducible:
Reproduce everytime when installing it.

Steps to Reproduce:
1.Rebuilup dvd image from the latest development tree;
2.Burn a DVD from the image;
3.Install os on the machine by choosing local cdrom method;
 
  
Actual results:
A window said:
Unable to find any devices of the type needed for this installation type. Would
you like to manually select your driver or use a driver disk?


Expected results:
No above question and installation could go on.

Additional info:
It's interesting that I could boot with DVD till choosing installation method,
but later on, it couldn't find the cd drive.

Comment 1 David Cantrell 2007-01-22 15:05:51 UTC
Do you get the same error if you try burning the boot.iso image provided in the
development tree?

Comment 2 Zhang Yanmin 2007-01-23 06:04:01 UTC
I burned the latest boot.iso and booted it. It has the exact the same problem.

I tested it on ia64 machine.

Comment 3 Prarit Bhargava 2007-01-29 15:29:49 UTC
dchapman, didn't we see something similar to this a while ago?

P.

Comment 4 Doug Chapman 2007-01-29 15:49:43 UTC
This sounds a lot like the problem we saw a while back where the kernel could
not find any scsi disks.  I am not sure what the real issue was there, it
apparently got fixed upstream.

Yanmin, what is the kernel version you are booting (likely the issue here is
with the kernel and not with anaconda).  Also, is this a serial console install?
 If so can you grab the boot messages from the kernel?  If so we can hopefully
tell if it is seeing the SCSI devices or not.



Comment 5 Zhang Yanmin 2007-01-30 05:27:13 UTC
The latest kernel, kernel-2.6.19-1.2914.fc7. At least since 2912, the issue was
there.

1) The serial console doesn't work;
2) I captured the log by a walkaround. See the attachment.

The cd (dvd) drive is of ide and the driver should be ide-cd.ko. From the
anaconda log, the ide-cd insertion failed. I checked kernel config and found
CONFIG_BLK_DEV_IDECD is not set. That might be the root cause.
 

Comment 6 Zhang Yanmin 2007-01-30 05:28:33 UTC
Created attachment 146886 [details]
Anaconda log

Comment 7 Zhang Yanmin 2007-01-30 05:29:28 UTC
Created attachment 146887 [details]
syslog

Comment 8 Zhang Yanmin 2007-01-30 05:48:25 UTC
Prarit,

Could you work out a new config?

Thanks,
Yanmin


Comment 9 Zhang Yanmin 2007-01-30 09:26:45 UTC
I used a walkaround to rebuilt kernel rpm with ide-cd module and retested. Local
cd installation still failedalthough ide-cd.ko was inserted correctly. I'm not
sure if other modules were lost.

Comment 10 Prarit Bhargava 2007-01-30 14:20:03 UTC
From private email from Yanmin:

>The cd drive is ide on my machine. The driver should be ide-cd.ko.
>The anaconda log showed ide-cd insertion failed.

>I checked kernel config file and found CONFIG_BLK_DEV_IDECD is not set.

>That might be the root cause.


Comment 11 Prarit Bhargava 2007-01-30 16:07:53 UTC
Yanmin,

One thing before we go down this road.  Could you boot a 2.6.18 kernel and do a
lspci -xxx -vv?

Thanks,

P.

Comment 12 Prarit Bhargava 2007-01-31 20:23:23 UTC
Yanmin, what type of system are you running on?

<7>ata_piix 0000:00:1f.1: version 2.00ac7
<3>PCI: Device 0000:00:1f.1 not available because of resource collisions
<4>ata_piix: probe of 0000:00:1f.1 failed with error -22

Looks like a BIOS or FW problem.

(pointed out by jgarzik)

P.

Comment 13 Zhang Yanmin 2007-02-01 05:55:13 UTC
Prarit,

Sorry for replying late. I was on a business trip yesterday.

My system is Tiger 4 with Madison cpu.

I installed the os on it by nfs and rebooted it. Then, I couldn't mount cdrom.

I recompiled kernel kernel-2.6.19-1.2914.fc7 with my config file which
configures all needed modules into kernel and rebooted. Unde the new kernel, the
cdrom mount succeeded.

So I think it's a kernel config issue.

Yanmin


Comment 14 Prarit Bhargava 2007-02-01 11:47:25 UTC
<7>ata_piix 0000:00:1f.1: version 2.00ac7
<3>PCI: Device 0000:00:1f.1 not available because of resource collisions
<4>ata_piix: probe of 0000:00:1f.1 failed with error -22

Yanmin,

The above is why your device is not working.  AFAIK, I do not have a Tiger 4
with Madison's here in Westford -- you'll have to debug the ata_piix driver
and/or ACPI tables.

The fact that you're getting resource collisions means either something is wrong
with your ACPI tables, or something is wrong with the PCI enumeration (which is
unlikely, but still possible).

Resource collisions -> ata_piix driver is not loaded -> No CDROM.

P.

Comment 15 Zhang Yanmin 2007-02-02 00:56:05 UTC
Created attachment 147171 [details]
lspci output

Prarit,

The attachment is the lspci output on my tiger 4 box. I recompiled
kernel-2.6.19-1.2914.fc7.

Thanks,
Yanmin

Comment 16 Zhang Yanmin 2007-02-02 01:00:31 UTC
With the new compiled kernel, I could boot on my tiger 4. The log showed the
same error info of ata_piix, but cdrom could work after booted. If I set
CONFIG_BLK_DEV_IDECD=n, then, cdrom doesn't work.

That's why I think it's a configuration issue. I am still debugging it. Perhaps
anaconda also need to be changed to insert appropriate modules.

Yanmin


Comment 17 Zhang Yanmin 2007-02-02 01:12:40 UTC
The modules to support cdrom on my ia64 box are cdrom.ko, idecore.ko, ide_cd.ko
and piix.ko. Not ata_piix although ata_piix was loaded by anaconda.

Yanmin


Comment 18 Zhang Yanmin 2007-02-02 02:09:37 UTC
Prarit,

What's the difference between piix and ata_piix? ata is the official name of
ide, but kernel has support to piix and ata_piix. piix work well on my box while
ata_piix doesn't work.

Yanmin


Comment 19 Prarit Bhargava 2007-02-02 02:49:10 UTC
>The modules to support cdrom on my ia64 box are cdrom.ko, idecore.ko, ide_cd.ko
and piix.ko. Not ata_piix although ata_piix was loaded by anaconda.

These are the old driver.  The new driver supports different modules.

Please attach lspci -xxx -vv output ...

P.

Comment 20 Zhang Yanmin 2007-02-02 02:53:00 UTC
Comment #15 includes lspci -xxx -vv output.

Yanmin


Comment 21 Zhang Yanmin 2007-02-02 03:25:22 UTC
Created attachment 147179 [details]
New syslog including both ata_pixx output and ide_cd/piix output

The new syslog includes ata_pixx error info and ide_cd/piix successful info.
They refer to the same device 0000:00:1f.1.

Yanmin

Comment 22 Prarit Bhargava 2007-02-02 12:26:05 UTC
Yanmin, please stop compiling kernels.  Right now it's a waste of time.

P.

Comment 23 Zhang Yanmin 2007-02-05 02:53:15 UTC
Prarit,

Thanks. I want to know why old ide_cd could works well but ata_piix doesn't work.

I checked the resources of ide controller. From below log, we could see that
the 5th resource's start is 0 and end is not 0, so pci_enable_device will fail.
If pci_enable_device fails, ata_piix wouldn't recover the device, but ide-cd
would call pci_enable_device_bars again with a smaller mask 4.

The comments of function ide_pci_enable has the details. That's why ide-cd
could work.

The fix could be:
1) Fix BIOS: Is it possible? I mean there might be many machines having the same
issues, so there are the comments of function ide_pci_enable.
2) Fix it in ata modules: Like ide-cd.

Yanmin

*****log*****************
[root@tigerF 0000:00:1f.1]# pwd
/sys/devices/pci0000:00/0000:00:1f.1
[root@tigerF 0000:00:1f.1]# cat resource
0x00000000000001f0 0x00000000000001f7 0x0000000000000110
0x00000000000003f6 0x00000000000003f6 0x0000000000000110
0x0000000000000170 0x0000000000000177 0x0000000000000110
0x0000000000000376 0x0000000000000376 0x0000000000000110
0x0000000000001000 0x000000000000100f 0x0000000000000101
0x0000000000000000 0x00000000000003ff 0x0000000000000200
0x0000000000000000 0x0000000000000000 0x0000000000000000


Comment 24 Prarit Bhargava 2007-02-05 14:25:05 UTC
(In reply to comment #23)
> Prarit,
> 
> Thanks. I want to know why old ide_cd could works well but ata_piix doesn't work.
> 
> I checked the resources of ide controller. From below log, we could see that
> the 5th resource's start is 0 and end is not 0, so pci_enable_device will fail.
> If pci_enable_device fails, ata_piix wouldn't recover the device, but ide-cd
> would call pci_enable_device_bars again with a smaller mask 4.
> 
> The comments of function ide_pci_enable has the details. That's why ide-cd
> could work.
> 
> The fix could be:
> 1) Fix BIOS: Is it possible? I mean there might be many machines having the same
> issues, so there are the comments of function ide_pci_enable.

Yanmin, herein lies the problem.  The new driver (ata_piix) relies on the
BIOS/ACPI being correct.  On this system they are not.

IMHO #1 is the correct way to go.

> 2) Fix it in ata modules: Like ide-cd.

This would allow vendors to keep using broken BIOS/ACPI tables.  That type of
patch won't be treated very nicely upstream ;)

P.


Comment 25 Zhang Yanmin 2007-02-06 01:58:19 UTC
Prarit,

With the FC7 kernel, the cdrom drives of all my 3 ia64 boxes( 2 are tiger 4 and
1 is hitachi) don't work.

Tiger4 assigns bad resource. Hitachi reported:
ata_piix 0000:00:1f.1: irq 14 request failed: -38

I'm afraid other guys will run into it soon.

Yanmin


Comment 26 Zhang Yanmin 2007-02-06 05:33:31 UTC
Prarit,

I contacted Tony luck and make sure the BIOS on my tiger machine is the latest
Montecito BIOS.

Yanmin

Comment 27 Prarit Bhargava 2007-02-06 12:29:02 UTC
Yanmin, could you post the /sys/devices/pci0000:00/0000:00:1f.1 for the other
Tiger and the Hitachi?

Thanks,

P.

Comment 28 Prarit Bhargava 2007-02-06 12:32:58 UTC
Yanmin, could you also try booting with "acpi=off" and "acpi=noirq" on those boxes?

P.

Comment 29 Zhang Yanmin 2007-02-07 00:28:28 UTC
Against #27:

My another tiger halted while I am oout of office today. Below is of Hitachi 
machine.

From the log, I deduce that the ata drive on Hitachi uses legacy mode and ata 
driver chooses fixed irq 14 and 15. On ia64, Such fixed isa irq should be 
converted by function isa_irq_to_vector because 14/15 are GSI on IA64.

So ata/ata_piix is bad on compatibility and isn't tested thoroughly.

Yanmin


--------------/sys on Hitachi-------------
[root@Hitachi 0000:00:1f.1]# pwd
/sys/devices/pci0000:00/0000:00:1f.1
[root@Hitachi 0000:00:1f.1]# cat resource
0x00000000000001f0 0x00000000000001f7 0x0000000000000110
0x00000000000003f6 0x00000000000003f6 0x0000000000000110
0x0000000000000170 0x0000000000000177 0x0000000000000110
0x0000000000000376 0x0000000000000376 0x0000000000000110
0x0000000000002080 0x000000000000208f 0x0000000000000101
0x0000000000000000 0x0000000000000000 0x0000000000000000
0x0000000000000000 0x0000000000000000 0x0000000000000000


----------syslog of hitachi------------------
libata version 2.00 loaded.
ata_piix 0000:00:1f.1: version 2.00ac7
ACPI: PCI Interrupt 0000:00:1f.1[A] -> GSI 21 (level, low) -> IRQ 60
ata1: PATA max UDMA/100 cmd 0x1F0 ctl 0x3F6 bmdma 0x2080 irq 14
ata2: PATA max UDMA/100 cmd 0x170 ctl 0x376 bmdma 0x2088 irq 15
ata_piix 0000:00:1f.1: irq 14 request failed: -38
ACPI: PCI interrupt for device 0000:00:1f.1 disabled


Comment 30 Zhang Yanmin 2007-02-07 00:34:59 UTC
Against comment #28:

I will try tomorrow. ia64 need ACPI support. If acpi=off, could it work any 
more?

BTW, I suspect another tiger machine has the same resource collision issue 
like the first tiger. So we have 2 issues of ide/ata drive. 1) bad BIOS set 
collided resources. 2) legacy mode, ata driver doesn't convert GSI to irq on 
ia64.

Yanmin


Comment 31 Prarit Bhargava 2007-02-07 12:24:17 UTC
> 1) bad BIOS set collided resources. 

That will have to be fixed by the vendor (in this case Intel).  Upstream is very
reluctant to put fixes in for broken BIOSes.

2) legacy mode, ata driver doesn't convert GSI to irq on ia64.

You're right on that -- that seems like a coding issue.  I'll get to that shortly.

Could you send me the make/model of the Hitachi system?  Maybe there is one in
Westford I could use ...

Comment 32 Prarit Bhargava 2007-02-07 19:09:28 UTC
Yanmin, could you attach the syslog from the Hitachi box?

Thanks,

P.

Comment 33 Prarit Bhargava 2007-02-07 19:42:06 UTC
> 
> 2) legacy mode, ata driver doesn't convert GSI to irq on ia64.
> 
> You're right on that -- that seems like a coding issue.  I'll get to that shortly.
> 

I take this back.  After looking at the code it looks like the driver is calling
pci_enable, etc., which does map GSIs to irqs.

So it's something else.  I'm tempted to say "broken BIOS" again, but would need
to see the complete syslog and do some debugging to verify that.

P.


Comment 34 Zhang Yanmin 2007-02-08 00:53:47 UTC
Created attachment 147621 [details]
The syslog from Hitachi

I recompiled kernel-2.6.19-1.2919.fc7.src.rpm by enabling ide-cd/piix and
keeping ata/ata_piix.

Comment 35 Zhang Yanmin 2007-02-08 01:55:30 UTC
The model of my hitachi machine:

It's one of S6E4500 series, maybe S6E4521, S6E4530, or S6E4531.


Comment 36 Zhang Yanmin 2007-02-08 02:09:42 UTC
Against comment #33:

What is pci_enable? I assume it is pci_enable_device. pci_enable_device does
call acpi functions to retrieve irq number from ACPI tables. But if the ide/ata
device is at legacy mode, ata driver would use ATA_PRIMARY_IRQ (14) and
ATA_SECONDARY_IRQ(15) as the irq number without converson from GSI to irq.

Pls. see the call sequence.
ata_pci_init_one
      => ata_pci_init_legacy_port (set 14/15 as irq number if legacy mode)
      => ata_device_add (just call request_irq)

Yanmin


Comment 37 Zhang Yanmin 2007-02-08 07:51:47 UTC
Created attachment 147636 [details]
Patch to fix ATA driver

The patch fixes the cd driver issue on my Hitachi.

I sent another similiar patch to LKML.

Yanmin

Comment 38 Prarit Bhargava 2007-02-08 12:30:59 UTC
+#if defined(__ia64__)
+#define ATA_PRIMARY_IRQ(dev)	isa_irq_to_vector(14)
+#else
 #define ATA_PRIMARY_IRQ(dev)	14
+#endif

Yanmin,

Hmmm ... I could have sworn I once read that as ATA_PRIMARY_IRQ(dev) was
ide_default_irq(0x1F0) ... Maybe I have it wrong in my head and am confusing it
with another piece of code.

In any case, I'm assuming you're right :) -- but I would suggest that the above
should be

#define ATA_PRIMARY_IRQ(dev)	ide_default_irq(0x1F0)
#define ATA_SECONDARY_IRQ(dev)	ide_default_irq(0x170)

(see include/asm-ia64/ide.h:ide_default_irq)

That way you can drop the #ifdef's.

What do you think?

P.

Comment 39 Prarit Bhargava 2007-02-08 12:35:33 UTC
And :) good catch. :)

P.

Comment 40 Zhang Yanmin 2007-02-09 01:11:39 UTC
Against comment #38:

It looks like ata source codes don't include old ide code. So I didn't use
ide_default_irq. More discussion is on LKML. Andrew thinks it's a good fix at
least for 2.6.20 stable kernel.

Yanmin


Comment 41 Zhang Yanmin 2007-02-09 01:18:35 UTC
Against comment #24:

I discussed the tiger BIOS issue with Tony again. He agreed to help push BIOS
team to fix it, but the chance of success is very low because tiger4 might be
replaced soon. Unfortunately, most of my ia64 boxes are tiger4.

Yanmin


Comment 42 Zhang Yanmin 2007-02-09 08:35:49 UTC
Created attachment 147744 [details]
New patch to fix irq conversion issue

The new patch is based on community comments.

I added it into the latest FC7 kernel and recompiled the kernel. Then, I
rebuilt a DVD from the latest development tree	and replaced the kernel.
Finally, I succeeded in installing the os on my hitachi machine by local cd
installation method.

Comment 43 Prarit Bhargava 2007-02-12 18:19:23 UTC
Yanmin,

I spoke with Dave Jones about the BIOS issue you're having with the Tiger4
system.  He agrees that the best thing to do is to put a fix into the Fedora
tree for it.

As Alan Cox suggested, the best bet is to create a quirks file similar to
arch/i386/kernel/quirks.c ...

P.

Comment 44 Zhang Yanmin 2007-02-13 01:00:19 UTC
Prarit,

I will work out a patch based on Alan's idea.

Yanmin


Comment 45 Zhang Yanmin 2007-02-13 09:22:45 UTC
Created attachment 147966 [details]
Patch to add a quirk for ide controller on tiger-4

Against comment #43:
Here is the patch of the quirk. I sent it to linux-ia64 maillist.

Yanmin

Comment 46 Zhang Yanmin 2007-02-14 06:23:17 UTC
Created attachment 148038 [details]
New patch of ide controller quirk on ia64

Comment 48 Prarit Bhargava 2007-03-26 13:51:14 UTC
This patch is in the kernel source for F7 (and beyond).

P.