Bug 487712 - Fedora11 Alpha give a kernel Oops while installing on JS12 blade (.ehea_probe_adapter)
Fedora11 Alpha give a kernel Oops while installing on JS12 blade (.ehea_probe...
Status: CLOSED RAWHIDE
Product: Fedora
Classification: Fedora
Component: kernel (Show other bugs)
rawhide
ppc64 All
low Severity high
: ---
: ---
Assigned To: Kernel Maintainer List
Fedora Extras Quality Assurance
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2009-02-27 10:51 EST by IBM Bug Proxy
Modified: 2009-11-24 02:00 EST (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2009-03-25 00:30:34 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
reboot.log (3.91 KB, text/x-log)
2009-02-27 10:51 EST, IBM Bug Proxy
no flags Details

  None (edit)
Description IBM Bug Proxy 2009-02-27 10:51:10 EST
=Comment: #0=================================================
Pavan Naregundi <pavan.naregundi@in.ibm.com> - 

reboot.log

Kernel produced a Oops message during the installation of Fedora11 Alpha on JS12 blade. Log
generated while booting is pasted below.

Installation type: DVD based installation
Machine model type: 7998-60X
CPU type: Power 6

===================
boot: linux64
Please wait, loading kernel...
   Elf64 kernel loaded...
Loading ramdisk...
ramdisk loaded at 04f00000, size: 19866 Kbytes
OF stdout device is: /vdevice/vty@30000000
Hypertas detected, assuming LPAR !
command line: ro 
memory layout at init:
  alloc_bottom : 0000000006267000
  alloc_top    : 0000000008000000
  alloc_top_hi : 0000000008000000
  rmo_top      : 0000000008000000
  ram_top      : 0000000008000000
Looking for displays
instantiating rtas at 0x00000000074f2000 ... done
boot cpu hw idx 0000000000000000
starting cpu hw idx 0000000000000002... done
copying OF device tree ...
Building dt strings...
Building dt structure...
Device tree strings 0x0000000006368000 -> 0x0000000006369796
Device tree struct  0x000000000636a000 -> 0x0000000006384000
Calling quiesce ...
returning from prom_init
Phyp-dump disabled at boot time
Using pSeries machine description
Using 1TB segments
Found initrd at 0xc000000004f00000:0xc000000006266ae7
console [udbg0] enabled
Partition configured for 4 cpus.
CPU maps initialized for 2 threads per core
Starting Linux PPC64 #1 SMP Thu Jan 29 14:47:36 EST 2009
-----------------------------------------------------
ppc64_pft_size                = 0x1a
physicalMemorySize            = 0xa0000000
htab_hash_mask                = 0x7ffff
-----------------------------------------------------
Initializing cgroup subsys cpuset
Initializing cgroup subsys cpu
Linux version 2.6.29-0.66.rc3.fc11.ppc64 (mockbuild@ppc10.fedora.phx.redhat.com) (gcc version 4.3.2
20081105 (Red Hat 4.3.2-7) (GCC) ) #1 SMP Thu Jan 29 14:47:36 EST 2009
[boot]0012 Setup Arch
PCI host bridge /pci@800000020000201  ranges:
  IO 0x00003dffe02f0000..0x00003dffe02fffff -> 0x00000000000f0000
 MEM 0x00003c0080000000..0x00003c00ffffffff -> 0x0000000080000000 
EEH: PCI Enhanced I/O Error Handling Enabled
PPC64 nvram contains 15360 bytes
Zone PFN ranges:
  DMA      0x00000000 -> 0x000a0000
  Normal   0x000a0000 -> 0x000a0000
Movable zone start PFN for each node
early_node_map[1] active PFN ranges
    0: 0x00000000 -> 0x000a0000
[boot]0015 Setup Done
Built 1 zonelists in Node order, mobility grouping on.  Total pages: 638720
Policy zone: DMA
Kernel command line: ro 
[boot]0020 XICS Init
[boot]0021 XICS Done
PID hash table entries: 4096 (order: 12, 32768 bytes)
clocksource: timebase mult[7d0000] shift[22] registered
Console: colour dummy device 80x25
console handover: boot [udbg0] -> real [hvc0]
Lock dependency validator: Copyright (c) 2006 Red Hat, Inc., Ingo Molnar
... MAX_LOCKDEP_SUBCLASSES:  8
... MAX_LOCK_DEPTH:          48
... MAX_LOCKDEP_KEYS:        8191
... CLASSHASH_SIZE:          4096
... MAX_LOCKDEP_ENTRIES:     8192
... MAX_LOCKDEP_CHAINS:      16384
... CHAINHASH_SIZE:          8192
 memory used by lock dependency info: 4351 kB
 per task-struct memory footprint: 2688 bytes
Dentry cache hash table entries: 524288 (order: 10, 4194304 bytes)
Inode-cache hash table entries: 262144 (order: 9, 2097152 bytes)
allocated 26214400 bytes of page_cgroup
please try cgroup_disable=memory option if you don't want
freeing bootmem node 0
Memory: 2443428k/2621440k available (14596k kernel code, 178012k reserved, 1080k data, 8874k bss,
6480k init)
SLUB: Genslabs=13, HWalign=128, Order=0-3, MinObjects=0, CPUs=4, Nodes=16
Calibrating delay loop... 1019.90 BogoMIPS (lpj=509952)
Security Framework initialized
SELinux:  Initializing.
Mount-cache hash table entries: 256
Initializing cgroup subsys ns
Initializing cgroup subsys cpuacct
Initializing cgroup subsys memory
Initializing cgroup subsys devices
Initializing cgroup subsys freezer
Initializing cgroup subsys net_cls
Processor 1 found.
Processor 2 found.
Processor 3 found.
Brought up 4 CPUs
khelper used greatest stack depth: 10944 bytes left
net_namespace: 2232 bytes
regulator: core version 0.5
NET: Registered protocol family 16
IBM eBus Device Driver
khelper used greatest stack depth: 10896 bytes left
PCI: Probing PCI hardware
pci 0000:00:01.0: PME# supported from D0 D1 D2 D3hot
pci 0000:00:01.0: PME# disabled
pci 0000:00:01.1: PME# supported from D0 D1 D2 D3hot
pci 0000:00:01.1: PME# disabled
pci 0000:00:01.2: PME# supported from D0 D1 D2 D3hot
pci 0000:00:01.2: PME# disabled
IOMMU table initialized, virtual merging enabled
khelper used greatest stack depth: 10560 bytes left
bio: create slab <bio-0> at 0
SCSI subsystem initialized
usbcore: registered new interface driver usbfs
usbcore: registered new interface driver hub
usbcore: registered new device driver usb
NetLabel: Initializing
NetLabel:  domain hash size = 128
NetLabel:  protocols = UNLABELED CIPSOv4
NetLabel:  unlabeled traffic allowed by default
khelper used greatest stack depth: 10352 bytes left
NET: Registered protocol family 2
IP route cache hash table entries: 131072 (order: 8, 1048576 bytes)
TCP established hash table entries: 524288 (order: 11, 8388608 bytes)
TCP bind hash table entries: 65536 (order: 10, 4718592 bytes)
TCP: Hash tables configured (established 524288 bind 65536)
TCP reno registered
NET: Registered protocol family 1
checking if image is initramfs... it is
Freeing initrd memory: 19866k freed
audit: initializing netlink socket (disabled)
type=2000 audit(1233902614.614:1): initialized
HugeTLB registered 16 MB page size, pre-allocated 0 pages
HugeTLB registered 16 GB page size, pre-allocated 0 pages
VFS: Disk quotas dquot_6.5.2
Dquot-cache hash table entries: 512 (order 0, 4096 bytes)
msgmni has been set to 4811
alg: No test for stdrng (krng)
Block layer SCSI generic (bsg) driver version 0.4 loaded (major 252)
io scheduler noop registered
io scheduler cfq registered (default)
pci_hotplug: PCI Hot Plug PCI Core version: 0.5
pciehp: PCI Express Hot Plug Controller Driver version: 0.4
Linux agpgart interface v0.103
Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled
TX39/49 Serial driver version 1.11
brd: module loaded
loop: module loaded
Fixed MDIO Bus: probed
input: Macintosh mouse button emulation as /devices/virtual/input/input0
Uniform Multi-Platform E-IDE driver
Driver 'sd' needs updating - please use bus_type methods
Driver 'sr' needs updating - please use bus_type methods
ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
ehci_hcd 0000:00:01.2: enabling device (0140 -> 0142)
ehci_hcd 0000:00:01.2: EHCI Host Controller
ehci_hcd 0000:00:01.2: new USB bus registered, assigned bus number 1
ehci_hcd 0000:00:01.2: irq 307, io mem 0x3c00ffffd000
ehci_hcd 0000:00:01.2: USB 2.0 started, EHCI 1.00
usb usb1: New USB device found, idVendor=1d6b, idProduct=0002
usb usb1: New USB device strings: Mfr=3, Product=2, SerialNumber=1
usb usb1: Product: EHCI Host Controller
usb usb1: Manufacturer: Linux 2.6.29-0.66.rc3.fc11.ppc64 ehci_hcd
usb usb1: SerialNumber: 0000:00:01.2
usb usb1: configuration #1 chosen from 1 choice
hub 1-0:1.0: USB hub found
hub 1-0:1.0: 5 ports detected
ohci_hcd: USB 1.1 'Open' Host Controller (OHCI) Driver
ohci_hcd 0000:00:01.0: OHCI Host Controller
khelper used greatest stack depth: 10224 bytes left
ohci_hcd 0000:00:01.0: new USB bus registered, assigned bus number 2
ohci_hcd 0000:00:01.0: irq 305, io mem 0x3c00ffffe000
usb usb2: New USB device found, idVendor=1d6b, idProduct=0001
usb usb2: New USB device strings: Mfr=3, Product=2, SerialNumber=1
usb usb2: Product: OHCI Host Controller
usb usb2: Manufacturer: Linux 2.6.29-0.66.rc3.fc11.ppc64 ohci_hcd
usb usb2: SerialNumber: 0000:00:01.0
usb usb2: configuration #1 chosen from 1 choice
hub 2-0:1.0: USB hub found
hub 2-0:1.0: 3 ports detected
ohci_hcd 0000:00:01.1: OHCI Host Controller
ohci_hcd 0000:00:01.1: new USB bus registered, assigned bus number 3
ohci_hcd 0000:00:01.1: irq 306, io mem 0x3c00fffff000
usb usb3: New USB device found, idVendor=1d6b, idProduct=0001
usb usb3: New USB device strings: Mfr=3, Product=2, SerialNumber=1
usb usb3: Product: OHCI Host Controller
usb usb3: Manufacturer: Linux 2.6.29-0.66.rc3.fc11.ppc64 ohci_hcd
usb usb3: SerialNumber: 0000:00:01.1
usb usb3: configuration #1 chosen from 1 choice
hub 3-0:1.0: USB hub found
hub 3-0:1.0: 2 ports detected
uhci_hcd: USB Universal Host Controller Interface driver
mice: PS/2 mouse device common for all mice
platform ppc-rtc.0: rtc core: registered ppc_md as rtc0
device-mapper: uevent: version 1.0.3
device-mapper: ioctl: 4.14.0-ioctl (2008-04-23) initialised: dm-devel@redhat.com
usbcore: registered new interface driver hiddev
usbcore: registered new interface driver usbhid
usbhid: v2.6:USB HID core driver
nf_conntrack version 0.5.0 (16384 buckets, 65536 max)
CONFIG_NF_CT_ACCT is deprecated and will be removed soon. Please use
nf_conntrack.acct=1 kernel paramater, acct=1 nf_conntrack module option or
sysctl net.netfilter.nf_conntrack_acct=1 to enable it.
ip_tables: (C) 2000-2006 Netfilter Core Team
TCP cubic registered
Initializing XFRM netlink socket
NET: Registered protocol family 17
registered taskstats version 1
Freeing unused kernel memory: 6480k freed
[9;0][8]
Greetings.
anaconda installer init version 11.5.0.12 starting
mounting /proc filesystem... done
creating /dev filesystem... done
starting udev...done
mounting /dev/pts (unix98 pty) filesystem... done
mounting /sys filesystem... done
anaconda installer init version 11.5.0.12 using /dev/hvc0 as console
trying to remount root filesystem read write... done
mounting /tmp as ramfs... done
running install...
running /sbin/loader
detecting hardware...
Oops: Kernel access of bad area, sig: 11 [#1]
SMP NR_CPUS=128 NUMA pSeries
Modules linked in: ehea(+) iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ext2 ext4 jbd2 crc16
squashfs nfs lockd nfs_acl auth_rpcgss sunrpc vfat fat cramfs
NIP: d0000000004d0640 LR: d0000000004da2dc CTR: 0000000000000000
REGS: c0000000999870c0 TRAP: 0300   Not tainted  (2.6.29-0.66.rc3.fc11.ppc64)
MSR: 8000000000009032 <EE,ME,IR,DR>  CR: 24422484  XER: 00000001
DAR: 6b6b6b6b6b6bcb33, DSISR: 0000000040000000
TASK = c000000099990000[1010] 'modprobe' THREAD: c000000099984000 CPU: 3
GPR00: c0000000994cdea0 c000000099987340 d0000000004edd40 0000000000000001 
GPR04: d0000000004e4350 0000000000000000 0000000000000000 6b6b6b6b6b6b6b6b 
GPR08: c0000000994cdda8 6b6b6b6b6b6b6b6b 0000000000000008 0000000000000000 
GPR12: 0000000044422444 c000000000f97a00 0000000000000000 0000000000000000 
GPR16: 0000000000000000 0000000000000000 0000000010005890 0000000000000003 
GPR20: 000000001003ba58 0000000000000000 0000000000000000 0000000000000000 
GPR24: 00000000100292c0 000000001002944c c000000000eae7d0 d0000000004e41e0 
GPR28: fffffffffffffffb d0000000004e41e0 d0000000004ecde0 c0000000994cdda8 
NIP [d0000000004d0640] .ehea_update_firmware_handles+0x88/0x3f0 [ehea]
LR [d0000000004da2dc] .ehea_probe_adapter+0x3a0/0x3e8 [ehea]
Call Trace:
[c000000099987340] [c000000000f8a7e8] kmalloc_caches+0x32e8/0x4a00 (unreliable)
[c000000099987420] [d0000000004da2dc] .ehea_probe_adapter+0x3a0/0x3e8 [ehea]
[c0000000999874e0] [c0000000004a4124] .of_platform_device_probe+0x80/0xb8
[c000000099987570] [c000000000380d24] .driver_probe_device+0x114/0x1fc
[c000000099987610] [c000000000380ea0] .__driver_attach+0x94/0xd8
[c0000000999876a0] [c0000000003802e8] .bus_for_each_dev+0x7c/0xdc
[c000000099987750] [c000000000380ab0] .driver_attach+0x28/0x40
[c0000000999877d0] [c00000000037f9ac] .bus_add_driver+0xcc/0x280
[c000000099987870] [c0000000003811a4] .driver_register+0xe4/0x1bc
[c000000099987920] [c0000000004a3fdc] .of_register_driver+0x44/0x58
[c000000099987990] [c000000000024c70] .ibmebus_register_driver+0x30/0x4c
[c000000099987a20] [d0000000004da4f8] .ehea_module_init+0x1d4/0x2374 [ehea]
[c000000099987ab0] [c000000000009434] .do_one_initcall+0x9c/0x1dc
[c000000099987d90] [c0000000000ce684] .SyS_init_module+0xd8/0x234
[c000000099987e30] [c0000000000085f0] syscall_exit+0x0/0x40
Instruction dump:
fbe1fff8 f821ff21 e93d0170 3909ff08 48000058 39400000 7d285214 394a0008 
2f2a0080 e9290010 2fa90000 419e002c <80095fc8> 2f800001 409e0020 80095f98 
---[ end trace d5bc3c95b8684359 ]---
modprobe used greatest stack depth: 7728 bytes left
waiting for hardware to initialize...
drivers/net/ibmveth.c: ibmveth: IBM i/pSeries Virtual Ethernet Driver 1.03
vio_register_driver: driver ibmveth registering
eth0 (ibmveth): not using net_device_ops yet
eth1 (ibmveth): not using net_device_ops yet
modprobe used greatest stack depth: 6864 bytes left
vio_register_driver: driver ibmvscsi registering
ibmvscsi 30000002: SRP_VERSION: 16.a
scsi0 : IBM POWER Virtual SCSI Adapter 1.5.8
ibmvscsi 30000002: partner initialization complete
ibmvscsi 30000002: sent SRP login
ibmvscsi 30000002: SRP_LOGIN succeeded
ibmvscsi 30000002: host srp version: 16.a, host partition 06-1C11A (1), OS 3, max io 1048576
scsi 0:0:1:0: Direct-Access     AIX      VDASD            0001 PQ: 0 ANSI: 3
scsi 0:0:2:0: CD-ROM            AIX      VOPTA                 PQ: 0 ANSI: 4
sd 0:0:1:0: [sda] 130023424 512-byte hardware sectors: (66.5 GB/62.0 GiB)
sd 0:0:1:0: [sda] Write Protect is off
sd 0:0:1:0: [sda] Mode Sense: 17 00 00 08
sd 0:0:1:0: [sda] Cache data unavailable
sd 0:0:1:0: [sda] Assuming drive cache: write through
sd 0:0:1:0: [sda] 130023424 512-byte hardware sectors: (66.5 GB/62.0 GiB)
sd 0:0:1:0: [sda] Write Protect is off
sd 0:0:1:0: [sda] Mode Sense: 17 00 00 08
sd 0:0:1:0: [sda] Cache data unavailable
sd 0:0:1:0: [sda] Assuming drive cache: write through
 sda: sda1 sda2 sda3 sda4 < sda5 sda6 sda7 >
sd 0:0:1:0: [sda] Attached SCSI disk
sd 0:0:1:0: Attached scsi generic sg0 type 0
sr0: scsi-1 drive
Uniform CD-ROM driver Revision: 3.20
sr 0:0:2:0: Attached scsi CD-ROM sr0
sr 0:0:2:0: Attached scsi generic sg1 type 5

Welcome to Fedora for ppc
===========================


After the installation got completed, during the reboot again the call traces are produced(see
reboot.log attachment).

When the machine booted the installed fedora, it could not recognize the eth2 and eth3(which are
Host Ethernet Adapters)
=Comment: #5=================================================
Jan-Bernd Themann <THEMANN@de.ibm.com> - 
Hi Edjunior,

Hannes, a member of our eHEA team, will have a look at that machine once he get access. 

Regards,
Jan-Bernd
=Comment: #6=================================================
Hannes Hering <HERING2@de.ibm.com> - 
Hello,

I had a look at the machine. We found out that we missed some cleanup in the error path. The problem
here is that the adapter data structure is added to a list and in error case this data structure is
freed but not removed from the list. Here a patch is needed to correctly remove the list entry
before freeing.

This showed up because in the EHEA_0096 driver version contained in the current Fedora 11 Alpha
kernel uses kzalloc to allocate page aligned memory blocks and also uses a configuration which
enables kzalloc to generate not page aligned memory addresses. However, this bug has already fixed,
the patch was sent with other patches to the kernel mailing list and applied by the maintainer.

http://lkml.org/lkml/2009/1/21/191
        applied: http://lkml.org/lkml/2009/1/21/319
http://lkml.org/lkml/2009/1/21/192
        applied: http://lkml.org/lkml/2009/1/21/320
http://lkml.org/lkml/2009/1/21/190
        applied: http://lkml.org/lkml/2009/1/21/321

We currently expect that these patches make their way to the final 2.6.29 kernel. It might be
possible that the needed patch for the problem fix in the error path is not included in the final
2.6.29 kernel.

Regards

Hannes
=Comment: #9=================================================
Edjunior Barbosa Machado <emachado@linux.vnet.ibm.com> - 
I have talked to Hannes and he mentioned this another patch that is also needed to fix this issue
(and it was already accepted):

http://lkml.org/lkml/2009/2/11/131
applied: http://lkml.org/lkml/2009/2/11/379

He also mentioned that he tested these patches using vanilla kernel 2.6.28.4 and 2.6.29-rc2 (and
both have worked fine).

=Comment: #11=================================================
Edjunior Barbosa Machado <emachado@linux.vnet.ibm.com> - 
Thanks Pavan for pointing this out.

I was able to rebuild the Fedora11 kernel from the src rpm (using rpmbuild) with the 4
aforementioned patches (btw, all of them were clearly applied without modifications, in the order
they are mentioned here).

This new kernel no longer shows the backtrace on boot, but now it returns these error messages when
configuring the net interface:

[root@mp6lp1 ~]# ifconfig eth2 9.126.89.219 netmask 255.255.255.0

=======================================================
[ INFO: possible circular locking dependency detected ]
2.6.29-0.66.rc3.bz51514.fc11.ppc64 #1
-------------------------------------------------------
ifconfig/2421 is trying to acquire lock:
 (&ehea_fw_handles.lock){--..}, at: [<d00000000051da18>] .ehea_up+0x6c/0x700 [ehea]

but task is already holding lock:
 (&port->port_lock){--..}, at: [<d00000000051e0e8>] .ehea_open+0x3c/0x118 [ehea]

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #2 (&port->port_lock){--..}:
       [<c0000000000c292c>] .lock_acquire+0x54/0x80
       [<c000000000575ff8>] .mutex_lock_nested+0x1ac/0x49c
       [<d00000000051e0e8>] .ehea_open+0x3c/0x118 [ehea]
       [<c0000000004bcf5c>] .dev_open+0xe8/0x168
       [<c0000000004bc64c>] .dev_change_flags+0x10c/0x214
       [<c00000000052a530>] .devinet_ioctl+0x2c8/0x794
       [<c00000000052c184>] .inet_ioctl+0xd4/0x128
       [<c0000000004a6034>] .sock_ioctl+0x310/0x368
       [<c00000000016a2a8>] .vfs_ioctl+0x54/0xec
       [<c00000000016a9ec>] .do_vfs_ioctl+0x6ac/0x718
       [<c00000000016aad0>] .SyS_ioctl+0x78/0xbc
       [<c0000000001a534c>] .dev_ifsioc+0x1c8/0x424
       [<c0000000001a670c>] .compat_sys_ioctl+0x3ec/0x470
       [<c0000000000085f0>] syscall_exit+0x0/0x40

-> #1 (rtnl_mutex){--..}:
       [<c0000000000c292c>] .lock_acquire+0x54/0x80
       [<c000000000575ff8>] .mutex_lock_nested+0x1ac/0x49c
       [<c0000000004c8ad8>] .rtnl_lock+0x20/0x38
       [<c0000000004bdb28>] .register_netdev+0x1c/0x80
       [<d00000000051cc9c>] .ehea_setup_single_port+0x270/0x394 [ehea]
       [<d00000000052313c>] .ehea_probe_adapter+0x308/0x400 [ehea]
       [<c0000000004a4124>] .of_platform_device_probe+0x80/0xb8
       [<c000000000380d24>] .driver_probe_device+0x114/0x1fc
       [<c000000000380ea0>] .__driver_attach+0x94/0xd8
       [<c0000000003802e8>] .bus_for_each_dev+0x7c/0xdc
       [<c000000000380ab0>] .driver_attach+0x28/0x40
       [<c00000000037f9ac>] .bus_add_driver+0xcc/0x280
       [<c0000000003811a4>] .driver_register+0xe4/0x1bc
       [<c0000000004a3fdc>] .of_register_driver+0x44/0x58
       [<c000000000024c70>] .ibmebus_register_driver+0x30/0x4c
       [<d000000000523408>] .ehea_module_init+0x1d4/0x23a4 [ehea]
       [<c000000000009434>] .do_one_initcall+0x9c/0x1dc
       [<c0000000000ce684>] .SyS_init_module+0xd8/0x234
       [<c0000000000085f0>] syscall_exit+0x0/0x40

-> #0 (&ehea_fw_handles.lock){--..}:
       [<c0000000000c292c>] .lock_acquire+0x54/0x80
       [<c000000000575ff8>] .mutex_lock_nested+0x1ac/0x49c
       [<d00000000051da18>] .ehea_up+0x6c/0x700 [ehea]
       [<d00000000051e110>] .ehea_open+0x64/0x118 [ehea]
       [<c0000000004bcf5c>] .dev_open+0xe8/0x168
       [<c0000000004bc64c>] .dev_change_flags+0x10c/0x214
       [<c00000000052a530>] .devinet_ioctl+0x2c8/0x794
       [<c00000000052c184>] .inet_ioctl+0xd4/0x128
       [<c0000000004a6034>] .sock_ioctl+0x310/0x368
       [<c00000000016a2a8>] .vfs_ioctl+0x54/0xec
       [<c00000000016a9ec>] .do_vfs_ioctl+0x6ac/0x718
       [<c00000000016aad0>] .SyS_ioctl+0x78/0xbc
       [<c0000000001a534c>] .dev_ifsioc+0x1c8/0x424
       [<c0000000001a670c>] .compat_sys_ioctl+0x3ec/0x470
       [<c0000000000085f0>] syscall_exit+0x0/0x40

other info that might help us debug this:

2 locks held by ifconfig/2421:
 #0:  (rtnl_mutex){--..}, at: [<c0000000004c8ad8>] .rtnl_lock+0x20/0x38
 #1:  (&port->port_lock){--..}, at: [<d00000000051e0e8>] .ehea_open+0x3c/0x118 [ehea]

stack backtrace:
Call Trace:
[c000000093493110] [c0000000000117d8] .show_stack+0x6c/0x16c (unreliable)
[c0000000934931c0] [c0000000000c0cb0] .print_circular_bug_tail+0xd8/0xfc
[c000000093493290] [c0000000000c2178] .__lock_acquire+0x1080/0x17e0
[c000000093493390] [c0000000000c292c] .lock_acquire+0x54/0x80
[c000000093493420] [c000000000575ff8] .mutex_lock_nested+0x1ac/0x49c
[c000000093493530] [d00000000051da18] .ehea_up+0x6c/0x700 [ehea]
[c000000093493640] [d00000000051e110] .ehea_open+0x64/0x118 [ehea]
[c0000000934936e0] [c0000000004bcf5c] .dev_open+0xe8/0x168
[c000000093493770] [c0000000004bc64c] .dev_change_flags+0x10c/0x214
[c000000093493810] [c00000000052a530] .devinet_ioctl+0x2c8/0x794
[c000000093493920] [c00000000052c184] .inet_ioctl+0xd4/0x128
[c000000093493990] [c0000000004a6034] .sock_ioctl+0x310/0x368
[c000000093493a30] [c00000000016a2a8] .vfs_ioctl+0x54/0xec
[c000000093493ac0] [c00000000016a9ec] .do_vfs_ioctl+0x6ac/0x718
[c000000093493ba0] [c00000000016aad0] .SyS_ioctl+0x78/0xbc
[c000000093493c50] [c0000000001a534c] .dev_ifsioc+0x1c8/0x424
[c000000093493d50] [c0000000001a670c] .compat_sys_ioctl+0x3ec/0x470
[c000000093493e30] [c0000000000085f0] syscall_exit+0x0/0x40
ehea: eth2: Physical port up
ehea: External switch port is backup port
eth2: no IPv6 routers present

=Comment: #12=================================================
Edjunior Barbosa Machado <emachado@linux.vnet.ibm.com> - 
Also, it might be worth mentioning that even with these error messages from the previous comment,
the net interface seems to be working OK with the patched kernel.

Hannes,
could you please take a look into this again?
Thank you.
=Comment: #13=================================================
Hannes Hering <HERING2@de.ibm.com> - 
(In reply to comment #11)
Hello,

be aware that this "error message" is just a info message, as shown by this line on top of the message:
> [ INFO: possible circular locking dependency detected ]
This means that everything works fine, except that it is possible that a circular locking occurs.
That does not mean that it happened. Except of having this message in the log, everything works
fine. However, we need to investigate what triggers the message.

We're already working on this and the issue is tracked in another bug.

Regards

Hannes

=================================================
Hello Red Hat,

We face this issue with eHEA when installing Fedora11 Alpha in a JS12 blade. In according to Hannes
Hering, this is a problem that occurs with a specific kernel configuration that F11 uses by default.
Hannes were able to create 4 patches that fixes this issue.
After applying the patches, ifconfig works fine but generates that info message. eHEA developers are
already looking into this, tracking in a separate internal bug report.

Thanks.
Comment 1 IBM Bug Proxy 2009-02-27 10:51:26 EST
Created attachment 333498 [details]
reboot.log
Comment 2 IBM Bug Proxy 2009-03-11 12:21:25 EDT
This issue has meanwhile been fixed in mainline kernel and a backport patch for RH5.4 - which has been accepted by RH - has been provided as well. Thus this bug should be closed.

Regards
Thomas
Comment 3 IBM Bug Proxy 2009-03-17 07:50:40 EDT
?[9;0]?[8]









































(In reply to comment #20)
> This issue has meanwhile been fixed in mainline kernel and a backport patch for
> RH5.4 - which has been accepted by RH - has been provided as well. Thus this
> bug should be closed.
>
> Regards
> Thomas
>

Can you please share the commit id of the patch in the mainline?

Thanks
Pavan
Comment 4 IBM Bug Proxy 2009-03-18 03:10:44 EDT
> Can you please share the commit id of the patch in the mainline?

http://ozlabs.org/pipermail/linuxppc-dev/2009-March/069298.html
Comment 5 IBM Bug Proxy 2009-04-02 09:20:50 EDT
------- Comment From emachado@linux.vnet.ibm.com 2009-04-02 09:11 EDT-------
Hello Red Hat,

we have faced a similar Oops message when starting Fedora 11 Beta installation through cdrom on JS12:

=======================
Oops: Kernel access of bad area, sig: 11 [#1]
SMP NR_CPUS=128 NUMA pSeries
Modules linked in: ehea(+) iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi
ext2 ext4 jbd2 crc16 squashfs nfs lockd nfs_acl auth_rpcgss sunrpc vfat fat
cramfs
NIP: d000000000522874 LR: d00000000052d060 CTR: 0000000000000000
REGS: c000000099c57030 TRAP: 0300   Not tainted
(2.6.29-0.258.2.3.rc8.git2.fc11.ppc64)
MSR: 8000000000009032 <EE,ME,IR,DR>  CR: 24422448  XER: 00000001
DAR: 6b6b6b6b6b6bcb33, DSISR: 0000000040000000
TASK = c0000000999f2670[1018] 'modprobe' THREAD: c000000099c54000 CPU: 1
GPR00: 0000000000000000 c000000099c572b0 d000000000540da8 0000000000000001
GPR04: c0000000999f3178 d000000000537630 6b6b6b6b6b6b6b6b 0000000000000000
GPR08: 0000000000000000 c000000099a5c490 6b6b6b6b6b6b6b6b 0000000000000008
GPR12: 0000000044422444 c000000001064600 0000000010029590 0000000000000000
GPR16: 0000000000000000 0000000000000003 000000001002a3ec 0000000000000000
GPR20: 0000000010005e20 0000000000000000 0000000000000000 000000001002a050
GPR24: 000000001003cb28 0000000000000000 0000000000000000 fffffffffffffffb
GPR28: c00000009e090cd8 d0000000005374c0 d00000000053fe50 c000000099c572b0
NIP [d000000000522874] .ehea_update_firmware_handles+0x88/0x354 [ehea]
LR [d00000000052d060] .ehea_probe_adapter+0x3a4/0x3f0 [ehea]
Call Trace:
[c000000099c572b0] [c000000099c57380] 0xc000000099c57380 (unreliable)
[c000000099c57380] [d00000000052d060] .ehea_probe_adapter+0x3a4/0x3f0 [ehea]
[c000000099c57440] [c00000000051d024] .of_platform_device_probe+0x8c/0xc8
[c000000099c574e0] [c0000000003dc148] .driver_probe_device+0x120/0x1e0
[c000000099c57580] [c0000000003dc2b4] .__driver_attach+0xac/0xf4
[c000000099c57620] [c0000000003db550] .bus_for_each_dev+0x9c/0x104
[c000000099c576e0] [c0000000003dbe98] .driver_attach+0x40/0x60
[c000000099c57770] [c0000000003dab38] .bus_add_driver+0xdc/0x28c
[c000000099c57820] [c0000000003dc6e8] .driver_register+0xd4/0x1b0
[c000000099c578d0] [c00000000051ce6c] .of_register_driver+0x60/0x80
[c000000099c57960] [c000000000029634] .ibmebus_register_driver+0x40/0x60
[c000000099c579f0] [d00000000052d288] .ehea_module_init+0x1dc/0x2474 [ehea]
[c000000099c57a80] [c000000000009478] .do_one_initcall+0xac/0x1f0
[c000000099c57d70] [c0000000000ebeec] .SyS_init_module+0xf0/0x258
[c000000099c57e30] [c0000000000085f0] syscall_exit+0x0/0x40
Instruction dump:
38e00000 38bd0170 e9290170 3929ff08 48000058 39600000 7d495a14 396b0008
2fab0080 e94a0010 2f2a0000 419a002c <808a5fc8> 2f040001 409a0020 808a5f98
---[ end trace 555014e3ac5b23ec ]---
====================

Please confirm whether the aforementioned patches were applied in F11 Beta or not.

Thanks.
Comment 6 IBM Bug Proxy 2009-04-14 14:20:42 EDT
------- Comment From emachado@linux.vnet.ibm.com 2009-04-14 14:16 EDT-------
*** Bug 52532 has been marked as a duplicate of this bug. ***
Comment 7 IBM Bug Proxy 2009-11-24 02:00:47 EST
------- Comment From pavan.naregundi@in.ibm.com 2009-11-24 01:57 EDT-------
Tested with Fedora12 GA, now I am not able to see this issue.

Note You need to log in before you can comment on or make changes to this bug.