Bug 450155 - /sbin/dump causes a kernel panic in map_bio() /
/sbin/dump causes a kernel panic in map_bio() /
Status: CLOSED INSUFFICIENT_DATA
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel (Show other bugs)
5.1
i686 Linux
low Severity high
: rc
: ---
Assigned To: Anton Arapov
Martin Jenner
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2008-06-05 12:02 EDT by Alf Clement
Modified: 2014-06-18 04:01 EDT (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2008-12-18 06:25:09 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Alf Clement 2008-06-05 12:02:37 EDT
Description of problem:

this applies to kernels 2.6.19-92.el5 and 2.6.18-53
assign to kernel MD component.

I installed a linux server and run a backup using dump.
The effect was the the kernel panic'ed in map_bio() with 2.6.18-53.1.4.el5.
After I upgraded to 2.6.18-92 the system seem to hang.
Always after some time, ie. when 17% of the dump is done, sometime earlier.

I do a dump level 0 (-b64) from a RAID 1 array mounted on /data to a
file in root fs in /backup. It's called /dev/mapper/nvidia_bcacfdbgp1
The system is in normal user mode, but also in single user mode.

Steps to reproduce:
/sbin/dump -0u -b64 -f /backup/data.0 /data

either in single or multiuser mode.

Maybe it has something todo with the -b option? I've created a few backups
before I introduced the -b64 option to improve the speed.
-b64 seems to panic always. Right now I start over to test with -b16, but this
is slow...

EIP was pointing to map_bio(). 
Callstack which I could read of the console:
add_to_page_cache
__do_pache_cache_readahead
dm_any_congested
blockable_page_cache_readahead
make_ahead_window
page_cache_readahead
do_generic_mapping
__generic_file_io_read
file_read_actor
generic_file_read
auto_remove_wake_function
mutex_lock
block_lseek
vfs_read
sys_read
sys_call
Comment 1 Anton Arapov 2008-06-10 05:26:59 EDT
Please, try to reproduce this problem and attach the whole debug message, you've
got. That will provide us detailed information to work.
Comment 2 Alf Clement 2008-06-10 05:39:41 EDT
I had reconfigured the machine in order to get it stable running, so I cannot 
reproduce it anymore. The output on the console is the same as I wrote.

It a HP Proliant ML115. I've had / on a 160GB disk and two 500GB discs for the
RAID array mounted on /data. All fileststems with ext3.
So dump was reading from /data and storing data on /backup. 
The size of used space in /data about 40GB. / had enough space to store the
backup. I played with the blocksizes, but got also problems at 32. 16 or lower
takes too much time to do the backup.

Hope you can reproduce it.
Comment 3 Anton Arapov 2008-06-10 06:22:44 EDT
It would be _very_ helpful to have mentioned backtrace with the
addresses/offsets... 

I'm trying to reproduce it but have no luck so far ...
Comment 4 Anton Arapov 2008-06-10 08:06:42 EDT
Tried to reproduce it in many ways, even on the same configuration: raid1(using
dm),... played with blocksize variable. Every backup were successful.

No similar complaints were found in lkml and internet. I need detailed backtrace
for further investigation.

So that putting bug into NEEDINFO state.

Note You need to log in before you can comment on or make changes to this bug.