Bug 135266 - Panics while backing up LVM snapshots
Summary: Panics while backing up LVM snapshots
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 3
Classification: Red Hat
Component: kernel
Version: 3.0
Hardware: All
OS: Linux
medium
high
Target Milestone: ---
Assignee: Heinz Mauelshagen
QA Contact: Brian Brock
URL:
Whiteboard:
Depends On:
Blocks: 132991
TreeView+ depends on / blocked
 
Reported: 2004-10-11 15:16 UTC by Wendy Cheng
Modified: 2010-10-22 02:38 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2005-05-18 13:28:14 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
workaround patch (460 bytes, patch)
2004-12-09 15:12 UTC, Wendy Cheng
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2005:294 0 normal SHIPPED_LIVE Moderate: Updated kernel packages available for Red Hat Enterprise Linux 3 Update 5 2005-05-18 04:00:00 UTC

Description Wendy Cheng 2004-10-11 15:16:02 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.0.1)
Gecko/20020830

Description of problem:
From IT#49956:

The customer is having an intermittent problem with a kernel panic on
their production web course teaching system. Each time the panic
occurs when they are doing a backup to an LTO tape drive of an LVM
snapshot volume. Not all backups produce a panic. There are 4 file
systems dumped, but the panic has only occurred during the one file
system that is a snapshot volumes. All 4 volumes being backed up are
LVM volumes.

The panic route:

EIP is at lvm_find_exception_table [lvm-mod] 0x7b
(2.4.21-15.0.3.ELsmp/i686)
eax: f8a52fe8   ebx: f909c220   ecx: f8a7ac60   edx: 00000000
esi: 02dd2280   edi: 00000009   ebp: 00000801   esp: d0581d60
ds: 0068   es: 0068   ss: 0068
Process dump (pid: 16383, stackpage=d0581000)
Stack: 0001ffff f6b85e18 f52eca00 00000008 00000000 d0581de6 f89441fe
00000801
      02dd2280 f52eca00 02dd2280 00000000 02dd0008 02dd2280 f52eca00
00000001
      f894065d d0581de6 d0581de0 02dd2280 f52eca00 c0435280 00000000
c0426200
Call Trace:   [<f89441fe>] lvm_snapshot_remap_block [lvm-mod] 0x7e
(0xd0581d78)
[<f894065d>] lvm_map [lvm-mod] 0x20d (0xd0581da0)
[<f8940a57>] lvm_make_request_fn [lvm-mod] 0x17 (0xd0581df8)
[<c01cd29a>] generic_make_request [kernel] 0xea (0xd0581e04)
[<c01cd339>] submit_bh_rsector [kernel] 0x49 (0xd0581e2c)
[<c01642d6>] block_read_full_page [kernel] 0x266 (0xd0581e48)
[<c01459da>] add_to_page_cache_unique [kernel] 0x5a (0xd0581e98)
[<c0145c31>] page_cache_read [kernel] 0xe1 (0xd0581eac)
[<c0168c40>] blkdev_get_block [kernel] 0x0 (0xd0581eb4)
[<c01465f7>] generic_file_readahead [kernel] 0xd7 (0xd0581ed4)
[<c0146b89>] do_generic_file_read [kernel] 0x489 (0xd0581ef0)
[<c0147435>] generic_file_new_read [kernel] 0xc5 (0xd0581f30)
[<c0147270>] file_read_actor [kernel] 0x0 (0xd0581f40)
[<c021e926>] sock_read [kernel] 0x96 (0xd0581f50)
[<c014755f>] generic_file_read [kernel] 0x2f (0xd0581f7c)
[<c01607b7>] sys_read [kernel] 0x97 (0xd0581f94)

Version-Release number of selected component (if applicable):
2.4.21-15.0.3.ELsmp

How reproducible:
Didn't try


Additional info:

Comment 1 Wendy Cheng 2004-10-11 15:19:06 UTC
The system died with "dump" command in lvm_find_exception_table with
one entry's "next" field of the hash_table being "0" during a list
entry remove.

   108 lvm_find_exception_table(kdev_t org_dev, unsigned long
org_start, lv_t *         lv)
   109 {
   110         struct list_head * hash_table =
lv->lv_snapshot_hash_table, * ne        xt;
   111         unsigned long mask = lv->lv_snapshot_hash_mask;
   112         int chunk_size = lv->lv_chunk_size;
   113         lv_block_exception_t * ret;
   114         int i = 0;
   115
   116         if (!hash_table)
   117                 BUG();
   118         hash_table = &hash_table[hashfn(org_dev, org_start,
mask, chunk_        size)];
   119         ret = NULL;
   120         for (next = hash_table->next; next != hash_table; next
= next->n        ext)
   121         {
   122                 lv_block_exception_t * exception;
   123
   124                 exception = list_entry(next,
lv_block_exception_t, hash)        ;
   125                 if (exception->rsector_org == org_start &&
   126                     exception->rdev_org == org_dev)
   127                 {
   128                         if (i)
   129                         {
   130                                 /* fun, isn't it? :) */
   131 #ifdef  list_move
   132                                 list_move(next, hash_table);
   133 #else
   134                                 list_del(next);
   135                                 list_add(next, hash_table);


static inline void __list_del(struct list_head *prev, struct list_head
*next)
{
   109e:       8b 41 04                mov    0x4(%ecx),%eax
   10a1:       8b 11                   mov    (%ecx),%edx
       next->prev = prev;
       prev->next = next;
   10a3:       89 10                   mov    %edx,(%eax)
}

/**
* list_del - deletes entry from list.
* @entry: the element to delete from the list.
* Note: list_empty on entry does not return true after this, the entry
is in an undefined state.
*/
static inline void list_del(struct list_head *entry)
{
       __list_del(entry->prev, entry->next);
       entry->next = (void *) 0;
   10a5:       c7 01 00 00 00 00       movl   $0x0,(%ecx)
   10ab:       89 42 04                mov    %eax,0x4(%edx)         
   <----- edx is 0
   10ae:       8b 03                   mov    (%ebx),%eax
   10b0:       89 48 04                mov    %ecx,0x4(%eax)
   10b3:       89 01                   mov    %eax,(%ecx)
   10b5:       89 59 04                mov    %ebx,0x4(%ecx)
   10b8:       89 0b                   mov    %ecx,(%ebx)
   10ba:       89 c8                   mov    %ecx,%eax
   10bc:       eb ce                   jmp    108c
<lvm_find_exception_table+0x5c>
   10be:       0f 0b                   ud2a
   10c0:       75 00                   jne    10c2
<lvm_find_exception_table+0x92>
   10c2:       0d 00 00 00 eb          or     $0xeb000000,%eax
   10c7:       97                      xchg   %eax,%edi

Comment 2 Wendy Cheng 2004-10-11 15:25:22 UTC
1. The vmcore is located at (looged in as anonymous ftp account):
   ftp://enterprise.redhat.com/incoming/vmcore-361320.gz
2. Found an identical issue in:
   http://www.spinics.net/lists/lvm/msg11750.html.
3. Just freshly built a test kernel with option #1 as discussed in
   previous link for customer to test (as a workaround). 





Comment 10 Norm Murray 2004-12-22 05:58:18 UTC
Adding alias to IT ticket of another customer having the same problem. 

Comment 51 Ernie Petrides 2005-01-25 23:40:22 UTC
A fix for this problem has just been committed to the RHEL3 U5
patch pool this afternoon (in kernel version 2.4.21-27.9.EL).


Comment 53 Heinz Mauelshagen 2005-04-04 12:53:45 UTC
*** Bug 152959 has been marked as a duplicate of this bug. ***

Comment 54 Tim Powers 2005-05-18 13:28:14 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2005-294.html



Note You need to log in before you can comment on or make changes to this bug.