Bug 135266
| Summary: | Panics while backing up LVM snapshots | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 3 | Reporter: | Wendy Cheng <nobody+wcheng> | ||||
| Component: | kernel | Assignee: | Heinz Mauelshagen <heinzm> | ||||
| Status: | CLOSED ERRATA | QA Contact: | Brian Brock <bbrock> | ||||
| Severity: | high | Docs Contact: | |||||
| Priority: | medium | ||||||
| Version: | 3.0 | CC: | 157070.alewis, kanderso, kmori, peterm, petrides, riel, sammy, tao | ||||
| Target Milestone: | --- | ||||||
| Target Release: | --- | ||||||
| Hardware: | All | ||||||
| OS: | Linux | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2005-05-18 13:28:14 UTC | Type: | --- | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Bug Depends On: | |||||||
| Bug Blocks: | 132991 | ||||||
| Attachments: |
|
||||||
The system died with "dump" command in lvm_find_exception_table with
one entry's "next" field of the hash_table being "0" during a list
entry remove.
108 lvm_find_exception_table(kdev_t org_dev, unsigned long
org_start, lv_t * lv)
109 {
110 struct list_head * hash_table =
lv->lv_snapshot_hash_table, * ne xt;
111 unsigned long mask = lv->lv_snapshot_hash_mask;
112 int chunk_size = lv->lv_chunk_size;
113 lv_block_exception_t * ret;
114 int i = 0;
115
116 if (!hash_table)
117 BUG();
118 hash_table = &hash_table[hashfn(org_dev, org_start,
mask, chunk_ size)];
119 ret = NULL;
120 for (next = hash_table->next; next != hash_table; next
= next->n ext)
121 {
122 lv_block_exception_t * exception;
123
124 exception = list_entry(next,
lv_block_exception_t, hash) ;
125 if (exception->rsector_org == org_start &&
126 exception->rdev_org == org_dev)
127 {
128 if (i)
129 {
130 /* fun, isn't it? :) */
131 #ifdef list_move
132 list_move(next, hash_table);
133 #else
134 list_del(next);
135 list_add(next, hash_table);
static inline void __list_del(struct list_head *prev, struct list_head
*next)
{
109e: 8b 41 04 mov 0x4(%ecx),%eax
10a1: 8b 11 mov (%ecx),%edx
next->prev = prev;
prev->next = next;
10a3: 89 10 mov %edx,(%eax)
}
/**
* list_del - deletes entry from list.
* @entry: the element to delete from the list.
* Note: list_empty on entry does not return true after this, the entry
is in an undefined state.
*/
static inline void list_del(struct list_head *entry)
{
__list_del(entry->prev, entry->next);
entry->next = (void *) 0;
10a5: c7 01 00 00 00 00 movl $0x0,(%ecx)
10ab: 89 42 04 mov %eax,0x4(%edx)
<----- edx is 0
10ae: 8b 03 mov (%ebx),%eax
10b0: 89 48 04 mov %ecx,0x4(%eax)
10b3: 89 01 mov %eax,(%ecx)
10b5: 89 59 04 mov %ebx,0x4(%ecx)
10b8: 89 0b mov %ecx,(%ebx)
10ba: 89 c8 mov %ecx,%eax
10bc: eb ce jmp 108c
<lvm_find_exception_table+0x5c>
10be: 0f 0b ud2a
10c0: 75 00 jne 10c2
<lvm_find_exception_table+0x92>
10c2: 0d 00 00 00 eb or $0xeb000000,%eax
10c7: 97 xchg %eax,%edi
1. The vmcore is located at (looged in as anonymous ftp account): ftp://enterprise.redhat.com/incoming/vmcore-361320.gz 2. Found an identical issue in: http://www.spinics.net/lists/lvm/msg11750.html. 3. Just freshly built a test kernel with option #1 as discussed in previous link for customer to test (as a workaround). Adding alias to IT ticket of another customer having the same problem. A fix for this problem has just been committed to the RHEL3 U5 patch pool this afternoon (in kernel version 2.4.21-27.9.EL). *** Bug 152959 has been marked as a duplicate of this bug. *** An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2005-294.html |
From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.0.1) Gecko/20020830 Description of problem: From IT#49956: The customer is having an intermittent problem with a kernel panic on their production web course teaching system. Each time the panic occurs when they are doing a backup to an LTO tape drive of an LVM snapshot volume. Not all backups produce a panic. There are 4 file systems dumped, but the panic has only occurred during the one file system that is a snapshot volumes. All 4 volumes being backed up are LVM volumes. The panic route: EIP is at lvm_find_exception_table [lvm-mod] 0x7b (2.4.21-15.0.3.ELsmp/i686) eax: f8a52fe8 ebx: f909c220 ecx: f8a7ac60 edx: 00000000 esi: 02dd2280 edi: 00000009 ebp: 00000801 esp: d0581d60 ds: 0068 es: 0068 ss: 0068 Process dump (pid: 16383, stackpage=d0581000) Stack: 0001ffff f6b85e18 f52eca00 00000008 00000000 d0581de6 f89441fe 00000801 02dd2280 f52eca00 02dd2280 00000000 02dd0008 02dd2280 f52eca00 00000001 f894065d d0581de6 d0581de0 02dd2280 f52eca00 c0435280 00000000 c0426200 Call Trace: [<f89441fe>] lvm_snapshot_remap_block [lvm-mod] 0x7e (0xd0581d78) [<f894065d>] lvm_map [lvm-mod] 0x20d (0xd0581da0) [<f8940a57>] lvm_make_request_fn [lvm-mod] 0x17 (0xd0581df8) [<c01cd29a>] generic_make_request [kernel] 0xea (0xd0581e04) [<c01cd339>] submit_bh_rsector [kernel] 0x49 (0xd0581e2c) [<c01642d6>] block_read_full_page [kernel] 0x266 (0xd0581e48) [<c01459da>] add_to_page_cache_unique [kernel] 0x5a (0xd0581e98) [<c0145c31>] page_cache_read [kernel] 0xe1 (0xd0581eac) [<c0168c40>] blkdev_get_block [kernel] 0x0 (0xd0581eb4) [<c01465f7>] generic_file_readahead [kernel] 0xd7 (0xd0581ed4) [<c0146b89>] do_generic_file_read [kernel] 0x489 (0xd0581ef0) [<c0147435>] generic_file_new_read [kernel] 0xc5 (0xd0581f30) [<c0147270>] file_read_actor [kernel] 0x0 (0xd0581f40) [<c021e926>] sock_read [kernel] 0x96 (0xd0581f50) [<c014755f>] generic_file_read [kernel] 0x2f (0xd0581f7c) [<c01607b7>] sys_read [kernel] 0x97 (0xd0581f94) Version-Release number of selected component (if applicable): 2.4.21-15.0.3.ELsmp How reproducible: Didn't try Additional info: