247112 – expand logical volume fails: lvresize failed: failed to suspend {lvname}

Bug 247112 - expand logical volume fails: lvresize failed: failed to suspend {lvname}

Summary: expand logical volume fails: lvresize failed: failed to suspend {lvname}

Keywords:
Status:	CLOSED WONTFIX
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	system-config-lvm
Sub Component:
Version:	7
Hardware:	athlon
OS:	Linux
Priority:	low
Severity:	medium
Target Milestone:	---
Assignee:	Jim Parsons
QA Contact:	Fedora Extras Quality Assurance
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2007-07-05 14:12 UTC by David Timms
Modified:	2008-06-17 01:47 UTC (History)
CC List:	2 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2008-06-17 01:47:57 UTC
Type:	---
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
dialog settings for lv resize. (28.69 KB, image/png) 2007-07-05 14:12 UTC, David Timms	no flags	Details
visual map of the vg/lv usage and space. (24.20 KB, image/png) 2007-07-05 14:26 UTC, David Timms	no flags	Details
var/log/messages events and fdisk layout. (4.05 KB, text/plain) 2007-07-05 14:28 UTC, David Timms	no flags	Details
gkrellm visual disk usage over 4x minutes during scan (77.02 KB, image/png) 2007-07-12 22:50 UTC, David Timms	no flags	Details
strace 30MB log bzipped to 851KB! (831.46 KB, application/x-bzip2) 2007-07-12 22:55 UTC, David Timms	no flags	Details
lvmdump -a -m (53.30 KB, application/octet-stream) 2007-07-14 02:53 UTC, David Timms	no flags	Details
View All

Description David Timms 2007-07-05 14:12:09 UTC

Description of problem:
There are free extents available in my {vgstorage} volume group. Trying to
expand the one and only logical volume fails. No data loss occurs.

Version-Release number of selected component (if applicable):
kernel-2.6.21-1.3228.fc7
lvm2-2.02.24-1.fc7
system-config-lvm-1.1.1-1.0.fc7

How reproducible:
Keeps happening on this one machine, however, if I make only a small increase in
number of extents to use for this lv, then it may succeed.

Steps to Reproduce:
1. 4 physical disks.
2. disk 1 has /|boot|swap
3. disk 2/3/4 has 10 partitions {boot|extended|8 lvm splitting the disk size
into 8 almost equal parts}.
4. lvm initialize a group of parts.
5. add parts to a single vg
6. create a lv in the vg.
7. add more parts to the vg.
8. expand the lv, OK
9. add more parts to the vg.
10. expand the lv {edit logical volume - use remaining or slider or hand type}
  
Actual results:
Waits about 8 seconds after clicking OK
The resizing lv dialog with knightrider {uninformative} progress is shown.
Error dialog:
lvresize command failed. Command attempted:
"/usr/sbin/lvextend -l 27003 /dev/vgstorage/lvhome" - System Error
Message: device-mapper: reload ioctl failed: Invalid argument
  Failed to suspend lvhome.
{OK}

clicking OK closes the error and edit logical volume dialogs, and
"reloading LVM" dialog appears. {this takes over 3 minutes to complete}.

Expected results:
Succeeds - lvhome now fills the whole vgstorage.

Additional info:
As I am learning about LVM capabilities through use of the -lvm util, I am doing
something which is probably not a normal use case {ie splitting the physical
disks into multiple partitions before adding to a VG.

During the process I also removed some PE's and added different PE's, expanded
the vg and lv size, and so forth, often just one 4GB extent at a time {total
size is 99GB lv within a 106GB vg.

New sizes I have tried {since last successful resize}:
current size is 25550, total extents in vg: 27004
27004
27003
26277
26000
25990
25800
25650
25600
each test takes several minutes to have error dialog, and 3++ minutes to reload
the lvm data.

Comment 1 David Timms 2007-07-05 14:12:09 UTC

Created attachment 158591 [details]
dialog settings for lv resize.

Comment 2 David Timms 2007-07-05 14:26:51 UTC

Created attachment 158593 [details]
visual map of the vg/lv usage and space.

Comment 3 David Timms 2007-07-05 14:28:38 UTC

Created attachment 158594 [details]
var/log/messages events and fdisk layout.

Comment 4 David Timms 2007-07-12 22:50:43 UTC

Created attachment 159108 [details]
gkrellm visual disk usage over 4x minutes during scan

Disk usage during the scan seems to average around 300KB/sec for each disk.
=====
when the resize fails, the following is logged:
Jul 13 08:04:33 poweredge yum : Installed: strace.i386 4.5.15-1.fc7
Jul 13 08:33:08 poweredge kernel: device-mapper: table: device 8:5 too small
for target
Jul 13 08:33:08 poweredge kernel: device-mapper: table: 253:1: linear:
dm-linear: Device lookup failed
Jul 13 08:33:08 poweredge kernel: device-mapper: ioctl: error adding target to
table
=====
dmesg:
ip_tables: (C) 2000-2006 Netfilter Core Team
Netfilter messages via NETLINK v0.30.
nf_conntrack version 0.5.0 (8192 buckets, 65536 max)
e1000: eth0: e1000_watchdog: NIC Link is Up 100 Mbps Full Duplex, Flow Control:
RX/TX
audit(1184191640.355:4): audit_pid=2167 old=0 by auid=4294967295
subj=system_u:system_r:auditd_t:s0
SELinux: initialized (dev rpc_pipefs, type rpc_pipefs), uses genfs_contexts
BUG: warning at kernel/softirq.c:138/local_bh_enable() (Not tainted)
 [<c042b0cf>] local_bh_enable+0x45/0x92
 [<c06002bd>] cond_resched_softirq+0x2c/0x42
 [<c059adf3>] release_sock+0x4f/0x9d
 [<c05cec7d>] tcp_send_ack+0xeb/0xef
 [<c05c7755>] tcp_recvmsg+0x8d2/0x9de
 [<c059a7a9>] sock_common_recvmsg+0x3e/0x54
 [<c0598839>] sock_aio_read+0xfc/0x108
 [<c04755f8>] do_sync_read+0xc7/0x10a
 [<c0436e71>] autoremove_wake_function+0x0/0x35
 [<c0475e99>] vfs_read+0xba/0x152
 [<c04762db>] sys_read+0x41/0x67
 [<c0404f70>] syscall_call+0x7/0xb
 =======================
Bluetooth: Core ver 2.11
NET: Registered protocol family 31
Bluetooth: HCI device and connection manager initialized
Bluetooth: HCI socket layer initialized
Bluetooth: L2CAP ver 2.8
Bluetooth: L2CAP socket layer initialized
Bluetooth: RFCOMM socket layer initialized
Bluetooth: RFCOMM TTY layer initialized
Bluetooth: RFCOMM ver 1.8
Bluetooth: HIDP (Human Interface Emulation) ver 1.2
SELinux: initialized (dev autofs, type autofs), uses genfs_contexts
SELinux: initialized (dev autofs, type autofs), uses genfs_contexts
SELinux: initialized (dev autofs, type autofs), uses genfs_contexts
eth0: no IPv6 routers present
device-mapper: table: device 8:5 too small for target
device-mapper: table: 253:1: linear: dm-linear: Device lookup failed
device-mapper: ioctl: error adding target to table
device-mapper: table: device 8:5 too small for target
device-mapper: table: 253:1: linear: dm-linear: Device lookup failed
device-mapper: ioctl: error adding target to table

I hadn't noticed the bug bit before, but is it related - i htink it was during
the last reboot - but dmesg doesn't give timing.

Comment 5 David Timms 2007-07-12 22:55:11 UTC

Created attachment 159109 [details]
strace 30MB log bzipped to 851KB!

I see lots of:
(4, "5\20\4\0\33\nf\2\354\0`\2\226\0\24\0\226\4\5\0\34\nf\2"..., 5272) = 5272
read(4, 0xbf85298c, 32) 		= -1 EAGAIN (Resource temporarily
unavailable)
poll([{fd=4, events=POLLIN, revents=POLLIN}], 1, -1) = 1
read(4, "\1\20\345\367(\1\0\0\0\0\0\0\260\361\271\t(W\216\277|\225"..., 32) =
32
readv(4, [{"\26\306\26\316\26\316\26\316\26\316\26\316\26\316\26\316"...,
1184}, {"", 0}], 2) = 1184
write(4, "I\2\5\0$\nf\2\0\0\0\0\224\0\22\0\377\377\377\377", 20) = 20
read(4, 0xbf8524cc, 32) 		= -1 EAGAIN (Resource temporarily
unavailable)
poll([{fd=4, events=POLLIN, revents=POLLIN}], 1, -1) = 1
read(4, "\1\10\346\367\232\2\0\0\0\0\0\0T\325C\10(W\216\277|\225"..., 32) = 32
readv(4, [{"\205\371\377\377\377\377\377\377\377\377\377\377\377\377"...,
2664}, {"", 0}], 2) = 2664
write(4, "7\2\5\0/\nf\2\357\1`\2\0\0\1\0\0\0\0\0H\2.\1\357\1`\2/"..., 1372) =
1372
read(4, 0xbf85298c, 32) 		= -1 EAGAIN (Resource temporarily
unavailable)
poll([{fd=4, events=POLLIN, revents=POLLIN}], 1, -1) = 1
read(4, "\1\20\360\367$\0\0\0\0\0\0\0\300\35\304\t(W\216\277|\225"..., 32) = 32

readv(4, [{"\325\275\224\275\224\275\224\275\325\305\325\305\325\305"..., 144},
{"", 0}], 2) = 144
write(4, "I\2\5\0$\nf\2\0\0\0\0\224\0\22\0\377\377\377\377", 20) = 20
read(4, 0xbf8524cc, 32) 		= -1 EAGAIN (Resource temporarily
unavailable)

Comment 6 Jim Parsons 2007-07-13 19:50:01 UTC

Ok - first, thank you for the prompt log output. Sometimes, these types of
device mapper errors indicate multipath or md problems, where more than one
device *should* be merged together with LVM2 layered on top; but the merging
hasn't happened so LVM2 is left using the underlying devices that are not the
expected size.

We need to get medieval with this problem - we need to try 'lvmdump'... that's a
script I ask people to run to gather diagnostics. We need to  look for things
like md devices in there, at aren't being assembled, or that are filtered out
wrongly in lvm.conf.

lvmdump is in the latest releases, and is also in the sources cvsweb repo. Here
is a link to the man page:  http://linux.die.net/man/8/lvmdump

I kinda think this is not related to the UI, but i'll hang in here with you :)

BTW, thx for using the GUI

Comment 7 David Timms 2007-07-14 02:53:46 UTC

Created attachment 159263 [details]
lvmdump -a -m

I think I missed mentioning that running the command the the gui mentioned it
had trouble on fails as well {same error message}, the difference being once
the error message appears, it doesn't try to "reload lvm" config, so it only
takes a few minutes not like 6 minutes.

I guess the gui is out of it as long as the attempted command is legit.
===
more info on the machine - in case it is useful:
dell poweredge 1600SC
5x36G disks on SCSI
adaptec controller:
  2x in hw RAID 1 {/dev/sda}
  1x {/dev/sdb}
  1x {/dev/sdc/
lsi controller:
  1x {/dev/sdd}
===
also ran an fsck.ext3 -f on the two lvm volumes; it says structure is ok.

Comment 8 David Timms 2007-07-16 08:04:15 UTC

Does the lack of ~disk errors~ in /varlog/messages guarantee the disks are good ?
Is it worth destructively scanning the disk(s) for errors with badblocks -w ?
Is there smart capabilities on scsi disks ?

Comment 9 Bug Zapper 2008-05-14 13:25:06 UTC

This message is a reminder that Fedora 7 is nearing the end of life. Approximately 30 (thirty) days from now Fedora will stop maintaining and issuing updates for Fedora 7. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as WONTFIX if it remains open with a Fedora 'version' of '7'.

Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version prior to Fedora 7's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that we may not be able to fix it before Fedora 7 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora please change the 'version' of this bug. If you are unable to change the version, please add a comment here and someone will do it for you.

Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. If possible, it is recommended that you try the newest available Fedora distribution to see if your bug still exists.

Please read the Release Notes for the newest Fedora distribution to make sure it will meet your needs:
http://docs.fedoraproject.org/release-notes/

The process we are following is described here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 10 David Timms 2008-05-15 13:12:21 UTC

Probably shortly after F8 release, the machine was upgraded, and hence no longer
runs F7. This problem wasn't seen with F8 {instead just slow access when
scanning the lvm volumes due to the ~over the top number of lvm partitions on
each physical disk that was making up the lvs.

Additionally, /dev/sdd disk4 has begun having read errors, so I took the
opportunity to recreate the lvm config with each of the 3 remaining disks having
only two lvm partitions {instead of 10}. The LVs are resizing without error, so
I am happy for the issue to be closed if the maintainer doesn't want to actively
track the issue. Thanks.

Comment 11 Bug Zapper 2008-06-17 01:47:54 UTC

Fedora 7 changed to end-of-life (EOL) status on June 13, 2008. 
Fedora 7 is no longer maintained, which means that it will not 
receive any further security or bug fix updates. As a result we 
are closing this bug. 

If you can reproduce this bug against a currently maintained version 
of Fedora please feel free to reopen this bug against that version.

Thank you for reporting this bug and we are sorry it could not be fixed.

Note You need to log in before you can comment on or make changes to this bug.