Red Hat Bugzilla – Bug 254185
lvcreate causes "kernel: general protection fault" then future lvm processes hang
Last modified: 2007-11-30 17:12:14 EST
Not sure if this belongs under kernel or lvm2 -- I'm putting it under kernel
since I *think* from looking at the logs that the problem is occurring in
kernelspace (and it happend shortly after updating the kernel), but it's
triggered by lvcreate so please move it if needed.
Description of problem:
When nightly backup script tried to create a snapshot of /var with
"lvcreate -L 1G -s -n snapvar /dev/vg1/var", the lvcreate command hung in
an unkillable wait stage (or perhaps the first lvcreate segfaulted and exited
and the script tried it again and the second one hung in the unkillable wait
stage) and the kernel logged
"kernel: general protection fault: 0000  SMP"
followed by more details and stack trace of lvcreate (attached). A reboot was
needed to get the system back into a state where lvm/dm commands would work
again. There was heavy disk usage on /var at the time due to a mail loop. The
file system type is reiserfs.
The backups on the previous night and the subsequent nights proceeded without
errors so this is not yet a reproducible problem. Prior to this, with earlier
kernels, backups proceeded nightly for years with no issues like this.
Version-Release number of selected component (if applicable):
So far, it has happened only once. However, we've only been running this kernel
version for three days, so if this is a new problem introduced with the 2.6.22
kernel we may see it again. If it does happen again, I'll update this bug report.
Steps to Reproduce:
It may be possible to reproduce by creating a reiserfs filesystem on a 2-CPU
system, putting it under heavy disk load, then using lvcreate to make a
snapshot, remove it again, and repeat until the problem occurs. So far, though,
I have not been able to reproduce it, so I am reporting this bug just in case
someone with more kernel or lvm knowledge has some ideas based on the log file
See attached log file.
Note: I set this as severity:high since it does render many lvm commands
unusable until a reboot, but priority:low since I realize more information or
reproducibility may be needed first.
Created attachment 172435 [details]
Log file of kernel gpf and lvcreate process stack trace
It's been a week (running the same set of lvcreate's every night) and the
problem has not recurred, so perhaps this isn't a new problem in
kernel-126.96.36.199-42.fc6 after all, but something obscure and nonreproducible.
I'll wait a week longer and see if it recurs or if anyone else reports it
happening to them as well.
There are some very obscure bugs in the sysfs_hash_and_remove/kref_put code that
won't be fully fixed until 2.6.23. Will close this bug, as the fixes are already
upstream (and will be in Fedora 8.) The code was completely rewritten to solve