Not sure if this belongs under kernel or lvm2 -- I'm putting it under kernel since I *think* from looking at the logs that the problem is occurring in kernelspace (and it happend shortly after updating the kernel), but it's triggered by lvcreate so please move it if needed. Description of problem: When nightly backup script tried to create a snapshot of /var with "lvcreate -L 1G -s -n snapvar /dev/vg1/var", the lvcreate command hung in an unkillable wait stage (or perhaps the first lvcreate segfaulted and exited and the script tried it again and the second one hung in the unkillable wait stage) and the kernel logged "kernel: general protection fault: 0000 [1] SMP" followed by more details and stack trace of lvcreate (attached). A reboot was needed to get the system back into a state where lvm/dm commands would work again. There was heavy disk usage on /var at the time due to a mail loop. The file system type is reiserfs. The backups on the previous night and the subsequent nights proceeded without errors so this is not yet a reproducible problem. Prior to this, with earlier kernels, backups proceeded nightly for years with no issues like this. Version-Release number of selected component (if applicable): kernel: kernel-2.6.22.2-42.fc6 lvm2: lvm2-2.02.17-1.fc6 How reproducible: So far, it has happened only once. However, we've only been running this kernel version for three days, so if this is a new problem introduced with the 2.6.22 kernel we may see it again. If it does happen again, I'll update this bug report. Steps to Reproduce: It may be possible to reproduce by creating a reiserfs filesystem on a 2-CPU system, putting it under heavy disk load, then using lvcreate to make a snapshot, remove it again, and repeat until the problem occurs. So far, though, I have not been able to reproduce it, so I am reporting this bug just in case someone with more kernel or lvm knowledge has some ideas based on the log file attached. Additional info: See attached log file. Note: I set this as severity:high since it does render many lvm commands unusable until a reboot, but priority:low since I realize more information or reproducibility may be needed first.
Created attachment 172435 [details] Log file of kernel gpf and lvcreate process stack trace
It's been a week (running the same set of lvcreate's every night) and the problem has not recurred, so perhaps this isn't a new problem in kernel-2.6.22.2-42.fc6 after all, but something obscure and nonreproducible. I'll wait a week longer and see if it recurs or if anyone else reports it happening to them as well.
There are some very obscure bugs in the sysfs_hash_and_remove/kref_put code that won't be fully fixed until 2.6.23. Will close this bug, as the fixes are already upstream (and will be in Fedora 8.) The code was completely rewritten to solve this problem.