Bug 285021
Summary: | Kernel panic when sending SIGINT to GNU software rrdtool locks:1798 | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 4 | Reporter: | Bryan Heitman <bryanh> |
Component: | kernel | Assignee: | Ric Wheeler <rwheeler> |
Status: | CLOSED WONTFIX | QA Contact: | |
Severity: | high | Docs Contact: | |
Priority: | medium | ||
Version: | 4.5 | CC: | dkwon, gvg, ricardo.labiaga |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
URL: | http://www.sqlpaste.com/?entry_id=105195 | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2012-06-20 16:13:50 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Bryan Heitman
2007-09-10 19:45:38 UTC
I am not testing with rrdtool, but the Connectathon testsuite does something which sounds like this. It has a test which acquires a lock and is then signalled. This test passes and the system does not fail. How do I reproduce this situation, please? Hi Peter, Could you compile and run rrdtool v3 beta, to reproduce? .....or I can also give you remote access to the machine temporarily. Up to you. Where do I find rrdtool v3 beta, please? Have you tried this on different file systems? Only ext3 You can download rrdtool here and compile, if you need help getting it running, let me know http://oss.oetiker.ch/rrdtool/pub/beta/rrdtool-1.3beta1.tar.gz Wow. That has quite the complicated build process. I will work on it, but it will take a while. A simpler testcase might accelerate the work. Sorry Peter, if you have any trouble with re-creation of this let me know. I will be happy to provide the commands I am using. We hit this same panic when running sio (http://www.netapp.com/go/techontap/tot-march2006/0306tot_monthlytoolSIO.html) against a set of NetApp filers. The filers were configured in a cluster to takeover/giveback for the test. We have not tried to reproduce this problem with a single filer and plain reboot. The specific steps we used to reproduce the problem: - Takeover/Giveback the filer cluster every 3 minutes (We used FAS3050) - Setup and Enable clustering - Say A and B are the filers. All client traffic is targeted to B. - Let A takeover and giveback every 3 minutes, so that clients will be handled by both A & B. - Run the following SIO sequence on the clients repeatedly. ============================================ #!/usr/local/bin/bash FILE=$DIR/$(hostname) ITER=1 while [ true ] do echo $(date) : Iteration $ITER start touch $FILE /usr/local/test/bin/sio 0 0 8k 0 1m 0 4 $FILE -fillonce if [ $? -ne 0 ]; then echo $(date) : Fillonce failed fi /usr/local/test/bin/sio 66 100 8k 0 1m 300 4 $FILE if [ $? -ne 0 ]; then echo $(date) : sio-test failed fi rm -rf $FILE echo $(date) : Iteration $ITER end ITER=$(expr $ITER + 1) Done ============================================ - We used 14 RHEL4.4 clients and around 20 RHEL3.8 clients running the above sequence for 6 hrs. - We were able to get 2-4 RHEL4.4 clients to panic each time. Note that not all of the RHEL4.4 clients hit the panic. Thank you for submitting this issue for consideration in Red Hat Enterprise Linux. The release for which you requested us to review is now End of Life. Please See https://access.redhat.com/support/policy/updates/errata/ If you would like Red Hat to re-consider your feature request for an active release, please re-open the request via appropriate support channels and provide additional supporting details about the importance of this issue. |