Bug 126538
Summary: | filesystem deadlock when recovery happens | ||
---|---|---|---|
Product: | [Retired] Red Hat Cluster Suite | Reporter: | Corey Marthaler <cmarthal> |
Component: | gfs | Assignee: | David Teigland <teigland> |
Status: | CLOSED WORKSFORME | QA Contact: | Derek Anderson <danderso> |
Severity: | medium | Docs Contact: | |
Priority: | medium | ||
Version: | 4 | CC: | djansa |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | i686 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2005-01-05 22:25:04 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Corey Marthaler
2004-06-22 22:11:27 UTC
I fixed a bug with similar symptoms in Changeset 1.1667, although that bug took quite some effort to trigger in my setup. So, there's a good chance this is resolved. As of July 13, the simplest case (start IO on all nodes, shoot one) still causes this hang. I only see: Jul 13 14:36:14 tank-02 kernel: dlm: gfs0: recover event 23 Jul 13 14:36:14 tank-02 kernel: dlm: gfs0: remove node 6 Jul 13 14:36:16 tank-03 kernel: dlm: gfs1: recover event 20 Jul 13 14:36:16 tank-03 kernel: dlm: gfs1: remove node 6 Jul 13 14:36:09 tank-04 kernel: dlm: gfs0: recover event 18 Jul 13 14:36:09 tank-04 kernel: dlm: gfs0: remove node 6 Jul 13 14:36:16 tank-05 kernel: dlm: gfs0: recover event 13 Jul 13 14:36:16 tank-05 kernel: dlm: gfs0: remove node 6 Jul 13 14:36:00 tank-06 kernel: CMAN: node tank-01.lab.msp.redhat.com is not res ponding - removing from the cluster Jul 13 14:36:03 tank-06 kernel: dlm: gfs0: recover event 11 Jul 13 14:36:03 tank-06 kernel: dlm: gfs0: remove node 6 I cannot get this to happen using my four nodes. Could you get the nodes into this state, leave them, and then let me log in to inspect? Updating version to the right level in the defects. Sorry for the storm. No one has seen this in about 6 months |