Bug 157472
Summary: | assertion in util.c during recovery: "!ret" | ||
---|---|---|---|
Product: | [Retired] Red Hat Cluster Suite | Reporter: | Corey Marthaler <cmarthal> |
Component: | gfs | Assignee: | Kiersten (Kerri) Anderson <kanderso> |
Status: | CLOSED CURRENTRELEASE | QA Contact: | GFS Bugs <gfs-bugs> |
Severity: | medium | Docs Contact: | |
Priority: | medium | ||
Version: | 4 | CC: | kpreslan |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | gfs 6.1 | Doc Type: | Bug Fix |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2006-11-06 19:43:15 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Corey Marthaler
2005-05-11 21:28:44 UTC
This happened on tank-01 which was either a Slave or the Master at the time when tank-04 and tank-05 went down. so, are you going to describe the cluster? or just leave me guessing? only four nodes? embedded or seperate lock servers? which nodes were shot? clients? slaves? How many lock servers? Tilstra, you know I'd never knowingly leave you hangin' man. :) Here's a copy of the config file. These were embedded servers, again tank-04 and tank-05 were shot, tank-04 was a client and tank-05 was either a Slave or the Master, I really don't know. It was kindda a fluke that I hit this (in that I was trying to set a senario for a different bug so I wasn't paying attention as much as I normally would). <?xml version="1.0"?> <cluster config_version="8" name="tank-cluster"> <gulm> <lockserver name="tank-01.lab.msp.redhat.com"/> <lockserver name="tank-03.lab.msp.redhat.com"/> <lockserver name="tank-05.lab.msp.redhat.com"/> </gulm> <clusternodes> <clusternode name="tank-01.lab.msp.redhat.com" votes="1"> <fence> <method name="single"> <device name="apc" port="1" switch="1"/> </method> </fence> </clusternode> <clusternode name="tank-03.lab.msp.redhat.com" votes="1"> <fence> <method name="single"> <device name="apc" port="3" switch="1"/> </method> </fence> </clusternode> <clusternode name="tank-04.lab.msp.redhat.com" votes="1"> <fence> <method name="single"> <device name="apc" port="4" switch="1"/> </method> </fence> </clusternode> <clusternode name="tank-05.lab.msp.redhat.com" votes="1"> <fence> <method name="single"> <device name="apc" port="5" switch="1"/> </method> </fence> </clusternode> </clusternodes> <fencedevices> <fencedevice agent="fence_apc" ipaddr="tank-apc" login="apc" name="apc" passwd="apc"/> </fencedevices> <rm> <resources> <ip address="192.168.45.91" monitor_link="1"/> <ip address="192.168.45.92" monitor_link="1"/> <ip address="192.168.45.93" monitor_link="1"/> <ip address="192.168.45.94" monitor_link="1"/> <ip address="192.168.45.95" monitor_link="1"/> </resources> <service name="test1"> <ip ref="192.168.45.91"/> </service> <failoverdomains/> <service exclusive="1" name="coreyservice"> <clusterfs device="111" fstype="gfs" mountpoint="111" name="111" options="111"> <clusterfs device="222" fstype="gfs" mountpoint="222" name="222" options="222"> <clusterfs device="333" fstype="gfs" mountpoint="333" name="333" options="333"> <clusterfs device="444" fstype="gfs" mountpoint="444" name="444" options="444"/> </clusterfs> </clusterfs> <clusterfs device="222b" fstype="gfs" mountpoint="222b" name="222b" options="222b"/> </clusterfs> </service> </rm> <fence_daemon clean_start="0" post_fail_delay="0" post_join_delay="3"/> </cluster> GFS 6.1 doesn't like getting error codes from the lock modules. Prior versions handled this by retrying the lock request. So requeue lock reqs instead of telling gfs there was an error. This was fixed awhile ago. |