Bug 133215
Summary: | clients nfs mount goes stale after nfs service restart | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | [Retired] Red Hat Cluster Suite | Reporter: | Gregor Pardella <pardella> | ||||||
Component: | clumanager | Assignee: | Lon Hohberger <lhh> | ||||||
Status: | CLOSED ERRATA | QA Contact: | |||||||
Severity: | medium | Docs Contact: | |||||||
Priority: | medium | ||||||||
Version: | 3 | CC: | cluster-maint, vanhoof | ||||||
Target Milestone: | --- | ||||||||
Target Release: | --- | ||||||||
Hardware: | i686 | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2004-11-09 10:40:54 UTC | Type: | --- | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Attachments: |
|
Description
Gregor Pardella
2004-09-22 14:34:37 UTC
1.2.16-1 is in the Cluster Suite channel on RHN (!) Steps 3 and 4: Do you mean "service nfs stop / service nfs start", or using the cluster tools to restart it? > 1.2.16-1 is in the Cluster Suite channel on RHN (!) I know, but I have a working system, and because of this error I cannot restart the cluster! :-( >Steps 3 and 4: Do you mean "service nfs stop / service nfs start", >or using the cluster tools to restart it? I'm using the cluster tool to manage the cluster-services. The cluster tool gives the possibility to enable/disable and restart a service. So that steps 3/4 should be: 3. disable nfs-service by cluster-tool 4. enable nfs-service by cluster-tool or restart nfs-service by cluster-tool I'm sorry for the misunderstanding. Not really a misunderstanding; just want to have everything clear. WRT to upgrading, you can do it in a 'rolling' fashion; details are in the errata advisories. This should minimize downtime (because you don't have to take the whole cluster offline to do it; just one node at a time). Few more questions: (1) Is autofs (automount) used in conjunction with the clients? If so, what is the mount timeout? (2) How would you characterize the the clients receiving ESTALE (e.g. all netgroup members/some netgroup members/all clients [inside and outside of netgroup]/some clients [random, not specific to netgroup])? (3) What are the entries in /var/lib/nfs/rmtab and <service-mountpoint>/.clumanager/rmtab? (4) When the service is running, there should be a copy of "clurmtabd" running for each export path (but not per client). Is this the case? >WRT to upgrading, you can do it in a 'rolling' fashion; details are
>in the errata advisories. This should minimize downtime (because you
>don't have to take the whole cluster offline to do it; just one node
>at a time).
sure, but I want to solve the problem with the stale nfs handles
before I have downtime ;-)
Answers:
1) Nope. There is no use of autofs
2) Sll netgroup members - expect those which are seperatly noted with
special mount/export options.
3) The clients listed seperatly; /var/lib/nfs/rmtab contains the same
entries as <service-mountpoint>/.clumanager/rmtab
Was it that you wanted to know???
4) Thats the case!
I have just another question:
Why isn't it possible to make changes to the exports and reload the
nfs service??
The only way to do it is to take down the service, make your changes
and then get it online again. But for some additional entries in the
export this is exorbitant.
Ok, so it's just the netgroup clients. That should make it easier. The answer to your question lies in the way services are defined. They're more or less monolithic with lots of properties as opposed to modeled as a tree of separate entities combined in a group. This is a known architectural limitation. It should go away in the future (next major release of RHCS). Hmmmmm... Did you remove any NFS clients from the service? > Hmmmmm... Did you remove any NFS clients from the service?
What do you mean? No, I didn't removed some clients, so that these
couldn't mount the exports ;-)
There's no need to make any changes to the service, if it goes down
and up, the netgroup-clients get stale nfs-handles.
I'm not sure about the point how the behavior is if relocating the
service - I remember this worked well.
Thanks for the information. A similar problem occurs (apparently) with wildcards, and yet another with many individual exports. I have thus far been unable to reproduce any of the above. Could you attach your cluster.xml (you can change your IPs/hostnames if you're paranoid about it, but please don't change anything else)? Created attachment 104357 [details]
cluster.xml
Are you using the YP server to serve netgroups to the cluster? More specifically, are netgroups from your clustered YP service being used by your clustered NFS services? Not really - its just a YP-Slave for your network. But sure, the yp-server for the cluster is the slave. Is there anything wrong?? It should be fine; we are just collecting as much data as we can so we can try to figure out what's wrong. Thanks for your patience. Created attachment 104412 [details]
Should fix problem
thanks for the patch. I will include it to the new clumanager before updating and inform you about the results (this will take some time). thanks again for the patch - it seems working correctly now. |