There are two interrlated problems here:
1. We have a different readdir path for NFS compared with "normal" readdir due
to different locking requirements. The NFS path is slower. This might need to be
resolved at the NFS end rather than the GFS end, but I'm making this a GFS2 bug
for now until that can be looked at in more detail.
Ken Preslan described the reasons for having the different code for NFS in the
document he wrote just before he left.
2. We use the name of the current process as a way to switch between the NFS and
"normal" readdir behaviour. This is certainly wrong. At the very least we might
be able to use __builtin_return_address(0) for this since we only have a single
code path which needs the NFS behaviour.
Created attachment 135534 [details]
Message from Jan Engelhardt
The message I've just attached contains a suggestion for making a change to the
selection of the NFS path vs the "normal" path. It is a much better test than
the current one of guessing according to the name of the process.
Long term of course, we still want to eliminate the second code path
completely, but thats a rather more long and involved job.
I've now removed the "NFS only" path from readdir. Having looked at this in a
bit more detail, I'm not so sure that the deadlock can actually occur anyway. I
have a feeling that the situation mentioned in the notes which Ken Preslan left
us is not relevant any more since we lock directories before their content
anyway now, and use the old "must lock in order" rule to break deadlocks at the
directory level (in rename) first and again when locking the inodes at the next
level down. I don't think we ever lock more than two levels at a time. Rename
seems the only likely thing that we might deadlock against and I think we are
I think we can close this one now.