Red Hat Bugzilla – Bug 579598
Slight improvements to glock documentation
Last modified: 2010-11-11 10:29:45 EST
This feature request did not get resolved in time for Feature Freeze
for the current Red Hat Enterprise Linux release and has now been
denied. You may re-open your request by requesting your support
representative to propose it for the next release.
I've been reading the Linux Symposium paper referenced in this bug (Testing and verification of cluster filesystems) and I think I need a little more guidance regarding what I need to add to the GFS2 manual for RHEL 6. In some ways this material is more internal than what I provide in the GFS2 manual, but if administrators need this information about glocks and trace points to debug their file system configurations, then I should provide what I can.
As a first step: Do you have any general thoughts about what from this paper needs to be documented? I realize the answer could be "all of it", but I'm not sure.
I think most of it at least, and quite possibly all of it. The point is that without a bit of context, the tracepoints will not make sense. Also its another chance to explain in a different way, the way in which glocks work which is useful in itself since it will hopefully help reduce further the number of bugs which are reported due to misunderstandings of how the caching works.
Do we already have a generic tracepoints document of which this might form a part?
Can we ensure that we also document at least the blkdev tracepoints/blktrace itself in the same document?
If there is no existing tracepoints document, should there be one, or should we just document this as part of GFS2?
Originally I'd intended just to use the technical/release notes to document that this feature was available and then to point people at the paper, bearing in mind that as you say, this is more aimed at our support people than the average user. On the other hand, there seems no harm in documenting it in such a way as to be available to users, and the more debugging users can do, and the greater understanding they might get of the fs, the lower the burden on support. So I've certainly no objection to having more in depth documentation in this area.
First a pretty high-level question -- this wasn't clear to me from the paper. When you say that "this feature" is available, what specific feature do you mean, and do you mean that this is new for RHEL 6? For example, is the glock debugfs interface new? Is support for trace points as a whole new?
As to how to incorporate this into the manual: One way I've handled this sort of thing before -- in the LVM manual, for example -- is to move material of this level to an appendix. I've even seen documents where a technical paper such as this is included as an appendix -- still formatted as a paper, just as it was presented at a conference -- although I'm not suggesting that here. A big concern is that we do not provide information that might confuse people who are just looking for how GFS works or differs from other file systems they have used. Slapping the word "appendix" on something conveys that this is extra material -- particularly if in the introductory section of the book this is made clear, who the audience for this section is.
I'll see what I can find, but I'm pretty certain we do not have a generic tracepoints document.
So I'll look at taking this paper and reformatting it as an appendix, to see if that works as a standalone document. One thing I'll look at is whether section 1 -- the introductory material about the issues surrounding testing in a cluster and Open Source -- belongs here at all. My first thought is that the appendix would include the material in section 2 (Glocks) and section 3 (Trace points). The trace points section could well include material that is not specific to GFS2.
But this is all first-pass lookover.
I'll see what I can find and try to put together the outline of the appendix (which will pretty much be your paper) and we'll take it from there. But if I were clear on what here is new for RHEL 6 that could help me present this.
Tracepoints are in rhel5, but they are very limited. You need systemtap to make use of them. New in rhel6 is the tracepoints userland interface and the GFS2 tracepoints specifically.
Yes, I'd agree that the intro material about testing can be disposed of. We need to restrict ourselves to just explaining the gfs2 tracepoints and what they can be used for when taking material from that particular document. So a bit of gfs2 background is required in order to understand what the gfs2 tracepoints can actually tell us.
RHEL5 has blktrace, but via its own interface only. RHEL6 has blktrace via either that older interface or via the generic tracepoints interface.
Other subsystems have tracepoints too - it is a generic feature. We need to get across information on how to make use of them which is also generic as well as the gfs2 specific information. The paper mostly covers the gfs2 specific information and skips over the generic details of the interface. It does include some example output though.
As I started to pore through your technical paper again to see about how to approach this material for the GFS2 manual, I started with the section on glocks. But as you know, you have already provided me with information about node locking and lock dumps -- that information is already in the GFS2 working draft here:
And then it continues here...
I think it might be a useful thing to add the three tables from the technical paper to that troubleshooting subsection: Glock types, Glock flags, and Glock holder flags. Does that seem right to you, to add those tables? It also might be useful to add the example glock dump from debugfs to the section on troubleshooting GFS2 performance, maybe incorporating an explication of that sample into the text.
But other than that I'm not sure if there's anything in section 2 (Glocks) of the technical paper that I should include in the GFS2 manual other than what you've already written up for the manual.
So that leaves two things:
Based on the warning you give in the technical paper about not encouraging use of callback injection, I'm thinking maybe we don't need to include that material.
Which leaves us with adding the information on trace points, which is where this started and what you've been talking about in this bug all along. I'll start to dig in to that now, but I just wanted to verify at this point my sense that glocks are already covered in the draft and that we don't need to cover callback injection.
Does that seem like a way to proceed?
Yes, adding the tables seems like a good idea. Might also be worth adding a statement to the effect that while we don't expect the tracepoints to change very much, they are not a guaranteed stable API since their primary purpose is debugging.
I agree that leaving out the callback injection interface from the docs seems like a good plan, at least at the moment. We can always add it later if required.
Otherwise, it all sounds good to me.
The summary of the mailing list discussion so far is that we should turn this into a kbase and reinstate the original technical notes.
StevenL, this just made the pgampe priority list. Can you tell me if we will have this ready for review by Beta 2? - Mike
As per Comment 8, the information about trace points will be turned into a kbase article and will not be in the GFS2 manual.
I'll look further into what to do with this bug in particular -- there are now various things here beyond the trace points documentation -- but my understanding is that Steve Whitehouse will be putting this documentation into a kbase article.
Yes, thats the plan. Also the release notes for the original bug will be reinstated. Do we need a bug to update a kbase? I don't remember what the procedure it since I do it so rarely.
I'm not sure why I've been flagged as NEEDINFO here. The bug is against event tracing documentation, which we will address in your kbase article. As a side point this bug pointed me to some information about glocks that I can use to improve the section you wrote on glocks for the GFS2 manual (even though that, too, is somewhat internal). I am currently working on the RHEL 6 documents, and that's on my list.
What information do I need to provide? To close this bug?
The question was whether we need a bug in order to create a kbase. I assume that the answer is no, in which case we can just close this bug now I think.
Steve: I agree. I'll follow up just to be sure that all I need to do is close this.
This bug was predominantly about tracepoints, which we will be documenting in a kbase article. But there is also some info here about improving the glock documentation with some info from Steve Whitehouse's technical paper that will be very easy to add to the documentation.
So I've changed to title of this bug to reflect the part of the bug that is still valid. I possibly should have cloned this into a new bug to reflect only the glock part of this bug, but since the glock documentation is an unresolved part of the original bug it seemed that I could leave it here.
But changing the subject clarifies what the bug is about.
I have added the tables explaining the glock flags and types in the debugfs file to the 6.0 draft of the GFS2 document:
I am moving this bug to MODIFIED since the information is now in the GFS2 document that has been posted to the review server and checked into the SVN repository.
Verified the addition is in Red_Hat_Enterprise_Linux-Global_File_System_2-6-en-US-7-13.
Red Hat Enterprise Linux 6.0 is now available and should resolve
the problem described in this bug report. This report is therefore being closed
with a resolution of CURRENTRELEASE. You may reopen this bug report if the
solution does not work for you.