Hide Forgot
Need to integrate the kbase article "How can I use the GFS2 tracepoints and debugfs glocks file in RHEL6?" into official product docs. Link to kbase article: https://access.redhat.com/kb/docs/DOC-41624
That sounds ok to me.
Assigning to me, since it's my area, but I thought we'd already had an extensive discussion that concluded that this did not belong in the user administration guide, that the kBase article was the place for information of this sort. I'll dig out that exchange and revisit this issue.
We're in the process of trying to clear out the kbase of everything that is not a KCS article. KCS articles are generally specific bullet-point lists that address customer issues; the new tech briefs provide more detailed information that drills down into specific features or applications; and product docs seem to be the place for general how-to kinds of things. I'm the new tech writer for the portal, and I had a long meeting with Sam, Perry, and Sayandeb where we went through all the cluster docs in the kbase to identify what needs to go where. I have filed bugs for everything identified to be integrated into the docs. You can find my tracking list here: https://docspace.corp.redhat.com/docs/DOC-67420
But this is not a how-to sort of thing. This information is for file system developers, not for system administrators.
For related information, there is BZ#579598. But that split off into private email. ----------------- Message-ID: <4BF6DDA6.5030304> Date: Fri, 21 May 2010 14:23:18 -0500 From: Steven Levine <slevine> To: Steven Whitehouse <swhiteho>, Perry Myers <pmyers>, Ric Wheeler <rwheeler>, Steven Levine <slevine>, Nathan Straz <nstraz>, David Teigland <teigland>, Bob Peterson <rpeterso>, Abhijith Das <adas>, Bob Peterson <rpeterso> CC: Subhendu Ghosh <sghosh>, "Michael H. Smith" <mhideo>, filesystem-list <filesystem-dept-list> Subject: GFS2 glocks and tracepoints: Looking for Feedback on Where to Document Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.67 on 10.5.11.11 Summary: I'm looking for some feedback on where/how to document GFS2 troubleshooting -- particularly some clarification on who the audience is for this information For the RHEL 6 release, I have been working with Steve Whitehouse on documenting glocks and the new trace points in the GFS2 document. Much of our discussion can be found here: https://bugzilla.redhat.com/show_bug.cgi?id=579598 In that discussion, I suggest that we include information about trace points in an appendix -- on the theory that we have the information, it's GFS2-specific, people might want it, and there's no general "tracepoints" documentation otherwise. As I delve deeper into the actual administrative operation of trace points, however, I'm starting to question who the audience is for this information. It doesn't seem to be GFS administrators -- which is the audience for the book itself. Do Red Hat customers use this information? It doesn't seem as if it will really hurt anything if I provide the trace points information in the GFS2 manual -- that makes them available and easy to find for anybody doing development, even if that isn't really the defined audience for this document -- but I wanted to run this issue by this list to get some feedback on whether I'm confusing things for our customers if I do. On a related note, several weeks ago Steve provided me with some nice information about GFS node locking, and how that relates to performance tuning and troubleshooting. That information can be found in the RHEL 6 GFS2 draft here: http://documentation-stage.bne.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/6/html/Global_File_System_2/s1-ov-lockbounce.html And then it continues here, with a section on troubleshooting: http://documentation-stage.bne.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/6/html/Global_File_System_2/gfs2_performance_troubleshoot.html But as I look at the trace points info, I have the same questions about this troubleshooting section. Who is the audience that will be troubleshooting GFS2? I'm not sure it is the same audience that is using the rest of the document. I'd appreciate any thoughts you have about what I should do with this information -- or if the direction I'm heading (leave the GFS troubleshooting section as is, put the trace points in an appendix) is ok. Thanks, -Steven ------------------- From Dave Chinner in response: Not so much a GFS2 point of view, but I would not expect anyone who is not familiar with the code to understand what the output of tracepoints really mean. They might be useful to support engineers that have been trained to understand them, but I doubt that customers want to did that deeply into the inner workings of the system - that's why they pay for support.... Cheers, Dave. -- Dave Chinner dchinner ----------------------- From Steve Whitehouse: Yes, I'd tend to at least partially agree here. On the other hand, given suitable basic information about the internals, I hope that customers would gain a better understanding of the principles behind the filesystem's operation and that should reduce the number of support calls we get relating to poor performance due to cache bouncing between nodes. We've had a lot of those reports from customers in the past, and anything we can do to enhance understanding and reduce the number of calls that support have to deal with, the better. Also, this has come about because I wrote a brief comment aimed at the release notes, to indicate that we had a new feature - gfs2 tracepoints. That addition to the release notes was rejected on the basis that the feature should have more extensive documentation, and that resulted in the bz now opened to cover this documentation issue. It might be worth while though to ask support about the best way to tackle this particular subject in case there are any items they'd like to add in/leave out specifically, Steve. ------------- From Perry Myers: If we're not sure about this material being pertinent for the formal docs, why not just put the info in a public kbase? That would make it available for advanced users, w/o exploding the formal docs w/ potentially too much information. A kbase would also make it easy for support folks to find it. Perry ---------------- From Steve Whitehouse: That seems reasonable to me. The only question then is, should we release note the new feature? My feeling is yes, but I'll need to convince Ryan of the merits, Steve. ---------------- From Perry: I think a comment in the 'Technical Notes' section and in the errata should be sufficient. If someone creates the kbase prior to GA, we could even reference the kbase itself in the Technical Notes. Perry --------------- From Steve There are no technical notes for .0 releases, but the release notes should be ok I think. It doesn't need much of a mention - just something to say that it exists. Otherwise I think that sounds good, Steve --------------------- Other than that, it's documented in BZ#579598.
Allison: Based on the definition in Comment 4, why isn't this suitable for a tech brief? It's a very specific feature of the application, and not at all an end-user how-to. In fact, the glock information which I wound up putting in the GFS2 manual should probably also be a tech. brief, by this definition. -Steven
Steven, Per our discussion about these articles, it was decided that all of the debug information should be included in product docs. I think a description of the glocks and tracepoints should definitely be in the product docs. Perry Myers and/or Sam Folk-Williams could probably give you more information. -Allison
Perry: I don't understand this. Why are we putting file system internal debugging information for file system/kernel developers into an end-user administration document rather than a technical brief? I can see why this is not a good fit for kbase, from this description, but for similar reasons it is not a good fit for the administration manual. In Comment 6, above, I reproduce something Steve W. wrote last year about customers needing to understand some issues that might cause poor file system performance, but that's addressed in the existing section that he provided on node locking (which already seems pretty advanced, but at least it gives specific user-level advice on what to look for and how you might address it): http://documentation-stage.bne.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/6/html/Global_File_System_2/s1-ov-lockbounce.html But, as Dave Chinner noted at the time, tracepoints won't mean much unless you are familiar with the actual file system code. This redefines the audience for this document. If we have defined a place for information of this sort -- in this case technical briefs -- why put it in administration documentation? Back when we first had the discussion there was a bit of logic of "Well, it has to go somewhere and there's no good place for it". In fact, I may have said that myself. But is that what technical briefs are for? We are looking at creating a new tuning document as part of the documentation overhaul, but even a cluster tuning document doesn't seem to encompass file system development and debugging. What is it that tracepoints are used for, for somebody setting up a system? If the logic here is that "debugging" goes into the product document, what is it that's being debugged? (That is: How the system is configured, or the file system code itself?) This seems to be kernel stuff. As always, I could be misinterpreting what this is for and who will use it. Also I'm thinking that GFS2 may be its own species, regarding how to organize the documentation. Still, this seems kernel-level to me. Am I wrong about that? -Steven
slevine: To be clear... I don't care where this information goes, so long as it is easily accessible and searchable by the customers. If you think TechBrief is better place for this particular documentation, please coordinate with Allison and work out where this ultimately should live.
All, I think it would be helpful here to identify the audience and the purpose of these documents, as Steve is suggesting. I'm totally happy to have them be tech briefs. Tech Briefs cover a variety of areas, including specific use cases, setting up whole solutions, performance tuning, best practices, etc. Personally, I think having a comprehensive debugging guide in the product docs could be most appropriate. I definitely see Steve's point about confusing the audience of the administration guide. The trade off for tech briefs is that they are not systematically maintained for the life of the product, unlike the official documentation. On the other hand, we can produce them more rapidly and in an agile fashion. Perry/Sayan - what is the demand for these documents? Do we have a lot of customers asking for this information? Is this something that needs to be updated with every release, or is it OK to have it updated on demand? Sam
@Sam: Ric and SteveW would be better to ask, since this is specific to GFS which falls outside of my product (although it's very intertwined). I've set needinfo on them.
Let me see if I can answer some of the comment #11 questions... The demand is basically so that our more technical customers can solve some of their own performance issues (and other) issues so that it will hopefully reduce the load on support. The customers who ask for the information tend to do so indirectly - by asking for debugging info. At least to the best of my knowledge. It will have to be updated from time to time since the tracepoints are not an API and we don't guarantee to keep them stable. Of course we won't change them if we don't have to, and we will try to keep them stable whenever possible. There will need to be an update for 6.2, for example, but there was no change from 6.0 to 6.1.
OK I discussed with Steven L and a few other folks. Based on customer demand and the direction from Sayan and Perry, we agree this should be included in product documentation. However, it will be difficult to integrate it into the existing admin guide. Therefor we think including it as an appendix targeted at developers is probably the best way to go. Steven L will review the guide and the references it points to and evaluate further. Other options would be to create a new guide for developers, or do do significant re-writes/expansion of the material to be integrated with existing guide. The problem here is resources, so an appendix seems like the right compromise. -Sam
That seems like a reasonable plan to me. Is any more info required? I'm assuming that all it now settled at this point.
Steve: When I add this to the document I will need to be in touch with you about any edits or modifications I make, but my first-pass plan is to leave it in its current technical paper form, with a brief introduction to put in a context to indicate who this is for. I'll be asking for your advice/review when I put that together. I may have to do some reworking, but what I see here is that this will remain as is, self contained, almost as if we are publishing a standalone tech brief as an appendix to the document. To expand on Sam's summary, and to keep my reasoning on record in this bug: I still am not happy putting file system internals in an administration manual, but there is pretty much zero chance there will be an actual internals manual (that would require your/Bob's fulltime work for a while I think). Here I quote one of the system administrators who responded to my informal attempt last year to get a sense of this issue -- mind you this wasn't about GFS in particular and this was not a Red Hat customer and it was a completely informal comment but it's still interesting: If the system administrator needs to be explicitly concerned with file locks you have problems beyond what the documentation can solve. Meaning that this is below-the-covers stuff. But If I understand your Comment 13 there are GFS customers who are not your more standard system administrators and who are working at the level of tracepoints and examining file system locks to debug issues with file systems. Debugging file system performance? Or other sorts of problems? (That's the sort of question I will be asking you when I write up the introduction to this planned appendix.) In any case, as noted, the information has to go somewhere and while I do think making it a tech brief would cleanly solve the differing-audience issue (which I think is a major issue), I wasn't clear on the maintenance issue you note in comment 13, which is that the information is subject to change on a point-release basis. I think maintaining a tech brief on the customer portal site at each point release is something that could easily get lost, but I update the GFS2 manual on that basis as a matter of course and it's easy for you to do exactly as you have always done here: file a bug or let me know when there's something that's changing for a release and it gets monitored from there as part of our standard bugzilla procedure. It's the maintenance issue that convinces me that of our options, putting it as an appendix in the admin manual is probably the easiest course. In sum: Nothing more required from you now but I will need your approval and review and perhaps some more contextual information when I move the information to the GFS2 manual.
Plan sounds ok to me, let me know if you need anything from engineering at this stage.
This comment is a status update to this BZ. I have added the tracepoints article as an appendix to my current working copy of the 6.2 gfs2 manual, reformatting accordingly. I sent Steve Whitehouse a list of small questions, to clarify and discuss a few things, and he has responded and provided information about a new supported tracepoint to add. I will incorporate his comments and then send the reformatted appendix for review in the RHEL 6.2 timeframe.
Status update: I have edited and formatted Steve Whitehouse's article on tracepoints into an appendix, which is in the current review draft of the GFS2 manual. I have sent Steve a note asking him to look this over. Once he gives his approval I will move this BZ to MODIFIED. I think we're pretty safe for a 6.2 release of the material in the document.