Description of problem: Assertion failure: ngms0 kernel: Assertion failure in journal_forget_R575811ba() at transaction.c:1247: "!jh->b_committed_data" Version-Release number of selected component (if applicable): How reproducible: Unknown Steps to Reproduce: 1.unknown 2. 3. Actual results: Filesystem was severely corrupted Expected results: Additional info: Apologies for the limited info, but the customer has returned his systems to RH9, and does not have details on reproduction steps, nor does he have any more detail on the problem then this. Will update this ticket with more info if/when it arrives.
This is an assert failure scenario that I discovered very recently. It does not cause filesystem corruption: rather, it is a _symptom_ of filesystem corruption. If disk corruption causes an indirect block to be corrupted, it is possible for a bitmap block to become listed in the indirect block. If the user tries to delete that file, this panic can result: bitmaps have extra metadata associated with them and an attempt to throw that metadata away would violate ext3's internal assumptions. I am already looking at ways of improving ext3's behaviour when such corruption is seen. But such a fix won't cure any on-disk corruption; it will just make ext3 react to it more gracefully. To diagnose the initial data corruption we would need more information.
Current errata deal with this failure gracefully, but it won't be possible to diagnose the underlying problem any further on this information.