Bug 498489 - blktrace stops working after a trace-file-directory replacement
Summary: blktrace stops working after a trace-file-directory replacement
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel
Version: 5.3
Hardware: All
OS: Linux
medium
medium
Target Milestone: rc
: ---
Assignee: Eric Sandeen
QA Contact: Red Hat Kernel QE team
URL:
Whiteboard:
Depends On:
Blocks: 533192
TreeView+ depends on / blocked
 
Reported: 2009-04-30 17:42 UTC by Milos Malik
Modified: 2018-05-08 14:02 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2010-03-30 07:31:52 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2010:0178 0 normal SHIPPED_LIVE Important: Red Hat Enterprise Linux 5.5 kernel security and bug fix update 2010-03-29 12:18:21 UTC

Description Milos Malik 2009-04-30 17:42:18 UTC
Description of problem:
I wanted to know how blktrace copes with situation where one of the trace files (which are usually overwritten in the next run) is in fact a directory. The utility terminated with an error message, which is a correct behavior. But after removing the directory the utility stopped working and always terminates with another error message. Removing all trace files doesn't help. Remounting /sys/kernel/debug doesn't help.

Version-Release number of selected component (if applicable):
blktrace-1.0.0-3.el5

How reproducible:
always (i386, ppc, ia64, x86_64, s390x)

Steps to Reproduce:
# df -aT | grep -e debug -e md0
/dev/md0      ext3    17834416  14265104   2648736  85% /
debugfs    debugfs           0         0         0   -  /sys/kernel/debug
# blktrace -d /dev/md0 -w 5
Device: /dev/md0
  CPU  0:                    0 events,       19 KiB data
  CPU  1:                    0 events,        7 KiB data
  Total:                     0 events (dropped 0),       26 KiB data
# ls -l *.blktrace.*
-rw-r--r-- 1 root devqa7 19288 Apr 30 13:03 md0.blktrace.0
-rw-r--r-- 1 root devqa7  6648 Apr 30 13:04 md0.blktrace.1
# rm md0.blktrace.1
rm: remove regular file `md0.blktrace.1'? y
# mkdir md0.blktrace.1
# blktrace -d /dev/md0 -w 5
./md0.blktrace.1: Is a directory
Failed to start worker threads
# rmdir md0.blktrace.1
# ls -l *.blktrace.*
-rw-r--r-- 1 root devqa7 0 Apr 30  2009 md0.blktrace.0
# blktrace -d /dev/md0 -w 5
BLKTRACESETUP: No such file or directory
Failed to start trace on /dev/md0
# rm -f *.blktrace.*
# blktrace -d /dev/md0 -w 5
BLKTRACESETUP: No such file or directory
Failed to start trace on /dev/md0
# blktrace -d /dev/md0 -w 5
BLKTRACESETUP: No such file or directory
Failed to start trace on /dev/md0
  
Actual results:
blktrace is not working

Expected results:
blktrace is working again

Additional information:
On all machines where I reproduced this bug, the /sys/kernel/debug/ filesystem contains a directory called /sys/kernel/debug/block/<device-name> (eg. /sys/kernel/debug/block/md0). This directory is not present on machines where the file-directory replacement wasn't done.

Comment 1 Eric Sandeen 2009-04-30 18:09:07 UTC
Hm, seems to work for me on /dev/sda.  Did you always test with /dev/md0?

[root@bear-05 tmp]# blktrace -d /dev/sda -w 5
Device: /dev/sda
  CPU  0:                    0 events,        4 KiB data
  CPU  1:                    0 events,        1 KiB data
  CPU  2:                    0 events,        0 KiB data
  CPU  3:                    0 events,        0 KiB data
  Total:                     0 events (dropped 0),        4 KiB data
[root@bear-05 tmp]# ls -l *.blktrace.*
-rw-r--r-- 1 root root 3248 Apr 30 13:19 sda.blktrace.0
-rw-r--r-- 1 root root  464 Apr 30 13:19 sda.blktrace.1
-rw-r--r-- 1 root root    0 Apr 30 13:19 sda.blktrace.2
-rw-r--r-- 1 root root    0 Apr 30 13:19 sda.blktrace.3
[root@bear-05 tmp]# rm sda.blktrace.1
rm: remove regular file `sda.blktrace.1'? y
[root@bear-05 tmp]# mkdir sda.blktrace.1
[root@bear-05 tmp]# blktrace -d /dev/sda -w 5
./sda.blktrace.1: Is a directory
Failed to start worker threads
[root@bear-05 tmp]# rmdir sda.blktrace.1
[root@bear-05 tmp]# blktrace -d /dev/sda -w 5
Device: /dev/sda
  CPU  0:                    0 events,        0 KiB data
  CPU  1:                    0 events,        9 KiB data
  CPU  2:                    0 events,      162 KiB data
  CPU  3:                    0 events,        0 KiB data
  Total:                     0 events (dropped 0),      171 KiB data
[root@bear-05 tmp]# rm -f *blktrace*
[root@bear-05 tmp]# blktrace -d /dev/sda -w 5
Device: /dev/sda
  CPU  0:                    0 events,        0 KiB data
  CPU  1:                    0 events,        1 KiB data
  CPU  2:                    0 events,        4 KiB data
  CPU  3:                    0 events,        0 KiB data
  Total:                     0 events (dropped 0),        5 KiB data

Comment 3 Eric Sandeen 2009-05-05 21:03:56 UTC
Maybe I should ask which kernel you're testing?  I still can't reproduce this, although I think I see a couple upstream patches which help with the proper teardown in some circumstances....

Thanks,
-Eric

Comment 5 Eric Sandeen 2009-05-06 19:34:27 UTC
http://git.engineering.redhat.com/?p=linux-2.6.git;a=commitdiff_plain;h=35fc51e7a5056889421270c1fb63d8ec45fbccf4

seem to fix this; it's a kernel change.

From: Aneesh Kumar K.V <aneesh.kumar.ibm.com>
Date: Wed, 21 Nov 2007 11:25:41 +0000 (+0100)
Subject: blktrace: Make sure BLKTRACETEARDOWN does the full cleanup.
X-Git-Tag: v2.6.24-rc4~87^2~5
X-Git-Url: http://git.engineering.redhat.com/?p=linux-2.6.git;a=commitdiff_plain;h=35fc51e7a5056889421270c1fb63d8ec45fbccf4

blktrace: Make sure BLKTRACETEARDOWN does the full cleanup.

if blktrace program segfault it will not be able
to call BLKTRACETEARDOWN. Now if we run the blktrace
again that would result in a failure to create the
block/<device> debugfs directory.This will result
in blk_remove_root() to be called which will set
blk_tree_root to NULL. But the  debugfs block dir
still exist because it contain subdirectory.

Now if we try to fix it using BLKTRACETEARDOWN
it won't work because blk_tree_root is NULL.

Fix the same.

--------

I guess we'll need an exception to get this in ...

-Eric

Comment 6 Eric Sandeen 2009-05-06 19:36:49 UTC
Requestion exception for this one, without it blktrace gets into a state where tracing no longer works.    Change is uptream, and is confined to block/blktrace.c which can't regress, since we've never supported it before ...

Comment 8 RHEL Program Management 2009-05-12 17:39:21 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 11 RHEL Program Management 2009-09-25 17:39:04 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 12 Don Zickus 2009-12-11 19:26:54 UTC
in kernel-2.6.18-179.el5
You can download this test kernel from http://people.redhat.com/dzickus/el5

Please update the appropriate value in the Verified field
(cf_verified) to indicate this fix has been successfully
verified. Include a comment with verification details.

Comment 16 errata-xmlrpc 2010-03-30 07:31:52 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2010-0178.html


Note You need to log in before you can comment on or make changes to this bug.