Bug 1263189
Summary: | blktrace gets stuck after a trace-file is replaced by a directory of the same name | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Milos Malik <mmalik> |
Component: | blktrace | Assignee: | Eric Sandeen <esandeen> |
Status: | CLOSED ERRATA | QA Contact: | Karel Srot <ksrot> |
Severity: | medium | Docs Contact: | |
Priority: | medium | ||
Version: | 7.2 | CC: | esandeen |
Target Milestone: | rc | ||
Target Release: | --- | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | blktrace-1.0.5-8.el7 | Doc Type: | Bug Fix |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2016-11-04 07:41:11 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Milos Malik
2015-09-15 09:44:37 UTC
Fixed up by upstream commit commit 838361c6cfb1319eadd59daaf9074dcdb92746e6 Author: Robert Schiele <rschiele> Date: Mon Sep 8 09:38:52 2014 +0200 signal condition variable at end of stop_tracers stop_tracers modifies tp->is_done and thus must signal the condition variable tracer_wait_unblock is waiting on to monitor tp->is_done. Not doing so might cause the tool to deadlock if stop_tracers is called while a tracer thread is in tracer_wait_unblock. Signed-off-by: Robert Schiele <rschiele> Signed-off-by: Jens Axboe <axboe> It's a one-liner :) Hi Eric, I have encountered following behaviour during the testing on non-intel architectures: 1. have the bogus blktrace version installed 2. run the reproducer for this bug (the blktrace would stuck and would have to be killed as reported in #c0) 3. run blktrace command again and it will fail with: BLKTRACESETUP(2) /dev/vda2 failed: 2/No such file or directory Thread 0 failed open /sys/kernel/debug/block/(null)/trace0: 2/No such file or directory Thread 1 failed open /sys/kernel/debug/block/(null)/trace1: 2/No such file or directory FAILED to start thread on CPU 0: 1/Operation not permitted FAILED to start thread on CPU 1: 1/Operation not permitted 4. another blktrace command would succeed. It seems that after killing blktrace there were some remaining "locks" in place that prevented next blktrace to run properly but the same unsuccessful execution had removed these "locks". I am just curious whether this is something expected. Karel, that's only with the old version, right? Yes, I think that when it fails, it leaves the kernel in an inconsistent state; then the next run fails due to the inconsistency but cleans up correctly, which makes it all consistent again. I think I have seen this as well; I would need to dig into it a bit more to get the exact sequence of what happens. (In reply to Eric Sandeen from comment #5) > Karel, that's only with the old version, right? Well, yes and no. As the new version has fixed the bug and doesn't stuck I don't need to kill it. OTOH, I could reproduce it artificially even with the new version using # blktrace -d /dev/vda2 -w 30 & [1] 1280 # kill -9 1280 # [1]+ Killed blktrace -d /dev/vda2 -w 30 # # blktrace -d /dev/vda2 -w 5 BLKTRACESETUP(2) /dev/vda2 failed: 2/No such file or directory Thread 0 failed open /sys/kernel/debug/block/(null)/trace0: 2/No such file or directory FAILED to start thread on CPU 0: 1/Operation not permitted # # blktrace -d /dev/vda2 -w 5 === vda2 === CPU 0: 32 events, 2 KiB data Total: 32 events (dropped 0), 2 KiB data Just to clarify, I consider the bug to be successfully fixed. I was just curious what is going on here. Maybe blktrace could be updated to do the "cleanup" on start - that would be clearly a new bug/rfe report and maybe not trivial to implement. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2016-2499.html |