Bug 1274834 - check_cache 'slowness'
check_cache 'slowness'
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: device-mapper-persistent-data (Show other bugs)
x86_64 Linux
unspecified Severity medium
: rc
: ---
Assigned To: Joe Thornber
Jakub Krysl
Depends On:
Blocks: 1469559
  Show dependency treegraph
Reported: 2015-10-23 11:25 EDT by Corey Marthaler
Modified: 2017-08-16 04:29 EDT (History)
8 users (show)

See Also:
Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)

  None (edit)
Description Corey Marthaler 2015-10-23 11:25:54 EDT
Description of problem:

Copying comment #15 from bug 1189051

dmsetup create --table '0 2000 error'  errdev

cache_check  /dev/mapper/errdev

examining superblock
  superblock is corrupt
    incomplete io for block 0, e.res = 18446744073709551611, e.res2 = 0, offset = 0, nbytes = 4096

--- very slooooowly going here  ---

(gdb) bt
#0  0x00007f6cc456c644 in __io_getevents_0_4 (ctx=0x7f6cc4c9d000, min_nr=1, nr=3949, events=0x555ce9562820, timeout=0x0)
    at io_getevents.c:25
#1  0x00007f6cc456c67d in io_getevents_0_4 (ctx=<optimized out>, min_nr=min_nr@entry=1, nr=<optimized out>, 
    events=<optimized out>, timeout=timeout@entry=0x0) at io_getevents.c:54
#2  0x0000555ce7e3a7f0 in bcache::block_cache::wait_io (this=this@entry=0x555ce955e6c8) at block-cache/block_cache.cc:202
#3  0x0000555ce7e3add0 in bcache::block_cache::wait_all (this=<optimized out>) at block-cache/block_cache.cc:261
#4  bcache::block_cache::flush (this=this@entry=0x555ce955e6c8) at block-cache/block_cache.cc:674
#5  0x0000555ce7e3aecc in bcache::block_cache::~block_cache (this=0x555ce955e6c8, __in_chrg=<optimized out>)
    at block-cache/block_cache.cc:491
#6  0x0000555ce7eb1ef3 in persistent_data::block_manager<4096u>::~block_manager (this=0x555ce955e6c0, 
    __in_chrg=<optimized out>) at persistent-data/block.h:42
#7  boost::checked_delete<persistent_data::block_manager<4096u> > (x=0x555ce955e6c0)
    at /usr/include/boost/core/checked_delete.hpp:34
#8  boost::detail::sp_counted_impl_p<persistent_data::block_manager<4096u> >::dispose (this=<optimized out>)
    at /usr/include/boost/smart_ptr/detail/sp_counted_impl.hpp:78
#9  0x0000555ce7e3eaf8 in boost::detail::sp_counted_base::release (this=0x555ce95815d0)
    at /usr/include/boost/smart_ptr/detail/sp_counted_base_gcc_x86.hpp:146
#10 boost::detail::shared_count::~shared_count (this=0x7ffc84793ad8, __in_chrg=<optimized out>)
    at /usr/include/boost/smart_ptr/detail/shared_count.hpp:467
#11 boost::shared_ptr<persistent_data::block_manager<4096u> >::~shared_ptr (this=0x7ffc84793ad0, __in_chrg=<optimized out>)
    at /usr/include/boost/smart_ptr/shared_ptr.hpp:330
#12 (anonymous namespace)::metadata_check (fs=<synthetic pointer>, path="/dev/mapper/errdev") at caching/cache_check.cc:224
#13 (anonymous namespace)::check (fs=<synthetic pointer>, path="/dev/mapper/errdev") at caching/cache_check.cc:300
#14 (anonymous namespace)::check_with_exception_handling (fs=<synthetic pointer>, path="/dev/mapper/errdev")
    at caching/cache_check.cc:318
#15 cache_check_main (argc=<optimized out>, argv=<optimized out>) at caching/cache_check.cc:410
#16 0x0000555ce7e364d4 in base::command::run (this=0x555ce815d6c0 <caching::cache_check_cmd>, argv=0x7ffc84794798, argc=2)
    at base/application.h:26
#17 base::application::run (this=0x7ffc84794680, argc=2, argv=0x7ffc84794798) at base/application.cc:32
#18 0x0000555ce7e35431 in main (argc=2, argv=0x7ffc84794798) at main.cc:39

---  at some point it will finish ---

but this clearly relates to the size of  errored device

For 'reproducer'  with 2000 sectors - it's about 40 seconds
with  20000 sectors  it has NOT finished in 10 minutes - so it's not even linear time increase...

So - new bug for cache_check tool to address this 'slowness' should be created.

As for actual testing of this NEEDSCHECKFLAG - I'd probably suggest  to restore 'previous' table content before 'vgchange -an' -  since  cache_check  is  executed on 'deactivation' as well as on 'activation' phase.

So if you want to check 'needs' has been cleared in this test case - you would need to give it back original device - after  cache target itself switched to 'Fail' mode.

Version-Release number of selected component (if applicable):

lvm2-2.02.130-5.el7    BUILT: Wed Oct 14 08:27:29 CDT 2015
lvm2-libs-2.02.130-5.el7    BUILT: Wed Oct 14 08:27:29 CDT 2015
lvm2-cluster-2.02.130-5.el7    BUILT: Wed Oct 14 08:27:29 CDT 2015
device-mapper-1.02.107-5.el7    BUILT: Wed Oct 14 08:27:29 CDT 2015
device-mapper-libs-1.02.107-5.el7    BUILT: Wed Oct 14 08:27:29 CDT 2015
device-mapper-event-1.02.107-5.el7    BUILT: Wed Oct 14 08:27:29 CDT 2015
device-mapper-event-libs-1.02.107-5.el7    BUILT: Wed Oct 14 08:27:29 CDT 2015
device-mapper-persistent-data-0.5.5-1.el7    BUILT: Thu Aug 13 09:58:10 CDT 2015
cmirror-2.02.130-5.el7    BUILT: Wed Oct 14 08:27:29 CDT 2015
sanlock-3.2.4-1.el7    BUILT: Fri Jun 19 12:48:49 CDT 2015
sanlock-lib-3.2.4-1.el7    BUILT: Fri Jun 19 12:48:49 CDT 2015
lvm2-lockd-2.02.130-5.el7    BUILT: Wed Oct 14 08:27:29 CDT 2015

Note You need to log in before you can comment on or make changes to this bug.