Bug 1233103

Summary: thin_dump[5302]: segfault at 0 ip 0000000000498f84 sp 00007fff6d905900 error 4 in pdata_tools[400000+11f000]
Product: Red Hat Enterprise Linux 7 Reporter: Marian Csontos <mcsontos>
Component: device-mapper-persistent-dataAssignee: Zdenek Kabelac <zkabelac>
Status: CLOSED ERRATA QA Contact: Bruno Goncalves <bgoncalv>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 7.2CC: agk, bgoncalv, heinzm, knappch, lvm-team, mcsontos, msnitzer, prajnoha, thornber, zkabelac
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: device-mapper-persistent-data-0.5.2-1.el7 Doc Type: Bug Fix
Doc Text:
Intra-release bug. No documentation needed.
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-11-19 09:39:26 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
gdb: bt full
none
core dump
none
Compressed failing metadata content (dd) none

Description Marian Csontos 2015-06-18 08:49:31 UTC
Created attachment 1040369 [details]
gdb: bt full

Description of problem:
segfault in thin_dump

Version-Release number of selected component (if applicable):

device-mapper-persistent-data-0.4.2-1.el7.x86_64
Linux bot-rhel7-x86-64.lab.eng.brq.redhat.com 3.10.0-229.4.2.el7.x86_64 #1 SMP Fri Apr 24 15:26:38 EDT 2015 x86_64 x86_64 x86_64 GNU/Linux

(7.2 nightly build: rhel/nightly/RHEL-7.2-20150616.n.0/)

How reproducible:
100%

Steps to Reproduce:
1. run lvm2-testsuite T=shell/lvconvert-repair-thin.sh

Additional info:
see attachments

Comment 1 Marian Csontos 2015-06-18 08:55:00 UTC
Created attachment 1040373 [details]
core dump

Comment 3 Marian Csontos 2015-06-18 08:59:54 UTC
Some context what was the test doing:

227 [ 0:02] # Make some 'repairable' damage??
228 [ 0:02] dd if=/dev/zero of="$DM_DEV_DIR/$vg/repair" bs=1 seek=40960 count=1
229 [ 0:02] #lvconvert-repair-thin.sh:61+ dd if=/dev/zero of=/dev/@PREFIX@vg/repair bs=1 seek=40960 count=1
230 [ 0:02] 1+0 records in
231 [ 0:02] 1+0 records out
232 [ 0:02] 1 byte (1 B) copied, 0.000223267 s, 4.5 kB/s
233 [ 0:02] 
234 [ 0:02] not "$LVM_TEST_THIN_CHECK_CMD" "$DM_DEV_DIR/$vg/repair"
235 [ 0:02] #lvconvert-repair-thin.sh:63+ not /usr/sbin/thin_check /dev/@PREFIX@vg/repair
236 [ 0:02] examining superblock
237 [ 0:02] examining devices tree
238 [ 0:02] examining mapping tree
239 [ 0:02] mapping_tree_damage_visitor: path too long
240 [ 0:02] 
241 [ 0:02] not "$LVM_TEST_THIN_DUMP_CMD" "$DM_DEV_DIR/$vg/repair" | tee dump
242 [ 0:02] #lvconvert-repair-thin.sh:65+ not /usr/sbin/thin_dump /dev/@PREFIX@vg/repair
243 [ 0:02] #lvconvert-repair-thin.sh:65+ tee dump
244 [ 0:02] <superblock uuid="" time="0" transaction="2" data_block_size="256" nr_data_blocks="160">
245 [ 0:02]   <device dev_id="1" mapped_blocks="6" transaction="0" creation_time="0" snap_time="0">
246 [ 0:02]     <range_mapping origin_begin="0" data_begin="0" length="2" time="0"/>
247 [ 0:02]     <range_mapping origin_begin="64" data_begin="2" length="3" time="0"/>
248 [ 0:02]     <single_mapping origin_block="79" data_block="5" time="0"/>
249 [ 0:02]   </device>
250 [ 0:02]   <device dev_id="2" mapped_blocks="6" transaction="1" creation_time="0" snap_time="0">
251 [ 0:02] Process 5338 died of signal 11.

The rest of the script here: https://git.fedorahosted.org/cgit/lvm2.git/tree/test/shell/lvconvert-repair-thin.sh?h=dev-dct-lvmlockd-test&id=ec567103a59cfe7a38088a22df3cc6b72c244ede#n65

Comment 6 Marian Csontos 2015-06-26 17:12:51 UTC
Just tested with device-mapper-persistent-data-0.5.1-1.el7.x86_64 with same result.

Comment 7 Zdenek Kabelac 2015-06-29 10:14:55 UTC
Here is the backtrace from my rawhide:
(Compiled without -O8  for better readability)

Note: - when configured with --enable-debug  intalled binaries should not be stripped.

I assume it tries to do something with 'already' bad mapping ??


Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x000000000049a387 in (anonymous namespace)::single_mapping_tree_damage_visitor::visit (this=0x7ffed05c6b90, path=..., d=...) at thin-provisioning/mapping_tree.cc:199
199					v_.visit(missing_mappings(d.desc_, path[0], d.lost_keys_));
(gdb) bt
#0  0x000000000049a387 in (anonymous namespace)::single_mapping_tree_damage_visitor::visit (this=0x7ffed05c6b90, path=std::vector of length 0, capacity 0, d=...)
    at thin-provisioning/mapping_tree.cc:199
#1  0x00000000004a2041 in persistent_data::btree_detail::btree_damage_visitor<thin_provisioning::mapping_tree_detail::mapping_visitor, (anonymous namespace)::single_mapping_tree_damage_visitor, 1u, thin_provisioning::mapping_tree_detail::block_traits>::issue_damage (this=0x7ffed05c6ab0, path=std::vector of length 0, capacity 0, r=...)
    at ./persistent-data/data-structures/btree_damage_visitor.h:434
#2  0x000000000049cdc1 in persistent_data::btree_detail::btree_damage_visitor<thin_provisioning::mapping_tree_detail::mapping_visitor, (anonymous namespace)::single_mapping_tree_damage_visitor, 1u, thin_provisioning::mapping_tree_detail::block_traits>::maybe_issue_damage (this=0x7ffed05c6ab0, path=std::vector of length 0, capacity 0) at ./persistent-data/data-structures/btree_damage_visitor.h:454
#3  0x000000000049bd3a in persistent_data::btree_detail::btree_damage_visitor<thin_provisioning::mapping_tree_detail::mapping_visitor, (anonymous namespace)::single_mapping_tree_damage_visitor, 1u, thin_provisioning::mapping_tree_detail::block_traits>::end_walk (this=0x7ffed05c6ab0) at ./persistent-data/data-structures/btree_damage_visitor.h:428
#4  0x000000000049b2d6 in persistent_data::btree_detail::btree_damage_visitor<thin_provisioning::mapping_tree_detail::mapping_visitor, (anonymous namespace)::single_mapping_tree_damage_visitor, 1u, thin_provisioning::mapping_tree_detail::block_traits>::visit_complete (this=0x7ffed05c6ab0) at ./persistent-data/data-structures/btree_damage_visitor.h:193
#5  0x00000000004a2a12 in persistent_data::btree<1u, thin_provisioning::mapping_tree_detail::block_traits>::visit_depth_first (this=0x7ffed05c6bc0, v=...)
    at ./persistent-data/data-structures/btree.tcc:811
#6  0x000000000049ac6c in persistent_data::btree_visit_values<1u, thin_provisioning::mapping_tree_detail::block_traits, thin_provisioning::mapping_tree_detail::mapping_visitor, (anonymous namespace)::single_mapping_tree_damage_visitor> (tree=..., value_visitor=warning: can't find linker symbol for virtual table for `thin_provisioning::mapping_tree_detail::mapping_visitor' value
..., damage_visitor=...) at ./persistent-data/data-structures/btree_damage_visitor.h:488
#7  0x000000000049a66e in thin_provisioning::walk_mapping_tree (tree=..., mv=warning: can't find linker symbol for virtual table for `thin_provisioning::mapping_tree_detail::mapping_visitor' value
..., dv=warning: can't find linker symbol for virtual table for `thin_provisioning::mapping_tree_detail::damage_visitor' value
...) at thin-provisioning/mapping_tree.cc:253
#8  0x00000000004aacba in (anonymous namespace)::mapping_tree_emitter::emit_mappings (this=0x7ffed05c7290, subtree_root=10) at thin-provisioning/metadata_dumper.cc:205
#9  0x00000000004aaa81 in (anonymous namespace)::mapping_tree_emitter::visit (this=0x7ffed05c7290, path=std::vector of length 1, capacity 1 = {...}, tree_root=10)
    at thin-provisioning/metadata_dumper.cc:188
#10 0x000000000049cc9c in persistent_data::btree_detail::btree_damage_visitor<thin_provisioning::mapping_tree_detail::device_visitor, (anonymous namespace)::dev_tree_damage_visitor, 1u, thin_provisioning::mapping_tree_detail::mtree_traits>::visit_values (this=0x7ffed05c7170, path=std::vector of length 0, capacity 0, n=...) at ./persistent-data/data-structures/btree_damage_visitor.h:211
#11 0x000000000049b536 in persistent_data::btree_detail::btree_damage_visitor<thin_provisioning::mapping_tree_detail::device_visitor, (anonymous namespace)::dev_tree_damage_visitor, 1u, thin_provisioning::mapping_tree_detail::mtree_traits>::visit_leaf (this=0x7ffed05c7170, loc=..., n=...) at ./persistent-data/data-structures/btree_damage_visitor.h:187
#12 0x00000000004a30ee in persistent_data::btree<1u, thin_provisioning::mapping_tree_detail::mtree_traits>::walk_tree_internal (this=0x1ca7440, v=..., loc=..., b=9)
    at ./persistent-data/data-structures/btree.tcc:874
#13 0x00000000004a2a73 in persistent_data::btree<1u, thin_provisioning::mapping_tree_detail::mtree_traits>::walk_tree (this=0x1ca7440, v=..., loc=..., b=9)
    at ./persistent-data/data-structures/btree.tcc:821
#14 0x00000000004a281f in persistent_data::btree<1u, thin_provisioning::mapping_tree_detail::mtree_traits>::visit_depth_first (this=0x1ca7440, v=...)
    at ./persistent-data/data-structures/btree.tcc:810
#15 0x000000000049a88c in persistent_data::btree_visit_values<1u, thin_provisioning::mapping_tree_detail::mtree_traits, thin_provisioning::mapping_tree_detail::device_visitor, (anonymous namespace)::dev_tree_damage_visitor> (tree=..., value_visitor=..., damage_visitor=...) at ./persistent-data/data-structures/btree_damage_visitor.h:488
#16 0x000000000049a4d7 in thin_provisioning::walk_mapping_tree (tree=..., dev_v=..., dv=warning: can't find linker symbol for virtual table for `thin_provisioning::mapping_tree_detail::damage_visitor' value
...) at thin-provisioning/mapping_tree.cc:219
#17 0x00000000004ab087 in thin_provisioning::metadata_dump (md=..., e=..., repair=false) at thin-provisioning/metadata_dumper.cc:234
#18 0x00000000004c77ec in (anonymous namespace)::dump_ (path="/dev/shm/LVMTEST22139.P9TbVSIoOt/dev/LVMTEST22139vg/repair", out=..., format="xml", flags=..., metadata_snap=0)
    at thin-provisioning/thin_dump.cc:68
#19 0x00000000004c7a3d in (anonymous namespace)::dump (path="/dev/shm/LVMTEST22139.P9TbVSIoOt/dev/LVMTEST22139vg/repair", output=0x0, format="xml", flags=..., metadata_snap=0)
    at thin-provisioning/thin_dump.cc:84
#20 0x00000000004c7fab in thin_dump_main (argc=2, argv=0x7ffed05c7be8) at thin-provisioning/thin_dump.cc:167
#21 0x0000000000404963 in base::command::run (this=0x755120 <thin_provisioning::thin_dump_cmd>, argc=2, argv=0x7ffed05c7be8) at ./base/application.h:26
#22 0x00000000004045e6 in base::application::run (this=0x7ffed05c7ad0, argc=2, argv=0x7ffed05c7be8) at base/application.cc:32
#23 0x000000000046a89a in main (argc=2, argv=0x7ffed05c7be8) at main.cc:39
(gdb) print d
$1 = (const persistent_data::btree_detail::damage &) @0x7ffed05c68f0: {lost_keys_ = {
    begin_ = {<boost::optional_detail::optional_base<unsigned long>> = {<boost::optional_detail::optional_tag> = {<No data fields>}, m_initialized = true, m_storage = {dummy_ = {
            data = "\000\000\000\000\000\000\000", aligner_ = {<No data fields>}}}}, <No data fields>}, 
    end_ = {<boost::optional_detail::optional_base<unsigned long>> = {<boost::optional_detail::optional_tag> = {<No data fields>}, m_initialized = false, m_storage = {dummy_ = {
            data = "\243\314B\000\000\000\000", aligner_ = {<No data fields>}}}}, <No data fields>}}, desc_ = "bad checksum in btree node"}
(gdb) print path
$2 = std::vector of length 0, capacity 0
(gdb) print d.desc_
$3 = "bad checksum in btree node"
(gdb)

Comment 8 Joe Thornber 2015-07-02 13:50:11 UTC
Could I have a copy of the problematic metadata please?

Comment 9 Zdenek Kabelac 2015-07-02 14:28:34 UTC
Created attachment 1045527 [details]
Compressed failing metadata content (dd)

Comment 10 Joe Thornber 2015-07-03 10:45:23 UTC
Using version v0.5.1 of the tools:

root@debian:~/thin-provisioning-tools# valgrind bin/thin_check ~/metadata 
==20703== Memcheck, a memory error detector
==20703== Copyright (C) 2002-2011, and GNU GPL'd, by Julian Seward et al.
==20703== Using Valgrind-3.7.0 and LibVEX; rerun with -h for copyright info
==20703== Command: bin/thin_check /root/metadata
==20703== 
examining superblock
examining devices tree
examining mapping tree
mapping_tree_damage_visitor: path too long
==20703== 
==20703== HEAP SUMMARY:
==20703==     in use at exit: 0 bytes in 0 blocks
==20703==   total heap usage: 81 allocs, 81 frees, 16,924,881 bytes allocated
==20703== 
==20703== All heap blocks were freed -- no leaks are possible
==20703== 
==20703== For counts of detected and suppressed errors, rerun with: -v
==20703== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 4 from 4)
root@debian:~/thin-provisioning-tools# echo $?
1


Admittedly the error message is not helpful, but I'm not seeing a seg fault.  Could you confirm that this particular metadata causes the crash please?  We may have a build or library version issue.

Comment 11 Joe Thornber 2015-07-03 12:12:12 UTC
Ignore previous comment, wrong tool.

Comment 12 Joe Thornber 2015-07-03 12:14:24 UTC
v0.5.2 of the tools released.  Fixes segfault in thin_restore, and improves error message from thin_check.

Comment 14 Bruno Goncalves 2015-07-13 12:47:23 UTC
This problem seems to have been inserted on device-mapper-persistent-data-0.4.2-1.el7 and fixed on device-mapper-persistent-data-0.5.3-1.el7

# thin_dump -V
0.4.2-1.el7

# thin_dump ~/metadata 
<superblock uuid="" time="0" transaction="2" data_block_size="256" nr_data_blocks="160">
  <device dev_id="1" mapped_blocks="6" transaction="0" creation_time="0" snap_time="0">
    <range_mapping origin_begin="0" data_begin="0" length="2" time="0"/>
    <range_mapping origin_begin="64" data_begin="2" length="3" time="0"/>
    <single_mapping origin_block="79" data_block="5" time="0"/>
  </device>
  <device dev_id="2" mapped_blocks="6" transaction="1" creation_time="0" snap_time="0">
Segmentation fault (core dumped)

############



# thin_dump -V
0.5.3-1.el7

# thin_dump ~/metadata 
<superblock uuid="" time="0" transaction="2" data_block_size="256" nr_data_blocks="160">
  <device dev_id="1" mapped_blocks="6" transaction="0" creation_time="0" snap_time="0">
    <range_mapping origin_begin="0" data_begin="0" length="2" time="0"/>
    <range_mapping origin_begin="64" data_begin="2" length="3" time="0"/>
    <single_mapping origin_block="79" data_block="5" time="0"/>
  </device>
  <device dev_id="2" mapped_blocks="6" transaction="1" creation_time="0" snap_time="0">
metadata contains errors (run thin_check for details).
perhaps you wanted to run with --repair

Comment 15 Marian Csontos 2015-07-13 13:01:09 UTC
...which makes the bug intra-release only.

Version 0.4.2 was not released in any of RHEL nor Fedora products.

Comment 17 errata-xmlrpc 2015-11-19 09:39:26 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2015-2170.html