Bug 886985

Summary: ioctl/libdm-iface.c", line=1840, dm_errno=-1, format=0x7ff708c05688 "Internal error: Performing unsafe table load while %d device(s) are known to be suspended: ...
Product: [Fedora] Fedora Reporter: Marian Csontos <mcsontos>
Component: lvm2Assignee: Zdenek Kabelac <zkabelac>
Status: CLOSED EOL QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 19CC: agk, bmarzins, bmr, dwysocha, heinzm, jonathan, lvm-team, mcsontos, msnitzer, prajnoha, prockai, zkabelac
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-02-17 14:37:24 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
pvmove -vvvv outputs none

Description Marian Csontos 2012-12-13 18:30:47 UTC
Description of problem:
After enabling abort_on_internal_error, got one in clvmd:

"ioctl/libdm-iface.c", line=1840, dm_errno=-1, format="Internal error: Performing unsafe table load while %d device(s) are known to be suspended: ...

Version-Release number of selected component (if applicable):
- upstream lvm commit c089029b7 running on el6.4
- not yet verified against el6.4 - creating upstream bug

How reproducible:
TBD

Steps to Reproduce:
1. TBD
  
Actual results:
Internal Error in clvmd

Expected results:
Pass?

Additional info:

Core was generated by `clvmd -T30'.
Program terminated with signal 6, Aborted.
#0  0x00007ff7080358a5 in raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
64        return INLINE_SYSCALL (tgkill, 3, pid, selftid, sig);
#0  0x00007ff7080358a5 in raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
        resultvar = 0
        pid = <value optimized out>
        selftid = 30484
#1  0x00007ff708037085 in abort () at abort.c:92
        save_stage = 2
        act = {__sigaction_handler = {sa_handler = 0x7ff709633cde, sa_sigaction = 0x7ff709633cde}, sa_mask = {__val = {140698991144160, 140698991148254, 10537517192083781977, 0, 0, 140698967907160, 140698991173632, 157494624, 4294967295, 4222418944, 5, 7102128, 0, 140698991151712, 3, 0}}, sa_flags = 155406288, sa_restorer = 0x5}
        sigs = {__val = {32, 0 <repeats 15 times>}}
#2  0x0000000000442280 in print_log (level=2, file=0x7ff708c04a25 "ioctl/libdm-iface.c", line=1840, dm_errno=-1, format=0x7ff708c05688 "Internal error: Performing unsafe table load while %d device(s) are known to be suspended: %s%s%s %s%.0d%s%.0d%s%s") at log/log.c:270
        ap = {{gp_offset = 48, fp_offset = 48, overflow_arg_area = 0x7ff709634208, reg_save_area = 0x7ff7096340f0}}
        buf = "\022j\006\000\367\177\000\000\037j\006\000\367\177\000\000\237y\006\000\367\177\000\000\022j\006\000\367\177\000\000\237y\006\000\367\177", '\000' <repeats 42 times>, "0?c\t\000\000\000\000\200=c\t\367\177\000\000\000\000\006\000\367\177\000\000\000\000\000\000\000\000\000\000c(\300\b\367\177\000\000\000?c\t\367\177\000\000\233\017", '\000' <repeats 14 times>, "\r\000\000\000\000\000\000\000Z\026\275\b\367\177\000\000\377\377\377\377\000\000\000\000PKc\t0\177\000\000\240>c\t\367\177\000\000Z\026\275\b\367\177\000\000\000\000\000\000\000\000\000\000\200>c\ts\177\000\000\300>c\t\367\177\000\000`\034\275\b\367\177\000\000P?c\t\367\177\000\000\024\000\000\000\000\000\000\000\251S\300\b\001\000\000\000\201\000\000\000\000\000\000\000`?c\t\367\177\000\000D\000\000\000\000\000\000\000\256"...
        locn = "Internal error: Performing unsafe table load while 3 device(s) are known to be suspended:  (253:6) \000\367\177", '\000' <repeats 18 times>"\252, \000\000\000\016\310<\222Adding target to (253:6): 0 2449408 mirror userspace 3 LVM-KemgUNW20ryVJrnDyOFd9veUf"...
        bufused = 0
        n = 99
        message = 0x7ff709632ce0 "Internal error: Performing unsafe table load while 3 device(s) are known to be suspended:  (253:6) "
        trformat = 0x7ff708c05688 "Internal error: Performing unsafe table load while %d device(s) are known to be suspended: %s%s%s %s%.0d%s%.0d%s%s"
        newbuf = 0x7ff70003dd90 "Internal error: Performing unsafe table load while 3 device(s) are known to be suspended:  (253:6) "
        use_stderr = 0
        log_once = 0
        fatal_internal_error = 1
        msglen = 99
#3  0x00007ff708bff88c in dm_task_run (dmt=0x7ff70003dcc0) at ioctl/libdm-iface.c:1828
        dmi = 0x3
        command = 3241737481
        check_udev = 253328
        rely_on_udev = 0
        suspended_counter = 3
        ioctl_retry = 1
        retryable = 0
        dev_name = 0x0
        dev_uuid = 0x0
#4  0x00007ff708bfdcad in _reload_with_suppression_v4 (dmt=0x7ff70003dcc0) at ioctl/libdm-iface.c:1456
        task = 0x7ff70003dd90
        t1 = 0x7ff700017450
        t2 = 0x0
        len = 140698833900816
        r = 1
#5  0x00007ff708bff52e in dm_task_run (dmt=0x7ff70003dcc0) at ioctl/libdm-iface.c:1816
        dmi = 0x544a56697979456f
        command = 3241737481
        check_udev = 253120
        rely_on_udev = 32759
        suspended_counter = 420256
        ioctl_retry = 1
        retryable = 0
        dev_name = 0x0
        dev_uuid = 0x0
#6  0x00007ff708be57e8 in _load_node (dnode=0x7ff70003f2c0) at libdm-deptree.c:2412
        r = 0
        dmt = 0x7ff70003dcc0
        seg = 0x7ff70003f350
        seg_start = 2449408
        existing_table_size = 140698980325421
#7  0x00007ff708be5c28 in dm_tree_preload_children (dnode=0x7ff70003ed48, uuid_prefix=0x7ff70005edb0 "LVM-KemgUNW20ryVJrnDyOFd9veUfZquoEyy9ZvrKRZeovdHpXcGwlmntqeRAXAKaumh", uuid_prefix_len=36) at libdm-deptree.c:2478
        r = 1
        handle = 0x7ff70003da78
        child = 0x7ff70003f2c0
        newinfo = {exists = 157500528, suspended = 32759, live_table = 257096, inactive_table = 32759, open_count = 257032, event_nr = 32759, major = 257416, minor = 32759, read_only = 257416, target_count = 32759}
        update_devs_flag = 0
#8  0x00007ff708be5b08 in dm_tree_preload_children (dnode=0x7ff70003ec08, uuid_prefix=0x7ff70005edb0 "LVM-KemgUNW20ryVJrnDyOFd9veUfZquoEyy9ZvrKRZeovdHpXcGwlmntqeRAXAKaumh", uuid_prefix_len=36) at libdm-deptree.c:2469
        r = 1
        handle = 0x7ff70003ee20
        child = 0x7ff70003ed48
        newinfo = {exists = 257352, suspended = 32759, live_table = 388528, inactive_table = 32759, open_count = 388696, event_nr = 32759, major = 374136, minor = 32759, read_only = 157500736, target_count = 1}
        update_devs_flag = 0
#9  0x0000000000489af3 in _tree_action (dm=0x7ff70005e9b0, lv=0x7ff70005b4c8, laopts=0x7ff709634750, action=PRELOAD) at activate/dev_manager.c:2394
        DLID_SIZE = 36
        dtree = 0x7ff70003ebf0
        root = 0x7ff70003ec08
        dlid = 0x7ff70005edb0 "LVM-KemgUNW20ryVJrnDyOFd9veUfZquoEyy9ZvrKRZeovdHpXcGwlmntqeRAXAKaumh"
        r = 0
#10 0x0000000000489d38 in dev_manager_preload (dm=0x7ff70005e9b0, lv=0x7ff70005b4c8, laopts=0x7ff709634750, flush_required=0x7ff709634704) at activate/dev_manager.c:2441
No locals.
#11 0x0000000000414850 in _lv_preload (lv=0x7ff70005b4c8, laopts=0x7ff709634750, flush_required=0x7ff709634704) at activate/activate.c:913
        r = 0
        dm = 0x7ff70005e9b0
        old_readonly = 0
#12 0x0000000000416338 in _lv_suspend (cmd=0x7ff7000008c0, lvid_s=0x2247223 "KemgUNW20ryVJrnDyOFd9veUfZquoEyy9ZvrKRZeovdHpXcGwlmntqeRAXAKaumh", laopts=0x7ff709634750, error_if_not_suspended=0) at activate/activate.c:1501
        lv = 0x7ff7000109d8
        lv_pre = 0x7ff70005b4c8
        pvmove_lv = 0x0
        lvl_pre = 0x7ff700056990
        sl = 0x46d5c8
        snap_seg = 0x7ff709634700
        info = {exists = 1, suspended = 0, open_count = 0, major = 253, minor = 4, read_only = 0, live_table = 1, inactive_table = 0, read_ahead = 0}
        r = 0
        lockfs = 0
        flush_required = 0
        detached = {lv_pre = 0x3, laopts = 0x7ff700013ea0, flush_required = 0x0}
#13 0x000000000041680d in lv_suspend_if_active (cmd=0x7ff7000008c0, lvid_s=0x2247223 "KemgUNW20ryVJrnDyOFd9veUfZquoEyy9ZvrKRZeovdHpXcGwlmntqeRAXAKaumh", origin_only=0, exclusive=0) at activate/activate.c:1598
        laopts = {exclusive = 0, origin_only = 0, no_merging = 0, real_pool = 0, is_activate = 0, skip_in_use = 0, revert = 0, read_only = 0}
#14 0x000000000040f98e in do_suspend_lv (resource=0x2247223 "KemgUNW20ryVJrnDyOFd9veUfZquoEyy9ZvrKRZeovdHpXcGwlmntqeRAXAKaumh", command=156 '\234', lock_flags=4 '\004') at lvm-functions.c:458
        oldmode = 1
        lvi = {exists = 1, suspended = 0, open_count = 0, major = 253, minor = 4, read_only = 0, live_table = 1, inactive_table = 0, read_ahead = 0}
        origin_only = 0
        exclusive = 0
#15 0x000000000040fcc8 in do_lock_lv (command=156 '\234', lock_flags=4 '\004', resource=0x2247223 "KemgUNW20ryVJrnDyOFd9veUfZquoEyy9ZvrKRZeovdHpXcGwlmntqeRAXAKaumh") at lvm-functions.c:548
        status = 0
#16 0x0000000000408fc1 in do_command (client=0x0, msg=0x2247210, msglen=84, buf=0x7ff709634be8, buflen=1481, retlen=0x7ff709634bfc) at clvmd-command.c:127
        args = 0x2247221 "\234\004KemgUNW20ryVJrnDyOFd9veUfZquoEyy9ZvrKRZeovdHpXcGwlmntqeRAXAKaumh"
        arglen = 66
        status = 0
        lockname = 0x2247223 "KemgUNW20ryVJrnDyOFd9veUfZquoEyy9ZvrKRZeovdHpXcGwlmntqeRAXAKaumh"
        locktype = 0x0
        nodeinfo = {sysname = "\377\377\377\377", '\000' <repeats 20 times>, "@\364\070\b\367\177\000\000\000\000\000\000\000\000\000\000\200;\016\b\367\177\000\000\000\001\000\000\367\177\000\000\002\000\000\000\000\000\000\000 ", nodename = "Ic\t\367\177\000\000.\000\000\000\000\000\000\000\377\377\377\377\375\000\000\000MU\234\b\367\177\000\000PIc\t\367\177\000\000\220\367\003\000\367\177\000\000\000\000\000\f\000\000\000\000\060Jc\t\367\177\000\000]\002", release = "\000\000\000\000\000\000PKc\t\367\177\000\000`Jc\t\367\177\000\000\243$\005\b\367\177\000\000\060Jc\t\367\177\000\000 \000\000\000\367\177\000\000\312\005\000\000\000\000\000\000 \000\000\000\367\177\000\000PKc", version = "\t\367\177\000\000`Jc\t\367\177\000\000\340\005\000\000\000\000\000\000\205\303\a\b\367\177\000\000\062\000\000\000\000\000\000\000\312\005\000\000\000\000\000\000\210\000\000\000\367\177\000\000]\002\000\000\000\000\000\000G\000\000", machine = "\367\177\000\000\027\000\000\000\000\000\000\000\002\000\000\000\000\000\000\000\000\006", '\000' <repeats 30 times>, "^\000\000\000G\000\000\000]\000\000\000|", domainname = "\000\000\000w\000\000\000n\000\000\000 \000\000\000\022\000\000\000\000\000\000\f\000\000\000\000\312\005\000\000\000\000\000\000 \000\000\000\367\177\000\000PKc\t\367\177\000\000`Jc\t\367\177\000\000\003\000\000\000\000"}
        lock_cmd = 156 '\234'
        lock_flags = 4 '\004'
#17 0x000000000040cf2b in process_remote_command (msg=0x2247210, msglen=84, fd=7, csid=0x22472b0 "\001") at clvmd.c:1531
        replyargs = 0x7ff70003f790 "x"
        nodename = 0x7ff709634a60 "node01"
        replylen = 0
        buflen = 1481
        status = 0
#18 0x000000000040df28 in process_work_item (cmd=0x2247290) at clvmd.c:1992
No locals.
#19 0x000000000040e059 in lvm_thread_fn (arg=0x7fff6dd98250) at clvmd.c:2035
        ss = {__val = {2560, 0 <repeats 15 times>}}
        lvm_params = 0x7fff6dd98250
        cmd = 0x2247290
#20 0x00007ff70839d851 in start_thread (arg=0x7ff709635700) at pthread_create.c:301
        __res = <value optimized out>
        pd = 0x7ff709635700
        now = <value optimized out>
        unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140698991154944, -5164366343405636900, 7121184, 140698991155648, 0, 3, 5168565476599722716, 5168562679843058396}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
        not_first_call = <value optimized out>
        pagesize_m1 = <value optimized out>
        sp = <value optimized out>
        freesize = <value optimized out>
#21 0x00007ff7080eb90d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115
No locals.

Comment 1 Zdenek Kabelac 2013-01-15 09:55:05 UTC
More info is needed here.

What are the steps, how the tables look like.

Any small simple reproducer ?

Comment 2 Marian Csontos 2013-01-15 15:47:11 UTC
Seems following is enough to raise the error in clvmd:

    vgchange -an pv_shuffle_A
    vgchange -an pv_shuffle_B
    pvmove /dev/vdb1 /dev/vdh1 &
    pvmove /dev/vdd2 /dev/vdf1

This is a race and happens only when pvmoves are executed concurrently without delay. When executed manually with a small delay it has never happened.

I will run with -vvvv and attach outputs.

Comment 3 Marian Csontos 2013-01-15 18:49:57 UTC
Created attachment 678985 [details]
pvmove -vvvv outputs

Comment 4 Marian Csontos 2013-01-16 06:17:51 UTC
Interesting this does not happen on single node though the same scenario runs. Is there a check missing?

Comment 5 Fedora End Of Life 2013-04-03 16:01:06 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 19 development cycle.
Changing version to '19'.

(As we did not run this process for some time, it could affect also pre-Fedora 19 development
cycle bugs. We are very sorry. It will help us with cleanup during Fedora 19 End Of Life. Thank you.)

More information and reason for this action is here:
https://fedoraproject.org/wiki/BugZappers/HouseKeeping/Fedora19

Comment 6 Peter Rajnoha 2013-12-09 13:23:46 UTC
Still seen with latest builds?

Comment 7 Fedora End Of Life 2015-01-09 17:32:05 UTC
This message is a notice that Fedora 19 is now at end of life. Fedora 
has stopped maintaining and issuing updates for Fedora 19. It is 
Fedora's policy to close all bug reports from releases that are no 
longer maintained. Approximately 4 (four) weeks from now this bug will
be closed as EOL if it remains open with a Fedora 'version' of '19'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 19 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 8 Fedora End Of Life 2015-02-17 14:37:24 UTC
Fedora 19 changed to end-of-life (EOL) status on 2015-01-06. Fedora 19 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.

Comment 9 Red Hat Bugzilla 2023-09-14 01:39:27 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days