Description of problem: https://build.gluster.org/job/centos6-regression/8321/console 05:36:46 ./tests/basic/ec/heal-info.t: line 34: 6762 Killed dd status=none if=/dev/zero of=$M0/a bs=1M count=2048 05:37:35 ./tests/basic/ec/heal-info.t .. 05:37:35 1..22 05:37:35 ok 1, LINENUM:18 05:37:35 ok 2, LINENUM:19 05:37:35 ok 3, LINENUM:20 05:37:35 ok 4, LINENUM:21 05:37:35 ok 5, LINENUM:22 05:37:35 ok 6, LINENUM:23 05:37:35 ok 7, LINENUM:24 05:37:35 ok 8, LINENUM:26 05:37:35 ok 9, LINENUM:32 05:37:35 not ok 10 Got "113" instead of "^0$", LINENUM:43 05:37:35 FAILED COMMAND: ^0$ echo 113 05:37:35 ok 11, LINENUM:47 05:37:35 ok 12, LINENUM:48 05:37:35 ok 13, LINENUM:49 05:37:35 ok 14, LINENUM:50 05:37:35 ok 15, LINENUM:51 05:37:35 ok 16, LINENUM:52 05:37:35 ok 17, LINENUM:55 05:37:35 not ok 18 Got "7" instead of "^1$", LINENUM:56 05:37:35 FAILED COMMAND: ^1$ get_pending_heal_count patchy 05:37:35 ok 19, LINENUM:57 05:37:35 ok 20, LINENUM:60 05:37:35 ok 21, LINENUM:61 05:37:35 not ok 22 Got "110" instead of "^105$", LINENUM:71 05:37:35 FAILED COMMAND: ^105$ get_pending_heal_count patchy 05:37:35 Failed 3/22 subtests Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
I tried again to reproduce this issue and it fails once in a while. It looks like when heal info tries to do heal inspection some of the bricks do not respond and it consider it as "ENOTCONN", in which case it will return it as heal needed. This test is performing heal info in a loop while IO's are going on for 1000 files of 1MB each. Whenever, test is failing it is taking longer time. My other observation that heal was not happening was wrong. This test starts after it disables the heal for the volume. -- Ashish
From the tests on my system: PASS: ================= check heal + echo 'check heal' ++ get_pending_heal_count patchy ++ local vol=patchy ++ gluster volume heal patchy info ++ grep 'Number of entries' ++ awk '{ sum+=$4} END {print sum}' check heal 1 + heal_count=0 + total_heal_count=0 + echo 'check heal 1' + set +x check heal + echo 'check heal' ++ get_pending_heal_count patchy ++ local vol=patchy ++ gluster volume heal patchy info ++ grep 'Number of entries' ++ awk '{ sum+=$4} END {print sum}' + heal_count=0 + total_heal_count=0 check heal 1 + echo 'check heal 1' + set +x check heal + echo 'check heal' ++ get_pending_heal_count patchy ++ local vol=patchy ++ gluster volume heal patchy info ++ grep 'Number of entries' ++ awk '{ sum+=$4} END {print sum}' check heal 1 + heal_count=0 + total_heal_count=0 + echo 'check heal 1' + set +x check heal + echo 'check heal' ++ get_pending_heal_count patchy ++ local vol=patchy ++ grep 'Number of entries' ++ awk '{ sum+=$4} END {print sum}' ++ gluster volume heal patchy info + rm -f /mnt/glusterfs/0/lock + set +x Done + heal_count=0 + total_heal_count=0 check heal 1 + echo 'check heal 1' + set +x ok 10, LINENUM:51 ok 11, LINENUM:55 =================================================== FAIL: =================================================== check heal ++ get_pending_heal_count patchy ++ local vol=patchy ++ gluster volume heal patchy info ++ awk '{ sum+=$4} END {print sum}' ++ grep 'Number of entries' + heal_count=0 + total_heal_count=0 + echo 'check heal 1' check heal 1 + set +x check heal + echo 'check heal' ++ get_pending_heal_count patchy ++ local vol=patchy ++ gluster volume heal patchy info ++ grep 'Number of entries' ++ awk '{ sum+=$4} END {print sum}' + heal_count=6 <---------------Already non-zero + total_heal_count=6 check heal 1 + echo 'check heal 1' + set +x check heal + echo 'check heal' ++ get_pending_heal_count patchy ++ local vol=patchy ++ gluster volume heal patchy info ++ grep 'Number of entries' ++ awk '{ sum+=$4} END {print sum}' check heal 1 + heal_count=6 + total_heal_count=12 + echo 'check heal 1' + set +x check heal + echo 'check heal' ++ get_pending_heal_count patchy ++ local vol=patchy ++ gluster volume heal patchy info ++ awk '{ sum+=$4} END {print sum}' ++ grep 'Number of entries' check heal 1 + heal_count=6 + total_heal_count=18 + echo 'check heal 1' + set +x check heal + echo 'check heal' ++ get_pending_heal_count patchy ++ local vol=patchy ++ awk '{ sum+=$4} END {print sum}' ++ grep 'Number of entries' ++ gluster volume heal patchy info + heal_count=6 + total_heal_count=24 + echo 'check heal 1' + set +x check heal 1 + echo 'check heal' check heal ++ get_pending_heal_count patchy ++ local vol=patchy ++ gluster volume heal patchy info ++ grep 'Number of entries' ++ awk '{ sum+=$4} END {print sum}' + heal_count=6 + total_heal_count=30 check heal 1 + echo 'check heal 1' + set +x check heal + echo 'check heal' ++ get_pending_heal_count patchy ++ local vol=patchy ++ grep 'Number of entries' ++ awk '{ sum+=$4} END {print sum}' ++ gluster volume heal patchy info check heal 1 + heal_count=6 + total_heal_count=36 + echo 'check heal 1' + set +x check heal + echo 'check heal' ++ get_pending_heal_count patchy ++ local vol=patchy ++ gluster volume heal patchy info ++ grep 'Number of entries' ++ awk '{ sum+=$4} END {print sum}' + heal_count=6 check heal 1 + total_heal_count=42 + echo 'check heal 1' + set +x check heal + echo 'check heal' ++ get_pending_heal_count patchy ++ local vol=patchy ++ gluster volume heal patchy info ++ grep 'Number of entries' ++ awk '{ sum+=$4} END {print sum}' check heal 1 + heal_count=6 + total_heal_count=48 + echo 'check heal 1' + set +x check heal + echo 'check heal' ++ get_pending_heal_count patchy ++ local vol=patchy ++ gluster volume heal patchy info ++ grep 'Number of entries' ++ awk '{ sum+=$4} END {print sum}' check heal 1 + heal_count=6 + total_heal_count=54 + echo 'check heal 1' + set +x + echo 'check heal' check heal ++ get_pending_heal_count patchy ++ local vol=patchy ++ gluster volume heal patchy info ++ grep 'Number of entries' ++ awk '{ sum+=$4} END {print sum}' check heal 1 + heal_count=6 + total_heal_count=60 + echo 'check heal 1' + set +x + echo 'check heal' check heal ++ get_pending_heal_count patchy ++ local vol=patchy ++ awk '{ sum+=$4} END {print sum}' ++ grep 'Number of entries' ++ gluster volume heal patchy info check heal 1 + heal_count=6 + total_heal_count=66 + echo 'check heal 1' + set +x check heal + echo 'check heal' ++ get_pending_heal_count patchy ++ local vol=patchy ++ awk '{ sum+=$4} END {print sum}' ++ grep 'Number of entries' ++ gluster volume heal patchy info + rm -f /mnt/glusterfs/0/lock Done + set +x + heal_count=6 + total_heal_count=72 check heal 1 + echo 'check heal 1' + set +x not ok 10 Got "72" instead of "^0$", LINENUM:51 FAILED COMMAND: ^0$ echo 72
Lot of time since no activity on this bug. We have either fixed it already or it is mostly not critical anymore! Please re-open the bug if the issue is burning for you, or you want to take the bug to closure with fixes.