RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 794904 - Redundant log leg failure fails to repair and causes kernel hang
Summary: Redundant log leg failure fails to repair and causes kernel hang
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: lvm2
Version: 6.3
Hardware: x86_64
OS: Linux
urgent
urgent
Target Milestone: rc
: ---
Assignee: Jonathan Earl Brassow
QA Contact: Cluster QE
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-02-17 21:33 UTC by Corey Marthaler
Modified: 2012-06-20 15:01 UTC (History)
9 users (show)

Fixed In Version: lvm2-2.02.95-4.el6
Doc Type: Bug Fix
Doc Text:
This bug is a regression from the previous release. No release notes are necessary.
Clone Of:
Environment:
Last Closed: 2012-06-20 15:01:31 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2012:0962 0 normal SHIPPED_LIVE lvm2 bug fix and enhancement update 2012-06-19 21:12:11 UTC

Description Corey Marthaler 2012-02-17 21:33:13 UTC
Description of problem:
Scenario kill_secondary_log_2_legs_2_logs: Kill secondary log of synced 2 leg redundant log mirror(s)

********* Mirror hash info for this scenario *********
* names:              syncd_secondary_log_2legs_2logs_1
* sync:               1
* striped:            0
* leg devices:        /dev/sdb1 /dev/sdg1
* log devices:        /dev/sdd1 /dev/sdh1
* no MDA devices:     
* failpv(s):          /dev/sdh1
* failnode(s):        taft-01
* leg fault policy:   allocate
* log fault policy:   allocate
******************************************************

Creating mirror(s) on taft-01...
taft-01: lvcreate --mirrorlog mirrored -m 1 -n syncd_secondary_log_2legs_2logs_1 -L 500M helter_skelter /dev/sdb1:0-1000 /dev/sdg1:0-1000 /dev/sdd1:0-150 /dev/sdh1:0-150

Mirror Structure(s):
  LV                                                Attr     LSize   Copy%  Devices
  syncd_secondary_log_2legs_2logs_1                 mwi-a-m- 500.00m   4.00 syncd_secondary_log_2legs_2logs_1_mimage_0(0),syncd_secondary_log_2legs_2logs_1_mimage_1(0)
  [syncd_secondary_log_2legs_2logs_1_mimage_0]      Iwi-aom- 500.00m        /dev/sdb1(0)
  [syncd_secondary_log_2legs_2logs_1_mimage_1]      Iwi-aom- 500.00m        /dev/sdg1(0)
  [syncd_secondary_log_2legs_2logs_1_mlog]          mwi-aom-   4.00m 100.00 syncd_secondary_log_2legs_2logs_1_mlog_mimage_0(0),syncd_secondary_log_2legs_2logs_1_mlog_mimage_1(0)
  [syncd_secondary_log_2legs_2logs_1_mlog_mimage_0] iwi-aom-   4.00m        /dev/sdd1(0)
  [syncd_secondary_log_2legs_2logs_1_mlog_mimage_1] iwi-aom-   4.00m        /dev/sdh1(0)

PV=/dev/sdh1
        syncd_secondary_log_2legs_2logs_1_mlog_mimage_1: 1.3
PV=/dev/sdh1
        syncd_secondary_log_2legs_2logs_1_mlog_mimage_1: 1.3

Waiting until all mirror|raid volumes become fully syncd...
   1/1 mirror(s) are fully synced: ( 100.00% )

Creating ext on top of mirror(s) on taft-01...
mke2fs 1.41.12 (17-May-2010)
Mounting mirrored ext filesystems on taft-01...

Writing verification files (checkit) to mirror(s) on...
        ---- taft-01 ----

<start name="taft-01_syncd_secondary_log_2legs_2logs_1" pid="23835" time="Fri Feb 17 14:38:35 2012" type="cmd" />
Sleeping 10 seconds to get some outsanding EXT I/O locks before the failure 
Verifying files (checkit) on mirror(s) on...
        ---- taft-01 ----

Disabling device sdh on taft-01
[DEADLOCK]


taft-01 qarshd[15258]: Running cmdline: echo offline > /sys/block/sdh/device/state &
taft-01 kernel: sd 3:0:0:7: rejecting I/O to offline device
taft-01 lvm[2997]: Secondary mirror device 253:4 has failed (D).
taft-01 lvm[2997]: Device failure in helter_skelter-syncd_secondary_log_2legs_2logs_1_mlog.
taft-01 lvm[2997]: Names including "_mlog" are reserved. Please choose a different LV name.
taft-01 lvm[2997]: Run `lvconvert --help' for more information.
taft-01 lvm[2997]: Repair of mirrored device helter_skelter-syncd_secondary_log_2legs_2logs_1_mlog failed.
taft-01 lvm[2997]: Failed to remove faulty devices in helter_skelter-syncd_secondary_log_2legs_2logs_1_mlog.
taft-01 qarshd[15261]: Running cmdline: pvs -a
taft-01 kernel: INFO: task kmirrord:15165 blocked for more than 120 seconds.
taft-01 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
taft-01 kernel: kmirrord      D 0000000000000002     0 15165      2 0x00000080
taft-01 kernel: ffff8801fac51bc0 0000000000000046 0000000000000000 ffff880216a5b280
taft-01 kernel: ffff8801fac51be0 ffffffffa0009e03 00000000fac51b40 ffffffffa0009a30
taft-01 kernel: ffff880215d665f8 ffff8801fac51fd8 000000000000f4e8 ffff880215d665f8
taft-01 kernel: Call Trace:
taft-01 kernel: [<ffffffffa0009e03>] ? dispatch_io+0x233/0x260 [dm_mod]
taft-01 kernel: [<ffffffffa0009a30>] ? vm_get_page+0x0/0x70 [dm_mod]
taft-01 kernel: [<ffffffff8109b809>] ? ktime_get_ts+0xa9/0xe0
taft-01 kernel: [<ffffffff814ed1e3>] io_schedule+0x73/0xc0
taft-01 kernel: [<ffffffffa0009ec5>] sync_io+0x95/0x110 [dm_mod]
taft-01 kernel: [<ffffffffa0002770>] ? dm_unplug_all+0x50/0x70 [dm_mod]
taft-01 kernel: [<ffffffff811136c5>] ? mempool_kmalloc+0x15/0x20
taft-01 kernel: [<ffffffff81113273>] ? mempool_alloc+0x63/0x140
taft-01 kernel: [<ffffffffa000a167>] dm_io+0x1b7/0x1c0 [dm_mod]
taft-01 kernel: [<ffffffffa0009a30>] ? vm_get_page+0x0/0x70 [dm_mod]
taft-01 kernel: [<ffffffffa00099a0>] ? vm_next_page+0x0/0x30 [dm_mod]
taft-01 kernel: [<ffffffffa00207e1>] disk_flush+0x91/0x170 [dm_log]
taft-01 kernel: [<ffffffffa0029722>] ? dm_rh_inc+0x42/0xd0 [dm_region_hash]
taft-01 kernel: [<ffffffffa00290d3>] dm_rh_flush+0x13/0x20 [dm_region_hash]
taft-01 kernel: [<ffffffffa0033b4f>] do_mirror+0x27f/0x6e0 [dm_mirror]
taft-01 kernel: [<ffffffffa00338d0>] ? do_mirror+0x0/0x6e0 [dm_mirror]
taft-01 kernel: [<ffffffff8108b2b0>] worker_thread+0x170/0x2a0
taft-01 kernel: [<ffffffff81090bf0>] ? autoremove_wake_function+0x0/0x40
taft-01 kernel: [<ffffffff8108b140>] ? worker_thread+0x0/0x2a0
taft-01 kernel: [<ffffffff81090886>] kthread+0x96/0xa0
taft-01 kernel: [<ffffffff8100c14a>] child_rip+0xa/0x20
taft-01 kernel: [<ffffffff810907f0>] ? kthread+0x0/0xa0
taft-01 kernel: [<ffffffff8100c140>] ? child_rip+0x0/0x20



Version-Release number of selected component (if applicable):
2.6.32-220.el6.x86_64

lvm2-2.02.92-0.40.el6    BUILT: Thu Feb 16 18:12:38 CST 2012
lvm2-libs-2.02.92-0.40.el6    BUILT: Thu Feb 16 18:12:38 CST 2012
lvm2-cluster-2.02.92-0.40.el6    BUILT: Thu Feb 16 18:12:38 CST 2012
udev-147-2.40.el6    BUILT: Fri Sep 23 07:51:13 CDT 2011
device-mapper-1.02.71-0.40.el6    BUILT: Thu Feb 16 18:12:38 CST 2012
device-mapper-libs-1.02.71-0.40.el6    BUILT: Thu Feb 16 18:12:38 CST 2012
device-mapper-event-1.02.71-0.40.el6    BUILT: Thu Feb 16 18:12:38 CST 2012
device-mapper-event-libs-1.02.71-0.40.el6    BUILT: Thu Feb 16 18:12:38 CST 2012
cmirror-2.02.92-0.40.el6    BUILT: Thu Feb 16 18:12:38 CST 2012

Comment 1 Corey Marthaler 2012-02-17 22:01:36 UTC
This is reproducible and also occurs when it's the primary redundant log being
failed.

Comment 2 Jonathan Earl Brassow 2012-03-12 19:02:11 UTC
you get redundant logs through the "raid1" segment type and there is no need to do the extra layering required with the 'mirror' segment type.  I will conditionally nack this bug; because there are better solutions available to do what is desired.

Comment 5 Jonathan Earl Brassow 2012-04-10 23:46:17 UTC
Regression was introduced in 2.02.89 from changes to the dmeventd code.

Comment 8 Jonathan Earl Brassow 2012-04-11 01:18:09 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
This bug is a regression from the previous release.  No release notes are necessary.

Comment 11 Corey Marthaler 2012-04-11 15:56:29 UTC
Fix verified in the latest rpms.

2.6.32-251.el6.x86_64
lvm2-2.02.95-4.el6    BUILT: Wed Apr 11 09:03:19 CDT 2012
lvm2-libs-2.02.95-4.el6    BUILT: Wed Apr 11 09:03:19 CDT 2012
lvm2-cluster-2.02.95-4.el6    BUILT: Wed Apr 11 09:03:19 CDT 2012
udev-147-2.40.el6    BUILT: Fri Sep 23 07:51:13 CDT 2011
device-mapper-1.02.74-4.el6    BUILT: Wed Apr 11 09:03:19 CDT 2012
device-mapper-libs-1.02.74-4.el6    BUILT: Wed Apr 11 09:03:19 CDT 2012
device-mapper-event-1.02.74-4.el6    BUILT: Wed Apr 11 09:03:19 CDT 2012
device-mapper-event-libs-1.02.74-4.el6    BUILT: Wed Apr 11 09:03:19 CDT 2012
cmirror-2.02.95-4.el6    BUILT: Wed Apr 11 09:03:19 CDT 2012


The following test case now passes:
./helter_skelter -o taft-01 -l /home/msp/cmarthal/work/sts/sts-root -r /usr/tests/sts-rhel6.3 -e kill_secondary_log_2_legs_2_logs -i

Comment 13 errata-xmlrpc 2012-06-20 15:01:31 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2012-0962.html


Note You need to log in before you can comment on or make changes to this bug.