RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 863077 - Soft lockup on reboot with an active VG
Summary: Soft lockup on reboot with an active VG
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: kernel
Version: 6.4
Hardware: x86_64
OS: Unspecified
unspecified
high
Target Milestone: rc
: ---
Assignee: Chris Leech
QA Contact: Bruno Goncalves
URL:
Whiteboard:
: 857360 862024 862930 863523 865349 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-10-04 11:58 UTC by Nenad Peric
Modified: 2013-02-21 06:49 UTC (History)
14 users (show)

Fixed In Version: kernel-2.6.32-329.el6
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2013-02-21 06:49:27 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2013:0496 0 normal SHIPPED_LIVE Important: Red Hat Enterprise Linux 6 kernel update 2013-02-20 21:40:54 UTC

Description Nenad Peric 2012-10-04 11:58:40 UTC
Description of problem:

The system locks up during reboot (stopping of iscsi) if a VG is still active. 
No FS needs to be mounted. 

Version-Release number of selected component (if applicable):

Packages installed:
iscsi-initiator-utils-6.2.0.872-41.el6.x86_64
kernel-2.6.32-315.el6.x86_64

lvm2-libs-2.02.97-3.el6.x86_64
lvm2-2.02.97-3.el6.x86_64
lvm2-cluster-2.02.97-3.el6.x86_64
device-mapper-event-libs-1.02.76-3.el6.x86_64
device-mapper-1.02.76-3.el6.x86_64
device-mapper-libs-1.02.76-3.el6.x86_64
device-mapper-event-1.02.76-3.el6.x86_64


How reproducible:
Everytime
 

Steps to Reproduce:

1. Create VG - vgcreate test_vg /dev/sda1 /dev/sda2
2. Create LV - lvcreate test_vg -L10M
3. Reboot - reboot
  
Actual results:

The reboot locks up with the following error:

Stopping iscsi:  connection10:0: ping timeout of 5 secs expired, recv timeout 5, last rx 4294823269, last ping 4294828269, now 4294833269
 connection10:0: detected conn error (1011)
 connection8:0: ping timeout of 5 secs expired, recv timeout 5, last rx 4294826294, last ping 4294831294, now 4294836294
 connection8:0: detected conn error (1011)
 connection9:0: ping timeout of 5 secs expired, recv timeout 5, last rx 4294826312, last ping 4294831312, now 4294836312
 connection9:0: detected conn error (1011)
 connection4:0: ping timeout of 5 secs expired, recv timeout 5, last rx 4294826319, last ping 4294831319, now 4294836319
 connection4:0: detected conn error (1011)
 connection3:0: ping timeout of 5 secs expired, recv timeout 5, last rx 4294826326, last ping 4294831326, now 4294836326
 connection3:0: detected conn error (1011)
 connection1:0: ping timeout of 5 secs expired, recv timeout 5, last rx 4294826330, last ping 4294831330, now 4294836330
 connection1:0: detected conn error (1011)
 connection5:0: ping timeout of 5 secs expired, recv timeout 5, last rx 4294826350, last ping 4294831350, now 4294836350
 connection5:0: detected conn error (1011)
 connection7:0: ping timeout of 5 secs expired, recv timeout 5, last rx 4294826352, last ping 4294831352, now 4294836352
 connection7:0: detected conn error (1011)
 connection6:0: ping timeout of 5 secs expired, recv timeout 5, last rx 4294826352, last ping 4294831352, now 4294836353
 connection6:0: detected conn error (1011)
 connection2:0: ping timeout of 5 secs expired, recv timeout 5, last rx 4294826353, last ping 4294831353, now 4294836354
 connection2:0: detected conn error (1011)
BUG: soft lockup - CPU#0 stuck for 67s! [scsi_wq_11:1363]
Modules linked in: sctp libcrc32c autofs4 sg sd_mod crc_t10dif be2iscsi iscsi_boot_sysfs bnx2i cnic uio cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr ipv6 iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi microcode virtio_balloon virtio_net i2c_piix4 i2c_core ext4 mbcache jbd2 virtio_blk virtio_pci virtio_ring virtio pata_acpi ata_generic ata_piix dm_mirror dm_region_hash dm_log dm_mod [last unloaded: speedstep_lib]
CPU 0
Modules linked in: sctp libcrc32c autofs4 sg sd_mod crc_t10dif be2iscsi iscsi_boot_sysfs bnx2i cnic uio cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr ipv6 iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi microcode virtio_balloon virtio_net i2c_piix4 i2c_core ext4 mbcache jbd2 virtio_blk virtio_pci virtio_ring virtio pata_acpi ata_generic ata_piix dm_mirror dm_region_hash dm_log dm_mod [last unloaded: speedstep_lib]
 
Pid: 1363, comm: scsi_wq_11 Not tainted 2.6.32-315.el6.x86_64 #1 Red Hat KVM
RIP: 0010:[<ffffffff8150d7c7>]  [<ffffffff8150d7c7>] _spin_unlock_irqrestore+0x17/0x20
RSP: 0018:ffff88003b135d80  EFLAGS: 00000286
RAX: 0000000000000286 RBX: ffff88003b135d80 RCX: 0000000000000000
RDX: ffff88003b308050 RSI: 0000000000000286 RDI: 0000000000000286
RBP: ffffffff8100bc0e R08: dead000000200200 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff88003b135d80
R13: ffffffff8100bc0e R14: ffff88003b135d00 R15: ffff88003b135d10
FS:  0000000000000000(0000) GS:ffff880002200000(0000) knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 00000000017eaba0 CR3: 0000000037ed3000 CR4: 00000000000006f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process scsi_wq_11 (pid: 1363, threadinfo ffff88003b134000, task ffff88003b12c040)
Stack:
 ffff88003b135db0 ffffffff813769c7 ffffffffffffff10 ffff880037541c00
<d> ffff88003b308000 ffff88003b30f930 ffff88003b135e00 ffffffff8137a435
<d> 000000000000fb88 ffff88003dd21000 ffffffff81a8d020 ffff88003b30f800
Call Trace:
 [<ffffffff813769c7>] ? scsi_target_reap+0x77/0xe0
 [<ffffffff8137a435>] ? scsi_remove_target+0x175/0x180
 [<ffffffffa01698e1>] ? __iscsi_unbind_session+0xa1/0x150 [scsi_transport_iscsi]
 [<ffffffffa0169840>] ? __iscsi_unbind_session+0x0/0x150 [scsi_transport_iscsi]
 [<ffffffff8108e050>] ? worker_thread+0x170/0x2a0
 [<ffffffff81093e40>] ? autoremove_wake_function+0x0/0x40
 [<ffffffff8108dee0>] ? worker_thread+0x0/0x2a0
 [<ffffffff81093ad6>] ? kthread+0x96/0xa0
 [<ffffffff8108dee0>] ? worker_thread+0x0/0x2a0
 [<ffffffff8100c14a>] ? child_rip+0xa/0x20
 [<ffffffff81093a40>] ? kthread+0x0/0xa0
 [<ffffffff8100c140>] ? child_rip+0x0/0x20
Code: 00 00 00 01 74 05 e8 b9 7a d7 ff c9 c3 0f 1f 80 00 00 00 00 55 48 89 e5 0f 1f 44 00 00 48 89 fa c7 02 00 00 00 00 48 89 f7 57 9d <0f> 1f 44 00 00 c9 c3 66 90 55 48 89 e5 0f 1f 44 00 00 3e ff 07
Call Trace:
 [<ffffffff813769c7>] ? scsi_target_reap+0x77/0xe0
 [<ffffffff8137a435>] ? scsi_remove_target+0x175/0x180
 [<ffffffffa01698e1>] ? __iscsi_unbind_session+0xa1/0x150 [scsi_transport_iscsi]
 [<ffffffffa0169840>] ? __iscsi_unbind_session+0x0/0x150 [scsi_transport_iscsi]
 [<ffffffff8108e050>] ? worker_thread+0x170/0x2a0
 [<ffffffff81093e40>] ? autoremove_wake_function+0x0/0x40
 [<ffffffff8108dee0>] ? worker_thread+0x0/0x2a0
 [<ffffffff81093ad6>] ? kthread+0x96/0xa0
 [<ffffffff8108dee0>] ? worker_thread+0x0/0x2a0
 [<ffffffff8100c14a>] ? child_rip+0xa/0x20
 [<ffffffff81093a40>] ? kthread+0x0/0xa0
 [<ffffffff8100c140>] ? child_rip+0x0/0x20

--------------------------------------------------------------------------------

If VG is deactivated prior to reboot (vgchange -an test_vg) the system reboots printing out only the following warnings during stopping of the iscsi:

Stopping iscsi:  connection2:0: detected conn error (1020)
 connection10:0: detected conn error (1020)
 connection3:0: detected conn error (1020)
 connection4:0: detected conn error (1020)
 connection1:0: detected conn error (1020)
 connection5:0: detected conn error (1020)
 connection6:0: detected conn error (1020)
 connection7:0: detected conn error (1020)
 connection8:0: detected conn error (1020)
 connection9:0: detected conn error (1020)
[  OK  ]


Expected results:

reboot as exptected. 

Additional info:

Comment 2 Bruno Goncalves 2012-10-04 13:08:18 UTC
It is a regression as it is not reproducible on RHEL6.3

2.6.32-279.el6.x86_64
iscsi-initiator-utils-6.2.0.872-41.el6.x86_64

Comment 4 Chris Leech 2012-10-05 00:02:24 UTC
This isn't an iscsi-initiator-utils issue, it's a regression in the kernel.  I reproduced this using only the newer kernel builds on top of a 6.3 install, and bisected it down to a change in kernel-2.6.32-300.el6 (and still reproducible through the latest build 2.6.32-320.el6).

There are no iSCSI initiator changes there either, but 300 did have a large set of device mapper and SCSI changes to look at.

Comment 5 Chris Leech 2012-10-05 06:15:15 UTC
Assigning back to myself, I took a quick look upstream and found a likely fix.  Testing now and will submit the patch for kernel inclusion.

Comment 6 RHEL Program Management 2012-10-05 16:20:44 UTC
This request was evaluated by Red Hat Product Management for
inclusion in a Red Hat Enterprise Linux release.  Product
Management has requested further review of this request by
Red Hat Engineering, for potential inclusion in a Red Hat
Enterprise Linux release for currently deployed products.
This request is not yet committed for inclusion in a release.

Comment 7 David Milburn 2012-10-05 16:53:48 UTC
*** Bug 862024 has been marked as a duplicate of this bug. ***

Comment 8 David Milburn 2012-10-05 16:57:34 UTC
*** Bug 862930 has been marked as a duplicate of this bug. ***

Comment 9 Tomas Henzl 2012-10-06 19:37:00 UTC
*** Bug 863523 has been marked as a duplicate of this bug. ***

Comment 10 Jarod Wilson 2012-10-12 05:00:15 UTC
Patch(es) available on kernel-2.6.32-329.el6

Comment 13 Sibiao Luo 2012-10-15 03:16:57 UTC
*** Bug 865349 has been marked as a duplicate of this bug. ***

Comment 14 Bruno Goncalves 2012-10-15 07:54:50 UTC
Problem fixed on 2.6.32-329.el6.x86_64.


vgcreate test_vg /dev/sdb
  No physical volume label read from /dev/sdb
  Physical volume "/dev/sdb" successfully created
  Volume group "test_vg" successfully created

lvcreate test_vg -L10M
  Found duplicate PV Qj58vQT59Yb1KlgV5qioVCk0ERYfAB0d: using /dev/sdc not /dev/sdb
  Rounding up size to full physical extent 12.00 MiB
  Logical volume "lvol0" created

reboot

system rebooted without problem.

Comment 15 Tony Camuso 2012-10-24 14:38:04 UTC
*** Bug 857360 has been marked as a duplicate of this bug. ***

Comment 17 errata-xmlrpc 2013-02-21 06:49:27 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2013-0496.html


Note You need to log in before you can comment on or make changes to this bug.