Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1324916

Summary:

Crash from Double-use of a SCSI command after command timeout before scsi dispatch completes

Product:

Red Hat Enterprise Linux 6

Reporter:

David Jeffery <djeffery>

Component:

kernel

Assignee:

Dick Kennedy <dick.kennedy>

kernel sub component:

Storage

QA Contact:

guazhang <guazhang>

Status:

CLOSED WONTFIX

Docs Contact:

Severity:

high

Priority:

high

CC:

brubisch, bubrown, cww, dick.kennedy, djeffery, dkennedy, dlawrenc, emilne, jmagrini, jpittman, laurie.barry, loberman, mlombard, rahhorizon, revers, stalexan, torel, vbendel

Version:

6.6

Target Milestone:

Target Release:

---

Hardware:

Unspecified

OS:

Unspecified

Whiteboard:

Fixed In Version:

Doc Type:

Bug Fix

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2017-11-15 21:08:31 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Bug Depends On:

Bug Blocks:

1324930, 1374441, 1461138

Attachments:

Description	Flags
struct request ffff88a070c137c0 and foreach bt output.	none
Commands requested through 01557267	none

Description David Jeffery 2016-04-07 15:00:51 UTC

The pulling of a SCSI command from the request queue and submitting to the HBA has a rare race condition. The timeout timer is started first, then the command is sent to the HBA or requeued. If the timer times out before the command makes it to the HBA driver or before the requeue stops the timer, the command can end up being actively used by the SCSI error handler to handle the timeout while the task trying to submit or requeue the request also thinks it owns the SCSI command.

This results in the command being active and submitted twice, which can corrupt request queue lists or softirq done lists and lead to accessing freed request structs.

A customer has had several systems which appear to trigger this condition do to setting a very low timeout of 4 seconds. There were large CPU systems with several lpfc HBAs generating high amounts of I/O and network traffic. The vmcore showed a system which crashed from a corrupt block softirq complete list, with an already freed request and loops in the list.

Additionally, the vmcore showed a task interrupted in scsi_request_fn. It was interrupted in the "not_ready" path upon dropping the host_lock and before it could call blk_requeue_request. A series of interrupts and time in interrupt context was close to causing the issue again for the system.

Version-Release number of selected component (if applicable):
2.6.32-504.el6

How reproducible:
Occurs randomly on the customer's system. It's not the exact same trigger, but the same basic issue can be caused by turning up scsi logging so that scsi_log_send prints messages for every command to a slow serial console

Steps to Reproduce:
1. On a multi CPU, multi hba system, configure and have the kernel log messages to a serial console.
2. Set the timeout for the scsi disks very low, like a value of 2.
2. Set scsi_logging_level to a value like "0x3d00" to make scsi_log_send print and make the scsi layer generate lots of messages.
3. The system will crash when a delay in scsi_log_send() to log messages holds up the submission of a command long enough for the race to occur.

Actual results:
System crashes when using a very short SCSI timeout

Expected results:
System should continue to run even if an overly short SCSI timeout is selected.

Additional info:
Secure information concerns mean we only had limited viewing of the vmcore, We do not have free access to it or most of its output.

Comment 2 Tom Coughlan 2016-04-07 15:23:54 UTC

This is not a regression in 6.8, so considering the schedule, and the fact that this problem is rare in systems running with the default config. parameters, it should not block 6.8. Moving to 6.9. Nonetheless, this should be considered a high priority.

Comment 6 Roger Heflin 2016-04-08 17:32:54 UTC

I have access to the vmcore referenced above and can as needed assist with any debugger commands someone may want.

And am I reading this right, the scsi commands submitted have not yet made it to the HBA in this case (ie after around 4 seconds?) and the issue is what happens when those commands get aborted while the HBA is attempting to process them?   Is there anything else that could be adjusted to speed up the IO getting submitted to the HBA, or is this just a basic issue when high IO is being done?   This delay may explain io delays we get when the system is under stress (SAN is not responding correct-bad cables and lost packets).

And this may explain some others crashes I have looked at were we did not get vmcores, the IO subsystem was under stress (SAN issues again) and the node paniced during the stress, this is also pretty rare, and does not involve the extra driver being loaded in the kernel.

Good work finding it and good luck fixing.

Comment 7 Ewan D. Milne 2016-04-11 17:34:56 UTC

If you have access to the dump, could you examine the current "jiffies" value
as well as the "->start_time" of the request that caused the machine to crash.
I am not sure how to locate the request given the bug description above, but
another request might be locatable in the task that is in the "not_ready" path
in scsi_request_fn() as mentioned.  The "->start_time" of that request would be
interesting also.

I am looking through the code in RHEL6 to see how we can protect against this race.

Comment 8 Roger Heflin 2016-04-11 18:41:17 UTC

I added all of the previous debugging into and the jiffies and the struct request for the bad structure, not sure if it is what you want, but this may be enough for you to tell me the command to get what you want.

Oops trace:


BUG: unable to handle kernel NULL pointer dereference at 00000000000000b0
IP: [<ffffffff8127834f>] blk_done_softirq+0x7f/0xa0
PGD dfd338b067 PUD cd80ed6067 PMD 0
Oops: 0000 [#1] SMP
last sysfs file: /sys/devices/pci0000:00/0000:00:07.0/0000:0b:00.1/host6/rport-6:0-4/target6:0:2/6:0:2:83/state
CPU 50
Modules linked in: hangcheck_timer oracleacfs(P)(U) oracleadvm(P)(U) oracleoks(P)(U) krg_11_0_0_1130_impRHEL6K1smp-x86_64(P)(U) mptctl mptbase nfsd exportfs oracleasm(U) nfs lockd fscache auth_rpcgss nfs_acl sunrpc bonding 8021q garp stp llc ipv6 ext3 jbd dm_round_robin iTCO_wdt iTCO_vendor_support be2net ixgbe dca mdio e1000e ptp pps_core microcode ipmi_devintf serio_raw lpc_ich mfd_core hpilo hpwdt i7core_edac edac_core sg power_meter acpi_ipmi ipmi_si ipmi_msghandler bnx2 shpchp ext4 jbd2 mbcache sr_mod cdrom sd_mod lpfc scsi_transport_fc scsi_tgt crc_t10dif pata_acpi ata_generic ata_piix hpsa radeon ttm drm_kms_helper drm i2c_algo_bit i2c_core dm_multipath dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan]

Pid: 205, comm: ksoftirqd/50 Tainted: P           ---------------    2.6.32-504.1.3.el6.x86_64 #1 HP ProLiant DL980 G7
RIP: 0010:[<ffffffff8127834f>]  [<ffffffff8127834f>] blk_done_softirq+0x7f/0xa0
RSP: 0018:ffff88a070c03f18  EFLAGS: 00010216
RAX: 0000000000000000 RBX: ffff88a070c03f18 RCX: ffff88adba447490
RDX: ffff88a070c03f18 RSI: ffff881fd299bbc0 RDI: ffff88a070c137c0
RBP: ffff88a070c03f38 R08: 0000000000000000 R09: 0000000000000000
R10: ffff881fd1a74400 R11: 00000000055b9b1f R12: ffffffff81a830a0
R13: 0000000000000020 R14: 0000000000000100 R15: 0000000000000004
FS:  0000000000000000(0000) GS:ffff88a070c00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 00000000000000b0 CR3: 000000cd920a0000 CR4: 00000000000007e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process ksoftirqd/50 (pid: 205, threadinfo ffff881fd3a42000, task ffff881fd3a3eaa0)
Stack:
 ffff88adba447490 ffff88ae8b7d14d0 0000000000000100 0000000000000029
<d> ffff88a070c03fa8 ffffffff8107d8b1 0000000000000032 ffff88a070c03f68
<d> 0000003200000005 ffff881fd3a43fd8 ffff881fd3a43fd8 0000000000011440
Call Trace:
 <IRQ>
 [<ffffffff8107d8b1>] __do_softirq+0xc1/0x1e0
 [<ffffffff8100c30c>] call_softirq+0x1c/0x30
 <EOI>
 [<ffffffff8100fc15>] ? do_softirq+0x65/0xa0
 [<ffffffff8107d470>] ksoftirqd+0x80/0x110
 [<ffffffff8107d3f0>] ? ksoftirqd+0x0/0x110
 [<ffffffff8109e66e>] kthread+0x9e/0xc0
 [<ffffffff8100c20a>] child_rip+0xa/0x20
 [<ffffffff8109e5d0>] ? kthread+0x0/0xc0
 [<ffffffff8100c200>] ? child_rip+0x0/0x20
Code: e0 48 39 d8 74 34 66 0f 1f 44 00 00 48 8d 78 f0 48 8b 4f 10 48 8b 57 18 48 89 51 08 48 89 0a 48 89 47 10 48 89 47 18 48 8b 47 38 <ff> 90 b0 00 00 00 48 8b 45 e0 48 39 d8 75 d2 48 83 c4 18 5b c9
RIP  [<ffffffff8127834f>] blk_done_softirq+0x7f/0xa0
 RSP <ffff88a070c03f18>
CR2: 00000000000000b0

Here is all previous debugging that was asked for, current jiffies are at the end.

Not sure which one was the crash, I have included a struct request ffff88adb2c49080 which is the out of bounds one.

      KERNEL: vmlinux
    DUMPFILE: vmcore_charabldb1_010716  [PARTIAL DUMP]
        CPUS: 160
        DATE: Wed Jan  6 16:12:18 2016
      UPTIME: 1 days, 23:27:12
LOAD AVERAGE: 38.86, 14.07, 6.71
       TASKS: 4163
    NODENAME: charabldb1
     RELEASE: 2.6.32-504.1.3.el6.x86_64
     VERSION: #1 SMP Fri Oct 31 11:37:10 EDT 2014
     MACHINE: x86_64  (2393 Mhz)
      MEMORY: 1006 GB
       PANIC: "Oops: 0000 [#1] SMP " (check log for details)
         PID: 205
     COMMAND: "ksoftirqd/50"
        TASK: ffff881fd3a3eaa0  [THREAD_INFO: ffff881fd3a42000]
         CPU: 50
       STATE: TASK_RUNNING (PANIC)

crash> where
No stack.
gdb: gdb request failed: where
crash> p
Usage:
  p [-x|-d][-u] [expression | symbol[:cpuspec]]
Enter "help p" for details.
crash> p ((struct scsi_cmnd *)0xffff888eaf5f3980)->device->host->hostt[0]
$1 = {
  module = 0xffffffffa030f560,
  name = 0xffffffffa02f5754 "lpfc",
  detect = 0x0,
  release = 0x0,
  info = 0xffffffffa02c97a0,
  ioctl = 0x0,
  compat_ioctl = 0x0,
  queuecommand = 0xffffffffa02cbcc0,
  transfer_response = 0x0,
  eh_abort_handler = 0xffffffffa02c8080,
  eh_device_reset_handler = 0xffffffffa02c7e40,
  eh_target_reset_handler = 0xffffffffa02c7bf0,
  eh_bus_reset_handler = 0xffffffffa02c77f0,
  eh_host_reset_handler = 0xffffffffa02c6da0,
  slave_alloc = 0xffffffffa02c9310,
  slave_configure = 0xffffffffa02cdec0,
  slave_destroy = 0xffffffffa02c90c0,
  target_alloc = 0x0,
  target_destroy = 0x0,
  scan_finished = 0xffffffffa02b40d0,
  scan_start = 0x0,
  change_queue_depth = 0xffffffffa02cb000,
  change_queue_type = 0xffffffffa02c6740,
  bios_param = 0x0,
  proc_info = 0x0,
  eh_timed_out = 0x0,
  proc_name = 0x0,
  proc_dir = 0x0,
  can_queue = 0,
  this_id = -1,
  sg_tablesize = 64,
  max_sectors = 65535,
  dma_boundary = 0,
  cmd_per_lun = 3,
  present = 0 '\000',
  supported_mode = 0,
  unchecked_isa_dma = 0,
  use_clustering = 1,
  emulated = 0,
  skip_settle_delay = 0,
  ordered_tag = 0,
  lockless = 0,
  max_host_blocked = 0,
  shost_attrs = 0xffffffffa030d680,
  sdev_attrs = 0x0,
  legacy_hosts = {
    next = 0x0,
    prev = 0x0
  },
  vendor_id = 72057594037932255
}
crash> list -r -H ffff88a070c03f18
ffff88ae8b7d14d0
ffff88dfce1f3b38
ffff88add004fcc0
ffff88aeb126b370
ffff88ae63fb1cc0
ffff88ae7bce1080
ffff88adb7ac91d0
ffff88add004f080
ffff883fd2aaccc0
ffff88ae7ef11e48
ffff88ae9114c330
ffff88cdd7542518
ffff88e9f984b828
ffff88ae148ba6a0
ffff88ae7bce16a0
ffff88ae66172390
ffff88adecabe370
ffff88adcad66518
ffff88adcad66390
ffff88ae8fc3b518
ffff88ae7bce1e48
ffff88adcad66208
ffff88bfd14d56a0
ffff88ae50d3e828
ffff88e999b25080
ffff88ad97f6c828
ffff88ad97f6c6a0
ffff88aeb13c8390
ffff88ae9138f6a0
ffff88ae907ee9b0
ffff88aeb13bcb70
ffff88ad97f6c080
ffff88ad97f6c208
ffff88add004f6a0
ffff88add004f9b0
ffff88aeb105c080
ffff88bfd1be1208
ffff88ae911aacc0
ffff88bfd196dcc0
ffff88ae912f76a0
ffff88aeb13c8cc0
ffff88adac7ff828
ffff88ada7246208
ffff88ae68f7bcd0
ffff88adac7ff9b0
ffff88adac7ffcc0
ffff88ae21d21370
ffff88ae10a21210
ffff88ae90eefe48
ffff88eaa1817080
ffff88cda8c889b0
ffff88adb2d489b0
ffff88ae90eefcc0
ffff88ae906cae48
ffff88ae14ab7630
ffff88aeb2bcde48
ffff88ae25263828
ffff88ae21d21790
ffff88ae66172828
ffff88bfd14d59b0
ffff88bfd14d5cc0
ffff88cdd77ef390
ffff88adb2d48518
ffff88ae7ed65cc0
ffff882dd458bb38
ffff88ce5cabd518
ffff88ae250e16a0
ffff88ae250e1e48
ffff88dfce1f3cc0
ffff88bfd2b6fcc0
ffff88bfd2b6f390
ffff88adb6738e48
ffff88ad93fb41d0
ffff88e9e0911518
ffff88ae66328e48
ffff88ade36a06a0
ffff882d95c7a080
ffff88ad93fb4070
ffff88e9746d0e48
ffff88aeb102c210
ffff88ae8e307b38
ffff882db52f3e48
ffff88ae8e307828
ffff88adb65ae6a0
ffff88ad93fae210
ffff88bfd2b6f080
ffff88bfd2b6f828
ffff88aeb19d7e48
ffff88ceaa1159b0
ffff88adfd1359b0
ffff88ae7679c208
ffff88bfd0d38b38
ffff88bfd0d38828
ffff88cda179ab38
ffff88adfd172518
ffff88ae7679c828
ffff88ae906ca080
ffff88add0150b70
ffff88ae906ca9b0
ffff88ceaaccf828
ffff88ae9111c390
ffff88ae0c49da10
ffff88dfd08c16a0
ffff88adecabe8f0
ffff88adc3abd518
ffff88ae2537e828
ffff88adb7803a10
ffff88adc3abd390
ffff88ae911aa390
ffff88ae68f7bb70
ffff88e9a7483b38
ffff88adf3bd3cd0
ffff88bfd1b31518
ffff88aeb1bf7080
ffff88ad93faf518
ffff88adb65a8e48
ffff88aeb13c8208
ffff88bfd0d38518
ffff88aeb2ada390
ffff88adc3abdcc0
ffff88bfd0905b38
ffff88ae25263080
ffff88bfd0905080
ffff880e94186390
ffff88ae25263518
ffff88ade1586390
ffff88ae25263208
ffff88ae5753c210
ffff88adb2c49080
ffff88aeb12bd518
ffff880e9eef7208
ffff886d1630c390
ffff88ae923cf390
ffff88cda8c88518
ffff88ae31b04e48
ffff88adb67389b0
ffff88ae17e926a0
ffff88aeb19e9cc0
ffff88adc3abd208
ffff88adf3809080
ffff88ae911bc828
ffff88aeb19e9518
ffff88ae906ca6a0
ffff88adf3809518
ffff88aeb19e9080
ffff88aeaeae26a0
ffff884eaf304b38
ffff88bfd2d2be48
ffff88aea9c6a080
ffff88adf3809e48
ffff88cda8c88828
ffff88aeb1331390
ffff88addadc9e48
ffff882d75dec6a0
ffff880e9eef79b0
ffff88ade36a09b0
ffff880cc28f99b0
ffff88ae7ef11390
ffff884c6dc6f9b0
ffff884c12dc3e48
ffff885fd08ddcc0
ffff88add406a6a0
ffff88bfd2067208
ffff88ae250e4cc0
ffff88ae250e4208
ffff88bfd0d92e48
ffff88ae8f98ae48
ffff886d0217de48
ffff886eab1ef080
ffff886ce6b22cc0
ffff880e9dc8d9b0
ffff880e8dad4390
ffff88ade791e518
ffff880db033db38
ffff88bfd20679b0
ffff88ae221679b0
ffff88ae66172208
ffff88aeaebf7080
ffff88ada720ae48
ffff88bfd29239b0
ffff88bfd2923cc0
ffff88ad9c712080
ffff88aeaebf7208
ffff88adf1563cc0
ffff88ade8ec7b38
ffff88ae7bcad390
ffff88cdd7543cc0
ffff884eaad80080
ffff88bfd1bc46a0
ffff884eaa444518
ffff88ae8ff2f390
ffff88ae21d24080
ffff88add01ee208
ffff88ae911aa518
ffff88adb67386a0
ffff88e99dc8e828
ffff88ad97f6ccc0
ffff88ae66172e48
ffff88ad97f6cb38
ffff884eaafb06a0
ffff88e9e09106a0
ffff88ae907eecc0
ffff88ae663289b0
ffff88ae907ee208
ffff88add007b9b0
ffff88ae265f4cc0
ffff884eac0f0cc0
ffff88adf752ccc0
ffff88ae84c5b9b0
ffff88aeb12bd208
ffff88bfd2923b38
ffff88ade8ec7080
ffff884dc4ec7828
ffff88adf752ce48
ffff88eab6f846a0
ffff884d3d465390
ffff88aeb1331828
ffff88aeb1331cc0
ffff88bfd1bc4390
ffff88e9635bc828
ffff88bfd114a6a0
ffff88adcb596080
ffff88aeb13c8518
ffff88bfd114ab38
ffff88e929f02518
ffff88ade8ec7518
ffff88ae910ef828
ffff88ae24062e48
ffff88adb79b1cc0
ffff88ae66172518
ffff88ada72476a0
ffff884bad02ab38
ffff88adb79b19b0
ffff884ddb078828
ffff88aeb12bd390
ffff88adf14d1080
ffff884ca3563390
ffff88adac7ff6a0
ffff88aeb1331208
ffff88ae661726a0
ffff88bfd2923e48
ffff884eabc9cb38
ffff88ade8e76828
ffff88aeb19e96a0
ffff88dfd27a1828
ffff884e3e8a5828
ffff88aeb12bd080
ffff88aeb12bde48
ffff88adf14d19b0
ffff88e929d91e48
ffff88ae250e1518
ffff88ad93fafb38
ffff885fcf29a208
ffff88e95d1fce48
ffff88ae90d95cc0
ffff88aeb1331518
ffff88bfd2e03208
ffff88ae250e1828
ffff88bfd0a40080
ffff88bfd1ff0828
ffff886cf997d6a0
ffff884eac132208
ffff88e9504af9b0
ffff88aeb13c8b38
ffff88adb2c49518
ffff88ade8e769b0
ffff88ae911bce48
ffff884e3eae3828
ffff88adfd134080
ffff88e9af9bc080
ffff887fd1a2e390
ffff88ae17db66a0
ffff884eafd48208
ffff88eaa1221518
ffff88adac785390
ffff88ae616abcc0
ffff88ada7246080
ffff88bfd29236a0
ffff88cdf26bb9b0
ffff884eaa7d76a0
ffff884eaa895390
ffff88adac785828
ffff88ae04b65cc0
ffff88bfd1ff0080
ffff88ae1ec7de48
ffff88bfd235e518
ffff88bfd235e6a0
ffff88e9f30f9390
ffff88adb65a8390
ffff88adb65a8208
ffff880db1feb208
ffff88ceaa488518
ffff880e9dcfeb38
ffff88ae91381b38
ffff88cdc72e9390
ffff88bfd235ecc0
ffff88e9a9af6518
ffff88ceaa3789b0
ffff88aeb13c8e48
ffff88cd7c904518
ffff88bfd1bc4080
ffff88adfd1349b0
ffff88adb2d48208
ffff88ae08f06208
ffff884d4c98d208
ffff88ae91140080
ffff88cd7c9046a0
ffff886ea7a13828
ffff88ae90c65208
ffff88ae19a32390
ffff88aeaebf76a0
ffff88ae04b65828
ffff88ae08f06390
ffff884eaf34e6a0
ffff88ae04b65b38
ffff88bfd0a9a080
ffff885fd08dd208
ffff881fcac1de48
ffff88ae9138fcc0
ffff88bfd2067390
ffff88ae50d3e080
ffff88bfd2067828
ffff88bfd2067080
ffff88bfd2067e48
ffff88aeb105cb38
ffff88aeb105c6a0
ffff88ae0ef5eb38
ffff881fcbb20b38
ffff88aeb105c828
ffff88ae90668cc0
ffff88ae7655b6a0
ffff88adb2c49cc0
ffff88adbccc3080
ffff88adb2c49e48
ffff88ad93fafcc0
ffff88bfd0d92b38
ffff88aeb1360518
ffff886e753c6b38
ffff88ae90d95e48
ffff88adb2c49828
ffff88aeb1360390
ffff88ae90c65080
ffff88e9cfc4d9b0
ffff88fbd356e6a0
ffff88aeb13609b0
ffff886d0db7a390
ffff88ae91140b38
ffff88ad9c712390
ffff88e9b1b3d9b0
ffff886e3558f390
ffff88ae1d3d6cc0
ffff88e950606cc0
ffff88ad9c712cc0
ffff88ae7ef11cc0
ffff88bfd1b31080
ffff88bfd2067518
ffff88aeb1360828
ffff88bfd2067cc0
ffff884ea9c949b0
ffff88adbccc3828
ffff88bfd2ae5208
ffff88ae9040fe48
ffff88ae04b65390
ffff88dfd09846a0
ffff88ade8ec7e48
ffff88eaa1221080
ffff88ae04b65e48
ffff88ae9139f828
ffff880db1febb38
ffff88ae92350208
ffff88ae66185390
ffff88ae04b65080
ffff88bfd2ae5080
ffff88ae04b65518
ffff88adcb5946a0
ffff88ae9139fb38
ffff88ae04b65208
ffff88ae7ef11b38
ffff88e9b1bba390
ffff880db1feb6a0
ffff88adfd172080
ffff88adb6738b38
ffff88ae2abbfb38
ffff88ae19ac86a0
ffff88ae7bcadcc0
ffff88aea9c856a0
ffff88bfd1b31cc0
ffff88ae161799b0
ffff88aeaeb68e48
ffff88bfd2d92080
ffff88bfd0d92390
ffff88bfd0d92208
ffff88add02ac208
ffff88ae16179cc0
ffff88ae911bccc0
ffff88ad9c712828
ffff88ae16179518
ffff88addbcd6828
ffff88addadc9208
ffff88ae8e307208
ffff88ae8e307080
ffff88ae9139f208
ffff88ae265f4b38
ffff88ade1586208
ffff88ae8e307390
ffff88adb2c49208
ffff88bfd0d389b0
ffff88cd8f5f56a0
ffff88ae1d3d6e48
ffff88ad93faf9b0
ffff88ae90c65390
ffff88adfd172828
ffff88ce47371cc0
ffff880cb9542390
ffff88ae7655b828
ffff884eaed1a390
ffff88ade791ee48
ffff88add02ace48
ffff88bfd0905390
ffff884eaa200e48
ffff88aea9c85828
ffff88add01ee6a0
ffff88ade791e9b0
ffff88ae90c65518
ffff88add02ac390
ffff88bfd0c80080
ffff88ae92350390
ffff886e7af02b38
ffff88ae220879b0
ffff88ae22087b38
ffff880d89d5c9b0
ffff880d5d7df208
ffff88adceeb6390
ffff880c82b4eb38
ffff88ad97f42cc0
ffff88adf14e9080
ffff88ae910ed518
ffff88adf3bc7518
ffff884ca14e7390
ffff88bfd0d38080
ffff88cdae95e208
ffff88bfd2923080
ffff880dbb94e6a0
ffff880c6d79eb38
ffff880dbbb52080
ffff88fbd6a23e48
ffff88ae17e92080
ffff884e1ae86390
ffff88cd8f5f5518
ffff884eb06e59b0
ffff88e9504b6208
ffff88eab511ae48
ffff88fbd5324208
ffff88adc15ce828
ffff88ae221cbb38
ffff88adb390b208
ffff880d5b42ce48
ffff88bfd0d92cc0
ffff88dfd1b1f390
ffff88adb390b9b0
ffff88ae90f66080
ffff88aea9c85cc0
ffff88a070c117e0
ffff88ae7ed9f518
ffff88aeaeae2080
ffff88ae22486cc0
ffff88aeb24e1518
ffff886eaac6b828
ffff88addad83390
ffff886eaa826b38
ffff88bfd2ae5b38
ffff88ce0c985828
ffff88adcae27b38
ffff88ae8e323390
ffff88adf1563080
ffff88aeb2adacc0
ffff88ae90eef828
ffff88e9ebe57390
ffff88adfd0d6b38
ffff88ae9111c6a0
ffff88add403a390
ffff88e975de4518
ffff886d2ccf2b38
ffff88e9b1b3d828
ffff88ae9111c208
ffff88cdc7005080
ffff88add02ac9b0
ffff88e9f4de2828
ffff88adfd0d6390
ffff88e9de77a9b0
ffff88addadc9390
ffff88ae9111ce48
ffff886ea759c390
ffff88ae22167518
ffff88ceaa115208
ffff88adf3bc7390
ffff88adf3bc7080
ffff88ae907ee6a0
ffff88eab56d8b38
ffff884b6d2e8080
ffff882db73d7390
ffff88fbd3589080
ffff886e4eeea208
ffff88ad97db7208
ffff88adb2d486a0
ffff88ad97db7390
ffff88e99bccb518
ffff88ae250e1208
ffff886ea6dc8828
ffff88adb2c49b38
ffff88adfd0d6208
ffff88ae907ee828
ffff88bfd1be1b38
ffff885fd0b82208
ffff88bfd0d38208
ffff88bfd0d38cc0
ffff886d35178cc0
ffff88ae148ba828
ffff88aeb2adae48
ffff88cdd7776b38
ffff88ae0edf8b38
ffff88cdeeaa2cc0
ffff88aeb13c89b0
ffff88ae21d24828
ffff88ae9131a9b0
ffff88ea1e277b38
ffff88ae905906a0
ffff88aeb2bcd828
ffff88cd9a917828
ffff88aeaea71b38
ffff88aeaea71cc0
ffff88ae911aab38
ffff884e8e773518
ffff88aeaea71828
ffff88bfd0a409b0
ffff88ae911aa6a0
ffff88e9504b6518
ffff88aeb1bd0518
ffff88fbd6a23b38
ffff884d7a193e48
ffff88ae265f4828
ffff88aeaeae2208
ffff88ae8ff2f080
ffff88cdd77ef828
ffff88ae8ff2fcc0
ffff88ae66172cc0
ffff88aeaea71e48
ffff88ae2537e6a0
ffff88bfd2b6f9b0
ffff88ae923cfcc0
ffff880d5b51bb38
ffff880e4d5da9b0
ffff88bfd0a40518
ffff88aeb12ad390
ffff88ae910f7e48
ffff885fd127b390
ffff88ad9c675390
ffff88eab66ff6a0
ffff88eab5714518
ffff88ae905909b0
ffff88adb7acc208
ffff88ae66172b38
ffff884eaca9e080
ffff886ce6b229b0
ffff88adb7acc080
ffff88ae7ef11208
ffff88ae22486080
ffff88ae90f66b38
ffff88ae7ed9fcc0
ffff88ae2536ab38
ffff88ae2537e080
ffff88adf1563518
ffff88ae7ed9f390
ffff88bfd0d38390
ffff88adbcfdc208
ffff88ada720ab38
ffff88ae616ab9b0
ffff88adb7accb38
ffff88adb390b6a0
ffff882dc623b390
ffff880db1feb390
ffff88ae25263b38
ffff88adb65ae9b0
ffff88adb79b1b38
ffff884ead8aa208
ffff880c6d79e6a0
ffff88aeb12adcc0
ffff88aeb1bd0b38
ffff88bfd0c80208
ffff88ae209bb080
ffff88addadc9b38
ffff88ae7bcad9b0
ffff88ae51ed0828
ffff88ae8fc3bcc0
ffff882e75511b38
ffff882d5df6e9b0
ffff882dd826f208
ffff882dd8a6f9b0
ffff882d75cb8080
ffff882d9f8636a0
ffff882ea8ca7390
ffff883fd0b3ab38
ffff88adb7acc518
ffff88ae22167b38
ffff882db924e6a0
ffff882db73d6b38
ffff882e7548eb38
ffff88ae9131a208
ffff883fd0ae3518
ffff882ea948b6a0
ffff88ae7ed65080
ffff88ae22486208
ffff88ae8f98a390
ffff88ae21d24518
ffff882dd7da1390
ffff88aeb19e9828
ffff88ada70f89b0
ffff88bfd114acc0
ffff88adf1563390
ffff88ae7ed65518
ffff88ae8f98ab38
ffff886cf9a4e6a0
ffff88ae224869b0
ffff88adfd134390
ffff88addad83cc0
ffff88ae7bcad080
ffff88adfd0d6518
ffff88bfd2a5f518
ffff88ae8f98a828
ffff88bfd2e03390
ffff88ae907ee080
ffff88addad836a0
ffff88ae8f98acc0
ffff88ae21d24390
ffff88ae51ed06a0
ffff88ae21d24e48
ffff88adf3809b38
ffff88ae8f98a518
ffff88aea9c85518
ffff88aeb1bf7828
ffff88aea9c85208
ffff88aeb1bf7390
ffff88ae21d24b38
ffff88ae072c1518
ffff886d054779b0
ffff88aeb19e9e48
ffff88adb390bb38
ffff88adb390b080
ffff88aeaeb689b0
ffff88aeaeb68cc0
ffff88aeb1bf79b0
ffff88ae7bcad518
ffff88ae221cb080
ffff88bfd0a40e48
ffff88bfd0a40828
ffff88adf38096a0
ffff88aeb1bf7518
ffff88aeb1bf7208
ffff88aeaeb686a0
ffff88ae072c1e48
ffff88ae9139fe48
ffff88bfd0a406a0
ffff88ae2536a080
ffff88ae2536a208
ffff88aea9c85b38
ffff88bfd0a40390
ffff88ae221cb828
ffff88addbcd66a0
ffff88aeb12bd9b0
ffff88ae910ed080
ffff88ae252636a0
ffff88ae252639b0
ffff884eac548080
ffff88ae2536a390
ffff88ae0edf8390
ffff88ae58504390
ffff88bfd0a40208
ffff88ae58504208
ffff88adb2c49390
ffff88aea9c859b0
ffff88ae7ed9f828
ffff88adcee819b0
ffff88add004f208
ffff88aeb12e6e48
ffff88ada720a6a0
ffff88aeb12e6b38
ffff88adcd691cc0
ffff88adcd691828
ffff88adb2d48cc0
ffff88bfd2d2bb38
ffff88adb2d48390
ffff88ae910ed6a0
ffff88bfd2d2b828
ffff88bfd2d2b208
ffff88adcd691b38
ffff88ae25263cc0
ffff88ada720a9b0
ffff88aeaeae29b0
ffff88ae910ed9b0
ffff88ae616ab6a0
ffff88ae616ab208
ffff88aeb12e6208
ffff88ae910edb38
ffff88ae25263e48
ffff88adb67a3390
ffff88adbcfdc080
ffff88adbcfdc518
ffff88adbcfdcb38
ffff88adbcfdc9b0
ffff88adbcfdc828
ffff88adbcfdccc0
ffff88adb7acc9b0
ffff88aeb12ad9b0
ffff88adb67a39b0
ffff88aeb12ad828
ffff88aeb12adb38
ffff88bfd0c80518
ffff88bfd2923208
ffff88ae7ef11828
ffff88bfd0c80390
ffff88aeb12ad080
ffff88adb2c496a0
ffff88ae7ed9f6a0
ffff88ad97f6ce48
ffff88ad97f6c390
ffff88aeb12bdb38
ffff88ae7ef119b0
ffff88aeb12bd828
ffff88bfd0c80b38
ffff88bfd2923518
ffff88bfd2b6f6a0
ffff88cdd77efe48
ffff88bfd2d2b390
ffff88bfd2d2b9b0
ffff88adf14e9390
ffff88adbccc3390
ffff88aeb2ada9b0
ffff88ae91296390
ffff88ae905046a0
ffff88ae9138f828
ffff88ae9138f9b0
ffff88aeb2ada828
ffff88aeb2ada518
ffff88adb67a3828
ffff88ad97f6c518
ffff88bfd0c80cc0
ffff88bfd14d5518
ffff88bfd0c806a0
ffff88ade791e208
ffff88bfd2923828
ffff88aeaeae2b38
ffff88ae2537e9b0
ffff88addad83828
ffff88adb2d48b38
ffff88ae90668518
ffff88ae91296828
ffff88addd186208
ffff88ae911bcb38
ffff88adb6739518
ffff88ae91296e48
ffff88adfd135390
ffff88bfd235e208
ffff88bfd235eb38
ffff88ae90504208
ffff88ae91296cc0
ffff88bfd2923390
ffff88ae911aa828
ffff88ae2536a9b0
ffff88ae90590b38
ffff88ae911aae48
ffff88aeb12ad518
ffff88bfd14d5208
ffff88adfd134828
ffff88add007bcc0
ffff88bfd1ab1828
ffff88bfd2e036a0
ffff88adfd134cc0
ffff88addbcd6cc0
ffff88aeb1e53e48
ffff88ade8e766a0
ffff88bfd0d929b0
ffff88adf1563e48
ffff88ade36a0390
ffff88adb7acce48
ffff88add02ac080
ffff88bfd114a9b0
ffff88adfd134e48
ffff88adf14e9e48
ffff88addd186e48
ffff88ae0ece3208
ffff88aeb24e1cc0
ffff88ae90c98080
ffff88ada70f8518
ffff88adcb594208
ffff88ae911bc208
ffff88ae265f4208
ffff88ae7ed9f208
ffff88adb390b518
ffff88ae90f666a0
ffff88ada70f8390
ffff88aeaea71080
ffff88adceeb6828
ffff88aeb19e99b0
ffff88ae7bcadb38
ffff88ae911406a0
ffff88ae2536acc0
ffff88ae19a329b0
ffff88adf1656cc0
ffff88adb7acc6a0
ffff88ae90590518
ffff88ae51ed0e48
ffff88ae7ef11518
ffff88ae1ec7d6a0
ffff88ae90504518
ffff88ad9c712b38
ffff88ae2536ae48
ffff88aeaebf79b0
ffff88adb65a8b38
ffff88ae2d0f0828
ffff88ae911bc9b0
ffff88aea9c85e48
ffff88aeaea71390
ffff88ae90668208
ffff88adf14e9cc0
ffff88adf752c208
ffff88aeaeb17cc0
ffff88adb67a3cc0
ffff88cda13109b0
ffff88adcad66828
ffff88ae7ed659b0
ffff88aeaea71518
ffff88adf14e9b38
ffff88adcb594e48
ffff88aeae8d0390
ffff88bfd1be1080
ffff886e744a1e48
ffff88ae2536a828
ffff88ae2536a6a0
ffff88adcae27208
ffff88ae923cfb38
ffff88addd1869b0
ffff88aea9e22828
ffff88adcb594828
ffff88ae8fc3bb38
ffff88ae91296080
ffff88aeb24e19b0
ffff88aea9c85080
ffff88adbccb9828
ffff88ae905049b0
ffff88ae585049b0
ffff88adb67a3b38
ffff88adb67a3e48
ffff88ae29d81080
ffff88adcee81b38
ffff88ae912966a0
ffff88ae2d0f0e48
ffff88ae923cf6a0
ffff88ae51ed0208
ffff88adcad669b0
ffff88bfd19a9cc0
ffff88adf14e9208
ffff88ae072c1cc0
ffff88ae7bce1b38
ffff88ae911aa208
ffff88ae22087828
ffff88ae8e3076a0
ffff88bfd196d828
ffff88ae7655b9b0
ffff88ae9040f518
ffff88ada70f8b38
ffff88ae7679c390
ffff88ade8e76cc0
ffff88bfd196d390
ffff88ae17db6208
ffff88ae9040f390
ffff88ae616ab828
ffff88ae19ac8b38
ffff88ae84c5b080
ffff88ae8b457080
ffff88ae8b457e48
ffff88ae8b457cc0
ffff88adb79b1518
ffff88aeb24e16a0
ffff88aeb1bf7b38
ffff88bfd2a5f080
ffff88bfd2a5fcc0
ffff88bfd2a5f9b0
ffff88bfd2a5f390
ffff88bfd2a5fe48
ffff88cdd77ef9b0
ffff88bfd2a5f828
ffff88ae250e4518
ffff88ae250e4b38
ffff88ae923cf9b0
ffff880cb52c3518
ffff88ae911bc6a0
ffff880dbb86fcc0
ffff88e925bd7828
ffff88adb2c499b0
ffff886d12663e48
ffff88bfd2b6fb38
ffff88eaa0fdd208
ffff88ae923cf518
ffff88ae923cf208
ffff88dfd08c1080
ffff884bfe6f9518
ffff88adb2c49080

list: duplicate list entry: ffff88adb2c49080
crash> kmem ffff88aeb1331b38
CACHE            NAME                 OBJSIZE  ALLOCATED     TOTAL  SLABS  SSIZE
ffff88fbd6740380 dm_rq_target_io          392      47375     57320   5732     4k
SLAB              MEMORY            TOTAL  ALLOCATED  FREE
ffff88aeb1331000  ffff88aeb1331058     10          6     4
FREE / [ALLOCATED]
  [ffff88aeb1331b10]

      PAGE         PHYSICAL      MAPPING       INDEX CNT FLAGS
ffffea02636c32b8 aeb1331000                0 ffff88ae34acf3c0  1 2c0000000000080 slab
crash>
crash> kmem ffff88adb2c49080
CACHE            NAME                 OBJSIZE  ALLOCATED     TOTAL  SLABS  SSIZE
ffff88fbd6740380 dm_rq_target_io          392      47375     57320   5732     4k
SLAB              MEMORY            TOTAL  ALLOCATED  FREE
ffff88adb2c49000  ffff88adb2c49058     10         10     0
FREE / [ALLOCATED]
  [ffff88adb2c49058]

      PAGE         PHYSICAL      MAPPING       INDEX CNT FLAGS
ffffea025ff1aff8 adb2c49000                0 ffff88addafbd100  1 2c0000000000080 slab
crash> kmem ffff884bfe6f9518
CACHE            NAME                 OBJSIZE  ALLOCATED     TOTAL  SLABS  SSIZE
ffff88fbd6740380 dm_rq_target_io          392      47375     57320   5732     4k
SLAB              MEMORY            TOTAL  ALLOCATED  FREE
ffff884bfe6f9000  ffff884bfe6f9058     10          4     6
FREE / [ALLOCATED]
  [ffff884bfe6f94f0]

      PAGE         PHYSICAL      MAPPING       INDEX CNT FLAGS
ffffea0109fa8678 4bfe6f9000                0 ffff885fd2c63380  1 140000000000080 slab
crash> kmem ffff88aeb12bd518
CACHE            NAME                 OBJSIZE  ALLOCATED     TOTAL  SLABS  SSIZE
ffff88fbd6740380 dm_rq_target_io          392      47375     57320   5732     4k
SLAB              MEMORY            TOTAL  ALLOCATED  FREE
ffff88aeb12bd000  ffff88aeb12bd058     10         10     0
FREE / [ALLOCATED]
  [ffff88aeb12bd4f0]

      PAGE         PHYSICAL      MAPPING       INDEX CNT FLAGS
ffffea02636c1958 aeb12bd000                0 ffff88ae90ce8d80  1 2c0000000000080 slab
crash> struct request ffff88aeb1331b28
struct request {
  queuelist = {
    next = 0xffff881fcd733088,
    prev = 0xffff884eac5b3b28
  },
  csd = {
    list = {
      next = 0xffff88adb2c49080,
      prev = 0xffff88cd7cb87828
    },
    func = 0xffffffff81278370 <trigger_softirq>,
    info = 0xffff88aeb1331b28,
    flags = 0,
    priv = 0
  },
  q = 0xffff881fcd732ce8,
  cmd_flags = 16784965,
  cmd_type = REQ_TYPE_FS,
  atomic_flags = 1,
  cpu = 130,
  __data_len = 8192,
  __sector = 803116671,
  bio = 0xffff88ae209fabc0,
  biotail = 0xffff88ae209fabc0,
  hash = {
    next = 0x0,
    pprev = 0x0
  },
  {
    rb_node = {
      rb_parent_color = 18446612882611444648,
      rb_right = 0x0,
      rb_left = 0x0
    },
    completion_data = 0xffff88aeb1331ba8
  },
  {
    elevator_private = {0x0, 0x0, 0x0},
    flush = {
      seq = 0,
      list = {
        next = 0x0,
        prev = 0x0
      }
    }
  },
  rq_disk = 0xffff883fd1dff000,
  start_time = 4465498210,
  start_time_ns = 170936376757782,
  io_start_time_ns = 170936624155168,
  nr_phys_segments = 1,
  ioprio = 0,
  ref_count = 1,
  special = 0xffff88ceaa2399c0,
  buffer = 0x0,
  tag = 0,
  errors = 0,
  __cmd = "\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000",
  cmd = 0xffff88bfd18e6710 "*",
  cmd_len = 16,
  extra_len = 0,
  sense_len = 0,
  resid_len = 8192,
  sense = 0x0,
  deadline = 4465502457,
  timeout_list = {
    next = 0xffff884eac5b3c50,
    prev = 0xffff881fcd7330f8
  },
  timeout = 4000,
  retries = 0,
  end_io = 0xffffffffa0002a00,
  end_io_data = 0xffff88aeb1331b10,
  next_rq = 0x0,
  pad = 0x0
}
crash> struct request ffff88adb2c49070
struct request {
  queuelist = {
    next = 0xffff88adb67389a0,
    prev = 0xffff88adb2c499a0
  },
  csd = {
    list = {
      next = 0xffff884bfe6f9518,
      prev = 0xffff88aeb12bd518
    },
    func = 0xffffffff81278370 <trigger_softirq>,
    info = 0xffff88adb2c49070,
    flags = 1,
    priv = 0
  },
  q = 0xffff881fcd641328,
  cmd_flags = 16784965,
  cmd_type = REQ_TYPE_FS,
  atomic_flags = 1,
  cpu = 135,
  __data_len = 8192,
  __sector = 721182495,
  bio = 0xffff88adf39925c0,
  biotail = 0xffff88adf39925c0,
  hash = {
    next = 0x0,
    pprev = 0x0
  },
  {
    rb_node = {
      rb_parent_color = 18446612878342787312,
      rb_right = 0x0,
      rb_left = 0x0
    },
    completion_data = 0xffff88adb2c490f0
  },
  {
    elevator_private = {0x0, 0x0, 0x0},
    flush = {
      seq = 0,
      list = {
        next = 0x0,
        prev = 0x0
      }
    }
  },
  rq_disk = 0xffff883fd1dd6000,
  start_time = 4465499326,
  start_time_ns = 170937493383578,
  io_start_time_ns = 170938157020987,
  nr_phys_segments = 1,
  ioprio = 0,
  ref_count = 1,
  special = 0xffff888eaf5f3980,
  buffer = 0x0,
  tag = 11,
  errors = 0,
  __cmd = "\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000",
  cmd = 0xffff88ae63c08410 "*",
  cmd_len = 16,
  extra_len = 0,
  sense_len = 0,
  resid_len = 8192,
  sense = 0x0,
  deadline = 4465503989,
  timeout_list = {
    next = 0xffff88adb2c49ac8,
    prev = 0xffff88adb6738ac8
  },
  timeout = 4000,
  retries = 0,
  end_io = 0xffffffffa0002a00,
  end_io_data = 0xffff88adb2c49058,
  next_rq = 0x0,
  pad = 0x0
}
crash> struct request ffff884bfe6f9508
struct request {
  queuelist = {
    next = 0xffff88e9ee3aa690,
    prev = 0xffff88eaae2b0b28
  },
  csd = {
    list = {
      next = 0xffff88dfd08c1080,
      prev = 0xffff88adb2c49080
    },
    func = 0xffffffff81278370 <trigger_softirq>,
    info = 0xffff884bfe6f9508,
    flags = 1,
    priv = 0
  },
  q = 0xffff881fcd924ea8,
  cmd_flags = 16784965,
  cmd_type = REQ_TYPE_FS,
  atomic_flags = 1,
  cpu = 135,
  __data_len = 8192,
  __sector = 88864255,
  bio = 0xffff884d934b0780,
  biotail = 0xffff884d934b0780,
  hash = {
    next = 0x0,
    pprev = 0x0
  },
  {
    rb_node = {
      rb_parent_color = 18446612458705491336,
      rb_right = 0x0,
      rb_left = 0x0
    },
    completion_data = 0xffff884bfe6f9588
  },
  {
    elevator_private = {0x0, 0x0, 0x0},
    flush = {
      seq = 0,
      list = {
        next = 0x0,
        prev = 0x0
      }
    }
  },
  rq_disk = 0xffff883fd34ba800,
  start_time = 4465499330,
  start_time_ns = 170937497327657,
  io_start_time_ns = 170938154952269,
  nr_phys_segments = 1,
  ioprio = 0,
  ref_count = 1,
  special = 0xffff885fd220b680,
  buffer = 0x0,
  tag = 0,
  errors = 0,
  __cmd = "\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000",
  cmd = 0xffff88adb4da2710 "*",
  cmd_len = 16,
  extra_len = 0,
  sense_len = 0,
  resid_len = 8192,
  sense = 0x0,
  deadline = 4465503987,
  timeout_list = {
    next = 0xffff88eaae2b0c50,
    prev = 0xffff88e9ee3aa7b8
  },
  timeout = 4000,
  retries = 0,
  end_io = 0xffffffffa0002a00,
  end_io_data = 0xffff884bfe6f94f0,
  next_rq = 0x0,
  pad = 0x0
}
crash> struct request ffff88aeb12bd508
struct request {
  queuelist = {
    next = 0xffff88aeb12bd1f8,
    prev = 0xffff88e9ee3aa9a0
  },
  csd = {
    list = {
      next = 0xffff88adb2c49080,
      prev = 0xffff880e9eef7208
    },
    func = 0xffffffff81278370 <trigger_softirq>,
    info = 0xffff88aeb12bd508,
    flags = 1,
    priv = 0
  },
  q = 0xffff881fcdbb2238,
  cmd_flags = 16784965,
  cmd_type = REQ_TYPE_FS,
  atomic_flags = 1,
  cpu = 135,
  __data_len = 8192,
  __sector = 90617247,
  bio = 0xffff88aeb1109600,
  biotail = 0xffff88aeb1109600,
  hash = {
    next = 0x0,
    pprev = 0x0
  },
  {
    rb_node = {
      rb_parent_color = 18446612882610967944,
      rb_right = 0x0,
      rb_left = 0x0
    },
    completion_data = 0xffff88aeb12bd588
  },
  {
    elevator_private = {0x0, 0x0, 0x0},
    flush = {
      seq = 0,
      list = {
        next = 0x0,
        prev = 0x0
      }
    }
  },
  rq_disk = 0xffff883fd1d7dc00,
  start_time = 4465499327,
  start_time_ns = 170937494653426,
  io_start_time_ns = 170938156582742,
  nr_phys_segments = 1,
  ioprio = 0,
  ref_count = 1,
  special = 0xffff88e9ac078380,
  buffer = 0x0,
  tag = 10,
  errors = 0,
  __cmd = "\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000",
  cmd = 0xffff88ae76713c90 "*",
  cmd_len = 16,
  extra_len = 0,
  sense_len = 0,
  resid_len = 8192,
  sense = 0x0,
  deadline = 4465503989,
  timeout_list = {
    next = 0xffff88e9ee3aaac8,
    prev = 0xffff88aeb12bd320
  },
  timeout = 4000,
  retries = 0,
  end_io = 0xffffffffa0002a00,
  end_io_data = 0xffff88aeb12bd4f0,
  next_rq = 0x0,
  pad = 0x0
}
crash> list ffff88a070c03f18
ffff88a070c03f18
ffff88adba447490
ffff88aeb13bc1d0
ffff88cdc50f8b38
ffff88ce15a05518
ffff88aea9e7f790
ffff88ce94064208
ffff88ad93fae630
ffff884ead10d518
ffff88ae90eef208
ffff88bfd1b319b0
ffff88ae90eef080
ffff88adf15a9d10
ffff884d7a317080
ffff88e9bf547390
ffff88e9bf547080
ffff882db3077cc0
ffff88aeb105c518
ffff88dfce1f3518
ffff88ae63c08750
ffff88adac7ffe48
ffff88adb79b1208
ffff88ae90f66e48
ffff88ae2d0f0080
ffff88cd7cb87828
ffff88aeb1331b38
ffff88adb2c49080
ffff884bfe6f9518
ffff88dfd08c1080
ffff88ae923cf208
ffff88ae923cf518
ffff88eaa0fdd208
ffff88bfd2b6fb38
ffff886d12663e48
ffff88adb2c499b0
ffff88e925bd7828
ffff880dbb86fcc0
ffff88ae911bc6a0
ffff880cb52c3518
ffff88ae923cf9b0
ffff88ae250e4b38
ffff88ae250e4518
ffff88bfd2a5f828
ffff88cdd77ef9b0
ffff88bfd2a5fe48
ffff88bfd2a5f390
ffff88bfd2a5f9b0
ffff88bfd2a5fcc0
ffff88bfd2a5f080
ffff88aeb1bf7b38
ffff88aeb24e16a0
ffff88adb79b1518
ffff88ae8b457cc0
ffff88ae8b457e48
ffff88ae8b457080
ffff88ae84c5b080
ffff88ae19ac8b38
ffff88ae616ab828
ffff88ae9040f390
ffff88ae17db6208
ffff88bfd196d390
ffff88ade8e76cc0
ffff88ae7679c390
ffff88ada70f8b38
ffff88ae9040f518
ffff88ae7655b9b0
ffff88bfd196d828
ffff88ae8e3076a0
ffff88ae22087828
ffff88ae911aa208
ffff88ae7bce1b38
ffff88ae072c1cc0
ffff88adf14e9208
ffff88bfd19a9cc0
ffff88adcad669b0
ffff88ae51ed0208
ffff88ae923cf6a0
ffff88ae2d0f0e48
ffff88ae912966a0
ffff88adcee81b38
ffff88ae29d81080
ffff88adb67a3e48
ffff88adb67a3b38
ffff88ae585049b0
ffff88ae905049b0
ffff88adbccb9828
ffff88aea9c85080
ffff88aeb24e19b0
ffff88ae91296080
ffff88ae8fc3bb38
ffff88adcb594828
ffff88aea9e22828
ffff88addd1869b0
ffff88ae923cfb38
ffff88adcae27208
ffff88ae2536a6a0
ffff88ae2536a828
ffff886e744a1e48
ffff88bfd1be1080
ffff88aeae8d0390
ffff88adcb594e48
ffff88adf14e9b38
ffff88aeaea71518
ffff88ae7ed659b0
ffff88adcad66828
ffff88cda13109b0
ffff88adb67a3cc0
ffff88aeaeb17cc0
ffff88adf752c208
ffff88adf14e9cc0
ffff88ae90668208
ffff88aeaea71390
ffff88aea9c85e48
ffff88ae911bc9b0
ffff88ae2d0f0828
ffff88adb65a8b38
ffff88aeaebf79b0
ffff88ae2536ae48
ffff88ad9c712b38
ffff88ae90504518
ffff88ae1ec7d6a0
ffff88ae7ef11518
ffff88ae51ed0e48
ffff88ae90590518
ffff88adb7acc6a0
ffff88adf1656cc0
ffff88ae19a329b0
ffff88ae2536acc0
ffff88ae911406a0
ffff88ae7bcadb38
ffff88aeb19e99b0
ffff88adceeb6828
ffff88aeaea71080
ffff88ada70f8390
ffff88ae90f666a0
ffff88adb390b518
ffff88ae7ed9f208
ffff88ae265f4208
ffff88ae911bc208
ffff88adcb594208
ffff88ada70f8518
ffff88ae90c98080
ffff88aeb24e1cc0
ffff88ae0ece3208
ffff88addd186e48
ffff88adf14e9e48
ffff88adfd134e48
ffff88bfd114a9b0
ffff88add02ac080
ffff88adb7acce48
ffff88ade36a0390
ffff88adf1563e48
ffff88bfd0d929b0
ffff88ade8e766a0
ffff88aeb1e53e48
ffff88addbcd6cc0
ffff88adfd134cc0
ffff88bfd2e036a0
ffff88bfd1ab1828
ffff88add007bcc0
ffff88adfd134828
ffff88bfd14d5208
ffff88aeb12ad518
ffff88ae911aae48
ffff88ae90590b38
ffff88ae2536a9b0
ffff88ae911aa828
ffff88bfd2923390
ffff88ae91296cc0
ffff88ae90504208
ffff88bfd235eb38
ffff88bfd235e208
ffff88adfd135390
ffff88ae91296e48
ffff88adb6739518
ffff88ae911bcb38
ffff88addd186208
ffff88ae91296828
ffff88ae90668518
ffff88adb2d48b38
ffff88addad83828
ffff88ae2537e9b0
ffff88aeaeae2b38
ffff88bfd2923828
ffff88ade791e208
ffff88bfd0c806a0
ffff88bfd14d5518
ffff88bfd0c80cc0
ffff88ad97f6c518
ffff88adb67a3828
ffff88aeb2ada518
ffff88aeb2ada828
ffff88ae9138f9b0
ffff88ae9138f828
ffff88ae905046a0
ffff88ae91296390
ffff88aeb2ada9b0
ffff88adbccc3390
ffff88adf14e9390
ffff88bfd2d2b9b0
ffff88bfd2d2b390
ffff88cdd77efe48
ffff88bfd2b6f6a0
ffff88bfd2923518
ffff88bfd0c80b38
ffff88aeb12bd828
ffff88ae7ef119b0
ffff88aeb12bdb38
ffff88ad97f6c390
ffff88ad97f6ce48
ffff88ae7ed9f6a0
ffff88adb2c496a0
ffff88aeb12ad080
ffff88bfd0c80390
ffff88ae7ef11828
ffff88bfd2923208
ffff88bfd0c80518
ffff88aeb12adb38
ffff88aeb12ad828
ffff88adb67a39b0
ffff88aeb12ad9b0
ffff88adb7acc9b0
ffff88adbcfdccc0
ffff88adbcfdc828
ffff88adbcfdc9b0
ffff88adbcfdcb38
ffff88adbcfdc518
ffff88adbcfdc080
ffff88adb67a3390
ffff88ae25263e48
ffff88ae910edb38
ffff88aeb12e6208
ffff88ae616ab208
ffff88ae616ab6a0
ffff88ae910ed9b0
ffff88aeaeae29b0
ffff88ada720a9b0
ffff88ae25263cc0
ffff88adcd691b38
ffff88bfd2d2b208
ffff88bfd2d2b828
ffff88ae910ed6a0
ffff88adb2d48390
ffff88bfd2d2bb38
ffff88adb2d48cc0
ffff88adcd691828
ffff88adcd691cc0
ffff88aeb12e6b38
ffff88ada720a6a0
ffff88aeb12e6e48
ffff88add004f208
ffff88adcee819b0
ffff88ae7ed9f828
ffff88aea9c859b0
ffff88adb2c49390
ffff88ae58504208
ffff88bfd0a40208
ffff88ae58504390
ffff88ae0edf8390
ffff88ae2536a390
ffff884eac548080
ffff88ae252639b0
ffff88ae252636a0
ffff88ae910ed080
ffff88aeb12bd9b0
ffff88addbcd66a0
ffff88ae221cb828
ffff88bfd0a40390
ffff88aea9c85b38
ffff88ae2536a208
ffff88ae2536a080
ffff88bfd0a406a0
ffff88ae9139fe48
ffff88ae072c1e48
ffff88aeaeb686a0
ffff88aeb1bf7208
ffff88aeb1bf7518
ffff88adf38096a0
ffff88bfd0a40828
ffff88bfd0a40e48
ffff88ae221cb080
ffff88ae7bcad518
ffff88aeb1bf79b0
ffff88aeaeb68cc0
ffff88aeaeb689b0
ffff88adb390b080
ffff88adb390bb38
ffff88aeb19e9e48
ffff886d054779b0
ffff88ae072c1518
ffff88ae21d24b38
ffff88aeb1bf7390
ffff88aea9c85208
ffff88aeb1bf7828
ffff88aea9c85518
ffff88ae8f98a518
ffff88adf3809b38
ffff88ae21d24e48
ffff88ae51ed06a0
ffff88ae21d24390
ffff88ae8f98acc0
ffff88addad836a0
ffff88ae907ee080
ffff88bfd2e03390
ffff88ae8f98a828
ffff88bfd2a5f518
ffff88adfd0d6518
ffff88ae7bcad080
ffff88addad83cc0
ffff88adfd134390
ffff88ae224869b0
ffff886cf9a4e6a0
ffff88ae8f98ab38
ffff88ae7ed65518
ffff88adf1563390
ffff88bfd114acc0
ffff88ada70f89b0
ffff88aeb19e9828
ffff882dd7da1390
ffff88ae21d24518
ffff88ae8f98a390
ffff88ae22486208
ffff88ae7ed65080
ffff882ea948b6a0
ffff883fd0ae3518
ffff88ae9131a208
ffff882e7548eb38
ffff882db73d6b38
ffff882db924e6a0
ffff88ae22167b38
ffff88adb7acc518
ffff883fd0b3ab38
ffff882ea8ca7390
ffff882d9f8636a0
ffff882d75cb8080
ffff882dd8a6f9b0
ffff882dd826f208
ffff882d5df6e9b0
ffff882e75511b38
ffff88ae8fc3bcc0
ffff88ae51ed0828
ffff88ae7bcad9b0
ffff88addadc9b38
ffff88ae209bb080
ffff88bfd0c80208
ffff88aeb1bd0b38
ffff88aeb12adcc0
ffff880c6d79e6a0
ffff884ead8aa208
ffff88adb79b1b38
ffff88adb65ae9b0
ffff88ae25263b38
ffff880db1feb390
ffff882dc623b390
ffff88adb390b6a0
ffff88adb7accb38
ffff88ae616ab9b0
ffff88ada720ab38
ffff88adbcfdc208
ffff88bfd0d38390
ffff88ae7ed9f390
ffff88adf1563518
ffff88ae2537e080
ffff88ae2536ab38
ffff88ae7ed9fcc0
ffff88ae90f66b38
ffff88ae22486080
ffff88ae7ef11208
ffff88adb7acc080
ffff886ce6b229b0
ffff884eaca9e080
ffff88ae66172b38
ffff88adb7acc208
ffff88ae905909b0
ffff88eab5714518
ffff88eab66ff6a0
ffff88ad9c675390
ffff885fd127b390
ffff88ae910f7e48
ffff88aeb12ad390
ffff88bfd0a40518
ffff880e4d5da9b0
ffff880d5b51bb38
ffff88ae923cfcc0
ffff88bfd2b6f9b0
ffff88ae2537e6a0
ffff88aeaea71e48
ffff88ae66172cc0
ffff88ae8ff2fcc0
ffff88cdd77ef828
ffff88ae8ff2f080
ffff88aeaeae2208
ffff88ae265f4828
ffff884d7a193e48
ffff88fbd6a23b38
ffff88aeb1bd0518
ffff88e9504b6518
ffff88ae911aa6a0
ffff88bfd0a409b0
ffff88aeaea71828
ffff884e8e773518
ffff88ae911aab38
ffff88aeaea71cc0
ffff88aeaea71b38
ffff88cd9a917828
ffff88aeb2bcd828
ffff88ae905906a0
ffff88ea1e277b38
ffff88ae9131a9b0
ffff88ae21d24828
ffff88aeb13c89b0
ffff88cdeeaa2cc0
ffff88ae0edf8b38
ffff88cdd7776b38
ffff88aeb2adae48
ffff88ae148ba828
ffff886d35178cc0
ffff88bfd0d38cc0
ffff88bfd0d38208
ffff885fd0b82208
ffff88bfd1be1b38
ffff88ae907ee828
ffff88adfd0d6208
ffff88adb2c49b38
ffff886ea6dc8828
ffff88ae250e1208
ffff88e99bccb518
ffff88ad97db7390
ffff88adb2d486a0
ffff88ad97db7208
ffff886e4eeea208
ffff88fbd3589080
ffff882db73d7390
ffff884b6d2e8080
ffff88eab56d8b38
ffff88ae907ee6a0
ffff88adf3bc7080
ffff88adf3bc7390
ffff88ceaa115208
ffff88ae22167518
ffff886ea759c390
ffff88ae9111ce48
ffff88addadc9390
ffff88e9de77a9b0
ffff88adfd0d6390
ffff88e9f4de2828
ffff88add02ac9b0
ffff88cdc7005080
ffff88ae9111c208
ffff88e9b1b3d828
ffff886d2ccf2b38
ffff88e975de4518
ffff88add403a390
ffff88ae9111c6a0
ffff88adfd0d6b38
ffff88e9ebe57390
ffff88ae90eef828
ffff88aeb2adacc0
ffff88adf1563080
ffff88ae8e323390
ffff88adcae27b38
ffff88ce0c985828
ffff88bfd2ae5b38
ffff886eaa826b38
ffff88addad83390
ffff886eaac6b828
ffff88aeb24e1518
ffff88ae22486cc0
ffff88aeaeae2080
ffff88ae7ed9f518
ffff88a070c117e0
ffff88aea9c85cc0
ffff88ae90f66080
ffff88adb390b9b0
ffff88dfd1b1f390
ffff88bfd0d92cc0
ffff880d5b42ce48
ffff88adb390b208
ffff88ae221cbb38
ffff88adc15ce828
ffff88fbd5324208
ffff88eab511ae48
ffff88e9504b6208
ffff884eb06e59b0
ffff88cd8f5f5518
ffff884e1ae86390
ffff88ae17e92080
ffff88fbd6a23e48
ffff880dbbb52080
ffff880c6d79eb38
ffff880dbb94e6a0
ffff88bfd2923080
ffff88cdae95e208
ffff88bfd0d38080
ffff884ca14e7390
ffff88adf3bc7518
ffff88ae910ed518
ffff88adf14e9080
ffff88ad97f42cc0
ffff880c82b4eb38
ffff88adceeb6390
ffff880d5d7df208
ffff880d89d5c9b0
ffff88ae22087b38
ffff88ae220879b0
ffff886e7af02b38
ffff88ae92350390
ffff88bfd0c80080
ffff88add02ac390
ffff88ae90c65518
ffff88ade791e9b0
ffff88add01ee6a0
ffff88aea9c85828
ffff884eaa200e48
ffff88bfd0905390
ffff88add02ace48
ffff88ade791ee48
ffff884eaed1a390
ffff88ae7655b828
ffff880cb9542390
ffff88ce47371cc0
ffff88adfd172828
ffff88ae90c65390
ffff88ad93faf9b0
ffff88ae1d3d6e48
ffff88cd8f5f56a0
ffff88bfd0d389b0
ffff88adb2c49208
ffff88ae8e307390
ffff88ade1586208
ffff88ae265f4b38
ffff88ae9139f208
ffff88ae8e307080
ffff88ae8e307208
ffff88addadc9208
ffff88addbcd6828
ffff88ae16179518
ffff88ad9c712828
ffff88ae911bccc0
ffff88ae16179cc0
ffff88add02ac208
ffff88bfd0d92208
ffff88bfd0d92390
ffff88bfd2d92080
ffff88aeaeb68e48
ffff88ae161799b0
ffff88bfd1b31cc0
ffff88aea9c856a0
ffff88ae7bcadcc0
ffff88ae19ac86a0
ffff88ae2abbfb38
ffff88adb6738b38
ffff88adfd172080
ffff880db1feb6a0
ffff88e9b1bba390
ffff88ae7ef11b38
ffff88ae04b65208
ffff88ae9139fb38
ffff88adcb5946a0
ffff88ae04b65518
ffff88bfd2ae5080
ffff88ae04b65080
ffff88ae66185390
ffff88ae92350208
ffff880db1febb38
ffff88ae9139f828
ffff88ae04b65e48
ffff88eaa1221080
ffff88ade8ec7e48
ffff88dfd09846a0
ffff88ae04b65390
ffff88ae9040fe48
ffff88bfd2ae5208
ffff88adbccc3828
ffff884ea9c949b0
ffff88bfd2067cc0
ffff88aeb1360828
ffff88bfd2067518
ffff88bfd1b31080
ffff88ae7ef11cc0
ffff88ad9c712cc0
ffff88e950606cc0
ffff88ae1d3d6cc0
ffff886e3558f390
ffff88e9b1b3d9b0
ffff88ad9c712390
ffff88ae91140b38
ffff886d0db7a390
ffff88aeb13609b0
ffff88fbd356e6a0
ffff88e9cfc4d9b0
ffff88ae90c65080
ffff88aeb1360390
ffff88adb2c49828
ffff88ae90d95e48
ffff886e753c6b38
ffff88aeb1360518
ffff88bfd0d92b38
ffff88ad93fafcc0
ffff88adb2c49e48
ffff88adbccc3080
ffff88adb2c49cc0
ffff88ae7655b6a0
ffff88ae90668cc0
ffff88aeb105c828
ffff881fcbb20b38
ffff88ae0ef5eb38
ffff88aeb105c6a0
ffff88aeb105cb38
ffff88bfd2067e48
ffff88bfd2067080
ffff88bfd2067828
ffff88ae50d3e080
ffff88bfd2067390
ffff88ae9138fcc0
ffff881fcac1de48
ffff885fd08dd208
ffff88bfd0a9a080
ffff88ae04b65b38
ffff884eaf34e6a0
ffff88ae08f06390
ffff88ae04b65828
ffff88aeaebf76a0
ffff88ae19a32390
ffff88ae90c65208
ffff886ea7a13828
ffff88cd7c9046a0
ffff88ae91140080
ffff884d4c98d208
ffff88ae08f06208
ffff88adb2d48208
ffff88adfd1349b0
ffff88bfd1bc4080
ffff88cd7c904518
ffff88aeb13c8e48
ffff88ceaa3789b0
ffff88e9a9af6518
ffff88bfd235ecc0
ffff88cdc72e9390
ffff88ae91381b38
ffff880e9dcfeb38
ffff88ceaa488518
ffff880db1feb208
ffff88adb65a8208
ffff88adb65a8390
ffff88e9f30f9390
ffff88bfd235e6a0
ffff88bfd235e518
ffff88ae1ec7de48
ffff88bfd1ff0080
ffff88ae04b65cc0
ffff88adac785828
ffff884eaa895390
ffff884eaa7d76a0
ffff88cdf26bb9b0
ffff88bfd29236a0
ffff88ada7246080
ffff88ae616abcc0
ffff88adac785390
ffff88eaa1221518
ffff884eafd48208
ffff88ae17db66a0
ffff887fd1a2e390
ffff88e9af9bc080
ffff88adfd134080
ffff884e3eae3828
ffff88ae911bce48
ffff88ade8e769b0
ffff88adb2c49518
ffff88aeb13c8b38
ffff88e9504af9b0
ffff884eac132208
ffff886cf997d6a0
ffff88bfd1ff0828
ffff88bfd0a40080
ffff88ae250e1828
ffff88bfd2e03208
ffff88aeb1331518
ffff88ae90d95cc0
ffff88e95d1fce48
ffff885fcf29a208
ffff88ad93fafb38
ffff88ae250e1518
ffff88e929d91e48
ffff88adf14d19b0
ffff88aeb12bde48
ffff88aeb12bd080
ffff884e3e8a5828
ffff88dfd27a1828
ffff88aeb19e96a0
ffff88ade8e76828
ffff884eabc9cb38
ffff88bfd2923e48
ffff88ae661726a0
ffff88aeb1331208
ffff88adac7ff6a0
ffff884ca3563390
ffff88adf14d1080
ffff88aeb12bd390
ffff884ddb078828
ffff88adb79b19b0
ffff884bad02ab38
ffff88ada72476a0
ffff88ae66172518
ffff88adb79b1cc0
ffff88ae24062e48
ffff88ae910ef828
ffff88ade8ec7518
ffff88e929f02518
ffff88bfd114ab38
ffff88aeb13c8518
ffff88adcb596080
ffff88bfd114a6a0
ffff88e9635bc828
ffff88bfd1bc4390
ffff88aeb1331cc0
ffff88aeb1331828
ffff884d3d465390
ffff88eab6f846a0
ffff88adf752ce48
ffff884dc4ec7828
ffff88ade8ec7080
ffff88bfd2923b38
ffff88aeb12bd208
ffff88ae84c5b9b0
ffff88adf752ccc0
ffff884eac0f0cc0
ffff88ae265f4cc0
ffff88add007b9b0
ffff88ae907ee208
ffff88ae663289b0
ffff88ae907eecc0
ffff88e9e09106a0
ffff884eaafb06a0
ffff88ad97f6cb38
ffff88ae66172e48
ffff88ad97f6ccc0
ffff88e99dc8e828
ffff88adb67386a0
ffff88ae911aa518
ffff88add01ee208
ffff88ae21d24080
ffff88ae8ff2f390
ffff884eaa444518
ffff88bfd1bc46a0
ffff884eaad80080
ffff88cdd7543cc0
ffff88ae7bcad390
ffff88ade8ec7b38
ffff88adf1563cc0
ffff88aeaebf7208
ffff88ad9c712080
ffff88bfd2923cc0
ffff88bfd29239b0
ffff88ada720ae48
ffff88aeaebf7080
ffff88ae66172208
ffff88ae221679b0
ffff88bfd20679b0
ffff880db033db38
ffff88ade791e518
ffff880e8dad4390
ffff880e9dc8d9b0
ffff886ce6b22cc0
ffff886eab1ef080
ffff886d0217de48
ffff88ae8f98ae48
ffff88bfd0d92e48
ffff88ae250e4208
ffff88ae250e4cc0
ffff88bfd2067208
ffff88add406a6a0
ffff885fd08ddcc0
ffff884c12dc3e48
ffff884c6dc6f9b0
ffff88ae7ef11390
ffff880cc28f99b0
ffff88ade36a09b0
ffff880e9eef79b0
ffff882d75dec6a0
ffff88addadc9e48
ffff88aeb1331390
ffff88cda8c88828
ffff88adf3809e48
ffff88aea9c6a080
ffff88bfd2d2be48
ffff884eaf304b38
ffff88aeaeae26a0
ffff88aeb19e9080
ffff88adf3809518
ffff88ae906ca6a0
ffff88aeb19e9518
ffff88ae911bc828
ffff88adf3809080
ffff88adc3abd208
ffff88aeb19e9cc0
ffff88ae17e926a0
ffff88adb67389b0
ffff88ae31b04e48
ffff88cda8c88518
ffff88ae923cf390
ffff886d1630c390
ffff880e9eef7208
ffff88aeb12bd518
ffff88adb2c49080

list: duplicate list entry: ffff88adb2c49080
crash> list -r ffff88a070c03f18
ffff88a070c03f18
ffff88adba447490
ffff88aeb13bc1d0
ffff88cdc50f8b38
ffff88ce15a05518
ffff88aea9e7f790
ffff88ce94064208
ffff88ad93fae630
ffff884ead10d518
ffff88ae90eef208
ffff88bfd1b319b0
ffff88ae90eef080
ffff88adf15a9d10
ffff884d7a317080
ffff88e9bf547390
ffff88e9bf547080
ffff882db3077cc0
ffff88aeb105c518
ffff88dfce1f3518
ffff88ae63c08750
ffff88adac7ffe48
ffff88adb79b1208
ffff88ae90f66e48
ffff88ae2d0f0080
ffff88cd7cb87828
ffff88aeb1331b38
ffff88adb2c49080
ffff884bfe6f9518
ffff88dfd08c1080
ffff88ae923cf208
ffff88ae923cf518
ffff88eaa0fdd208
ffff88bfd2b6fb38
ffff886d12663e48
ffff88adb2c499b0
ffff88e925bd7828
ffff880dbb86fcc0
ffff88ae911bc6a0
ffff880cb52c3518
ffff88ae923cf9b0
ffff88ae250e4b38
ffff88ae250e4518
ffff88bfd2a5f828
ffff88cdd77ef9b0
ffff88bfd2a5fe48
ffff88bfd2a5f390
ffff88bfd2a5f9b0
ffff88bfd2a5fcc0
ffff88bfd2a5f080
ffff88aeb1bf7b38
ffff88aeb24e16a0
ffff88adb79b1518
ffff88ae8b457cc0
ffff88ae8b457e48
ffff88ae8b457080
ffff88ae84c5b080
ffff88ae19ac8b38
ffff88ae616ab828
ffff88ae9040f390
ffff88ae17db6208
ffff88bfd196d390
ffff88ade8e76cc0
ffff88ae7679c390
ffff88ada70f8b38
ffff88ae9040f518
ffff88ae7655b9b0
ffff88bfd196d828
ffff88ae8e3076a0
ffff88ae22087828
ffff88ae911aa208
ffff88ae7bce1b38
ffff88ae072c1cc0
ffff88adf14e9208
ffff88bfd19a9cc0
ffff88adcad669b0
ffff88ae51ed0208
ffff88ae923cf6a0
ffff88ae2d0f0e48
ffff88ae912966a0
ffff88adcee81b38
ffff88ae29d81080
ffff88adb67a3e48
ffff88adb67a3b38
ffff88ae585049b0
ffff88ae905049b0
ffff88adbccb9828
ffff88aea9c85080
ffff88aeb24e19b0
ffff88ae91296080
ffff88ae8fc3bb38
ffff88adcb594828
ffff88aea9e22828
ffff88addd1869b0
ffff88ae923cfb38
ffff88adcae27208
ffff88ae2536a6a0
ffff88ae2536a828
ffff886e744a1e48
ffff88bfd1be1080
ffff88aeae8d0390
ffff88adcb594e48
ffff88adf14e9b38
ffff88aeaea71518
ffff88ae7ed659b0
ffff88adcad66828
ffff88cda13109b0
ffff88adb67a3cc0
ffff88aeaeb17cc0
ffff88adf752c208
ffff88adf14e9cc0
ffff88ae90668208
ffff88aeaea71390
ffff88aea9c85e48
ffff88ae911bc9b0
ffff88ae2d0f0828
ffff88adb65a8b38
ffff88aeaebf79b0
ffff88ae2536ae48
ffff88ad9c712b38
ffff88ae90504518
ffff88ae1ec7d6a0
ffff88ae7ef11518
ffff88ae51ed0e48
ffff88ae90590518
ffff88adb7acc6a0
ffff88adf1656cc0
ffff88ae19a329b0
ffff88ae2536acc0
ffff88ae911406a0
ffff88ae7bcadb38
ffff88aeb19e99b0
ffff88adceeb6828
ffff88aeaea71080
ffff88ada70f8390
ffff88ae90f666a0
ffff88adb390b518
ffff88ae7ed9f208
ffff88ae265f4208
ffff88ae911bc208
ffff88adcb594208
ffff88ada70f8518
ffff88ae90c98080
ffff88aeb24e1cc0
ffff88ae0ece3208
ffff88addd186e48
ffff88adf14e9e48
ffff88adfd134e48
ffff88bfd114a9b0
ffff88add02ac080
ffff88adb7acce48
ffff88ade36a0390
ffff88adf1563e48
ffff88bfd0d929b0
ffff88ade8e766a0
ffff88aeb1e53e48
ffff88addbcd6cc0
ffff88adfd134cc0
ffff88bfd2e036a0
ffff88bfd1ab1828
ffff88add007bcc0
ffff88adfd134828
ffff88bfd14d5208
ffff88aeb12ad518
ffff88ae911aae48
ffff88ae90590b38
ffff88ae2536a9b0
ffff88ae911aa828
ffff88bfd2923390
ffff88ae91296cc0
ffff88ae90504208
ffff88bfd235eb38
ffff88bfd235e208
ffff88adfd135390
ffff88ae91296e48
ffff88adb6739518
ffff88ae911bcb38
ffff88addd186208
ffff88ae91296828
ffff88ae90668518
ffff88adb2d48b38
ffff88addad83828
ffff88ae2537e9b0
ffff88aeaeae2b38
ffff88bfd2923828
ffff88ade791e208
ffff88bfd0c806a0
ffff88bfd14d5518
ffff88bfd0c80cc0
ffff88ad97f6c518
ffff88adb67a3828
ffff88aeb2ada518
ffff88aeb2ada828
ffff88ae9138f9b0
ffff88ae9138f828
ffff88ae905046a0
ffff88ae91296390
ffff88aeb2ada9b0
ffff88adbccc3390
ffff88adf14e9390
ffff88bfd2d2b9b0
ffff88bfd2d2b390
ffff88cdd77efe48
ffff88bfd2b6f6a0
ffff88bfd2923518
ffff88bfd0c80b38
ffff88aeb12bd828
ffff88ae7ef119b0
ffff88aeb12bdb38
ffff88ad97f6c390
ffff88ad97f6ce48
ffff88ae7ed9f6a0
ffff88adb2c496a0
ffff88aeb12ad080
ffff88bfd0c80390
ffff88ae7ef11828
ffff88bfd2923208
ffff88bfd0c80518
ffff88aeb12adb38
ffff88aeb12ad828
ffff88adb67a39b0
ffff88aeb12ad9b0
ffff88adb7acc9b0
ffff88adbcfdccc0
ffff88adbcfdc828
ffff88adbcfdc9b0
ffff88adbcfdcb38
ffff88adbcfdc518
ffff88adbcfdc080
ffff88adb67a3390
ffff88ae25263e48
ffff88ae910edb38
ffff88aeb12e6208
ffff88ae616ab208
ffff88ae616ab6a0
ffff88ae910ed9b0
ffff88aeaeae29b0
ffff88ada720a9b0
ffff88ae25263cc0
ffff88adcd691b38
ffff88bfd2d2b208
ffff88bfd2d2b828
ffff88ae910ed6a0
ffff88adb2d48390
ffff88bfd2d2bb38
ffff88adb2d48cc0
ffff88adcd691828
ffff88adcd691cc0
ffff88aeb12e6b38
ffff88ada720a6a0
ffff88aeb12e6e48
ffff88add004f208
ffff88adcee819b0
ffff88ae7ed9f828
ffff88aea9c859b0
ffff88adb2c49390
ffff88ae58504208
ffff88bfd0a40208
ffff88ae58504390
ffff88ae0edf8390
ffff88ae2536a390
ffff884eac548080
ffff88ae252639b0
ffff88ae252636a0
ffff88ae910ed080
ffff88aeb12bd9b0
ffff88addbcd66a0
ffff88ae221cb828
ffff88bfd0a40390
ffff88aea9c85b38
ffff88ae2536a208
ffff88ae2536a080
ffff88bfd0a406a0
ffff88ae9139fe48
ffff88ae072c1e48
ffff88aeaeb686a0
ffff88aeb1bf7208
ffff88aeb1bf7518
ffff88adf38096a0
ffff88bfd0a40828
ffff88bfd0a40e48
ffff88ae221cb080
ffff88ae7bcad518
ffff88aeb1bf79b0
ffff88aeaeb68cc0
ffff88aeaeb689b0
ffff88adb390b080
ffff88adb390bb38
ffff88aeb19e9e48
ffff886d054779b0
ffff88ae072c1518
ffff88ae21d24b38
ffff88aeb1bf7390
ffff88aea9c85208
ffff88aeb1bf7828
ffff88aea9c85518
ffff88ae8f98a518
ffff88adf3809b38
ffff88ae21d24e48
ffff88ae51ed06a0
ffff88ae21d24390
ffff88ae8f98acc0
ffff88addad836a0
ffff88ae907ee080
ffff88bfd2e03390
ffff88ae8f98a828
ffff88bfd2a5f518
ffff88adfd0d6518
ffff88ae7bcad080
ffff88addad83cc0
ffff88adfd134390
ffff88ae224869b0
ffff886cf9a4e6a0
ffff88ae8f98ab38
ffff88ae7ed65518
ffff88adf1563390
ffff88bfd114acc0
ffff88ada70f89b0
ffff88aeb19e9828
ffff882dd7da1390
ffff88ae21d24518
ffff88ae8f98a390
ffff88ae22486208
ffff88ae7ed65080
ffff882ea948b6a0
ffff883fd0ae3518
ffff88ae9131a208
ffff882e7548eb38
ffff882db73d6b38
ffff882db924e6a0
ffff88ae22167b38
ffff88adb7acc518
ffff883fd0b3ab38
ffff882ea8ca7390
ffff882d9f8636a0
ffff882d75cb8080
ffff882dd8a6f9b0
ffff882dd826f208
ffff882d5df6e9b0
ffff882e75511b38
ffff88ae8fc3bcc0
ffff88ae51ed0828
ffff88ae7bcad9b0
ffff88addadc9b38
ffff88ae209bb080
ffff88bfd0c80208
ffff88aeb1bd0b38
ffff88aeb12adcc0
ffff880c6d79e6a0
ffff884ead8aa208
ffff88adb79b1b38
ffff88adb65ae9b0
ffff88ae25263b38
ffff880db1feb390
ffff882dc623b390
ffff88adb390b6a0
ffff88adb7accb38
ffff88ae616ab9b0
ffff88ada720ab38
ffff88adbcfdc208
ffff88bfd0d38390
ffff88ae7ed9f390
ffff88adf1563518
ffff88ae2537e080
ffff88ae2536ab38
ffff88ae7ed9fcc0
ffff88ae90f66b38
ffff88ae22486080
ffff88ae7ef11208
ffff88adb7acc080
ffff886ce6b229b0
ffff884eaca9e080
ffff88ae66172b38
ffff88adb7acc208
ffff88ae905909b0
ffff88eab5714518
ffff88eab66ff6a0
ffff88ad9c675390
ffff885fd127b390
ffff88ae910f7e48
ffff88aeb12ad390
ffff88bfd0a40518
ffff880e4d5da9b0
ffff880d5b51bb38
ffff88ae923cfcc0
ffff88bfd2b6f9b0
ffff88ae2537e6a0
ffff88aeaea71e48
ffff88ae66172cc0
ffff88ae8ff2fcc0
ffff88cdd77ef828
ffff88ae8ff2f080
ffff88aeaeae2208
ffff88ae265f4828
ffff884d7a193e48
ffff88fbd6a23b38
ffff88aeb1bd0518
ffff88e9504b6518
ffff88ae911aa6a0
ffff88bfd0a409b0
ffff88aeaea71828
ffff884e8e773518
ffff88ae911aab38
ffff88aeaea71cc0
ffff88aeaea71b38
ffff88cd9a917828
ffff88aeb2bcd828
ffff88ae905906a0
ffff88ea1e277b38
ffff88ae9131a9b0
ffff88ae21d24828
ffff88aeb13c89b0
ffff88cdeeaa2cc0
ffff88ae0edf8b38
ffff88cdd7776b38
ffff88aeb2adae48
ffff88ae148ba828
ffff886d35178cc0
ffff88bfd0d38cc0
ffff88bfd0d38208
ffff885fd0b82208
ffff88bfd1be1b38
ffff88ae907ee828
ffff88adfd0d6208
ffff88adb2c49b38
ffff886ea6dc8828
ffff88ae250e1208
ffff88e99bccb518
ffff88ad97db7390
ffff88adb2d486a0
ffff88ad97db7208
ffff886e4eeea208
ffff88fbd3589080
ffff882db73d7390
ffff884b6d2e8080
ffff88eab56d8b38
ffff88ae907ee6a0
ffff88adf3bc7080
ffff88adf3bc7390
ffff88ceaa115208
ffff88ae22167518
ffff886ea759c390
ffff88ae9111ce48
ffff88addadc9390
ffff88e9de77a9b0
ffff88adfd0d6390
ffff88e9f4de2828
ffff88add02ac9b0
ffff88cdc7005080
ffff88ae9111c208
ffff88e9b1b3d828
ffff886d2ccf2b38
ffff88e975de4518
ffff88add403a390
ffff88ae9111c6a0
ffff88adfd0d6b38
ffff88e9ebe57390
ffff88ae90eef828
ffff88aeb2adacc0
ffff88adf1563080
ffff88ae8e323390
ffff88adcae27b38
ffff88ce0c985828
ffff88bfd2ae5b38
ffff886eaa826b38
ffff88addad83390
ffff886eaac6b828
ffff88aeb24e1518
ffff88ae22486cc0
ffff88aeaeae2080
ffff88ae7ed9f518
ffff88a070c117e0
ffff88aea9c85cc0
ffff88ae90f66080
ffff88adb390b9b0
ffff88dfd1b1f390
ffff88bfd0d92cc0
ffff880d5b42ce48
ffff88adb390b208
ffff88ae221cbb38
ffff88adc15ce828
ffff88fbd5324208
ffff88eab511ae48
ffff88e9504b6208
ffff884eb06e59b0
ffff88cd8f5f5518
ffff884e1ae86390
ffff88ae17e92080
ffff88fbd6a23e48
ffff880dbbb52080
ffff880c6d79eb38
ffff880dbb94e6a0
ffff88bfd2923080
ffff88cdae95e208
ffff88bfd0d38080
ffff884ca14e7390
ffff88adf3bc7518
ffff88ae910ed518
ffff88adf14e9080
ffff88ad97f42cc0
ffff880c82b4eb38
ffff88adceeb6390
ffff880d5d7df208
ffff880d89d5c9b0
ffff88ae22087b38
ffff88ae220879b0
ffff886e7af02b38
ffff88ae92350390
ffff88bfd0c80080
ffff88add02ac390
ffff88ae90c65518
ffff88ade791e9b0
ffff88add01ee6a0
ffff88aea9c85828
ffff884eaa200e48
ffff88bfd0905390
ffff88add02ace48
ffff88ade791ee48
ffff884eaed1a390
ffff88ae7655b828
ffff880cb9542390
ffff88ce47371cc0
ffff88adfd172828
ffff88ae90c65390
ffff88ad93faf9b0
ffff88ae1d3d6e48
ffff88cd8f5f56a0
ffff88bfd0d389b0
ffff88adb2c49208
ffff88ae8e307390
ffff88ade1586208
ffff88ae265f4b38
ffff88ae9139f208
ffff88ae8e307080
ffff88ae8e307208
ffff88addadc9208
ffff88addbcd6828
ffff88ae16179518
ffff88ad9c712828
ffff88ae911bccc0
ffff88ae16179cc0
ffff88add02ac208
ffff88bfd0d92208
ffff88bfd0d92390
ffff88bfd2d92080
ffff88aeaeb68e48
ffff88ae161799b0
ffff88bfd1b31cc0
ffff88aea9c856a0
ffff88ae7bcadcc0
ffff88ae19ac86a0
ffff88ae2abbfb38
ffff88adb6738b38
ffff88adfd172080
ffff880db1feb6a0
ffff88e9b1bba390
ffff88ae7ef11b38
ffff88ae04b65208
ffff88ae9139fb38
ffff88adcb5946a0
ffff88ae04b65518
ffff88bfd2ae5080
ffff88ae04b65080
ffff88ae66185390
ffff88ae92350208
ffff880db1febb38
ffff88ae9139f828
ffff88ae04b65e48
ffff88eaa1221080
ffff88ade8ec7e48
ffff88dfd09846a0
ffff88ae04b65390
ffff88ae9040fe48
ffff88bfd2ae5208
ffff88adbccc3828
ffff884ea9c949b0
ffff88bfd2067cc0
ffff88aeb1360828
ffff88bfd2067518
ffff88bfd1b31080
ffff88ae7ef11cc0
ffff88ad9c712cc0
ffff88e950606cc0
ffff88ae1d3d6cc0
ffff886e3558f390
ffff88e9b1b3d9b0
ffff88ad9c712390
ffff88ae91140b38
ffff886d0db7a390
ffff88aeb13609b0
ffff88fbd356e6a0
ffff88e9cfc4d9b0
ffff88ae90c65080
ffff88aeb1360390
ffff88adb2c49828
ffff88ae90d95e48
ffff886e753c6b38
ffff88aeb1360518
ffff88bfd0d92b38
ffff88ad93fafcc0
ffff88adb2c49e48
ffff88adbccc3080
ffff88adb2c49cc0
ffff88ae7655b6a0
ffff88ae90668cc0
ffff88aeb105c828
ffff881fcbb20b38
ffff88ae0ef5eb38
ffff88aeb105c6a0
ffff88aeb105cb38
ffff88bfd2067e48
ffff88bfd2067080
ffff88bfd2067828
ffff88ae50d3e080
ffff88bfd2067390
ffff88ae9138fcc0
ffff881fcac1de48
ffff885fd08dd208
ffff88bfd0a9a080
ffff88ae04b65b38
ffff884eaf34e6a0
ffff88ae08f06390
ffff88ae04b65828
ffff88aeaebf76a0
ffff88ae19a32390
ffff88ae90c65208
ffff886ea7a13828
ffff88cd7c9046a0
ffff88ae91140080
ffff884d4c98d208
ffff88ae08f06208
ffff88adb2d48208
ffff88adfd1349b0
ffff88bfd1bc4080
ffff88cd7c904518
ffff88aeb13c8e48
ffff88ceaa3789b0
ffff88e9a9af6518
ffff88bfd235ecc0
ffff88cdc72e9390
ffff88ae91381b38
ffff880e9dcfeb38
ffff88ceaa488518
ffff880db1feb208
ffff88adb65a8208
ffff88adb65a8390
ffff88e9f30f9390
ffff88bfd235e6a0
ffff88bfd235e518
ffff88ae1ec7de48
ffff88bfd1ff0080
ffff88ae04b65cc0
ffff88adac785828
ffff884eaa895390
ffff884eaa7d76a0
ffff88cdf26bb9b0
ffff88bfd29236a0
ffff88ada7246080
ffff88ae616abcc0
ffff88adac785390
ffff88eaa1221518
ffff884eafd48208
ffff88ae17db66a0
ffff887fd1a2e390
ffff88e9af9bc080
ffff88adfd134080
ffff884e3eae3828
ffff88ae911bce48
ffff88ade8e769b0
ffff88adb2c49518
ffff88aeb13c8b38
ffff88e9504af9b0
ffff884eac132208
ffff886cf997d6a0
ffff88bfd1ff0828
ffff88bfd0a40080
ffff88ae250e1828
ffff88bfd2e03208
ffff88aeb1331518
ffff88ae90d95cc0
ffff88e95d1fce48
ffff885fcf29a208
ffff88ad93fafb38
ffff88ae250e1518
ffff88e929d91e48
ffff88adf14d19b0
ffff88aeb12bde48
ffff88aeb12bd080
ffff884e3e8a5828
ffff88dfd27a1828
ffff88aeb19e96a0
ffff88ade8e76828
ffff884eabc9cb38
ffff88bfd2923e48
ffff88ae661726a0
ffff88aeb1331208
ffff88adac7ff6a0
ffff884ca3563390
ffff88adf14d1080
ffff88aeb12bd390
ffff884ddb078828
ffff88adb79b19b0
ffff884bad02ab38
ffff88ada72476a0
ffff88ae66172518
ffff88adb79b1cc0
ffff88ae24062e48
ffff88ae910ef828
ffff88ade8ec7518
ffff88e929f02518
ffff88bfd114ab38
ffff88aeb13c8518
ffff88adcb596080
ffff88bfd114a6a0
ffff88e9635bc828
ffff88bfd1bc4390
ffff88aeb1331cc0
ffff88aeb1331828
ffff884d3d465390
ffff88eab6f846a0
ffff88adf752ce48
ffff884dc4ec7828
ffff88ade8ec7080
ffff88bfd2923b38
ffff88aeb12bd208
ffff88ae84c5b9b0
ffff88adf752ccc0
ffff884eac0f0cc0
ffff88ae265f4cc0
ffff88add007b9b0
ffff88ae907ee208
ffff88ae663289b0
ffff88ae907eecc0
ffff88e9e09106a0
ffff884eaafb06a0
ffff88ad97f6cb38
ffff88ae66172e48
ffff88ad97f6ccc0
ffff88e99dc8e828
ffff88adb67386a0
ffff88ae911aa518
ffff88add01ee208
ffff88ae21d24080
ffff88ae8ff2f390
ffff884eaa444518
ffff88bfd1bc46a0
ffff884eaad80080
ffff88cdd7543cc0
ffff88ae7bcad390
ffff88ade8ec7b38
ffff88adf1563cc0
ffff88aeaebf7208
ffff88ad9c712080
ffff88bfd2923cc0
ffff88bfd29239b0
ffff88ada720ae48
ffff88aeaebf7080
ffff88ae66172208
ffff88ae221679b0
ffff88bfd20679b0
ffff880db033db38
ffff88ade791e518
ffff880e8dad4390
ffff880e9dc8d9b0
ffff886ce6b22cc0
ffff886eab1ef080
ffff886d0217de48
ffff88ae8f98ae48
ffff88bfd0d92e48
ffff88ae250e4208
ffff88ae250e4cc0
ffff88bfd2067208
ffff88add406a6a0
ffff885fd08ddcc0
ffff884c12dc3e48
ffff884c6dc6f9b0
ffff88ae7ef11390
ffff880cc28f99b0
ffff88ade36a09b0
ffff880e9eef79b0
ffff882d75dec6a0
ffff88addadc9e48
ffff88aeb1331390
ffff88cda8c88828
ffff88adf3809e48
ffff88aea9c6a080
ffff88bfd2d2be48
ffff884eaf304b38
ffff88aeaeae26a0
ffff88aeb19e9080
ffff88adf3809518
ffff88ae906ca6a0
ffff88aeb19e9518
ffff88ae911bc828
ffff88adf3809080
ffff88adc3abd208
ffff88aeb19e9cc0
ffff88ae17e926a0
ffff88adb67389b0
ffff88ae31b04e48
ffff88cda8c88518
ffff88ae923cf390
ffff886d1630c390
ffff880e9eef7208
ffff88aeb12bd518
ffff88adb2c49080

list: duplicate list entry: ffff88adb2c49080

crash> p jiffies
jiffies = $2 = 4465500172

crash> struct request ffff88adb2c49080
struct request {
  queuelist = {
    next = 0xffff884bfe6f9518,
    prev = 0xffff88aeb12bd518
  },
  csd = {
    list = {
      next = 0xffffffff81278370 <trigger_softirq>,
      prev = 0xffff88adb2c49070
    },
    func = 0x1,
    info = 0xffff881fcd641328,
    flags = 7749,
    priv = 256
  },
  q = 0x1,
  cmd_flags = 135,
  cmd_type = 8192,
  atomic_flags = 721182495,
  cpu = -208067136,
  __data_len = 4294936749,
  __sector = 18446612879430460864,
  bio = 0x0,
  biotail = 0x0,
  hash = {
    next = 0xffff88adb2c490f0,
    pprev = 0x0
  },
  {
    rb_node = {
      rb_parent_color = 0,
      rb_right = 0x0,
      rb_left = 0x0
    },
    completion_data = 0x0
  },
  {
    elevator_private = {0x0, 0xffff883fd1dd6000, 0x10a2a1cbe},
    flush = {
      seq = 0,
      list = {
        next = 0xffff883fd1dd6000,
        prev = 0x10a2a1cbe
      }
    }
  },
  rq_disk = 0x9b777c92699a,
  start_time = 170938157020987,
  start_time_ns = 4294967297,
  io_start_time_ns = 18446612745141827968,
  nr_phys_segments = 0,
  ioprio = 0,
  ref_count = 0,
  special = 0xb,
  buffer = 0x0,
  tag = 0,
  errors = 0,
  __cmd = "\020\204\300c\256\210\377\377\020\000\000\000\000\000\000",
  cmd = 0x200000000000 <Address 0x200000000000 out of bounds>,
  cmd_len = 0,
  extra_len = 0,
  sense_len = 170536693,
  resid_len = 1,
  sense = 0xffff88adb2c49ac8,
  deadline = 18446612878404586184,
  timeout_list = {
    next = 0xfa0,
    prev = 0xffffffffa0002a00
  },
  timeout = 2999226456,
  retries = -30547,
  end_io = 0x0,
  end_io_data = 0x0,
  next_rq = 0xffffffff00000000,
  pad = 0xffff88ae616aaae8
}

Comment 9 Ewan D. Milne 2016-04-11 19:09:59 UTC

Thank you.  Can you provide the output of

> struct request ffff88a070c137c0

> foreach bt

Comment 10 Roger Heflin 2016-04-11 20:54:42 UTC

Created attachment 1146143 [details]
struct request ffff88a070c137c0 and foreach bt output.

struct request ffff88a070c137c0 and foreach bt output from corefile.

Comment 11 Ewan D. Milne 2016-04-15 19:33:56 UTC

OK, thank you very much.

Comment 12 Roger Heflin 2016-04-21 21:31:44 UTC

If the below looks like the bug, then we have reproduced it with the same test node load without the imperva module load.  An oracle flashback was in progress doing in excess of 10000write-iops/second (480MB/second) to a flash array.  iostats was reporting queue: values on the disks of up to 27 about the time the panic happened, with BUSY values of 95+, with avgresp of 15 or so and avgwait of as high of 472, so the io subsystem was being pushed hard during the flashback.

<4>[86408.053121] ------------[ cut here ]------------
<2>[86408.062550] kernel BUG at block/blk-core.c:1144!
<4>[86408.074664] invalid opcode: 0000 [#1] SMP
<4>[86408.087949] last sysfs file: /sys/devices/virtual/net/bond0/carrier
<4>[86408.099259] CPU 20
<4>[86408.101943] Modules linked in: oracleacfs(P)(U) oracleadvm(P)(U) oracleoks(P)(U) hangcheck_timer mptctl mptbase nfsd exportfs oracleasm(U) nfs lockd fscache auth_rpcgss nfs_acl sunrpc bonding 8021q garp stp llc ipv6 ext3 jbd dm_round_robin iTCO_wdt iTCO_vendor_support be2net ixgbe dca mdio e1000e ptp pps_core microcode ipmi_devintf serio_raw lpc_ich mfd_core hpilo hpwdt i7core_edac edac_core sg power_meter acpi_ipmi ipmi_si ipmi_msghandler bnx2 shpchp ext4 jbd2 mbcache sr_mod cdrom sd_mod lpfc scsi_transport_fc scsi_tgt crc_t10dif pata_acpi ata_generic ata_piix hpsa radeon ttm drm_kms_helper drm i2c_algo_bit i2c_core dm_multipath dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan]
<4>[86408.214181]
<4>[86408.220736] Pid: 990, comm: kblockd/20 Tainted: P           ---------------    2.6.32-504.1.3.el6.x86_64 #1 HP ProLiant DL980 G7
<4>[86408.243917] RIP: 0010:[<ffffffff8126ee84>]  [<ffffffff8126ee84>] blk_requeue_request+0x94/0xa0
<4>[86408.258790] RSP: 0018:ffff881fd2a11bc0  EFLAGS: 00010002
<4>[86408.269941] RAX: ffff88e9edbb5630 RBX: ffff88e9edbb5508 RCX: ffff88e9edbb5630
<4>[86408.283143] RDX: ffff88e9edbb5630 RSI: ffff88e9edbb5508 RDI: ffff88e9edbb5508
<4>[86408.296269] RBP: ffff881fd2a11be0 R08: 0000000000000001 R09: 0000000000000002
<4>[86408.309316] R10: 0000000000000000 R11: 0000000000000000 R12: ffff889fcff0ab68

Comment 13 Roger Heflin 2016-04-22 14:38:13 UTC

Since the purpose of the scsi timeout appears to be to detect a bad/slow scsi device, shouldn't the timer start once the io is submitted to the actual device rather than when the io is submitted to the scsi layer and starts working its way through?   Or maybe do we need 2 timeouts, one for the on the wire time and one for this layer.  The purpose of the timeout in our usage is to deal with underlying physical SAN issues so a timer that timed things closer to the actual SAN device would do a better job for our primary usage.

Comment 17 Tore H. Larsen 2016-09-23 06:54:03 UTC

cc. 2.6.32-504.16.2.el6.x86_64

filename:       /lib/modules/2.6.32-504.16.2.el6.x86_64/weak-updates/elx-lpfc/lpfc.ko
version:        0:10.4.255.16

Think we hit this one under moderate tape IO.

Comment 18 Ewan D. Milne 2016-09-23 13:21:26 UTC

(In reply to Tore H. Larsen from comment #17)
> cc. 2.6.32-504.16.2.el6.x86_64
> 
> filename:      
> /lib/modules/2.6.32-504.16.2.el6.x86_64/weak-updates/elx-lpfc/lpfc.ko
> version:        0:10.4.255.16
> 
> Think we hit this one under moderate tape IO.

OK, thank you.  Do you have a crash dump we could examine?

Comment 19 Tore H. Larsen 2016-09-23 13:32:38 UTC

Unfortunately no dump. Have to admit that this was emulex elx-lpfc driver pulled from Emulex, as I had issues with the defaulte. Kernel also tainted with cxfs (sgi) 7.3.0.3 and lin_tape (ibm) 2.9.4 as well.

[root@pem-adm1 ~]# systool -c scsi_host -v |grep -i fwrev
    fwrev               = "10.0.803.25, sli-4:2:b"
    fwrev               = "10.0.803.25, sli-4:2:b"
    fwrev               = "10.0.803.25, sli-4:2:b"
    fwrev               = "10.0.803.25, sli-4:2:b"
    fwrev               = "10.0.803.25, sli-4:2:b"
    fwrev               = "10.0.803.25, sli-4:2:b"

[root@pem-adm1 ~]# systool -c scsi_host -v |grep -i driv
    lpfc_drvr_version   = "Emulex LightPulse Fibre Channel SCSI driver 10.4.255.16"
    lpfc_drvr_version   = "Emulex LightPulse Fibre Channel SCSI driver 10.4.255.16"
    lpfc_drvr_version   = "Emulex LightPulse Fibre Channel SCSI driver 10.4.255.16"
    lpfc_drvr_version   = "Emulex LightPulse Fibre Channel SCSI driver 10.4.255.16"
    lpfc_drvr_version   = "Emulex LightPulse Fibre Channel SCSI driver 10.4.255.16"
    lpfc_drvr_version   = "Emulex LightPulse Fibre Channel SCSI driver 10.4.255.16"

[root@pem-adm1 ~]# systool -c scsi_host -v |grep -i model
    modeldesc           = "Emulex LPe16002B-M6 PCIe 2-port 16Gb Fibre Channel Adapter"
    modelname           = "LPe16002B-M6"
    modeldesc           = "Emulex LPe16002B-M6 PCIe 2-port 16Gb Fibre Channel Adapter"
    modelname           = "LPe16002B-M6"
    modeldesc           = "Emulex LightPulse LPe16004-M6 4-Port 16Gb Fibre Channel Adapter"
    modelname           = "LPe16004-M6"
    modeldesc           = "Emulex LightPulse LPe16004-M6 4-Port 16Gb Fibre Channel Adapter"
    modelname           = "LPe16004-M6"
    modeldesc           = "Emulex LightPulse LPe16004-M6 4-Port 16Gb Fibre Channel Adapter"
    modelname           = "LPe16004-M6"
    modeldesc           = "Emulex LightPulse LPe16004-M6 4-Port 16Gb Fibre Channel Adapter"
    modelname           = "LPe16004-M6"

[root@pem-adm1 modprobe.d]# more lpfc.conf 
options lpfc lpfc_sg_seg_cnt=256 \
        lpfc_fcp_io_channel=4 \
        lpfc_fcp_io_sched=1 \
        lpfc_fcp_imax=500000 \
        lpfc_lun_queue_depth=32  \
        lpfc_fcp2_no_tgt_reset=1


messages log prior to panic:

Sep 23 05:26:58 pem-adm1 kernel: sd 12:0:2:231: [sdes] Synchronizing SCSI cache
Sep 23 05:26:58 pem-adm1 kernel: lpfc 0000:81:00.1: 1:(0):0722 Target Reset rport failure: rdata xffff882062770fe8
Sep 23 05:26:58 pem-adm1 kernel: end_request: I/O error, dev sdeu, sector 26437910144
Sep 23 05:26:58 pem-adm1 kernel: sd 12:0:2:232: [sdet] Synchronizing SCSI cache
Sep 23 05:26:58 pem-adm1 kernel: sd 12:0:2:233: [sdeu] Synchronizing SCSI cache
Sep 23 05:26:58 pem-adm1 kernel: sd 12:0:2:234: [sdev] Synchronizing SCSI cache
Sep 23 05:26:58 pem-adm1 kernel: sd 12:0:2:240: [sdew] Synchronizing SCSI cache
Sep 23 05:26:58 pem-adm1 kernel: sd 12:0:2:241: [sdex] Synchronizing SCSI cache
Sep 23 05:26:58 pem-adm1 kernel: sd 12:0:2:242: [sdey] Synchronizing SCSI cache
Sep 23 05:26:58 pem-adm1 kernel: sd 12:0:2:243: [sdez] Synchronizing SCSI cache
Sep 23 05:26:58 pem-adm1 kernel: sd 12:0:2:244: [sdfa] Synchronizing SCSI cache
Sep 23 05:26:58 pem-adm1 kernel: sd 12:0:2:245: [sdfb] Synchronizing SCSI cache
Sep 23 05:26:58 pem-adm1 kernel: sd 12:0:2:246: [sdfc] Synchronizing SCSI cache
Sep 23 05:26:58 pem-adm1 kernel: sd 12:0:2:247: [sdfd] Synchronizing SCSI cache
Sep 23 05:26:58 pem-adm1 kernel: sd 12:0:2:248: [sdfe] Synchronizing SCSI cache
Sep 23 05:26:58 pem-adm1 kernel: sd 12:0:2:249: [sdff] Synchronizing SCSI cache
Sep 23 05:27:31 pem-adm1 kernel: lpfc 0000:81:00.1: 1:(0):2756 LOGO failure DID:011A00 Status:x3/x31000002
Sep 23 05:28:34 pem-adm1 kernel: lpfc 0000:81:00.1: 1:(0):2753 PLOGI failure DID:011A00 Status:x3/x31000002
Sep 23 05:29:39 pem-adm1 kernel: lpfc 0000:84:00.1: 3:(0):0727 TMF FCP_LUN_RESET to TGT 1 LUN 231 failed (3, 805306372) iocb_flag x6
Sep 23 05:29:39 pem-adm1 kernel: lpfc 0000:84:00.1: 3:(0):0713 SCSI layer issued Device Reset (1, 231) return x2003
Sep 23 05:30:09 pem-adm1 kernel: lpfc 0000:84:00.1: 3:(0):0203 Devloss timeout on WWPN 20:32:00:80:e5:29:a1:38 NPort x011a00 Data: x100 x5 x6
Sep 23 05:30:09 pem-adm1 kernel: sd 14:0:1:231: [sdnb] Synchronizing SCSI cache
Sep 23 05:30:09 pem-adm1 kernel: lpfc 0000:84:00.1: 3:(0):0722 Target Reset rport failure: rdata xffff88205e5f37e8
Sep 23 05:30:09 pem-adm1 kernel: end_request: I/O error, dev sdnb, sector 26439833216
Sep 23 05:30:09 pem-adm1 kernel: sd 14:0:1:232: [sdnc] Synchronizing SCSI cache
Sep 23 05:30:09 pem-adm1 kernel: sd 14:0:1:233: [sdnd] Synchronizing SCSI cache
Sep 23 05:30:09 pem-adm1 kernel: sd 14:0:1:234: [sdne] Synchronizing SCSI cache
Sep 23 05:30:09 pem-adm1 kernel: sd 14:0:1:240: [sdnf] Synchronizing SCSI cache
Sep 23 05:30:09 pem-adm1 kernel: sd 14:0:1:241: [sdng] Synchronizing SCSI cache
Sep 23 05:30:09 pem-adm1 kernel: sd 14:0:1:242: [sdnh] Synchronizing SCSI cache
Sep 23 05:30:09 pem-adm1 kernel: sd 14:0:1:243: [sdni] Synchronizing SCSI cache
Sep 23 05:30:09 pem-adm1 kernel: sd 14:0:1:244: [sdnj] Synchronizing SCSI cache
Sep 23 05:30:09 pem-adm1 kernel: sd 14:0:1:245: [sdnk] Synchronizing SCSI cache
Sep 23 05:30:09 pem-adm1 kernel: sd 14:0:1:246: [sdnl] Synchronizing SCSI cache
Sep 23 05:30:09 pem-adm1 kernel: sd 14:0:1:247: [sdnm] Synchronizing SCSI cache
Sep 23 05:30:09 pem-adm1 kernel: sd 14:0:1:248: [sdnn] Synchronizing SCSI cache
Sep 23 05:30:09 pem-adm1 kernel: sd 14:0:1:249: [sdno] Synchronizing SCSI cache
Sep 23 05:30:38 pem-adm1 kernel: lpfc 0000:85:00.1: 5:(0):0727 TMF FCP_LUN_RESET to TGT 7 LUN 231 failed (3, 805306372) iocb_flag x6
Sep 23 05:30:38 pem-adm1 kernel: lpfc 0000:85:00.1: 5:(0):0713 SCSI layer issued Device Reset (7, 231) return x2003
Sep 23 05:30:42 pem-adm1 kernel: lpfc 0000:84:00.1: 3:(0):2756 LOGO failure DID:011A00 Status:x3/x31000002
Sep 23 06:25:11 pem-adm1 kernel: imklog 5.8.10, log source = /proc/kmsg started.

No indication on NetApp/SGI E5660F storage side eventlog. Have to admit that the array which includes lun's 231 and 233 had a complete drawer failure some weeks back, but due to enough GHS and enclosure protection it rebuilt completely in the background. Storage array firmware 08.10.15.00. Planning to go to latest qualified 08.10.19.00 after survey.  As well as kernel 2.6.32-504.30.3.

Comment 22 Ewan D. Milne 2016-10-20 13:44:36 UTC

(In reply to Roger Heflin from comment #13)
> Since the purpose of the scsi timeout appears to be to detect a bad/slow
> scsi device, shouldn't the timer start once the io is submitted to the
> actual device rather than when the io is submitted to the scsi layer and
> starts working its way through?   Or maybe do we need 2 timeouts, one for
> the on the wire time and one for this layer.  The purpose of the timeout in
> our usage is to deal with underlying physical SAN issues so a timer that
> timed things closer to the actual SAN device would do a better job for our
> primary usage.

So, the timeout is started when the request is taken off the queue and
submitted to the HBA to go out on the fabric, it is not running while the
request is on the queue, just to be clear.

This happens when scsi_request_fn() calls blk_start_request() -> blk_add_timer().
scsi_request_fn() then calls scsi_dispatch_cmd() which calls the lpfc driver's
lpfc_queuecommand() function.

If at any point the command cannot be started, the timer is stopped and the
command is requeued.  The timer will be started again from the beginning the
next time it is attempted to be issued.

The rare race that David Jeffery refers to is that there is a finite amount of
processing performed between the call to blk_add_timer() and the time when
the driver adds the command to its internal data structures.  Until this is
done, the abort logic (described below) would not work correctly.

A timeout, when it happens, results in a call to abort the command.  The
abort call into the driver will check to see if the command is still pending
(i.e. if it is still in the driver's internal data structures).  If it is,
an ABTS is issued to the fabric and we wait for completion.  The logic is
designed to ensure that the driver is not still using the command prior to
the abort call completion.  After this, the SCSI midlayer performs error
recovery to attempt to regain connectivity to the device.

If the command is in the process of being added to the driver and it times
out, an abort call would succeed, and the normal command processing would also
succeed.  In practice this does not happen because there is not (usually) any
significant delay in the driver's _queuecommand() routine.

There is also a small amount of processing performed if the driver's
_queuecommand() routine rejects the command, causing it to be requeued,
or if one of the checks for (host, target, device) blocked causes it to
be requeued without being issued.  In those cases the timer is stopped
shortly after being started.

There is also the possibility for a race condition upon the completion of
a command, when the driver is removing the reference to the command from its
internal data structures.  In the case of lpfc, however, this is not done
until after ->scsi_done() is called, so it does not look like it would be
the cause of the problem.  It might be a problem for other drivers though.

---

It is difficult to see how we would not make it to the point of calling
lpfc_get_scsi_buf() and setting lpfc_cmd->pCmd and cmnd->host_scribble in
lpfc_queuecommand() within a 4 second timeout.  The sources of delay are:

   spin_lock(shost->host_lock);  [ in scsi_request_fn() ]

   spin_lock_irqsave(host->host_lock, flags);  [ in scsi_dispatch_cmd() ]

   spin_lock_irqsave(&phba->scsi_buf_list_get_lock, iflag);

---

Comment 23 Roger Heflin 2016-10-20 14:15:11 UTC

Here is some new stuff:

We have 4 nodes with each slightly different loads on them that we can reproduce this issue with the 4 sec timeout.  2 of those nodes stop being able to reproduce the kernel panic with the timeout set to 15 seconds.   

2 of the other nodes have still reproduced the scsi crash with the timeout set as high as 180 seconds.  In both of those cases the SAN is very likely losing random requests as there are a number of hosts sharing the ports on the array so there is a high probability of requests being lost and the abort code having to run when this is happening.  This exact same setup was not having any issues under a rhel5.10 errata kernel with the same badly overused SAN and the 4 second timeout.

All nodes reproducing the issue either have extreme load (the first one that reproduced it was doing an oracle flashback and I believe is successfully overloading the write cache on a flash array, it writes at 800mb/second for quite a while (a number of minutes) with good response times and then the response times go up >10x and we hit the issue if the timeout is 4sec, but don't hit it when we have higher timeouts).   In this case the SAN array and ports are dedicated to the given hosts so I would not expect there to be loss of SAN requests, just slow SAN requests because of the array possibly having its write cache overrun.  The other cases are not doing sustained writes for long periods of time but are also rapidly applying oracle archive logs every so often that takes a minute or more of intensive writes.

Given the 180second timeout hitting it it may be more a simple case of the abort is running at the exact same time something else is attempting to work on the same request.  I would be surprised if in the overused SAN case if a request were actually coming back as complete after 180seconds, but that may actually be possible that the array had been trying to send back the completely request for a while and just finally gets it back.    

In all of these cases though, we do see extremely high response times and wait times and I am unable to get my disk response monitoring tool to be able to successfully write stuff to a local internal disk (HP cciss/hpsa driver) for large periods of time when this is happening, so it does appear that the entire disk IO subsystem was backed up and not responding in a reasonable time. I am able to run that same tool with output to the screen being saved on another machine, so it is not an issue with the underlying iostat command I am using to collect my data.   It does show very high response and wait times >4000's of ms when the issue happens.

Comment 26 Ewan D. Milne 2016-10-21 17:04:27 UTC

It would be helpful to at least get the stack traces from the crashes you
mention, particularly those where the timeout values were much higher.

Comment 27 Roger Heflin 2016-10-21 17:27:25 UTC

This is from the serial console we have logging all on the troublesome machines: 

kernel BUG at block/blk-core.c:2166!^M
invalid opcode: 0000 [#1] SMP ^M
last sysfs file: /sys/devices/pci0000:00/0000:00:07.0/0000:0b:00.0/host2/rport-2:0-6/target2:0:4/2:0:4:32/state^M
CPU 55 ^M
Modules linked in: bridge oracleacfs(P)(U) oracleadvm(P)(U) oracleoks(P)(U) hangcheck_timer krg_11_0_0_1130_impRHEL6K1smp-x86_64(P)(U) mptctl mptbase oracleasm(U) bonding 8021q garp stp llc ipv6 ext3 jbd microcode be2net iTCO_wdt iTCO_vendor_support serio_raw lpc_ich mfd_core hpwdt hpilo i7core_edac edac_core e1000e ptp pps_core ses enclosure ipmi_devintf power_meter acpi_ipmi ipmi_si ipmi_msghandler sg bnx2 shpchp ext4 jbd2 mbcache dm_round_robin sr_mod cdrom sd_mod pata_acpi ata_generic ata_piix lpfc scsi_transport_fc scsi_tgt crc_t10dif hpsa radeon ttm drm_kms_helper drm i2c_algo_bit i2c_core dm_multipath dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan]^M
^M
Pid: 833, comm: kblockd/55 Tainted: P           ---------------    2.6.32-504.1.3.el6.x86_64 #1 HP ProLiant DL980 G7^M
RIP: 0010:[<ffffffff8126eafb>]  [<ffffffff8126eafb>] blk_start_request+0x4b/0x50^M
RSP: 0018:ffff881fd2f59c50  EFLAGS: 00010002^M
RAX: 0000000000000000 RBX: ffff880668431690 RCX: 000000000000cb38^M
RDX: 000107c1fd5fbd2e RSI: ffff88c070dc0000 RDI: ffff880668431690^M
RBP: ffff881fd2f59c60 R08: 0000000000000000 R09: 0000000000000000^M
R10: 0000000000000001 R11: 0000000000000000 R12: ffff881fced65e20^M
R13: 000000000000000e R14: 000000000000000e R15: ffff881fced94e68^M
FS:  0000000000000000(0000) GS:ffff88c070dc0000(0000) knlGS:0000000000000000^M
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b^M
CR2: 00007f8714c81978 CR3: 000000c6c4dc7000 CR4: 00000000000007e0^M
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000^M
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400^M
Process kblockd/55 (pid: 833, threadinfo ffff881fd2f58000, task ffff881fd2f57500)^M
Stack:^M
 ffff881fd2f59c70 ffff880668431690 ffff881fd2f59ca0 ffffffff81273a39^M
<d> ffff881fd2f59c90 ffff881fced6f000 ffff881fced94e68 ffff880668431690^M
<d> ffff881fd1973000 ffff88c070dd9ac8 ffff881fd2f59d10 ffffffff813875c1^M
Call Trace:^M
 [<ffffffff81273a39>] blk_queue_start_tag+0x89/0x120^M
 [<ffffffff813875c1>] scsi_request_fn+0x131/0x750^M
 [<ffffffff8108748d>] ? del_timer+0x7d/0xe0^M
 [<ffffffff8126f562>] __generic_unplug_device+0x32/0x40^M
 [<ffffffff8126f59e>] generic_unplug_device+0x2e/0x50^M
 [<ffffffff8126b3e4>] blk_unplug+0x34/0x70^M
 [<ffffffffa000461c>] dm_table_unplug_all+0x5c/0x100 [dm_mod]^M
 [<ffffffff8126b440>] ? blk_unplug_work+0x0/0x70^M
 [<ffffffff8126f562>] ? __generic_unplug_device+0x32/0x40^M
 [<ffffffff8126b440>] ? blk_unplug_work+0x0/0x70^M
 [<ffffffffa0000fa6>] dm_unplug_all+0x36/0x50 [dm_mod]^M
 [<ffffffff8126b476>] blk_unplug_work+0x36/0x70^M
 [<ffffffff8126b440>] ? blk_unplug_work+0x0/0x70^M
 [<ffffffff81097fe0>] worker_thread+0x170/0x2a0^M
 [<ffffffff8109eb00>] ? autoremove_wake_function+0x0/0x40^M
 [<ffffffff81097e70>] ? worker_thread+0x0/0x2a0^M
 [<ffffffff8109e66e>] kthread+0x9e/0xc0^M
 [<ffffffff8100c20a>] child_rip+0xa/0x20^M
 [<ffffffff8109e5d0>] ? kthread+0x0/0xc0^M
 [<ffffffff8100c200>] ? child_rip+0x0/0x20^M
Code: 8b 83 50 01 00 00 48 85 c0 75 15 f6 43 48 01 75 1a 48 89 df e8 f7 9a 00 00 48 83 c4 08 5b c9 c3 8b 50 54 89 90 14 01 00 00 eb e0 <0f> 0b eb fe 90 55 48 89 e5 53 48 83 ec 08 0f 1f 44 00 00 31 c0 ^M
RIP  [<ffffffff8126eafb>] blk_start_request+0x4b/0x50^M
 RSP <ffff881fd2f59c50>^M

I only have this one, we have the timeouts set such that oracle usually boots the node before the panic as the panic leaves the node hung up.   Our kdump setup does not appear to have memory allocated to kdump (it ooms when bring in the ~900 SAN luns), and hangs the node rather that doing anything.    Not sure how to change that in rhel6 since that is all supposed to be automagic. probably a bug in rhel6.6 kdump I assume.

If we were to really need much more I will have to have them upgrade to 6.7 as 6.6 attempts to dump hugepages which is a significant size on this node, and that bug is fixed in 6.7+, and possibly the other kdump issue.

Comment 28 Ewan D. Milne 2016-10-21 18:56:28 UTC

void blk_start_request(struct request *req)
{
        blk_dequeue_request(req);

        /*                                                                                                                                                                                                         
         * We are now handing the request to the hardware, initialize                                                                                                                                              
         * resid_len to full count and add the timeout handler.                                                                                                                                                    
         */
        req->resid_len = blk_rq_bytes(req);
        if (unlikely(blk_bidi_rq(req)))
                req->next_rq->resid_len = blk_rq_bytes(req->next_rq);

        BUG_ON(test_bit(REQ_ATOM_COMPLETE, &req->atomic_flags));       <==== HERE
        blk_add_timer(req);
}

Machine crashed because request already had the completion bit set, before
the timer was started.  The bit is cleared when the request is allocated and
initialized (memset to 0), or when it is requeued (after the timer is stopped),
or when the timer is restarted in certain cases (e.g. due to an FC port being
in the blocked state and the devloss timer not yet having expired).

This will be very difficult to diagnose without a crash dump to examine.
I suspect it was caused by the lpfc driver somehow completing the request
after it had been aborted and requeued (thus the REQ_ATOM_COMPLETE bit had
been reset).

There might be some evidence of this earlier in the console output, is there
anything in there about any aborted commands?

Comment 29 Roger Heflin 2016-10-21 19:18:23 UTC

No aborts on the console or in messages documented in the previous several hours to the crash.  

I have 3 other similar crashes crash with the timeout set low.

<2>kernel BUG at block/blk-core.c:1144!
<4>invalid opcode: 0000 [#1] SMP
<4>last sysfs file: /sys/devices/system/cpu/online
<4>CPU 60
<4>Modules linked in: oracleacfs(P)(U) oracleadvm(P)(U) oracleoks(P)(U) hangcheck_timer krg_11_0_0_1130_impRHEL6K1smp-x86_64(P)(U) mptctl mptbase nfsd exportfs oracleasm(U) nfs lockd fscache auth_rpcgss nfs_acl sunrpc bonding 8021q garp stp llc ipv6 ext3 jbd dm_round_robin iTCO_wdt iTCO_vendor_support be2net ixgbe dca mdio e1000e ptp pps_core microcode ipmi_devintf serio_raw lpc_ich mfd_core hpilo hpwdt i7core_edac edac_core sg power_meter acpi_ipmi ipmi_si ipmi_msghandler bnx2 shpchp ext4 jbd2 mbcache sr_mod cdrom sd_mod lpfc scsi_transport_fc scsi_tgt crc_t10dif pata_acpi ata_generic ata_piix hpsa radeon ttm drm_kms_helper drm i2c_algo_bit i2c_core dm_multipath dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan]
<4>
<4>Pid: 1030, comm: kblockd/60 Tainted: P           ---------------    2.6.32-504.1.3.el6.x86_64 #1 HP ProLiant DL980 G7
<4>RIP: 0010:[<ffffffff8126ee84>]  [<ffffffff8126ee84>] blk_requeue_request+0x94/0xa0
<4>RSP: 0018:ffff881fd2aa3bc0  EFLAGS: 00010006
<4>RAX: ffff88c3bec68f60 RBX: ffff88c3bec68e38 RCX: ffff88c3bec68f60
<4>RDX: ffff88c3bec68f60 RSI: ffff88c3bec68e38 RDI: ffff88c3bec68e38
<4>RBP: ffff881fd2aa3be0 R08: 0000000000000001 R09: 000000000000003c
<4>R10: 0000000000000001 R11: 0000000000000000 R12: ffff881fce404678
<4>R13: 0000000000000000 R14: ffff881fcf314000 R15: ffff886e9ed89580
<4>FS:  0000000000000000(0000) GS:ffff88c070c00000(0000) knlGS:0000000000000000
<4>CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
<4>CR2: 0000000089ca3014 CR3: 0000000e9f403000 CR4: 00000000000007e0
<4>DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
<4>DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
<4>Process kblockd/60 (pid: 1030, threadinfo ffff881fd2aa2000, task ffff881fd2aa1500)
<4>Stack:
<4> ffffffffffffff04 ffff881fcebf4000 ffff881fce404678 ffff88c3bec68e38
<4><d> ffff881fd2aa3c50 ffffffff81387665 ffff881fd2aa3d00 ffff88c73cd0e070
<4><d> ffff883fd2a90280 0000000000000004 ffff881fcebf4138 ffff881fcebf4048
<4>Call Trace:
<4> [<ffffffff81387665>] scsi_request_fn+0x1d5/0x750
<4> [<ffffffff8126f3c1>] __blk_run_queue+0x31/0x40
<4> [<ffffffff8126a89a>] elv_insert+0xfa/0x190
<4> [<ffffffff8126a970>] __elv_add_request+0x40/0x90
<4> [<ffffffff8126edad>] blk_insert_cloned_request+0x7d/0xc0
<4> [<ffffffffa000315c>] dm_dispatch_request+0x3c/0x70 [dm_mod]
<4> [<ffffffffa0003c37>] dm_request_fn+0x187/0x2f0 [dm_mod]
<4> [<ffffffff8126b440>] ? blk_unplug_work+0x0/0x70
<4> [<ffffffff8126f562>] __generic_unplug_device+0x32/0x40
<4> [<ffffffff8126f59e>] generic_unplug_device+0x2e/0x50
<4> [<ffffffff8126b440>] ? blk_unplug_work+0x0/0x70
<4> [<ffffffffa0000fb5>] dm_unplug_all+0x45/0x50 [dm_mod]
<4> [<ffffffff8126b476>] blk_unplug_work+0x36/0x70
<4> [<ffffffff8126b440>] ? blk_unplug_work+0x0/0x70
<4> [<ffffffff81097fe0>] worker_thread+0x170/0x2a0
<4> [<ffffffff8109eb00>] ? autoremove_wake_function+0x0/0x40
<4> [<ffffffff81097e70>] ? worker_thread+0x0/0x2a0
<4> [<ffffffff8109e66e>] kthread+0x9e/0xc0
<4> [<ffffffff8100c20a>] child_rip+0xa/0x20
<4> [<ffffffff8109e5d0>] ? kthread+0x0/0xc0
<4> [<ffffffff8100c200>] ? child_rip+0x0/0x20
<4>Code: 00 00 eb d1 4c 8b 2d 1c 3a 96 00 4d 85 ed 74 bf 49 8b 45 00 49 83 c5 08 48 89 de 4c 89 e7 ff d0 49 8b 45 00 48 85 c0 75 eb eb a4 <0f> 0b eb fe 0f 1f 84 00 00 00 00 00 55 48 89 e5 0f 1f 44 00 00
<1>RIP  [<ffffffff8126ee84>] blk_requeue_request+0x94/0xa0
<4> RSP <ffff881fd2aa3bc0>


And I have this one:

<2>kernel BUG at block/blk-core.c:1144!
<4>invalid opcode: 0000 [#1] SMP
<4>last sysfs file: /sys/devices/system/cpu/online
<4>CPU 91
<4>Modules linked in: oracleacfs(P)(U) oracleadvm(P)(U) oracleoks(P)(U) hangcheck_timer mptctl mptbase oracleasm(U) nfs lockd fscache auth_rpcgss nfs_acl sunrpc bonding 8021q garp stp llc ipv6 ext3 jbd dm_round_robin iTCO_wdt iTCO_vendor_support microcode ipmi_devintf serio_raw lpc_ich mfd_core hpilo hpwdt i7core_edac edac_core sg power_meter acpi_ipmi ipmi_si ipmi_msghandler ixgbe dca ptp pps_core mdio be2net bnx2 shpchp ext4 jbd2 mbcache sr_mod cdrom sd_mod lpfc scsi_transport_fc scsi_tgt crc_t10dif pata_acpi ata_generic ata_piix hpsa radeon ttm drm_kms_helper drm i2c_algo_bit i2c_core dm_multipath dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan]
<4>
<4>Pid: 3310, comm: scsi_eh_3 Tainted: P           ---------------    2.6.32-504.1.3.el6.x86_64 #1 HP ProLiant DL980 G7
<4>RIP: 0010:[<ffffffff8126ee84>]  [<ffffffff8126ee84>] blk_requeue_request+0x94/0xa0
<4>RSP: 0018:ffff881fce925d70  EFLAGS: 00010097
<4>RAX: ffff880e536f54a8 RBX: ffff880e536f5380 RCX: ffff880e536f54a8
<4>RDX: ffff880e536f54a8 RSI: ffff880e536f5380 RDI: ffff880e536f5380
<4>RBP: ffff881fce925d90 R08: ffff881fce925e90 R09: 0000000000000000
<4>R10: 0000000000000002 R11: 0000000000000000 R12: ffff881fce419328
<4>R13: 0000000000000000 R14: ffff881fce419328 R15: ffff881fd169b000
<4>FS:  0000000000000000(0000) GS:ffff882070ec0000(0000) knlGS:0000000000000000
<4>CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
<4>CR2: 00007fa7239e32b8 CR3: 0000006e20455000 CR4: 00000000000007e0
<4>DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
<4>DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
<4>Process scsi_eh_3 (pid: 3310, threadinfo ffff881fce924000, task ffff881fcefde040)
<4>Stack:
<4> 0000000000001057 0000000000000286 ffff882ddcefac80 ffff881fce404800
<4><d> ffff881fce925de0 ffffffff8138881b ffff881fce925dd0 ffff881fce925e80
<4><d> ffff881fce404800 ffff882ddcefac80 ffff881fce925e78 ffff881fce925e90
<4>Call Trace:
<4> [<ffffffff8138881b>] __scsi_queue_insert+0x9b/0x140
<4> [<ffffffff81388f93>] scsi_queue_insert+0x13/0x20
<4> [<ffffffff81384093>] scsi_eh_flush_done_q+0x93/0x150
<4> [<ffffffff81385e11>] scsi_error_handler+0x3d1/0x7c0
<4> [<ffffffff81385a40>] ? scsi_error_handler+0x0/0x7c0
<4> [<ffffffff8109e66e>] kthread+0x9e/0xc0
<4> [<ffffffff8100c20a>] child_rip+0xa/0x20
<4> [<ffffffff8109e5d0>] ? kthread+0x0/0xc0
<4> [<ffffffff8100c200>] ? child_rip+0x0/0x20
<4>Code: 00 00 eb d1 4c 8b 2d 1c 3a 96 00 4d 85 ed 74 bf 49 8b 45 00 49 83 c5 08 48 89 de 4c 89 e7 ff d0 49 8b 45 00 48 85 c0 75 eb eb a4 <0f> 0b eb fe 0f 1f 84 00 00 00 00 00 55 48 89 e5 0f 1f 44 00 00
<1>RIP  [<ffffffff8126ee84>] blk_requeue_request+0x94/0xa0


And this one:
<2>kernel BUG at block/blk-core.c:1144!
<4>invalid opcode: 0000 [#1] SMP
<4>last sysfs file: /sys/devices/virtual/net/bond0/carrier
<4>CPU 20
<4>Modules linked in: oracleacfs(P)(U) oracleadvm(P)(U) oracleoks(P)(U) hangcheck_timer mptctl mptbase oracleasm(U) nfs lockd fscache auth_rpcgss nfs_acl sunrpc bonding 8021q garp stp llc ipv6 ext3 jbd dm_round_robin iTCO_wdt iTCO_vendor_support microcode ipmi_devintf serio_raw lpc_ich mfd_core hpilo hpwdt i7core_edac edac_core sg power_meter acpi_ipmi ipmi_si ipmi_msghandler ixgbe dca ptp pps_core mdio be2net bnx2 shpchp ext4 jbd2 mbcache sr_mod cdrom sd_mod lpfc scsi_transport_fc scsi_tgt crc_t10dif pata_acpi ata_generic ata_piix hpsa radeon ttm drm_kms_helper drm i2c_algo_bit i2c_core dm_multipath dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan]
<4>
<4>Pid: 3346, comm: scsi_eh_5 Tainted: P           ---------------    2.6.32-504.1.3.el6.x86_64 #1 HP ProLiant DL980 G7
<4>RIP: 0010:[<ffffffff8126ee84>]  [<ffffffff8126ee84>] blk_requeue_request+0x94/0xa0
<4>RSP: 0018:ffff881fd00c9d70  EFLAGS: 00010006
<4>RAX: ffff884de19a44a8 RBX: ffff884de19a4380 RCX: ffff884de19a44a8
<4>RDX: ffff884de19a44a8 RSI: ffff884de19a4380 RDI: ffff884de19a4380
<4>RBP: ffff881fd00c9d90 R08: ffff881fd00c9e90 R09: 0000000000000000
<4>R10: 0000000000000002 R11: 0000000000000000 R12: ffff885fd1dbe778
<4>R13: 0000000000000000 R14: ffff885fd1dbe778 R15: ffff885fd2b50000
<4>FS:  0000000000000000(0000) GS:ffff884070c00000(0000) knlGS:0000000000000000
<4>CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
<4>CR2: 00007fe796932008 CR3: 0000002e67737000 CR4: 00000000000007e0
<4>DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
<4>DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
<4>Process scsi_eh_5 (pid: 3346, threadinfo ffff881fd00c8000, task ffff881fd0845500)
<4>Stack:
<4> 0000000000001057 0000000000000286 ffff884de1f07dc0 ffff885fd3fdc000
<4><d> ffff881fd00c9de0 ffffffff8138881b ffff881fd00c9dd0 ffff881fd00c9e80
<4><d> ffff885fd3fdc000 ffff884de1f07dc0 ffff881fd00c9e78 ffff881fd00c9e90
<4>Call Trace:
<4> [<ffffffff8138881b>] __scsi_queue_insert+0x9b/0x140
<4> [<ffffffff81388f93>] scsi_queue_insert+0x13/0x20
<4> [<ffffffff81384093>] scsi_eh_flush_done_q+0x93/0x150
<4> [<ffffffff81385e11>] scsi_error_handler+0x3d1/0x7c0
<4> [<ffffffff81385a40>] ? scsi_error_handler+0x0/0x7c0
<4> [<ffffffff8109e66e>] kthread+0x9e/0xc0
<4> [<ffffffff8100c20a>] child_rip+0xa/0x20
<4> [<ffffffff8109e5d0>] ? kthread+0x0/0xc0
<4> [<ffffffff8100c200>] ? child_rip+0x0/0x20
<4>Code: 00 00 eb d1 4c 8b 2d 1c 3a 96 00 4d 85 ed 74 bf 49 8b 45 00 49 83 c5 08 48 89 de 4c 89 e7 ff d0 49 8b 45 00 48 85 c0 75 eb eb a4 <0f> 0b eb fe 0f 1f 84 00 00 00 00 00 55 48 89 e5 0f 1f 44 00 00
<1>RIP  [<ffffffff8126ee84>] blk_requeue_request+0x94/0xa0
<4> RSP <ffff881fd00c9d70>

Comment 30 loberman 2016-10-26 20:05:52 UTC

Hello Roger

A couple of quick questions because I have been watching the BZ here and I have fallen behind the current configuration status.

What is the current configuration with respect to the settings for eh_deadline and eh_timeout
What is the current setting within multipath.conf for dev_loss_tmo etc.
What is the current Oracle disk heartbeat timer set to here.

You are statistically within a very tight margin here with these very low scsi timeouts where you are now exposed to these timing races.

How long is the maximum SCSI error recovery time allowed here before we have evictions.
I know you have these very low SCSI timeouts for that reason.

Many Thanks
Laurence Oberman

Comment 31 Roger Heflin 2016-10-26 21:42:48 UTC

we have not change eh* from default.  dev_loss_tmo is I believe 30.

We are not getting any declared timeouts on the SAN.  No scsi errors in general.  when we do get scsi errors they are 24 seconds 

The disk timeout on some of the nodes is as high as 120 seconds with similar values for oracles hb timers.   When it hits 120 seconds usually (but not always) oracle is evicting the node.  If we set the oracle timeout a few seconds higher we hit the scsi bugs.

Comment 33 John Pittman 2016-11-04 13:44:59 UTC

Created attachment 1217410 [details]
Commands requested through 01557267

Comment 39 Roger Heflin 2016-12-09 15:56:49 UTC

"Some of the array vendors have a SAN "jammer".  I believe it is basically a SAN analyzer with some sort of license option that allows one to cause a SAN packet to get misplaces.  Based on what we think is going on losing a read or a write often enough may be enough to reproduce this error as that is what our cleaning up would have reduced.

Not sure if RedHat can borrow the device from an array or switch vendor or if RedHat knows of someone else with the device.   One might be able to insert something in the kernel to randomly lose a SAN read or write packet just the attempt to simulate the situation of SAN packet loss because of congestion in the SAN fabric.  Thinking about it adding some magic in the kernel may be simpler. The vendor indicated the "license" was close to the cost of the analyzer.    Something like the scsi_dh module that has a ability to make a clean SAN dirty and possible has some limited ability to drop packets based on a rule.   That may give redhat a better ability to run test against updated/fixed things in the modules to make sure things correctly handle the sort of weirdness one gets in a large SAN when things are sub-optimal.

Comment 40 loberman 2016-12-09 16:53:06 UTC

Roger 

I have developed such an option that runs via the tcm_qla2xxx target driver that allows SCSI commands to be discarded with some control to a tcm array.

I have been chatting with Ewan about possible setting something up to reproduce this.

How many discards or drops do you think are needed to reproduce.

Any details you can provide will help in my efforts to use my jammer here.

Thanks
Laurence

Comment 43 loberman 2016-12-12 17:42:36 UTC

Hi Ewan

After realizing latest 4.9 is unstable with the jammer patch, I reverted to the one I used when testing prior to upstream submission. Version 4.5.1. 

I have 11 LIO Target LUNS.

I am running 11 direct_io reads to the mpath devices, and 1 oflag write job to an FS on one of the mpaths.

I have tried with eh_timeout and scsi timeout at 2s and with defaults of 10s and 30s.

I am discarding only NON TUR commands on the jammer and see the I/O stall, then recover.
Depending on how long I drop the commands I will get a stall, then it continues with no logging, or I will get the recovery of the adapter with a reset (lpfc) and this is logged.

[root@fcoe-test-rhel6 data1]# uname -a
Linux fcoe-test-rhel6 2.6.32-504.1.3.el6.x86_64 #1 SMP Fri Oct 31 11:37:10 EDT 2014 x86_64 x86_64 x86_64 GNU/Linux

[root@fcoe-test-rhel6 ~]# cat ./set_eh_timeout+scsi_timeout.sh
#!/bin/bash
for d in /sys/block/sd*
do
	echo 2 > $d/device/eh_timeout
	echo 2 > $d/device/timeout
done

With the above tuning and 15s blocks I get the hard sd device I/O errors

This will temporarly lose the path on this failure

[ 2062.043034] lpfc 0000:08:00.0: 0:(0):3053 lpfc_log_verbose changed from 0 (x0) to 4115 (x1013)
[ 2062.091842] lpfc 0000:05:00.0: 1:(0):3053 lpfc_log_verbose changed from 0 (x0) to 4115 (x1013)
[ 2218.681126] sd 3:0:0:6: [sdad]  Result: hostbyte=DID_REQUEUE driverbyte=DRIVER_OK
[ 2218.723973] sd 3:0:0:6: [sdad] CDB: Write(10): 2a 00 00 2b fb a0 00 00 10 00
[ 2218.763554] end_request: I/O error, dev sdad, sector 2882464
[ 2218.795537] sd 3:0:0:6: [sdad]  
[ 2218.795589] device-mapper: multipath: Failing path 65:208.
[ 2218.844046] Result: hostbyte=DID_REQUEUE driverbyte=DRIVER_OK
[ 2218.875699] sd 3:0:0:6: [sdad] CDB: Write(10): 2a 00 00 2c 2b a0 00 00 10 00
[ 2218.918841] end_request: I/O error, dev sdad, sector 2894752
[ 2218.950428] sd 3:0:0:6: [sdad]  Result: hostbyte=DID_REQUEUE driverbyte=DRIVER_OK
[ 2218.991945] sd 3:0:0:6: [sdad] CDB: Write(10): 2a 00 00 2c 99 80 00 00 10 00
[ 2219.036910] end_request: I/O error, dev sdad, sector 2922880
[ 2220.585006] sd 3:0:0:6: alua: port group 00 state A non-preferred supports TOlUSNA


However I have been unable to reproduce this panic so far.
Thanks
Laurence

Comment 44 Ewan D. Milne 2016-12-12 17:45:29 UTC

OK, thanks.  Let's let it run for a while and see what we get.

Comment 45 loberman 2016-12-14 17:44:04 UTC

Still busy here.
Moved to a dual port 8G LPFC, was only running a single port before because one of the ports was faulty.

Thanks
Laurence

Comment 46 loberman 2016-12-15 14:53:33 UTC

Hello Roger

After many attempts I have not yet had success in reproducing here.

I have tried with very short timeouts per below

#!/bin/bash
for d in /sys/block/sd*
do
echo 2 > $d/device/eh_timeout
echo 2 > $d/device/timeout
done

I have tried with no eh_deadline or timeout tuning as well, so at defaults and still have been unable to reproduce.

I watch the kernel and see the impact of the discards but after a timeout period the stack is recovering.

What is clear to me, and I have been down this same road recently with customers running large Oracle RAC configurations, is that you have to tune the css misscount.

The oracle default of 27s is simply too low for multipath to reconfigure on path loss etc.

Most customers these days are now running with Oracle voting heartbeats at 90s and above.

If you have any more details to share that you think would be helpful about your configuration and how this plays out please let me know.

My configuration (most recent)

Host with 100 LUNS, 4 paths so 400 sd devices accessed via F/C lpfc 4G on a targetLIO aray running my jammer code.

host port 1 -> DS5000 switch->targetLIO array
host port 2 -> DS5000 switch->targetLIO array

Running multibus

3600140570c3b35be40245618bef4e8e9 dm-15 LIO-ORG,block-6
size=20G features='0' hwhandler='1 alua' wp=rw
`-+- policy='round-robin 0' prio=50 status=active
|- 12:0:1:5 sdz 65:144 active ready running
|- 13:0:0:5 sdap 66:144 active ready running
|- 12:0:0:5 sdj 8:144 active ready running
`- 13:0:1:5 sdbf 67:144 active ready running

Jamming is done to a specific array port and commands are discarded.
This was done jamming all commands and then only data movement commands.

Thanks
Laurence and Ewan

Comment 58 Ewan D. Milne 2017-02-10 16:06:10 UTC

Dick, the following patch went into RHEL7.4 as part of the lpfc driver update:

commit 8ed2b039a3a7751d5b436b57fcaa5a256b04bcd9
Author: Rob Evers <revers>
Date:   Tue Jan 31 18:18:39 2017 -0500

    [scsi] lpfc: Correct panics with eh_timeout and eh_deadline
    
    Message-id: <1485886737-16352-39-git-send-email-revers>
    Patchwork-id: 164440
    O-Subject: [RHEL7.4 e-stor PATCH 38/56] scsi: lpfc: Correct panics with eh_timeout and eh_deadline
    Bugzilla: 1382101
    RH-Acked-by: Ewan Milne <emilne>
    RH-Acked-by: Maurizio Lombardi <mlombard>
    RH-Acked-by: Jarod Wilson <jarod>
    
    From: James Smart <james.smart>
    
    Correct panics with eh_timeout and eh_deadline
    
    We were having double completions on our SLI-3 version of adapters.
    Solved by clearing our command pointer before calling scsi_done.
    
    The eh paths potentially ran simulatenously and would see the non-null
    value and invoke scsi_done again.
    
    Signed-off-by: Dick Kennedy <dick.kennedy>
    Signed-off-by: James Smart <james.smart>
    Reviewed-by: Johannes Thumshirn <jthumshirn>
    Reviewed-by: Hannes Reinecke <hare>
    Signed-off-by: Martin K. Petersen <martin.petersen>
    (cherry picked from commit 89533e9be08aeda5cdc4600d46c1540c7b440299)
    Signed-off-by: Rob Evers <revers>
    Signed-off-by: Rafael Aquini <aquini>

Do you think it would be wise to put this into RHEL6?  We have seen some cases
of duplicate completions with lpfc on RHEL6, especially with short timeouts.

Comment 60 loberman 2017-04-10 14:49:10 UTC

Ewan,

I have a critical customer case seeing a similar issue with sli3.

2.6.32-642.6.2.el6.x86_64

Host Template

hostt = 0xffffffffa0116740 <lpfc_template_s3>,

We will need to get this into RHEL6.8+


crash> bt
PID: 0      TASK: ffffffff81a95020  CPU: 0   COMMAND: "swapper"
 #0 [ffff88009a203a90] machine_kexec at ffffffff8103fdcb
 #1 [ffff88009a203af0] crash_kexec at ffffffff810d1dc2
 #2 [ffff88009a203bc0] oops_end at ffffffff8154d110
 #3 [ffff88009a203bf0] die at ffffffff8101102b
 #4 [ffff88009a203c20] do_trap at ffffffff8154c964
 #5 [ffff88009a203c80] do_invalid_op at ffffffff8100cd95
 #6 [ffff88009a203d20] invalid_op at ffffffff8100c01b
    [exception RIP: blk_requeue_request+148]
    RIP: ffffffff8127c554  RSP: ffff88009a203dd0  RFLAGS: 00010093
    RAX: ffff8802ef189a08  RBX: ffff8802ef1898e0  RCX: ffff8802ef189a08
    RDX: ffff8802ef189a08  RSI: ffff8802ef1898e0  RDI: ffff8802ef1898e0
    RBP: ffff88009a203df0   R8: ffff88200b0a1178   R9: 0000000000000000
    R10: ffff88200f64f5e0  R11: 0000000000000000  R12: ffff88200a52cba8
    R13: 0000000000000000  R14: ffff88200a52cba8  R15: ffff88200f652000
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 #7 [ffff88009a203df8] __scsi_queue_insert at ffffffff813a1cdb
 #8 [ffff88009a203e48] scsi_queue_insert at ffffffff813a2453
 #9 [ffff88009a203e58] scsi_softirq_done at ffffffff813a253d
#10 [ffff88009a203e88] blk_done_softirq at ffffffff81285a05
#11 [ffff88009a203eb8] __do_softirq at ffffffff81085275
#12 [ffff88009a203f38] call_softirq at ffffffff8100c38c
#13 [ffff88009a203f50] do_softirq at ffffffff8100fca5
#14 [ffff88009a203f70] irq_exit at ffffffff81085105
#15 [ffff88009a203f80] do_IRQ at ffffffff81552c65
--- <IRQ stack> ---
#16 [ffffffff81a03d68] ret_from_intr at ffffffff8100ba53
    [exception RIP: intel_idle+254]
    RIP: ffffffff812fc0be  RSP: ffffffff81a03e18  RFLAGS: 00000206
    RAX: 0000000000000000  RBX: ffffffff81a03ea8  RCX: 0000000000000000
    RDX: 00000000000002f5  RSI: 0000000000000000  RDI: 00000000000b8d36
    RBP: ffffffff8100ba4e   R8: 0000000000000005   R9: 0000000000000386
    R10: 000149477ed15797  R11: 0000000000000000  R12: ffff88009a211b80
    R13: ffffffff81a03da8  R14: ffffffff810b71ec  R15: ffffffff81a03d98
    ORIG_RAX: ffffffffffffffac  CS: 0010  SS: 0018
#17 [ffffffff81a03eb0] cpuidle_idle_call at ffffffff81441d0a
#18 [ffffffff81a03ed0] cpu_idle at ffffffff81009fe6


void blk_requeue_request(struct request_queue *, struct request *);
crash> request ffff8802ef1898e0
struct request {
..
..
  q = 0xffff88200a52cba8, 
..
rq_disk = 0xffff882009f62400,

crash> gendisk 0xffff882009f62400
struct gendisk {
  major = 135, 
  first_minor = 48, 
  minors = 16, 
  disk_name = "sdij\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000ij", 

crash> dev -d | grep sdij
  135 ffff882009f62400   sdij       ffff88200a52cba8       1     0     1     0

In vmcore sdij is online

360050768018e02d61800000000000a18  (360050768018e02d61800000000000a18)  dm-2     IBM       2145
size=512000.00M  features='1 queue_if_no_path'  hwhandler=None
+- policy='round-robin'
  `- 1:0:0:0 sdb 8:16           [scsi_device: 0xffff884013303000 sdev_state: SDEV_RUNNING]
  `- 2:0:0:0 sdrr 134:336       [scsi_device: 0xffff884013276000 sdev_state: SDEV_RUNNING]
+- policy='round-robin'
  `- 1:0:1:0 sdij 135:48        [scsi_device: 0xffff88200a50f000 sdev_state: SDEV_RUNNING]
  `- 2:0:1:0 sdaaz 133:624      [scsi_device: 0xffff882008ddd800 sdev_state: SDEV_RUNNING]

Why did we panic

/**
 * blk_requeue_request - put a request back on queue
 * @q:          request queue where request should be inserted
 * @rq:         request to be inserted
 *
 * Description:
 *    Drivers often keep queueing requests until the hardware cannot accept
 *    more, when that condition happens we need to put the request back
 *    on the queue. Must be called with queue lock held.
 */
void blk_requeue_request(struct request_queue *q, struct request *rq)
{
        blk_delete_timer(rq);
        blk_clear_rq_complete(rq);
        trace_block_rq_requeue(q, rq);

        if (blk_rq_tagged(rq))
                blk_queue_end_tag(q, rq);

        BUG_ON(blk_queued_rq(rq));                  ************ UD2

        elv_requeue_request(q, rq);
}

List is empty so we call BUG_ON

struct request {
  queuelist = {
    next = 0xffff88200a52cba8, 
    prev = 0xffff88200a52cba8


#define blk_queued_rq(rq)       (!list_empty(&(rq)->queuelist))  Note (return head->next == head;)

Other notes
------------

We had issues on both scsi host1 and scsi host2
We started with offfline devices on host1 and then we timed on on host2

sd 1:0:1:65: Device offlined - not ready after error recovery
sd 1:0:1:1: Device offlined - not ready after error recovery
sd 1:0:1:1: Device offlined - not ready after error recovery
sd 1:0:1:1: Device offlined - not ready after error recovery
sd 1:0:1:1: Device offlined - not ready after error recovery
sd 1:0:1:1: Device offlined - not ready after error recovery
sd 1:0:1:1: Device offlined - not ready after error recovery
sd 1:0:1:65: [sdkw]  Result: hostbyte=DID_TIME_OUT driverbyte=DRIVER_OK
sd 1:0:1:65: [sdkw] CDB: Write(10): 2a 00 15 56 dd b0 00 00 18 00
end_request: I/O error, dev sdkw, sector 358014384
device-mapper: multipath: Failing path 67:320.
sd 1:0:1:1: [sdik]  Result: hostbyte=DID_TIME_OUT driverbyte=DRIVER_OK
sd 1:0:1:1: [sdik] CDB: Read(10): 28 00 00 58 0a 00 00 02 00 00
end_request: I/O error, dev sdik, sector 5769728
sd 1:0:1:1: [sdik]  Result: hostbyte=DID_TIME_OUT driverbyte=DRIVER_OK
sd 1:0:1:1: [sdik] CDB: Read(10): 28 00 00 58 0e 00 00 02 00 00
end_request: I/O error, dev sdik, sector 5770752
sd 1:0:1:1: [sdik]  Result: hostbyte=DID_TIME_OUT driverbyte=DRIVER_OK
sd 1:0:1:1: [sdik] CDB: Read(10): 28 00 00 58 08 00 00 02 00 00
end_request: I/O error, dev sdik, sector 5769216
sd 1:0:1:1: [sdik]  Result: hostbyte=DID_TIME_OUT driverbyte=DRIVER_OK
sd 1:0:1:1: [sdik] CDB: Read(10): 28 00 00 58 0c 00 00 02 00 00
end_request: I/O error, dev sdik, sector 5770240
sd 1:0:1:1: [sdik]  Result: hostbyte=DID_TIME_OUT driverbyte=DRIVER_OK
sd 1:0:1:1: [sdik] CDB: Read(10): 28 00 00 58 08 00 00 02 00 00
end_request: I/O error, dev sdik, sector 5769216
sd 1:0:1:1: [sdik]  Result: hostbyte=DID_TIME_OUT driverbyte=DRIVER_OK
sd 1:0:1:1: [sdik] CDB: Read(10): 28 00 00 58 0c 00 00 02 00 00
end_request: I/O error, dev sdik, sector 5770240

device-mapper: multipath: Failing path 135:64.
sd 2:0:1:65: Device offlined - not ready after error recovery
sd 2:0:1:1: Device offlined - not ready after error recovery
sd 2:0:1:1: Device offlined - not ready after error recovery
sd 2:0:1:1: Device offlined - not ready after error recovery
sd 2:0:1:1: Device offlined - not ready after error recovery
sd 2:0:1:1: Device offlined - not ready after error recovery
sd 2:0:1:1: Device offlined - not ready after error recovery
sd 2:0:1:1: Device offlined - not ready after error recovery
sd 2:0:1:65: [sdadm]  Result: hostbyte=DID_TIME_OUT driverbyte=DRIVER_OK
sd 2:0:1:65: [sdadm] CDB: Write(10): 2a 00 16 86 e8 28 00 00 28 00
end_request: I/O error, dev sdadm, sector 377940008
sd 2:0:1:1: [sdaba]  Result: hostbyte=DID_TIME_OUT driverbyte=DRIVER_OK
sd 2:0:1:1: [sdaba] CDB: Read(10): 28 00 00 58 08 00 00 02 00 00
end_request: I/O error, dev sdaba, sector 5769216
device-mapper: multipath: Failing path 65:896.
sd 2:0:1:1: [sdaba]  Result: hostbyte=DID_TIME_OUT driverbyte=DRIVER_OK
sd 2:0:1:1: [sdaba] CDB: Read(10): 28 00 00 58 0c 00 00 02 00 00
end_request: I/O error, dev sdaba, sector 5770240
device-mapper: multipath: Failing path 133:640.
sd 2:0:1:8: timing out command, waited 7s
sd 2:0:1:152: timing out command, waited 7s
device-mapper: multipath: Failing path 133:752.
device-mapper: multipath: Failing path 70:1008.
sd 1:0:1:212: timing out command, waited 7s
device-mapper: multipath: Failing path 132:368.
sd 2:0:1:169: timing out command, waited 7s
device-mapper: multipath: Failing path 128:768.
sd 1:0:1:216: timing out command, waited 7s
device-mapper: multipath: Failing path 132:432.
sd 2:0:1:170: timing out command, waited 7s
device-mapper: multipath: Failing path 128:784.
sd 1:0:1:11: timing out command, waited 7s
sd 1:0:1:158: timing out command, waited 7s
sd 1:0:1:172: timing out command, waited 7s
device-mapper: multipath: Failing path 135:224.
device-mapper: multipath: Failing path 129:272.
device-mapper: multipath: Failing path 129:496.
sd 2:0:1:209: timing out command, waited 7s
device-mapper: multipath: Failing path 134:544.
device-mapper: multipath: Failing path 134:688.
device-mapper: multipath: Failing path 70:896.
device-mapper: multipath: Failing path 70:944.
device-mapper: multipath: Failing path 70:880.
device-mapper: multipath: Failing path 71:768.
device-mapper: multipath: Failing path 71:896.
device-mapper: multipath: Failing path 71:928.
device-mapper: multipath: Failing path 128:832.
device-mapper: multipath: Failing path 128:864.
device-mapper: multipath: Failing path 129:784.
device-mapper: multipath: Failing path 130:896.
sd 1:0:1:124: timing out command, waited 7s
sd 1:0:1:200: timing out command, waited 7s
device-mapper: multipath: Failing path 70:496.
device-mapper: multipath: Failing path 131:432.

At the time of the panic host2 was in recovery

===
HOST      DRIVER
NAME      NAME                               Scsi_Host                shost_data               &.hostdata[0]           
-------------------------------------------------------------------------------------------------------------------------
host2     lpfc                               ffff88200f64f000         ffff88200b27f000         ffff88200f64f5e0

   DRIVER VERSION      : 0:11.0.0.4
   HOST BUSY           : 226
   HOST BLOCKED        : 0
   HOST FAILED         : 224
   SELF BLOCKED        : 0
   SHOST STATE         : SHOST_RECOVERY
   MAX LUN             : 4096
   CMD/LUN             : 3
   WORK Q NAME         : scsi_wq_2

Comment 67 Ewan D. Milne 2017-05-02 18:01:24 UTC

Dick, see comment # 58.  This patch should go into RHEL6.10.

Comment 68 Dick Kennedy 2017-05-17 18:20:45 UTC

(In reply to Ewan D. Milne from comment #67)
> Dick, see comment # 58.  This patch should go into RHEL6.10.

Yes it makes sense to put it in for the case that Laurence reported in comment 60, I am not sure if it will do anything for the original customer because they were running on Lancer. 

For the sli4 IOs, if you are running lots of IOs and things start to get aborted then the RRQ timer will be started for that Tgt/Lun/XRI combo. While the timer is active that XRI cannot be used by that Tgt/Lun combo. If you have enough IOs in flight and enough that are waiting for RRQ to clear them, then the lpfc_get_scsi_buf could fail to get a usable buffer/XRI and fail the command in 
queue command? I was thinking about the driver trying to find a buffer when this timeout happens and the abort is sent. The driver should fail the abort because it does not know about the command. But what if the timing is perfect and it finds a buffer right after the abort was sent. Then we are in trouble. how would we resolve this? Is the blk layer holding the shost lock when it processes the queue for lpfc? In this race condition, the abort could beat the IO?

The abort_handler is just using the hbalock for synching and the queuecommand is relying that the caller holds the shost lock. The get_scsi_buf is using a get_buf_list lock. 
Ewan can you verify what locks are held by the mid-layer for issuing an IO and aborting it?

Comment 69 Ewan D. Milne 2017-05-17 18:58:12 UTC

In RHEL6, the host_lock will be held during the call to lpfc_queuecommand because
you do not specify .lockless = 1 in the host template.  (A few RHEL6 drivers use
this, but lpfc does not.)  The calls to the various ->eh_ routines, including
->eh_abort_handler(), occur while the SCSI EH thread is running and all other
I/O is stopped on RHEL6, so you will not see other ->queuecommand() calls except
for the ones that the EH generates (e.g. TEST UNIT READY), but these all occur
in the same EH thread and are synchronous, you should not see any concurrency.

(RHEL7 is different, the ->eh_abort_handler() calls are made when the command
timeout expires.  It is possible to disable this by setting .no_async_abort = 1
in the host template, but so far we have not seen any problems that we are sure
were caused by this.  It remains an option for RHEL7, but will go away in the
next major release.)

Where you can run into trouble is in the FC transport code, because there are
multiple worker threads that can be running, and the different code paths take
locks, we have seen problems with that.  But I don't think that is the case here.

If the driver cannot issue an abort or handle one of the other ->eh functions
you should return a failure code and the SCSI EH will escalate.  In the end you
will get reset if nothing else works.

Comment 70 Dick Kennedy 2017-06-08 17:39:16 UTC

I am not sure how i can help fix this issue if the command has not reached the lpfc driver?

I don't remember seeing a issue like this but we have several io path fixes to correct locking. 

I would like to point out that the customer is still on a 10.0.x.x fw and the driver is 10.4.x.x, someone should get them up to date or at least in sync. The fw of a 10.x release should have the driver from that release.

Comment 71 John Pittman 2017-07-14 13:13:49 UTC

Thanks Dick,

I relayed the requests to my customer.  Will report back any news.

Comment 72 John Pittman 2017-07-17 15:02:24 UTC

Please be aware, I have added the customer here within the CC for this bug.

I passed along the suggestions to and they have some concerns.  The customer sited that "if they have specific firmware requirements for a given driver that they *MUST* bring the firmware with the driver and install it on driver load".  I believe this is already done with qlogic.

Additionally an explanation was requested on the request of syncing the firmware and driver versions.  Is there a known error/issue at play?

Comment 73 John Pittman 2017-07-17 15:08:21 UTC

Dick, could you respond to the concerns in comment #72?

Comment 74 Dick Kennedy 2017-07-27 17:29:16 UTC

The driver does not load fw on lpfc. We have always used an like lputil and ocmanager to download fw. Most of the HBAs require a reload of the driver to activate the new fw. 

The management tool:
https://www.broadcom.com/support/download-search/?pg=Storage+Adapters,+Controllers,+and+ICs&pf=Fibre+Channel+Host+Bus+Adapters&pn=LPe12002+FC+Host+Bus+Adapter&po=Emulex&pa=&dk=

The fw image to download: LPe16000-series firmware and boot code version 10.4.255.23

https://www.broadcom.com/support/download-search/?pg=&pf=Fibre+Channel+Host+Bus+Adapters&pn=LPe16002B+FC+Host+Bus+Adapter&po=Emulex&pa=&dk=

Comment 75 Roger Heflin 2017-07-27 18:10:59 UTC

Since we get the driver with RedHat (inbox driver) firmware does not come in at all that way. *IF* there is a firmware dependency in the driver the driver should be informing us of that during startup, or should be loading the firmware itself.

Is this you know what firmware will not work, or is this a case that you have not certified the combination?      

Note we will have firmware/driver combinations all over the place in our environment, and *ONLY* the machines with abnormally high batch loads and SAN's that tend to sometimes lose packets have this crashing issue (3 pairs of database nodes out of >1000 database nodes).

So are you saying that this rhel7 bug/patch:
[scsi] lpfc: Correct panics with eh_timeout and eh_deadline

is not applicable to rhel6.[789]?  Because the description of the bug matches exactly what is happening when the machine crashes (timeouts and deadlines are triggering because of SAN packet loss).

And we really cannot test this in prod, right now we have mostly stabilized it by increasing the timeouts to 120seconds and rearranging the SAN to reduce the incident of packet loss.

Comment 76 loberman 2017-07-27 18:38:46 UTC

Hello Roger

I hope all is well.

As far as FW is concerned, yes its harder on Emulex because the FW is loaded from flash.
Typically HPE for example, if you were using their driver, would provide the matching driver and firmware but with the inbox its a tougher ask.

This fix below is valid for the SL3 adapters

Correct panics with eh_timeout and eh_deadline

We were having double completions on our SLI-3 version of adapters.
Solved by clearing our command pointer before calling scsi_done.

The eh paths potentially ran simulatenously and would see the non-null
value and invoke scsi_done again.

Signed-off-by: Dick Kennedy <dick.kennedy>
Signed-off-by: James Smart <james.smart>

drivers/scsi/lpfc/lpfc_scsi.c | 6 +++---
drivers/scsi/lpfc/lpfc_sli.c | 12 ++++++++----
2 files changed, 11 insertions(+), 7 deletions(-)

We had other issues as well with eh_deadline no longer being settable on RHEL7 where the echo would fail.
Whereas on RHEL6 we have the silent issue where the SLI3 adapters have the eh_host_reset_handler removed in the host template but if you still had it enabled you could fall through to NULL pointer and panic. This is not yet fixed in RHEL6 kernels but is in progress. Another one found by the genius of David Jeffery.

Your very low time-outs set prior are the catalyst for this race in this BZ.

I think Dick Kennedy was just making a point that the FW is way behind the driver here but we do of course understand this can get out of sync sometimes.

"
I would like to point out that the customer is still on a 10.0.x.x fw and the driver is 10.4.x.x, someone should get them up to date or at least in sync. The fw of a 10.x release should have the driver from that release.
"

I don't think the firmware not matching is the root of this very low time-out setting race as described originally by David Jeffery.

The fix you called out is valid and needs to get into RHEL6.9 zstream if its possible.

Thanks
Laurence

Comment 77 Laurie Barry 2017-08-21 17:39:17 UTC

We'll include this patch in the RHEL6.10 and RHEL7.5 patchset.

Laurie

Comment 78 Darren Lawrence 2017-10-27 20:26:09 UTC

Will the patches also, or can they be included in the 6.9 zstream and possible even in the 7.3 zstream? I know the Customer here will be pressing hard for it in 6.9 and possible 7.3.

Comment 79 Rob Evers 2017-11-15 21:07:27 UTC

From initial description:

Actual results:
System crashes when using a very short SCSI timeout

Expected results:
System should continue to run even if an overly short SCSI timeout is selected.

The timeout value is unreasonably low.

This problem is not going to be resolved in rhel6.10.

Comment 80 Rob Evers 2017-11-15 21:08:31 UTC

Red Hat Enterprise Linux 6 is in the Production 3 Phase. During the Production 3 Phase, Critical impact Security Advisories (RHSAs) and selected Urgent Priority Bug Fix Advisories (RHBAs) may be released as they become available.

The official life cycle policy can be reviewed here:

http://redhat.com/rhel/lifecycle

This issue does not meet the inclusion criteria for the Production 3 Phase and will be marked as CLOSED/WONTFIX. If this remains a critical requirement, please contact Red Hat Customer Support to request a re-evaluation of the issue, citing a clear business justification. Note that a strong business justification will be required for re-evaluation. Red Hat Customer Support can be contacted via the Red Hat Customer Portal at the following URL:

https://access.redhat.com/