Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 1505942 - runner/tcmu: returns incorrect response sizes for non RW commands
runner/tcmu: returns incorrect response sizes for non RW commands
Status: ASSIGNED
Product: Red Hat Ceph Storage
Classification: Red Hat
Component: iSCSI (Show other bugs)
3.0
Unspecified Unspecified
high Severity urgent
: rc
: 4.0
Assigned To: Mike Christie
Tejas
Bara Ancincova
:
Depends On:
Blocks: 1494421
  Show dependency treegraph
 
Reported: 2017-10-24 11:40 EDT by Mike Christie
Modified: 2018-10-20 13:03 EDT (History)
7 users (show)

See Also:
Fixed In Version:
Doc Type: Known Issue
Doc Text:
.The iSCSI gateway can fail to scan or setup LUNs When using the iSCSI gateway, the Linux initiators can return the `kzalloc` failures due to buffers being too large. In addition, the VMWare ESX initiators can return the `READ_CAP` failures due to not being able to copy the data. As a consequence, the iSCSI gateway fails to scan or setup Logical Unit Numbers (LUNs), find or rediscover devices, and add the devices back after path failures.
Story Points: ---
Clone Of:
Environment:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Mike Christie 2017-10-24 11:40:29 EDT
Description of problem:

On the linux initiator side you might see kzalloc failures due to buffers being too large and on ESX you might see READ_CAP failures due to it not being able to copy the data. LUN scanning/setup might then fail so devices are not found or rediscovered and added back after path failures.

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:
Comment 10 Jason Dillaman 2017-10-26 15:48:46 EDT
Oct 10 18:16:09 localhost kernel: WARNING: CPU: 12 PID: 9299 at mm/page_alloc.c:2902 __alloc_pages_slowpath+0x6f/0x724
Oct 10 18:16:09 localhost kernel: Modules linked in: ext4 mbcache jbd2 dm_queue_length iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi sb_edac edac_core intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd mei_me iTCO_wdt sg iTCO_vendor_support joydev mxm_wmi lpc_ich mei pcspkr i2c_i801 shpchp ipmi_ssif ipmi_si ipmi_devintf wmi ipmi_msghandler acpi_power_meter nfsd auth_rpcgss nfs_acl lockd dm_multipath grace sunrpc ip_tables xfs libcrc32c sd_mod crc_t10dif crct10dif_generic mgag200 i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm ixgbe drm ahci libahci libata crct10dif_pclmul be2net megaraid_sas crct10dif_common crc32c_intel mdio ptp i2c_core pps_core dca dm_mirror dm_region_hash
Oct 10 18:16:09 localhost kernel: dm_log dm_mod
Oct 10 18:16:09 localhost kernel: CPU: 12 PID: 9299 Comm: kworker/u40:2 Not tainted 3.10.0-693.el7.x86_64 #1
Oct 10 18:16:09 localhost kernel: Hardware name: FUJITSU PRIMERGY RX2530 M2/D3279-B1, BIOS V5.0.0.11 R1.7.0 for D3279-B1x                     04/21/2016
Oct 10 18:16:09 localhost kernel: Workqueue: kmpath_handlerd activate_path [dm_multipath]
Oct 10 18:16:09 localhost kernel: 0000000000000000 000000001dde5f0b ffff8807766379e0 ffffffff816a3d91
Oct 10 18:16:09 localhost kernel: ffff880776637a20 ffffffff810879c8 00000b5681033619 0000000000100010
Oct 10 18:16:09 localhost kernel: 0000000000104010 ffff88087ffd7000 0000000000000000 0000000000124010
Oct 10 18:16:09 localhost kernel: Call Trace:
Oct 10 18:16:09 localhost kernel: [<ffffffff816a3d91>] dump_stack+0x19/0x1b
Oct 10 18:16:09 localhost kernel: [<ffffffff810879c8>] __warn+0xd8/0x100
Oct 10 18:16:09 localhost kernel: [<ffffffff81087b0d>] warn_slowpath_null+0x1d/0x20
Oct 10 18:16:09 localhost kernel: [<ffffffff8169f723>] __alloc_pages_slowpath+0x6f/0x724
Oct 10 18:16:09 localhost kernel: [<ffffffff8109927e>] ? try_to_del_timer_sync+0x5e/0x90
Oct 10 18:16:09 localhost kernel: [<ffffffff8118cd85>] __alloc_pages_nodemask+0x405/0x420
Oct 10 18:16:09 localhost kernel: [<ffffffff811d1108>] alloc_pages_current+0x98/0x110
Oct 10 18:16:09 localhost kernel: [<ffffffff8118760e>] __get_free_pages+0xe/0x40
Oct 10 18:16:09 localhost kernel: [<ffffffff811dcaae>] kmalloc_order_trace+0x2e/0xa0
Oct 10 18:16:09 localhost kernel: [<ffffffff811e0641>] __kmalloc+0x211/0x230
Oct 10 18:16:09 localhost kernel: [<ffffffff8147b376>] realloc_buffer+0x36/0x70
Oct 10 18:16:09 localhost kernel: [<ffffffff8147b8bb>] alua_rtpg+0x50b/0x630
Oct 10 18:16:09 localhost kernel: [<ffffffff810cd794>] ? update_curr+0x104/0x190
Oct 10 18:16:09 localhost kernel: [<ffffffff810ca29e>] ? account_entity_dequeue+0xae/0xd0
Oct 10 18:16:09 localhost kernel: [<ffffffff810cdc7c>] ? dequeue_entity+0x11c/0x5d0
Oct 10 18:16:09 localhost kernel: [<ffffffffc00dfc20>] ? reinstate_path+0x180/0x180 [dm_multipath]
Oct 10 18:16:09 localhost kernel: [<ffffffff8147ba17>] alua_activate+0x37/0x2a0
Oct 10 18:16:09 localhost kernel: [<ffffffffc00dfc20>] ? reinstate_path+0x180/0x180 [dm_multipath]
Oct 10 18:16:09 localhost kernel: [<ffffffff81477793>] scsi_dh_activate+0xc3/0x160
Oct 10 18:16:09 localhost kernel: [<ffffffffc00dfeaa>] activate_path+0x5a/0x60 [dm_multipath]
Oct 10 18:16:09 localhost kernel: [<ffffffff810a881a>] process_one_work+0x17a/0x440
Oct 10 18:16:09 localhost kernel: [<ffffffff810a94e6>] worker_thread+0x126/0x3c0
Oct 10 18:16:09 localhost kernel: [<ffffffff810a93c0>] ? manage_workers.isra.24+0x2a0/0x2a0
Oct 10 18:16:09 localhost kernel: [<ffffffff810b098f>] kthread+0xcf/0xe0
Oct 10 18:16:09 localhost kernel: [<ffffffff8108ddeb>] ? do_exit+0x6bb/0xa40
Oct 10 18:16:09 localhost kernel: [<ffffffff810b08c0>] ? insert_kthread_work+0x40/0x40
Oct 10 18:16:09 localhost kernel: [<ffffffff816b4f18>] ret_from_fork+0x58/0x90
Oct 10 18:16:09 localhost kernel: [<ffffffff810b08c0>] ? insert_kthread_work+0x40/0x40
Oct 10 18:16:09 localhost kernel: ---[ end trace 560d82aec7daf644 ]---
Oct 10 18:16:09 localhost kernel: sd 11:0:0:40: alua_rtpg: kmalloc buffer failed
Comment 18 Brett Niver 2017-10-30 09:49:14 EDT
Moving to 3.1

Note You need to log in before you can comment on or make changes to this bug.