RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1678785 - Loading table with too big physical size spams of type 6 VIO producing System Error -5 (plus soft lockup)
Summary: Loading table with too big physical size spams of type 6 VIO producing System...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 8
Classification: Red Hat
Component: kmod-kvdo
Version: 8.0
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: 8.0
Assignee: corwin
QA Contact: vdo-qe
URL:
Whiteboard:
Depends On: 1682560
Blocks: 1687005
TreeView+ depends on / blocked
 
Reported: 2019-02-19 15:18 UTC by Jakub Krysl
Modified: 2021-09-06 15:17 UTC (History)
4 users (show)

Fixed In Version: 6.2.1.23
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1687005 (view as bug list)
Environment:
Last Closed: 2019-11-05 22:12:27 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2019:3548 0 None None None 2019-11-05 22:12:42 UTC

Description Jakub Krysl 2019-02-19 15:18:52 UTC
Description of problem:
Loading new table with bigger physical size (without growing the device under) produces these errors:
[ 6425.517509] kvdo1:journalQ: Completing write VIO of type 6 for physical block 108652269 with error: System error -5 (-5) 

This message seems to be produced for every extra physical block (that just is not there), so if the given size is much bigger, this is lots of spam.

Plus if there is iscsi device connected on the system, I am getting iscsid soft lockup trace (leaving the -5 errors in between as they appeared in the log):
[ 6352.289203]  connection1:0: ping timeout of 5 secs expired, recv timeout 5, last rx 4301009255, last ping 4301014529, now 4301019648 
[ 6356.005253] kvdo1:journalQ: Completing write VIO of type 6 for physical block 108651718 with error: System error -5 (-5) 
[ 6379.976348] watchdog: BUG: soft lockup - CPU#3 stuck for 22s! [iscsid:962] 
[ 6380.104410] kvdo1:journalQ: Completing write VIO of type 6 for physical block 108651719 with error: System error -5 (-5) 
[ 6380.152540] Modules linked in: nfsv3 nfs_acl dm_service_time rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache kvdo(O) uds(O) iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi sunrpc dm_multipath intel_rapl sb_edac x86_pkg_temp_thermal intel_powerclamp 
[ 6380.203568] kvdo1:journalQ: Completing write VIO of type 6 for physical block 108651720 with error: System error -5 (-5) 
[ 6380.254415]  coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel iTCO_wdt iTCO_vendor_support 
[ 6380.305065] kvdo1:journalQ: Completing write VIO of type 6 for physical block 108651721 with error: System error -5 (-5) 
[ 6380.355872]  intel_cstate ipmi_ssif intel_uncore intel_rapl_perf i2c_i801 pcspkr sg 
[ 6380.406400] kvdo1:journalQ: Completing write VIO of type 6 for physical block 108651722 with error: System error -5 (-5) 
[ 6380.460047]  lpc_ich hpilo hpwdt ipmi_si ipmi_devintf acpi_tad ipmi_msghandler ioatdma wmi acpi_power_meter xfs libcrc32c sd_mod mgag200 i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm ahci libahci drm libata ixgbe crc32c_intel serio_raw tg3 mdio dca dm_mirror dm_region_hash dm_log dm_mod 
[ 6380.460072] CPU: 3 PID: 962 Comm: iscsid Kdump: loaded Tainted: G           O     --------- -  - 4.18.0-67.el8.x86_64 #1 
[ 6380.510838] kvdo1:journalQ: Completing write VIO of type 6 for physical block 108651723 with error: System error -5 (-5) 
[ 6380.563913] Hardware name: HP ProLiant DL360 Gen9/ProLiant DL360 Gen9, BIOS P89 10/25/2017 
[ 6380.563918] RIP: 0010:native_queued_spin_lock_slowpath+0x1d9/0x1f0 
[ 6380.563920] Code: ff ff ff f3 90 48 8b 02 48 85 c0 74 f6 48 89 c6 e9 60 ff ff ff 89 c1 e9 6d fe ff ff 85 c9 74 0e 8b 07 84 c0 74 08 f3 90 8b 07 <84> c0 75 f8 b8 01 00 00 00 66 89 07 c3 66 2e 0f 1f 84 00 00 00 00 
[ 6380.563921] RSP: 0018:ffffb092c175fe58 EFLAGS: 00000202 ORIG_RAX: ffffffffffffff13 
[ 6380.563923] RAX: 0000000000140101 RBX: ffff9531ead1ec88 RCX: 0000000000000001 
[ 6380.563924] RDX: 0000000000000101 RSI: 0000000000000000 RDI: ffff9531ead1ec88 
[ 6380.563925] RBP: ffff9531ead1ec00 R08: 0000000000000008 R09: 0000000000000000 
[ 6380.563926] R10: 0000000000000000 R11: ffff9531ead1ec00 R12: 00007ffe885746c0 
[ 6380.563927] R13: 000000000000000d R14: 0000000000000001 R15: 0000000000000008 
[ 6380.563929] FS:  00007f910f669d00(0000) GS:ffff953237ac0000(0000) knlGS:0000000000000000 
[ 6380.563933] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033 
[ 6380.615645] kvdo1:journalQ: Completing write VIO of type 6 for physical block 108651724 with error: System error -5 (-5) 
[ 6380.670862] CR2: 000055bf6f526094 CR3: 000000026610c006 CR4: 00000000001606e0 
[ 6380.670863] Call Trace: 
[ 6380.670868]  _raw_spin_lock_bh+0x27/0x30 
[ 6380.670873]  lock_sock_nested+0x1e/0x60 
[ 6380.721490] kvdo1:journalQ: Completing write VIO of type 6 for physical block 108651725 with error: System error -5 (-5) 
[ 6380.773148]  sock_setsockopt+0x6d/0xa10 
[ 6380.773152]  __sys_setsockopt+0xce/0xe0 
[ 6380.822598] kvdo1:journalQ: Completing write VIO of type 6 for physical block 108651726 with error: System error -5 (-5) 
[ 6380.874459]  __x64_sys_setsockopt+0x20/0x30 
[ 6380.874463]  do_syscall_64+0x5b/0x1b0 
[ 6380.925584] kvdo1:journalQ: Completing write VIO of type 6 for physical block 108651727 with error: System error -5 (-5) 
[ 6380.976692]  entry_SYSCALL_64_after_hwframe+0x65/0xca 
[ 6380.976695] RIP: 0033:0x7f910e19608e 
[ 6380.976696] Code: d8 64 89 02 48 c7 c0 ff ff ff ff eb b1 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 f3 0f 1e fa 49 89 ca b8 36 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d ca fd 2b 00 f7 d8 64 89 01 48 
[ 6380.976697] RSP: 002b:00007ffe885746b8 EFLAGS: 00000293 ORIG_RAX: 0000000000000036 
[ 6380.976699] RAX: ffffffffffffffda RBX: 000055bf6f535c98 RCX: 00007f910e19608e 
[ 6380.976700] RDX: 000000000000000d RSI: 0000000000000001 RDI: 0000000000000009 
[ 6380.976701] RBP: 000055bf6f530220 R08: 0000000000000008 R09: 0000000000000005 
[ 6380.976702] R10: 00007ffe885746c0 R11: 0000000000000293 R12: 000055bf6f538428 
[ 6380.976702] R13: 0000000000000003 R14: 0000000000000001 R15: 000055bf6f538160

Version-Release number of selected component (if applicable):
kernel-4.18.0-67.el8.x86_64
vdo-6.2.0.293-10.el8.x86_64
kmod-kvdo-6.2.0.293-47.el8.x86_64

How reproducible:
100%

Steps to Reproduce:
1. vdo create --name vdo --device /dev/mapper/rhel_storageqe--74-home
2. dmsetup table                                                                                                                                                                               
mpatha: 0 20981760 multipath 0 1 alua 1 1 service-time 0 1 2 8:16 1 1           
rhel_storageqe--74-home: 0 853467136 linear 8:2 16345088                                                                                                                                                           
rhel_storageqe--74-swap: 0 16343040 linear 8:2 2048                                                                                                                                                                
rhel_storageqe--74-root: 0 104857600 linear 8:2 869812224                                
vdo: 0 845632240 vdo V2 /dev/disk/by-id/dm-uuid-LVM-t9CAkhzrxCmr3d9xS6dx1zjmExag3plHk8LIlJWgv0kmRu26GLtQqwbotyDVPMcM 106683392 4096 32768 16380 on auto vdo maxDiscard 1 ack 1 bio 4 bioRotationInterval 64 cpu 2 h
ash 1 logical 1 physical 1
3. dmsetup suspend vdo
# load table with size 106683392 -> 108683392
4. dmsetup reload vdo --table '0 845632240 vdo V2 /dev/disk/by-id/dm-uuid-LVM-t9CAkhzrxCmr3d9xS6dx1zjmExag3plHk8LIlJWgv0kmRu26GLtQqwbotyDVPMcM 108683392 4096 32768 16380 on auto vdo maxDiscar
d 1 ack 1 bio 4 bioRotationInterval 64 cpu 2 hash 1 logical 1 physical 1'                                                                                                                                          
5. dmsetup resume vdo                   
device-mapper: resume ioctl on vdo  failed: Input/output error                                                                                                                                                     
Command failed.              

Actual results:
spam of [ 6380.510838] kvdo1:journalQ: Completing write VIO of type 6 for physical block 108651723 with error: System error -5 (-5)
plus [ 6379.976348] watchdog: BUG: soft lockup - CPU#3 stuck for 22s! [iscsid:962]

Expected results:
No spam, no soft lockup

Additional info:
Placed the spam and soft lockup into single ticket as per discussion with sweettea.

Comment 5 Jakub Krysl 2019-07-25 10:10:10 UTC
# vdo create --name vdo --device /dev/sda
Creating VDO vdo
Starting VDO vdo
Starting compression on VDO vdo
VDO instance 2 volume is ready at /dev/mapper/vdo
# dmsetup table
vdo: 0 19487242776 vdo V2 /dev/disk/by-id/scsi-35000cca26c1a0a38 2441609216 4096 32768 16380 on auto vdo maxDiscard 1 ack 1 bio 4 bioRotationInterval 64 cpu 2 hash 1 logical 1 physical 1
# dmsetup suspend vdo
# dmsetup reload vdo --table '0 19487242776 vdo V2 /dev/disk/by-id/scsi-35000cca26c1a0a38 2851209216 4096 32768 16380 on auto vdo maxDiscard 1 ack 1 bio 4 bioRotationInterval 64 cpu 2 hash 1 logical 1 physical 1'
# dmsetup suspend vdo
# dmsetup resume vdo
device-mapper: resume ioctl on vdo  failed: Input/output error
Command failed.

syslog:
[15577.848792] updateVIOErrorStats: 2038 callbacks suppressed
[15577.848805] kvdo2:journalQ: Completing write VIO of type 6 for physical block 2851176448 with error: System error -5 (-5)
[15577.848808] kvdo2:journalQ: Completing write VIO of type 6 for physical block 2851176384 with error: System error -5 (-5)
[15577.848812] kvdo2:journalQ: Completing write VIO of type 6 for physical block 2851176449 with error: System error -5 (-5)
[15577.848814] kvdo2:journalQ: Completing write VIO of type 6 for physical block 2851176512 with error: System error -5 (-5)
[15577.848818] kvdo2:journalQ: Completing write VIO of type 6 for physical block 2851176385 with error: System error -5 (-5)
[15577.848822] kvdo2:journalQ: Completing write VIO of type 6 for physical block 2851176513 with error: System error -5 (-5)
[15577.848825] kvdo2:journalQ: Completing write VIO of type 6 for physical block 2851176450 with error: System error -5 (-5)
[15577.848828] kvdo2:journalQ: Completing write VIO of type 6 for physical block 2851176451 with error: System error -5 (-5)
[15577.848830] kvdo2:journalQ: Completing write VIO of type 6 for physical block 2851176386 with error: System error -5 (-5)
[15577.848832] kvdo2:journalQ: Completing write VIO of type 6 for physical block 2851176514 with error: System error -5 (-5)
[15577.970134] kvdo2:dmsetup: resize operation failed, result = -5
[15577.976060] kvdo2:dmsetup: Commit of modifications to device 'vdo' failed: System error -5 (-5)
[15577.984763] kvdo2:journalQ: Unrecoverable error, entering read-only mode: System error -5 (-5)
[15577.993423] device-mapper: table: 253:0: vdo: preresume failed, error = -5

The spam is gone, only last 10 messages get printed. No more soft lockups because of that.

Note: The number of callbacks suppressed is proportional to metadata size, not number of extra nonexistent physical blocks.

Comment 8 errata-xmlrpc 2019-11-05 22:12:27 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:3548


Note You need to log in before you can comment on or make changes to this bug.