RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1501367 - lsmcli local-disk-list causes filesystem crash on Dell Precision 690 with disk HDS725050KLA360
Summary: lsmcli local-disk-list causes filesystem crash on Dell Precision 690 with dis...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: libstoragemgmt
Version: 7.4
Hardware: x86_64
OS: Unspecified
high
medium
Target Milestone: rc
: ---
Assignee: Gris Ge
QA Contact: Jakub Krysl
URL:
Whiteboard:
: 1508198 (view as bug list)
Depends On:
Blocks: 1508198 1511467 1511468
TreeView+ depends on / blocked
 
Reported: 2017-10-12 12:58 UTC by Jakub Krysl
Modified: 2021-09-03 14:11 UTC (History)
5 users (show)

Fixed In Version: libstoragemgmt-1.6.0-1.el7
Doc Type: Bug Fix
Doc Text:
Previously, when using the mptsas drivers, the "lsmcli local-disk-list" and "lsmcli ldl" commands in some cases caused the file system to terminate unexpectedly. With this update, the underlying code has been fixed, which prevents the described problem from occurring.
Clone Of:
: 1508198 1511467 1511468 (view as bug list)
Environment:
Last Closed: 2018-04-10 15:37:02 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2018:0864 0 None None None 2018-04-10 15:37:15 UTC

Description Jakub Krysl 2017-10-12 12:58:37 UTC
Description of problem:
Running 'lsmcli local-disk-list' on Dell Precision 690 with disk HDS725050KLA360 takes longer than usual and causes filesystem to crash:
[  520.800699] XFS (dm-0): metadata I/O error: block 0x3202479 ("xlog_iodone") error 5 numblks 64
[  520.809374] XFS (dm-0): Log I/O Error Detected.  Shutting down filesystem
[  520.816163] XFS (dm-0): Please umount the filesystem and rectify the problem(s)
[  522.248594] beah-beaker-backend[1326]: Traceback (most recent call last):
[  522.251278] beah-beaker-backend[1326]: File "/usr/lib64/python2.7/logging/handlers.py", line 863, in emit
[  522.251635] beah-beaker-backend[1326]: gaierror: [Errno -2] Name or service not known
[  522.251957] beah-beaker-backend[1326]: Logged from file log.py, line 453
[  522.252376] beah-beaker-backend[1326]: 2017-10-12 12:45:45,711 backend.twisted emit: ERROR Unhandled error in Deferred:
[  522.252727] beah-beaker-backend[1326]: Traceback (most recent call last):
[  522.253072] beah-beaker-backend[1326]: File "/usr/lib64/python2.7/logging/__init__.py", line 875, in emit
[  522.253415] beah-beaker-backend[1326]: File "/usr/lib64/python2.7/logging/__init__.py", line 835, in flush
[  522.253746] beah-beaker-backend[1326]: IOError: [Errno 5] Input/output error
[  522.254084] beah-beaker-backend[1326]: Logged from file log.py, line 453
[  522.254399] beah-beaker-backend[1326]: Unhandled error in Deferred:
[  522.294435] beah-beaker-backend[1326]: Traceback (most recent call last):
[  522.294837] beah-beaker-backend[1326]: File "/usr/lib64/python2.7/logging/handlers.py", line 863, in emit
[  522.295197] beah-beaker-backend[1326]: gaierror: [Errno -2] Name or service not known
[  522.295518] beah-beaker-backend[1326]: Logged from file log.py, line 453
[  522.295923] beah-beaker-backend[1326]: 2017-10-12 12:45:45,762 backend.twisted emit: ERROR Unhandled Error
[  522.296273] beah-beaker-backend[1326]: Traceback (most recent call last):
[  522.296598] beah-beaker-backend[1326]: File "/usr/lib64/python2.7/site-packages/twisted/internet/base.py", line 1169, in run
[  522.296919] beah-beaker-backend[1326]: File "/usr/lib64/python2.7/site-packages/twisted/internet/base.py", line 1178, in mainLoop
[  522.297262] beah-beaker-backend[1326]: File "/usr/lib64/python2.7/site-packages/twisted/internet/base.py", line 800, in runUntilCurrent
[  522.297588] beah-beaker-backend[1326]: File "/usr/lib64/python2.7/site-packages/twisted/internet/task.py", line 215, in __call__
[  522.297909] beah-beaker-backend[1326]: --- <exception caught here> ---
[  522.298244] beah-beaker-backend[1326]: File "/usr/lib64/python2.7/site-packages/twisted/internet/defer.py", line 134, in maybeDeferred
[  522.298563] beah-beaker-backend[1326]: File "/usr/lib/python2.7/site-packages/beah/backends/beakerlc.py", line 1591, in flush
[  522.298910] beah-beaker-backend[1326]: File "/usr/lib/python2.7/site-packages/beah/backends/beakerlc.py", line 1845, in runtime_sync
[  522.299281] beah-beaker-backend[1326]: File "/usr/lib/python2.7/site-packages/beah/misc/runtimes.py", line 344, in sync
[  522.299600] beah-beaker-backend[1326]: File "/usr/lib64/python2.7/shelve.py", line 169, in sync
[  522.299919] beah-beaker-backend[1326]: File "/usr/lib64/python2.7/bsddb/__init__.py", line 347, in sync
[  522.300274] beah-beaker-backend[1326]: File "/usr/lib64/python2.7/bsddb/dbutils.py", line 68, in DeadlockWrap
[  522.300600] beah-beaker-backend[1326]: bsddb.db.DBError: (5, 'Input/output error -- BDB0151 fsync: Input/output error')
[  522.300911] beah-beaker-backend[1326]: Traceback (most recent call last):
[  522.301829] beah-beaker-backend[1326]: File "/usr/lib64/python2.7/logging/__init__.py", line 875, in emit
[  522.302214] beah-beaker-backend[1326]: File "/usr/lib64/python2.7/logging/__init__.py", line 835, in flush
[  522.302578] beah-beaker-backend[1326]: IOError: [Errno 5] Input/output error
[  522.302898] beah-beaker-backend[1326]: Logged from file log.py, line 453
[  522.303267] beah-beaker-backend[1326]: Unhandled Error
[  522.303588] beah-beaker-backend[1326]: Traceback (most recent call last):
[  522.303896] beah-beaker-backend[1326]: File "/usr/lib64/python2.7/site-packages/twisted/internet/base.py", line 1169, in run
[  522.304254] beah-beaker-backend[1326]: File "/usr/lib64/python2.7/site-packages/twisted/internet/base.py", line 1178, in mainLoop
[  522.304572] beah-beaker-backend[1326]: File "/usr/lib64/python2.7/site-packages/twisted/internet/base.py", line 800, in runUntilCurrent
[  522.304935] beah-beaker-backend[1326]: File "/usr/lib64/python2.7/site-packages/twisted/internet/task.py", line 215, in __call__
[  522.305283] beah-beaker-backend[1326]: --- <exception caught here> ---
[  522.305597] beah-beaker-backend[1326]: File "/usr/lib64/python2.7/site-packages/twisted/internet/defer.py", line 134, in maybeDeferred
[  522.305943] beah-beaker-backend[1326]: File "/usr/lib/python2.7/site-packages/beah/backends/beakerlc.py", line 1591, in flush
[  522.306279] beah-beaker-backend[1326]: File "/usr/lib/python2.7/site-packages/beah/backends/beakerlc.py", line 1845, in runtime_sync
[  522.306592] beah-beaker-backend[1326]: File "/usr/lib/python2.7/site-packages/beah/misc/runtimes.py", line 344, in sync
[  522.639149] beah-beaker-backend[1326]: File "/usr/lib64/python2.7/shelve.py", line 169, in sync
[  522.639507] beah-beaker-backend[1326]: File "/usr/lib64/python2.7/bsddb/__init__.py", line 347, in sync
[  522.639841] beah-beaker-backend[1326]: File "/usr/lib64/python2.7/bsddb/dbutils.py", line 68, in DeadlockWrap
[  522.640201] beah-beaker-backend[1326]: bsddb.db.DBError: (5, 'Input/output error -- BDB0151 fsync: Input/output error')

The same output in console appears with libstoragemgmt version 1.4.0-2 and 1.5.0-2, but the cmdline error is different:
lsm 1.4.0-3:
# lsmcli local-disk-list
LIB_BUG(1): BUG: Got incorrect VPD page code '2e', should be 0x83
lsm 1.5.0-2:
# lsmcli local-disk-list
WARN: rpm_get('/dev/sda'): 1 BUG: Failed to open /dev/sda, error: 6, No such device or addressWARN: link$
-------------------------------------------------------------------------
/dev/sda | 5000cca20dde580e | No Support | KRVN67ZAJ4SA3F | Unknown

In the latter it takes few seconds longer for the filesystem to crash.

Note: This bug is probably not caused by lsm itself, but I do not have the knowledge to assign it to the right component as I do not know which cmd called from lsmcli local-disk-list causes this.

Version-Release number of selected component (if applicable):
both libstoragemgmt-1.4.0-3 and libstoragemgmt-1.5.0-2

How reproducible:
100%

Steps to Reproduce:
1.lsmcli local-disk-list
2.wait 10 seconds
3.observe filesystem crash

Actual results:
error on cmdline and filesystem crash in few seconds

Expected results:
No error and no fs crash

Additional info:
# lshw -class disk
  *-disk
       description: ATA Disk
       product: HDS725050KLA360
       physical id: 0.0.0
       bus info: scsi@8:0.0.0
       logical name: /dev/sda
       version: AB5A
       serial: ##############
       size: 465GiB (500GB)
       capacity: 465GiB (500GB)
       capabilities: 15000rpm partitioned partitioned:dos
       configuration: ansiversion=5 logicalsectorsize=512 sectorsize=512 signature=000ed8d2
  *-cdrom:0
       description: CD-R/CD-RW writer
       product: CD-RW GCE-8487B
       vendor: HL-DT-ST
       physical id: 0.0.0
       bus info: scsi@0:0.0.0
       logical name: /dev/cdrom
       logical name: /dev/sr0
       version: F109
       capabilities: removable audio cd-r cd-rw
       configuration: ansiversion=5 status=nodisc
  *-cdrom:1
       description: DVD writer
       product: DVD+-RW GWA4164B
       vendor: HL-DT-ST
       physical id: 0.1.0
       bus info: scsi@0:0.1.0
       logical name: /dev/sr1
       version: E111
       serial: [
       capabilities: removable audio cd-r cd-rw dvd dvd-r
       configuration: ansiversion=5 status=nodisc

# lsblk
NAME                         MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
fd0                            2:0    1     4K  0 disk
sda                            8:0    0 465.8G  0 disk
├─sda1                         8:1    0     1G  0 part /boot
└─sda2                         8:2    0 464.8G  0 part
  ├─rhel_dell--p690--01-root 253:0    0    50G  0 lvm  /
  ├─rhel_dell--p690--01-swap 253:1    0     2G  0 lvm  [SWAP]
  └─rhel_dell--p690--01-home 253:2    0 412.8G  0 lvm  /home
sr0                           11:0    1  1024M  0 rom
sr1                           11:1    1  1024M  0 rom

# df -h 
Filesystem                                       Size  Used Avail Use% Mounted on
/dev/mapper/rhel_dell--p690--01-root              50G  2.0G   49G   4% /
devtmpfs                                         908M     0  908M   0% /dev
tmpfs                                            919M     0  919M   0% /dev/shm
tmpfs                                            919M  8.6M  911M   1% /run
tmpfs                                            919M     0  919M   0% /sys/fs/cgroup
/dev/sda1                                       1014M  146M  869M  15% /boot
/dev/mapper/rhel_dell--p690--01-home             413G   33M  413G   1% /home
NetApp###################:/qe-data/kdump_cores  973G  145G  829G  15% /var/crash
tmpfs                                            184M     0  184M   0% /run/user/0

Comment 3 Gris Ge 2017-10-17 15:39:43 UTC
Root cause:

In libstoragemgmt, to query link speed on ATA device, it send and CDB: Inquiry `12 01 89 ff ff 00` where data size is 0xffff. While page 0x89 data size should be 572(0x023c) according to SAT-4.

Patch(for libstoragemgmt) has been send to upstream:
https://github.com/libstorage/libstoragemgmt/pull/316

Comment 4 Gris Ge 2017-10-17 15:43:37 UTC
The kernel message might be helpful:

[ 2753.713018] mptscsih: ioc0: attempting task abort! (sc=ffff880079b5fa00)
[ 2753.719722] sd 8:0:0:0: [sda] CDB: Inquiry 12 01 89 ff ff 00
[ 2757.481079] mptbase: ioc0: LogInfo(0x31140000): Originator={PL}, Code={IO Executed}, SubCode(0x0000) cb_idx mptscsih_io_done
[ 2757.482071] mptscsih: ioc0: task abort: SUCCESS (rv=2002) (sc=ffff880079b5fa00)
[ 2757.505052] sd 8:0:0:0: [sda] FAILED Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK
[ 2757.505056] sd 8:0:0:0: [sda] FAILED Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK
[ 2757.505068] sd 8:0:0:0: [sda] CDB: Synchronize Cache(10) 35 00 00 00 00 00 00 00 00 00
[ 2757.505076] blk_update_request: I/O error, dev sda, sector 0
[ 2757.505133] XFS (dm-0): metadata I/O error: block 0x320b79a ("xlog_iodone") error 5 numblks 64
[ 2757.505138] XFS (dm-0): xfs_do_force_shutdown(0x2) called from line 1200 of file fs/xfs/xfs_log.c.  Return address = 0xffffffffc0471ea0
[ 2757.505186] XFS (dm-0): Log I/O Error Detected.  Shutting down filesystem
[ 2757.505188] XFS (dm-0): Please umount the filesystem and rectify the problem(s)

Comment 7 Jakub Krysl 2017-10-24 11:34:23 UTC
No longer crashes the filesystem, but there is a python ImportError:

# lsmcli local-disk-list
Traceback (most recent call last):
  File "/usr/bin/lsmcli", line 18, in <module>
    from lsm.lsmcli import cmd_line_wrapper
  File "/usr/lib/python2.7/site-packages/lsm/__init__.py", line 21, in <module>
    from lsm._local_disk import LocalDisk
  File "/usr/lib/python2.7/site-packages/lsm/_local_disk.py", line 22, in <module>
    from lsm._clib import (_local_disk_vpd83_search, _local_disk_vpd83_get,
ImportError: /usr/lib64/libyajl.so.2: file too short

Comment 8 Gris Ge 2017-10-31 16:18:00 UTC
Hi Jakub,

It looks like the /usr/lib64/libyajl.so.2 is crashed due to previous file system crash.
I tried the `lsmcli ldl` command on my VM, it works well.

Could you install OS from scratch and try again?

Thanks.

Comment 10 Gris Ge 2017-11-02 16:28:00 UTC
*** Bug 1508198 has been marked as a duplicate of this bug. ***

Comment 11 Gris Ge 2017-11-03 13:06:42 UTC
Related kernel bug is:
https://bugzilla.redhat.com/show_bug.cgi?id=1504597

Comment 13 Jakub Krysl 2017-11-06 11:07:27 UTC
Gris,

Sorry it took so long, I could not get my hands on said server. I tested it with lsm 1.6.0-1 on fresh install and it is fixed. You are probably correct the /usr/lib64/libyajl.so.2 is crashed due to previous filesystem crash, I could not reproduce it again - crashed the filesystem with old version, restarted, tried to crash with new.

So setting this to VERIFIED as it works with said version libstoragemgmt-1.6.0-1.el7.

Comment 18 errata-xmlrpc 2018-04-10 15:37:02 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2018:0864


Note You need to log in before you can comment on or make changes to this bug.