Bug 1585361
Summary: | FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Sam Yangsao <syangsao> | ||||
Component: | smartmontools | Assignee: | Michal Hlavinka <mhlavink> | ||||
Status: | CLOSED WONTFIX | QA Contact: | BaseOS QE - Apps <qe-baseos-apps> | ||||
Severity: | low | Docs Contact: | |||||
Priority: | unspecified | ||||||
Version: | 7.4 | CC: | mahbubur.rahman | ||||
Target Milestone: | rc | ||||||
Target Release: | --- | ||||||
Hardware: | x86_64 | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2020-06-10 10:19:20 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Sam Yangsao
2018-06-02 01:41:03 UTC
Turned off smartd, and this is not the cause of the issue. Looks to be related to udisks, per [1], switching component in this bz. [1] https://bugzilla.redhat.com/show_bug.cgi?id=1351305#c13 This has no real impact: no customer runs udisks since it's been replace by udisks2 long ago. There is no upstream either... We should not spend any resources on this component at all. Can someone explain why this is occurring? Is it udisks or udisks2 related? I'm seeing this occur every 30 minutes in RHEL 7. Seems like a Bug that should be reviewed, harmless or not. Thanks! Reassigned to the new udisks/udisks2 maintainer. Sam, is this really udisks or udisks2? Can you grep running processes? Also, is this a USB HDD? If so, could you please attach `udevadm info /dev/sdX`? Notice that the device path changes, per the output in the first comment. That may indicate the HDD reconnecting on the USB bus. In that case could you please attach full dmesg output? Here's one of them. [root@tatooine ~]# udevadm info /dev/sdc P: /devices/pci0000:00/0000:00:01.0/0000:07:00.0/host8/target8:0:0/8:0:0:0/block/sdc N: sdc S: disk/by-id/ata-PNY_CS1311_240GB_SSD_PNY11162171570202C13 S: disk/by-id/wwn-0x5f8db4c111602c13 S: disk/by-path/pci-0000:07:00.0-scsi-0:0:0:0 E: DEVLINKS=/dev/disk/by-id/ata-PNY_CS1311_240GB_SSD_PNY11162171570202C13 /dev/disk/by-id/wwn-0x5f8db4c111602c13 /dev/disk/by-path/pci-0000:07:00.0-scsi-0:0:0:0 E: DEVNAME=/dev/sdc E: DEVPATH=/devices/pci0000:00/0000:00:01.0/0000:07:00.0/host8/target8:0:0/8:0:0:0/block/sdc E: DEVTYPE=disk E: ID_ATA=1 E: ID_ATA_DOWNLOAD_MICROCODE=1 E: ID_ATA_FEATURE_SET_APM=1 E: ID_ATA_FEATURE_SET_APM_CURRENT_VALUE=254 E: ID_ATA_FEATURE_SET_APM_ENABLED=1 E: ID_ATA_FEATURE_SET_HPA=1 E: ID_ATA_FEATURE_SET_HPA_ENABLED=1 E: ID_ATA_FEATURE_SET_PM=1 E: ID_ATA_FEATURE_SET_PM_ENABLED=1 E: ID_ATA_FEATURE_SET_SECURITY=1 E: ID_ATA_FEATURE_SET_SECURITY_ENABLED=0 E: ID_ATA_FEATURE_SET_SECURITY_ENHANCED_ERASE_UNIT_MIN=2 E: ID_ATA_FEATURE_SET_SECURITY_ERASE_UNIT_MIN=2 E: ID_ATA_FEATURE_SET_SMART=1 E: ID_ATA_FEATURE_SET_SMART_ENABLED=1 E: ID_ATA_ROTATION_RATE_RPM=0 E: ID_ATA_SATA=1 E: ID_ATA_SATA_SIGNAL_RATE_GEN1=1 E: ID_ATA_SATA_SIGNAL_RATE_GEN2=1 E: ID_ATA_WRITE_CACHE=1 E: ID_ATA_WRITE_CACHE_ENABLED=1 E: ID_BUS=ata E: ID_FS_LABEL=tatooine.lab.msp.redhat.com:126 E: ID_FS_LABEL_ENC=tatooine.lab.msp.redhat.com:126 E: ID_FS_TYPE=linux_raid_member E: ID_FS_USAGE=raid E: ID_FS_UUID=2902a605-1425-2e05-2507-3abc068826ce E: ID_FS_UUID_ENC=2902a605-1425-2e05-2507-3abc068826ce E: ID_FS_UUID_SUB=5db8a4f1-f98e-5791-bf23-c64cbf01802f E: ID_FS_UUID_SUB_ENC=5db8a4f1-f98e-5791-bf23-c64cbf01802f E: ID_FS_VERSION=1.2 E: ID_MODEL=PNY_CS1311_240GB_SSD E: ID_MODEL_ENC=PNY\x20CS1311\x20240GB\x20SSD\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20 E: ID_PATH=pci-0000:07:00.0-scsi-0:0:0:0 E: ID_PATH_TAG=pci-0000_07_00_0-scsi-0_0_0_0 E: ID_REVISION=CS131122 E: ID_SERIAL=PNY_CS1311_240GB_SSD_PNY11162171570202C13 E: ID_SERIAL_SHORT=PNY11162171570202C13 E: ID_TYPE=disk E: ID_WWN=0x5f8db4c111602c13 E: ID_WWN_WITH_EXTENSION=0x5f8db4c111602c13 E: MAJOR=8 E: MINOR=32 E: SUBSYSTEM=block E: TAGS=:systemd: E: USEC_INITIALIZED=84724 Created attachment 1453731 [details]
dmesg output
(In reply to Tomáš Bžatek from comment #6) > Sam, is this really udisks or udisks2? Can you grep running processes? Doesn't look like it's either from reviewing the process tree > Also, is this a USB HDD? If so, could you please attach `udevadm info > /dev/sdX`? Notice that the device path changes, per the output in the first > comment. That may indicate the HDD reconnecting on the USB bus. In that case > could you please attach full dmesg output? These are not USB HDD's. Outputs attached above. Thanks for the information provided and for the private access to your system, very helpful! Looks like you don't have udisks installed at all, not even the older storaged packages. Spawning smartctl -a /dev/sdc generates one more error message. The smartd daemon is running and probably does drive monitoring every 30 minutes. Looking at the /var/log/messages-20180603 log: > May 28 04:58:07 tatooine kernel: sd 8:0:3:0: [sdf] FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE > May 28 04:58:07 tatooine kernel: sd 8:0:3:0: [sdf] Sense Key : No Sense [current] [descriptor] > May 28 04:58:07 tatooine kernel: sd 8:0:3:0: [sdf] Add. Sense: No additional sense information > May 28 04:58:07 tatooine kernel: sd 8:0:3:0: [sdf] CDB: ATA command pass through(16) 85 06 2c 00 00 00 00 00 00 00 00 00 00 00 e5 00 > May 28 04:58:07 tatooine kernel: sd 8:0:3:0: [sdf] FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE > May 28 04:58:07 tatooine kernel: sd 8:0:3:0: [sdf] Sense Key : No Sense [current] [descriptor] > May 28 04:58:07 tatooine kernel: sd 8:0:3:0: [sdf] Add. Sense: No additional sense information > May 28 04:58:07 tatooine kernel: sd 8:0:3:0: [sdf] CDB: ATA command pass through(16) 85 06 2c 00 da 00 00 00 00 00 4f 00 c2 00 b0 00 > May 28 04:58:07 tatooine smartd[1073]: Device: /dev/sdf [SAT], 1 Currently unreadable (pending) sectors > May 28 04:58:07 tatooine smartd[1073]: Device: /dev/sdf [SAT], 1 Offline uncorrectable sectors Notice the timestamp coincidence. Also: > Jun 2 11:30:23 tatooine smartd[7855]: Device: /dev/sdc, type changed from 'scsi' to 'sat' > Jun 2 11:30:23 tatooine smartd[7855]: Device: /dev/sdc [SAT], opened > Jun 2 11:30:23 tatooine smartd[7855]: Device: /dev/sdc [SAT], PNY CS1311 240GB SSD, S/N:PNY11162171570202C13, WWN:5-f8db4c-111602c13, FW:CS131122, 240 GB > Jun 2 11:30:23 tatooine smartd[7855]: Device: /dev/sdc [SAT], not found in smartd database. > Jun 2 11:30:23 tatooine kernel: sd 8:0:0:0: [sdc] FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE > Jun 2 11:30:23 tatooine kernel: sd 8:0:0:0: [sdc] Sense Key : No Sense [current] [descriptor] > Jun 2 11:30:23 tatooine kernel: sd 8:0:0:0: [sdc] Add. Sense: No additional sense information > Jun 2 11:30:23 tatooine kernel: sd 8:0:0:0: [sdc] CDB: ATA command pass through(16) 85 06 0c 00 d8 00 00 00 01 00 4f 00 c2 00 b0 00 > Jun 2 11:30:23 tatooine kernel: sd 8:0:0:0: [sdc] FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE > Jun 2 11:30:23 tatooine kernel: sd 8:0:0:0: [sdc] Sense Key : No Sense [current] [descriptor] > Jun 2 11:30:23 tatooine kernel: sd 8:0:0:0: [sdc] Add. Sense: No additional sense information > Jun 2 11:30:23 tatooine kernel: sd 8:0:0:0: [sdc] CDB: ATA command pass through(16) 85 06 2c 00 da 00 00 00 00 00 4f 00 c2 00 b0 00 > Jun 2 11:30:23 tatooine smartd[7855]: Device: /dev/sdc [SAT], can't monitor Current_Pending_Sector count - no Attribute 197 > Jun 2 11:30:23 tatooine smartd[7855]: Device: /dev/sdc [SAT], can't monitor Offline_Uncorrectable count - no Attribute 198 > Jun 2 11:30:23 tatooine kernel: sd 8:0:0:0: [sdc] FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE > Jun 2 11:30:23 tatooine kernel: sd 8:0:0:0: [sdc] Sense Key : No Sense [current] [descriptor] > Jun 2 11:30:23 tatooine kernel: sd 8:0:0:0: [sdc] Add. Sense: No additional sense information > Jun 2 11:30:23 tatooine kernel: sd 8:0:0:0: [sdc] CDB: ATA command pass through(16) 85 06 2c 00 00 00 00 00 00 00 00 00 00 00 e5 00 > Jun 2 11:30:23 tatooine smartd[7855]: Device: /dev/sdc [SAT], is SMART capable. Adding to "monitor" list. Notice that the currently running smartd process still has the PID 7855 after 23 days of system uptime. You mentioned you tried turning smartd off and it didn't help. One other option to look for is a periodical check from cron jobs. I am getting similar error messages in the messages log, one is shown below: Aug 16 12:46:20 txentnbu5230 kernel: scsi 3:0:0:9: Direct-Access HITACHI OPEN-V 5001 PQ: 0 ANSI: 2 Aug 16 12:46:20 txentnbu5230 kernel: sd 3:0:0:9: sg_alloc: dev=39 Aug 16 12:46:20 txentnbu5230 kernel: sd 3:0:0:9: [sdal] Very big device. Trying to use READ CAPACITY(16). Aug 16 12:46:20 txentnbu5230 kernel: sd 3:0:0:9: Attached scsi generic sg39 type 0 Aug 16 12:46:20 txentnbu5230 kernel: [42544]: scst: Attached to scsi3, channel 0, id 0, lun 9, type 0 Aug 16 12:46:20 txentnbu5230 kernel: sd 3:0:0:9: [sdal] Using 0xffffffff as device size Aug 16 12:46:20 txentnbu5230 kernel: sd 3:0:0:9: [sdal] 4294967296 512-byte logical blocks: (2.19 TB/2.00 TiB) Aug 16 12:46:20 txentnbu5230 kernel: sd 3:0:0:9: [sdal] Write Protect is off Aug 16 12:46:20 txentnbu5230 kernel: sd 3:0:0:9: [sdal] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA Aug 16 12:46:20 txentnbu5230 kernel: sd 3:0:0:9: [sdal] Very big device. Trying to use READ CAPACITY(16). Aug 16 12:46:20 txentnbu5230 kernel: sd 3:0:0:9: [sdal] Using 0xffffffff as device size Aug 16 12:46:20 txentnbu5230 kernel: sd 3:0:0:9: [sdal] tag#0 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE Aug 16 12:46:20 txentnbu5230 kernel: sd 3:0:0:9: [sdal] tag#0 Sense Key : Illegal Request [current] Aug 16 12:46:20 txentnbu5230 kernel: sd 3:0:0:9: [sdal] tag#0 Add. Sense: Invalid field in cdb Aug 16 12:46:20 txentnbu5230 kernel: sd 3:0:0:9: [sdal] tag#0 CDB: Read(16) 88 20 00 00 00 00 00 00 00 00 00 00 00 08 00 00 Aug 16 12:46:20 txentnbu5230 kernel: blk_update_request: critical target error, dev sdal, sector 0 Aug 16 12:46:20 txentnbu5230 kernel: Buffer I/O error on dev sdal, logical block 0, async page read Interestingly the same devices do not get automatically discovered and I have to run ‘"echo "c t l" > /sys/class/scsi_host/host#/scan' command to add them manually. For example for host3, it discovers all the following devices automatically: 3:0:0:0 through 3:0:0:7 But the FC scan fails to create other target/luns connected to the server: 3:0:0:8 through 3:0:0:30 Anyone can put any light why the discovery process discovers only 8 luns per host# and not all. And when I add the remaining manually, they keep reporting the above errors but the customer reports that they are not having any problem using these additional devices. (In reply to Mahbub Rahman from comment #12) > Anyone can put any light why the discovery process discovers only 8 luns per > host# and not all. And when I add the remaining manually, they keep > reporting the above errors but the customer reports that they are not having > any problem using these additional devices. Mahbub, this looks like a different issue. Please open separate bugreport and assign it to kernel as this doesn't look udisks-related from a first sight. Reassigning to smartmontools per my comment 11. Red Hat Enterprise Linux version 7 entered the Maintenance Support Phase. In this phase only qualified Critical and Important Security errata advisories (RHSAs) and Urgent Priority Bug Fix errata advisories (RHBAs) may be released as they become available. Other errata advisories may be delivered as appropriate. This bug has been reviewed by Engineering representative and does not meet the inclusion criteria for Maintenance Support Phase. For more information about Red Hat Enterprise Linux Lifecycle, please see https://access.redhat.com/support/policy/updates/errata/ |