Description of problem: With a faulty disk the following is shown in "/var/log/messages": ... 894 Nov 12 17:39:26 myhostname kernel: sd 0:0:9:0: timing out command, waited 180s 895 Nov 12 17:39:26 myhostname kernel: sd 0:0:9:0: [sdc] FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK 896 Nov 12 17:39:26 myhostname kernel: sd 0:0:9:0: [sdc] CDB: Read(16) 88 00 00 00 00 00 00 01 0e 00 00 00 00 80 00 00 897 Nov 12 17:39:26 myhostname kernel: blk_update_request: I/O error, dev sdc, sector 69120 898 Nov 12 17:40:01 myhostname systemd: Started Session 2150 of user root. 899 Nov 12 17:42:26 myhostname kernel: sd 0:0:9:0: timing out command, waited 180s 900 Nov 12 17:42:46 myhostname kernel: sd 0:0:9:0: timing out command, waited 20s 901 Nov 12 17:43:06 myhostname kernel: sd 0:0:9:0: timing out command, waited 20s 902 Nov 12 17:43:26 myhostname kernel: sd 0:0:9:0: timing out command, waited 20s ... 912 Nov 12 17:46:26 myhostname kernel: sd 0:0:9:0: timing out command, waited 180s 913 Nov 12 17:46:46 myhostname kernel: sd 0:0:9:0: timing out command, waited 20s 914 Nov 12 17:47:06 myhostname kernel: sd 0:0:9:0: timing out command, waited 20s 915 Nov 12 17:47:26 myhostname kernel: sd 0:0:9:0: timing out command, waited 60s 916 Nov 12 17:47:26 myhostname kernel: sd 0:0:9:0: timing out command, waited 20s ... "timing out command, waited XXXs" is not showing which command is "timing out". Version-Release number of selected component (if applicable): RHEL 7.6 How reproducible: RHEL 7.6 with a faulty disk Actual results: The logs show errors in regards to a (possible faulty) disk. Some infos are generic ("timing out command, waited 20s") and the details are missing (which command timed out?). Expected results: With the "timing out command, waited XXs" info, also show which command timed out.
So the customer would like to see the SCSI command (i.e. log message like: 896 Nov 12 17:39:26 myhostname kernel: sd 0:0:9:0: [sdc] CDB: Read(16) 88 00 00 00 00 00 00 01 0e 00 00 00 00 80 00 00 in the output for the "timing out command" message? Is that it? To help with upstream acceptance of such a change, can the customer identify why this information would be useful? Also, please provide some information about what type of storage device the customer was using (the INQUIRY information from sg_inq would help).
(In reply to Ewan D. Milne from comment #3) > So the customer would like to see the SCSI command (i.e. log message like: > > 896 Nov 12 17:39:26 myhostname kernel: sd 0:0:9:0: [sdc] CDB: Read(16) 88 00 > 00 00 00 00 00 01 0e 00 00 00 00 80 00 00 > > in the output for the "timing out command" message? Is that it? > > To help with upstream acceptance of such a change, can the customer identify > why this information would be useful? > > Also, please provide some information about what type of storage device > the customer was using (the INQUIRY information from sg_inq would help). Following command is good example of what command timed out. 894 Nov 12 17:39:26 host01 kernel: sd 0:0:9:0: timing out command, waited 180s 895 Nov 12 17:39:26 host01 kernel: sd 0:0:9:0: [sdc] FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK 896 Nov 12 17:39:26 host01 kernel: sd 0:0:9:0: [sdc] CDB: Read(16) 88 00 00 00 00 00 00 01 0e 00 00 00 00 80 00 00 897 Nov 12 17:39:26 host01 kernel: blk_update_request: I/O error, dev sdc, sector 69120 So line 894 to 897 provides information what times out. But there are a lot of additional message entries, where it's not possible to identify, what command timed out. for example. 912 Nov 12 17:46:26 evesp07 kernel: sd 0:0:9:0: timing out command, waited 180s 913 Nov 12 17:46:46 evesp07 kernel: sd 0:0:9:0: timing out command, waited 20s 914 Nov 12 17:47:06 evesp07 kernel: sd 0:0:9:0: timing out command, waited 20s 915 Nov 12 17:47:26 evesp07 kernel: sd 0:0:9:0: timing out command, waited 60s 916 Nov 12 17:47:26 evesp07 kernel: sd 0:0:9:0: timing out command, waited 20s here in line 912 to 916 it's unable to identify which command timed out. It seems there are 5 different commands sent, which all timed out. Here the request is, to show the command always in form like above (line 894-897 or similar), when there is a command timing out. This would help in case of debugging. Does this make sense?
Hello Is there any test steps here if I don't have faulty disk? or just do sanity testing thanks Guazhang
IMO the patch is so trivial that sanity testing is sufficient.
(In reply to Maurizio Lombardi from comment #10) > IMO the patch is so trivial that sanity testing is sufficient. Ok, thanks your responds
Patch(es) committed on kernel-3.10.0-1065.el7
I did code review on fixed kernel and the regression testing, After check, all modified can be found in source https://beaker.engineering.redhat.com/jobs/3722823 this is from auto test bug bot
I did code review on fixed kernel and the regression testing, After check, all modified can be found in source https://beaker.engineering.redhat.com/jobs/3722824 this is from auto test bug bot
I did code review on fixed kernel and the regression testing, After check, all modified can be found in source https://beaker.engineering.redhat.com/jobs/3722828 this is from auto test bug bot
I did code review on fixed kernel and the regression testing, After check, all modified can be found in source https://beaker.engineering.redhat.com/jobs/3722830 this is from auto test bug bot
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2020:1016