Back to bug 1623601
| Who | When | What | Removed | Added |
|---|---|---|---|---|
| Red Hat Bugzilla Rules Engine | 2018-08-29 17:46:20 UTC | Target Release | 3.0 | 3.* |
| Vasu Kulkarni | 2018-08-29 17:54:09 UTC | Priority | unspecified | high |
| Target Release | 3.* | 3.1 | ||
| CC | vakulkar | |||
| Flags | automate_bug+ | |||
| Red Hat Bugzilla Rules Engine | 2018-08-29 17:54:15 UTC | Target Release | 3.1 | 3.0 |
| Vasu Kulkarni | 2018-08-29 17:59:10 UTC | Target Release | 3.0 | 3.1 |
| Jason Dillaman | 2018-08-29 19:56:03 UTC | CC | jdillama | |
| Target Milestone | rc | z1 | ||
| Jason Dillaman | 2018-08-29 19:56:44 UTC | CC | mkasturi | |
| Flags | needinfo?(mkasturi) | |||
| Madhavi Kasturi | 2018-08-30 05:50:19 UTC | Flags | needinfo?(mkasturi) | needinfo+ |
| Mike Christie | 2018-08-30 07:54:12 UTC | Link ID | Github https://github.com/open-iscsi/tcmu-runner/pull/471 | |
| Mike Christie | 2018-08-30 17:41:14 UTC | Status | NEW | POST |
| Mike Christie | 2018-08-30 17:43:38 UTC | Blocks | 1624040 | |
| Mike Christie | 2018-09-05 05:06:53 UTC | Blocks | 1624040 | |
| Harish NV Rao | 2018-09-05 07:28:02 UTC | CC | hnallurv, mchristi | |
| Flags | needinfo?(mchristi) | |||
| Mike Christie | 2018-09-05 16:01:33 UTC | Doc Text | Cause: The RHEL 7.5 kernel's ALUA layer reduced the number of it times an initiator retries the SCSI sense code ALUA State Transition. This is returned from the target side by tcmu-runner when it is taking the rbd exclusive lock during failover/failback and device discovery. Consequence: We can run out of retries before failover/discovery has completed, and the SCSI layer will return a failure to the multipath layer. The multipath layer will try another path and we can hit the same problem. The multipath layer will then bounce between paths resulting in slow or failed IO, management operations to the multipath device failing, in the initiator side logs you will see messages about paths being failed and removed then immediately re-added while IO is being performed to the multipath device. Workaround (if any): The ALUA layer change was added in RHEL 7.5. Downgrading the initiator's kernel to the RHEL 7.4 kernel will workaround the problem. Result: IO should not be failed from the SCSI layer to the multipath layer when performing IO and all paths are initially in the active and enabled dm-multipath state. | |
| Doc Type | If docs needed, set a value | Known Issue | ||
| Flags | needinfo?(mchristi) | |||
| Mike Christie | 2018-09-05 20:30:37 UTC | Status | POST | ASSIGNED |
| Vikhyat Umrao | 2018-09-09 17:33:30 UTC | CC | vumrao | |
| Tomas Petr | 2018-09-12 06:40:56 UTC | CC | tpetr | |
| Tomas Petr | 2018-09-12 15:15:10 UTC | CC | tserlin | |
| Flags | needinfo?(mchristi) | |||
| Mike Christie | 2018-09-12 15:42:46 UTC | Link ID | Github https://github.com/open-iscsi/tcmu-runner/pull/471 | Github open-iscsi/tcmu-runner/pull/471 |
| Flags | needinfo?(mchristi) | |||
| Jason Dillaman | 2018-09-12 17:54:05 UTC | Flags | needinfo?(mchristi) | |
| Flags | needinfo?(jdillama) | |||
| Flags | needinfo?(mchristi) needinfo?(jdillama) | |||
| Mike Christie | 2018-09-12 21:38:47 UTC | Flags | needinfo?(mchristi) | |
| Flags | needinfo?(mchristi) | |||
| Vikhyat Umrao | 2018-09-12 22:14:02 UTC | Flags | needinfo?(mchristi) | |
| Mike Christie | 2018-09-13 06:29:09 UTC | Flags | needinfo?(mchristi) | |
| Tomas Petr | 2018-09-13 06:36:11 UTC | Flags | needinfo?(mchristi) | |
| Vikhyat Umrao | 2018-09-13 16:29:56 UTC | Flags | needinfo?(mchristi) | |
| Mike Christie | 2018-09-13 16:46:02 UTC | Flags | needinfo?(mchristi) needinfo?(mchristi) | |
| Harish NV Rao | 2018-09-17 12:19:14 UTC | Blocks | 1584264 | |
| Aron Gunn | 2018-09-17 20:35:20 UTC | CC | agunn | |
| Docs Contact | agunn | |||
| Doc Text | Cause: The RHEL 7.5 kernel's ALUA layer reduced the number of it times an initiator retries the SCSI sense code ALUA State Transition. This is returned from the target side by tcmu-runner when it is taking the rbd exclusive lock during failover/failback and device discovery. Consequence: We can run out of retries before failover/discovery has completed, and the SCSI layer will return a failure to the multipath layer. The multipath layer will try another path and we can hit the same problem. The multipath layer will then bounce between paths resulting in slow or failed IO, management operations to the multipath device failing, in the initiator side logs you will see messages about paths being failed and removed then immediately re-added while IO is being performed to the multipath device. Workaround (if any): The ALUA layer change was added in RHEL 7.5. Downgrading the initiator's kernel to the RHEL 7.4 kernel will workaround the problem. Result: IO should not be failed from the SCSI layer to the multipath layer when performing IO and all paths are initially in the active and enabled dm-multipath state. | .An iSCSI device is busy according to the `systemd-udevd` service In the Red Hat Enterprise Linux 7.5, the kernel's ALUA layer reduced the number of times an initiator retries the SCSI sense code `ALUA State Transition`. This code is returned from the target side by the `tcmu-runner` service when taking the RBD exclusive lock during a failover or failback scenario and when doing a device discovery. As a consequence, the maximum number of retries occurs before the discovery process has completed, and the SCSI layer will return a failure to the multipath IO layer. The multipath IO layer will try the next available path, and the same problem will occur. This causes a loop of path checking, resulting in failed IO, and management operations to the multipath device to fail. The logs on the initiator node will print messages about devices being removed and then re-added. To workaround this issued, downgrade the initiator's kernel to Red Hat Enterprise Linux 7.4. |
||
| Aron Gunn | 2018-09-17 20:43:07 UTC | Doc Text | .An iSCSI device is busy according to the `systemd-udevd` service In the Red Hat Enterprise Linux 7.5, the kernel's ALUA layer reduced the number of times an initiator retries the SCSI sense code `ALUA State Transition`. This code is returned from the target side by the `tcmu-runner` service when taking the RBD exclusive lock during a failover or failback scenario and when doing a device discovery. As a consequence, the maximum number of retries occurs before the discovery process has completed, and the SCSI layer will return a failure to the multipath IO layer. The multipath IO layer will try the next available path, and the same problem will occur. This causes a loop of path checking, resulting in failed IO, and management operations to the multipath device to fail. The logs on the initiator node will print messages about devices being removed and then re-added. To workaround this issued, downgrade the initiator's kernel to Red Hat Enterprise Linux 7.4. | .An iSCSI device is busy according to the `systemd-udevd` service In Red Hat Enterprise Linux 7.5, the kernel's ALUA layer reduced the number of times an initiator retries the SCSI sense code `ALUA State Transition`. This code is returned from the target side by the `tcmu-runner` service when taking the RBD exclusive lock during a failover or failback scenario and when doing a device discovery. As a consequence, the maximum number of retries occurs before the discovery process has completed, and the SCSI layer will return a failure to the multipath IO layer. The multipath IO layer will try the next available path, and the same problem will occur. This causes a loop of path checking, resulting in failed IO, and management operations to the multipath device to fail. The logs on the initiator node will print messages about devices being removed and then re-added. To workaround this issued, downgrade the initiator's kernel to Red Hat Enterprise Linux 7.4. |
| Mike Christie | 2018-09-22 16:38:12 UTC | Status | ASSIGNED | MODIFIED |
| Fixed In Version | tcmu-runner-1.4.0-0.3.el7cp | |||
| errata-xmlrpc | 2018-10-02 15:43:49 UTC | CC | dn-infra-peta-pers | |
| Status | MODIFIED | ON_QA | ||
| Bara Ancincova | 2018-10-10 17:08:03 UTC | Docs Contact | agunn | bancinco |
| Doc Text | .An iSCSI device is busy according to the `systemd-udevd` service In Red Hat Enterprise Linux 7.5, the kernel's ALUA layer reduced the number of times an initiator retries the SCSI sense code `ALUA State Transition`. This code is returned from the target side by the `tcmu-runner` service when taking the RBD exclusive lock during a failover or failback scenario and when doing a device discovery. As a consequence, the maximum number of retries occurs before the discovery process has completed, and the SCSI layer will return a failure to the multipath IO layer. The multipath IO layer will try the next available path, and the same problem will occur. This causes a loop of path checking, resulting in failed IO, and management operations to the multipath device to fail. The logs on the initiator node will print messages about devices being removed and then re-added. To workaround this issued, downgrade the initiator's kernel to Red Hat Enterprise Linux 7.4. | In Red Hat Enterprise Linux 7.5, the kernel's ALUA layer reduced the number of times an initiator retries the SCSI sense code `ALUA State Transition`. This code is returned from the target side by the `tcmu-runner` service when taking the RBD exclusive lock during a failover or failback scenario and during a device discovery. As a consequence, the maximum number of retries had occurred before the discovery process was completed, and the SCSI layer returned a failure to the multipath I/O layer. The multipath I/O layer tried the next available path, and the same problem occurred. This behavior caused a loop of path checking, resulting in failed I/O operations and management operations to the multipath device. In addition, the logs on the initiator node printed messages about devices being removed and then re-added. This bug has been fixed, and the aforementioned operations no longer fail. | ||
| Doc Type | Known Issue | Bug Fix | ||
| Flags | needinfo?(mchristi) | |||
| Bara Ancincova | 2018-10-10 17:33:34 UTC | Doc Text | In Red Hat Enterprise Linux 7.5, the kernel's ALUA layer reduced the number of times an initiator retries the SCSI sense code `ALUA State Transition`. This code is returned from the target side by the `tcmu-runner` service when taking the RBD exclusive lock during a failover or failback scenario and during a device discovery. As a consequence, the maximum number of retries had occurred before the discovery process was completed, and the SCSI layer returned a failure to the multipath I/O layer. The multipath I/O layer tried the next available path, and the same problem occurred. This behavior caused a loop of path checking, resulting in failed I/O operations and management operations to the multipath device. In addition, the logs on the initiator node printed messages about devices being removed and then re-added. This bug has been fixed, and the aforementioned operations no longer fail. | .An iSCSI device is no longer busy according to the `systemd-udevd` In Red Hat Enterprise Linux 7.5, the kernel's ALUA layer reduced the number of times an initiator retries the SCSI sense code `ALUA State Transition`. This code is returned from the target side by the `tcmu-runner` service when taking the RBD exclusive lock during a failover or failback scenario and during a device discovery. As a consequence, the maximum number of retries had occurred before the discovery process was completed, and the SCSI layer returned a failure to the multipath I/O layer. The multipath I/O layer tried the next available path, and the same problem occurred. This behavior caused a loop of path checking, resulting in failed I/O operations and management operations to the multipath device. In addition, the logs on the initiator node printed messages about devices being removed and then re-added. This bug has been fixed, and the aforementioned operations no longer fail. |
| Mike Christie | 2018-10-10 18:42:22 UTC | Flags | needinfo?(mchristi) | |
| Bara Ancincova | 2018-10-16 13:27:16 UTC | Doc Text | .An iSCSI device is no longer busy according to the `systemd-udevd` In Red Hat Enterprise Linux 7.5, the kernel's ALUA layer reduced the number of times an initiator retries the SCSI sense code `ALUA State Transition`. This code is returned from the target side by the `tcmu-runner` service when taking the RBD exclusive lock during a failover or failback scenario and during a device discovery. As a consequence, the maximum number of retries had occurred before the discovery process was completed, and the SCSI layer returned a failure to the multipath I/O layer. The multipath I/O layer tried the next available path, and the same problem occurred. This behavior caused a loop of path checking, resulting in failed I/O operations and management operations to the multipath device. In addition, the logs on the initiator node printed messages about devices being removed and then re-added. This bug has been fixed, and the aforementioned operations no longer fail. | .The `dm-multipath` device's path no longer bounces between the failed and active state causing I/O failures, hangs, and performance issues In Red Hat Enterprise Linux 7.5, the kernel's ALUA layer reduced the number of times an initiator retries the SCSI sense code `ALUA State Transition`. This code is returned from the target side by the `tcmu-runner` service when taking the RBD exclusive lock during a failover or failback scenario and during a device discovery. As a consequence, the maximum number of retries had occurred before the discovery process was completed, and the SCSI layer returned a failure to the multipath I/O layer. The multipath I/O layer tried the next available path, and the same problem occurred. This behavior caused a loop of path checking, resulting in failed I/O operations and management operations to the multipath device. In addition, the logs on the initiator node printed messages about devices being removed and then re-added. This bug has been fixed, and the aforementioned operations no longer fail. |
| Bara Ancincova | 2018-10-16 13:37:03 UTC | Doc Text | .The `dm-multipath` device's path no longer bounces between the failed and active state causing I/O failures, hangs, and performance issues In Red Hat Enterprise Linux 7.5, the kernel's ALUA layer reduced the number of times an initiator retries the SCSI sense code `ALUA State Transition`. This code is returned from the target side by the `tcmu-runner` service when taking the RBD exclusive lock during a failover or failback scenario and during a device discovery. As a consequence, the maximum number of retries had occurred before the discovery process was completed, and the SCSI layer returned a failure to the multipath I/O layer. The multipath I/O layer tried the next available path, and the same problem occurred. This behavior caused a loop of path checking, resulting in failed I/O operations and management operations to the multipath device. In addition, the logs on the initiator node printed messages about devices being removed and then re-added. This bug has been fixed, and the aforementioned operations no longer fail. | .The DM-Multipath device's path no longer bounces between the failed and active state causing I/O failures, hangs, and performance issues In Red Hat Enterprise Linux 7.5, the kernel's ALUA layer reduced the number of times an initiator retries the SCSI sense code `ALUA State Transition`. This code is returned from the target side by the `tcmu-runner` service when taking the RBD exclusive lock during a failover or failback scenario and during a device discovery. As a consequence, the maximum number of retries had occurred before the discovery process was completed, and the SCSI layer returned a failure to the multipath I/O layer. The multipath I/O layer tried the next available path, and the same problem occurred. This behavior caused a loop of path checking, resulting in failed I/O operations and management operations to the multipath device. In addition, the logs on the initiator node printed messages about devices being removed and then re-added. This bug has been fixed, and the aforementioned operations no longer fail. |
| Tejas | 2018-10-22 06:01:34 UTC | CC | tchandra | |
| QA Contact | mkasturi | mmurthy | ||
| Bara Ancincova | 2018-10-23 16:59:19 UTC | Doc Text | .The DM-Multipath device's path no longer bounces between the failed and active state causing I/O failures, hangs, and performance issues In Red Hat Enterprise Linux 7.5, the kernel's ALUA layer reduced the number of times an initiator retries the SCSI sense code `ALUA State Transition`. This code is returned from the target side by the `tcmu-runner` service when taking the RBD exclusive lock during a failover or failback scenario and during a device discovery. As a consequence, the maximum number of retries had occurred before the discovery process was completed, and the SCSI layer returned a failure to the multipath I/O layer. The multipath I/O layer tried the next available path, and the same problem occurred. This behavior caused a loop of path checking, resulting in failed I/O operations and management operations to the multipath device. In addition, the logs on the initiator node printed messages about devices being removed and then re-added. This bug has been fixed, and the aforementioned operations no longer fail. | .The DM-Multipath device's path no longer bounces between the failed and active state causing I/O failures, hangs, and performance issues _[fixed by 3.1z1]_ In Red Hat Enterprise Linux 7.5, the kernel's ALUA layer reduced the number of times an initiator retries the SCSI sense code `ALUA State Transition`. This code is returned from the target side by the `tcmu-runner` service when taking the RBD exclusive lock during a failover or failback scenario and during a device discovery. As a consequence, the maximum number of retries had occurred before the discovery process was completed, and the SCSI layer returned a failure to the multipath I/O layer. The multipath I/O layer tried the next available path, and the same problem occurred. This behavior caused a loop of path checking, resulting in failed I/O operations and management operations to the multipath device. In addition, the logs on the initiator node printed messages about devices being removed and then re-added. This bug has been fixed, and the aforementioned operations no longer fail. |
| Manohar Murthy | 2018-10-24 10:47:44 UTC | Status | ON_QA | VERIFIED |
| Bara Ancincova | 2018-11-05 18:58:30 UTC | Doc Text | .The DM-Multipath device's path no longer bounces between the failed and active state causing I/O failures, hangs, and performance issues _[fixed by 3.1z1]_ In Red Hat Enterprise Linux 7.5, the kernel's ALUA layer reduced the number of times an initiator retries the SCSI sense code `ALUA State Transition`. This code is returned from the target side by the `tcmu-runner` service when taking the RBD exclusive lock during a failover or failback scenario and during a device discovery. As a consequence, the maximum number of retries had occurred before the discovery process was completed, and the SCSI layer returned a failure to the multipath I/O layer. The multipath I/O layer tried the next available path, and the same problem occurred. This behavior caused a loop of path checking, resulting in failed I/O operations and management operations to the multipath device. In addition, the logs on the initiator node printed messages about devices being removed and then re-added. This bug has been fixed, and the aforementioned operations no longer fail. | .The DM-Multipath device's path no longer bounces between the failed and active state causing I/O failures, hangs, and performance issues _[fixed in 3.1z1]_ In Red Hat Enterprise Linux 7.5, the kernel's ALUA layer reduced the number of times an initiator retries the SCSI sense code `ALUA State Transition`. This code is returned from the target side by the `tcmu-runner` service when taking the RBD exclusive lock during a failover or failback scenario and during a device discovery. As a consequence, the maximum number of retries had occurred before the discovery process was completed, and the SCSI layer returned a failure to the multipath I/O layer. The multipath I/O layer tried the next available path, and the same problem occurred. This behavior caused a loop of path checking, resulting in failed I/O operations and management operations to the multipath device. In addition, the logs on the initiator node printed messages about devices being removed and then re-added. This bug has been fixed, and the aforementioned operations no longer fail. |
| Bara Ancincova | 2018-11-06 16:34:31 UTC | Doc Text | .The DM-Multipath device's path no longer bounces between the failed and active state causing I/O failures, hangs, and performance issues _[fixed in 3.1z1]_ In Red Hat Enterprise Linux 7.5, the kernel's ALUA layer reduced the number of times an initiator retries the SCSI sense code `ALUA State Transition`. This code is returned from the target side by the `tcmu-runner` service when taking the RBD exclusive lock during a failover or failback scenario and during a device discovery. As a consequence, the maximum number of retries had occurred before the discovery process was completed, and the SCSI layer returned a failure to the multipath I/O layer. The multipath I/O layer tried the next available path, and the same problem occurred. This behavior caused a loop of path checking, resulting in failed I/O operations and management operations to the multipath device. In addition, the logs on the initiator node printed messages about devices being removed and then re-added. This bug has been fixed, and the aforementioned operations no longer fail. | .The DM-Multipath device's path no longer bounces between the failed and active state causing I/O failures, hangs, and performance issues _[fixed by 3.1z1]_ In Red Hat Enterprise Linux 7.5, the kernel's ALUA layer reduced the number of times an initiator retries the SCSI sense code `ALUA State Transition`. This code is returned from the target side by the `tcmu-runner` service when taking the RBD exclusive lock during a failover or failback scenario and during a device discovery. As a consequence, the maximum number of retries had occurred before the discovery process was completed, and the SCSI layer returned a failure to the multipath I/O layer. The multipath I/O layer tried the next available path, and the same problem occurred. This behavior caused a loop of path checking, resulting in failed I/O operations and management operations to the multipath device. In addition, the logs on the initiator node printed messages about devices being removed and then re-added. This bug has been fixed, and the aforementioned operations no longer fail. |
| Bara Ancincova | 2018-11-07 19:12:45 UTC | Doc Text | .The DM-Multipath device's path no longer bounces between the failed and active state causing I/O failures, hangs, and performance issues _[fixed by 3.1z1]_ In Red Hat Enterprise Linux 7.5, the kernel's ALUA layer reduced the number of times an initiator retries the SCSI sense code `ALUA State Transition`. This code is returned from the target side by the `tcmu-runner` service when taking the RBD exclusive lock during a failover or failback scenario and during a device discovery. As a consequence, the maximum number of retries had occurred before the discovery process was completed, and the SCSI layer returned a failure to the multipath I/O layer. The multipath I/O layer tried the next available path, and the same problem occurred. This behavior caused a loop of path checking, resulting in failed I/O operations and management operations to the multipath device. In addition, the logs on the initiator node printed messages about devices being removed and then re-added. This bug has been fixed, and the aforementioned operations no longer fail. | .The DM-Multipath device's path no longer bounces between the failed and active state causing I/O failures, hangs, and performance issues In Red Hat Enterprise Linux 7.5, the kernel's ALUA layer reduced the number of times an initiator retries the SCSI sense code `ALUA State Transition`. This code is returned from the target side by the `tcmu-runner` service when taking the RBD exclusive lock during a failover or failback scenario and during a device discovery. As a consequence, the maximum number of retries had occurred before the discovery process was completed, and the SCSI layer returned a failure to the multipath I/O layer. The multipath I/O layer tried the next available path, and the same problem occurred. This behavior caused a loop of path checking, resulting in failed I/O operations and management operations to the multipath device. In addition, the logs on the initiator node printed messages about devices being removed and then re-added. This bug has been fixed, and the aforementioned operations no longer fail. |
| errata-xmlrpc | 2018-11-08 18:50:16 UTC | Status | VERIFIED | RELEASE_PENDING |
| errata-xmlrpc | 2018-11-09 00:59:32 UTC | Status | RELEASE_PENDING | CLOSED |
| Resolution | --- | ERRATA | ||
| Last Closed | 2018-11-08 19:59:32 UTC | |||
| errata-xmlrpc | 2018-11-09 01:00:30 UTC | Link ID | Red Hat Product Errata RHBA-2018:3530 |
Back to bug 1623601