We ( Prasanna and me) had a look at this setup and things observed. *) The login/discovery dont work on the targets. To isolate the issue, we tried to login from one of the other node ( ocp master -> a fresh initiator) and discovery didnt work from it either. *) Looking at target side, we could see `tcmu-runner` service is hung and not responding at all. There are errors on glfs handlers as well. The interesting bit is that, these targets were able to discover,login and attach to a pod and write. Atleast I dont expect this is due to a 'network'/iptables issue because the nodes are NOT rebooted and also with same iptables rules in place the discovery and login worked. Couple of things to note here: There was a block volume 'delete' before Karthick noticed this issue, not sure it is related. There are also softlockups and panic strings in these servers, yet to match the timestamp though. ~~~ [1925749.244007] INFO: task iscsi_np:82714 blocked for more than 120 seconds. [1925749.258478] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [1925749.262048] iscsi_np D 0000000000000001 0 82714 2 0x00000084 [1925749.262055] ffff8802acf43bc0 0000000000000046 ffff8802927ccf10 ffff8802acf43fd8 [1925749.262058] ffff8802acf43fd8 ffff8802acf43fd8 ffff8802927ccf10 ffff8802702f31a8 [1925749.262061] 7fffffffffffffff ffff8802702f31a0 ffff8802927ccf10 0000000000000001 [1925749.262065] Call Trace: [1925749.262077] [<ffffffff816a94c9>] schedule+0x29/0x70 [1925749.262083] [<ffffffff816a6fd9>] schedule_timeout+0x239/0x2c0 [1925749.262090] [<ffffffff8109c3be>] ? __send_signal+0x18e/0x490 [1925749.262093] [<ffffffff816a987d>] wait_for_completion+0xfd/0x140 [1925749.262098] [<ffffffff810c4850>] ? wake_up_state+0x20/0x20 [1925749.262115] [<ffffffffc069e0fe>] iscsit_cause_connection_reinstatement+0x9e/0x100 [iscsi_target_mod] [1925749.262126] [<ffffffffc06b15c1>] iscsit_stop_session+0x121/0x1e0 [iscsi_target_mod] [1925749.262134] [<ffffffffc06a1ef8>] iscsi_check_for_session_reinstatement+0x208/0x280 [iscsi_target_mod] [1925749.262142] [<ffffffffc06a4c95>] iscsi_target_check_for_existing_instances+0x35/0x40 [iscsi_target_mod] [1925749.262150] [<ffffffffc06a4de1>] iscsi_target_do_login+0x141/0x670 [iscsi_target_mod] [1925749.262157] [<ffffffffc06a603c>] iscsi_target_start_negotiation+0x1c/0xb0 [iscsi_target_mod] [1925749.262165] [<ffffffffc06a3c6f>] iscsi_target_login_thread+0xb0f/0x1060 [iscsi_target_mod] [1925749.262168] [<ffffffff816a8f00>] ? __schedule+0x310/0x8b0 [1925749.262179] [<ffffffffc06a3160>] ? iscsi_target_login_sess_out+0x250/0x250 [iscsi_target_mod] [1925749.262183] [<ffffffff810b09cf>] kthread+0xcf/0xe0 [1925749.262186] [<ffffffff810b0900>] ? insert_kthread_work+0x40/0x40 [1925749.262190] [<ffffffff816b4f18>] ret_from_fork+0x58/0x90 [1925749.262193] [<ffffffff810b0900>] ? insert_kthread_work+0x40/0x40 ~~~ [91397.324059] CPU: 24 PID: 47197 Comm: iscsi_trx Tainted: G B ------------ 3.10.0-693.el7.x86_64 #1 [91397.324059] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 09/21/2015 [91397.324061] task: ffff880a07f89fa0 ti: ffff88086d3ec000 task.ti: ffff88086d3ec000 [91397.324072] RIP: 0010:[<ffffffffc0035650>] [<ffffffffc0035650>] target_free_sgl+0x20/0x60 [target_core_mod] [91397.324073] RSP: 0018:ffff88086d3efda8 EFLAGS: 00010202 [91397.324074] RAX: 0000000000040109 RBX: 0000000000000000 RCX: dead000000000200 [91397.324074] RDX: 0000000000000001 RSI: 0000000000000002 RDI: 0000000000000000 [91397.324075] RBP: ffff88086d3efdc8 R08: ffff88086d3efd58 R09: 00000001056dff1a [91397.324076] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 [91397.324076] R13: 0000000000000002 R14: 0000000000000000 R15: ffff88085d2eec00 [91397.324077] FS: 0000000000000000(0000) GS:ffff880c0dc00000(0000) knlGS:0000000000000000 [91397.324078] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [91397.324079] CR2: 0000000000000000 CR3: 0000000bc82b8000 CR4: 00000000003407e0 [91397.324127] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [91397.324128] ABORT_TASK: Sending TMR_FUNCTION_COMPLETE for ref_tag: 67 [91397.324153] ABORT_TASK: Sending TMR_FUNCTION_COMPLETE for ref_tag: 117 [91397.324168] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [91397.324169] Stack: [91397.324171] ffff88085ae9eb50 0000000000000001 ffff88085ae9eb50 0000000000000001 [91397.324173] ffff88086d3efdf0 ffffffffc0036ade ffff88085ae9e940 0000000000000001 [91397.324174] ffff88085ae9eb50 ffff88086d3efe20 ffffffffc0702e93 ffff880a5a3fe800 [91397.324175] Call Trace: [91397.324192] [<ffffffffc0036ade>] transport_generic_free_cmd+0x14e/0x260 [target_core_mod] [91397.324192] ABORT_TASK: Sending TMR_FUNCTION_COMPLETE for ref_tag: 43 [91397.324203] [<ffffffffc0702e93>] iscsit_free_cmd+0x73/0x140 [iscsi_target_mod] [91397.324204] ABORT_TASK: Sending TMR_FUNCTION_COMPLETE for ref_tag: 49 [91397.324211] [<ffffffffc070aa93>] iscsit_close_connection+0x413/0x820 [iscsi_target_mod] [91397.324216] [<ffffffffc06f8083>] iscsit_take_action_for_connection_exit+0x83/0x110 [iscsi_target_mod] [91397.324221] [<ffffffffc070a25f>] iscsi_target_rx_thread+0x9f/0xb0 [iscsi_target_mod] [91397.324225] [<ffffffffc070a1c0>] ? iscsi_target_tx_thread+0x210/0x210 [iscsi_target_mod] [91397.324230] [<ffffffff810b098f>] kthread+0xcf/0xe0 [91397.324232] [<ffffffff810b08c0>] ? insert_kthread_work+0x40/0x40 [91397.324235] [<ffffffff816b4f18>] ret_from_fork+0x58/0x90 [91397.324236] [<ffffffff810b08c0>] ? insert_kthread_work+0x40/0x40 [91397.324248] Code: c0 e8 45 51 31 c1 e9 55 ff ff ff 0f 1f 44 00 00 55 48 89 e5 41 56 49 89 fe 41 55 41 89 f5 41 54 49 89 fc 53 31 db 85 f6 7e 23 90 <49> 8b 3c 24 31 f6 83 c3 01 48 83 e7 fc e8 3e 61 15 c1 4c 89 e7 [91397.324259] RIP [<ffffffffc0035650>] target_free_sgl+0x20/0x60 [target_core_mod] [91397.324259] RSP <ffff88086d3efda8> [91397.324260] CR2: 0000000000000000 ~~~~~ Kernel: Linux dhcp47-72.lab.eng.blr.redhat.com 3.10.0-693.el7.x86_64 #1 SMP Thu Jul 6 19:56:57 EDT 2017 x86_64 x86_64 x86_64 GNU/Linux
The steps after which the issue was hit has some missing steps. Although this has been discussed among Me, Humble and Prasanna, updating the bz for any future reference. Steps to Reproduce: 1. Create 10 app pods with gluster-block as pvc 2. Run IOs 3. create 11th app pod with block-size exceeding the block-hosting volume (i.e., On a 500Gb block-hosting volume, tried to create a 600Gb block device), This failed as expected 4. Deleteed 11th pod and pvc 5. create 12th app pod with block-size equal to that of the block-hosting volume (i.e., On a 500Gb block-hosting volume, tried to create a 500Gb block device), This worked as expected 6. Deleted 12th app pod and pvc 3. restart app pod (I had deleted the app pod and dc had respun a new pod) 4. check if app pods are running
As an additional comment, regardless of discovery part, the tcmu-runner status, restart..etc was hung in this setup. Do we know why this service was hung ?
This issue was hit on the test setup used for testing EFK stack too. one of the elasticsearch pod using gluster-block device has crashed with the same error seen in this issue. Events: FirstSeen LastSeen Count From SubObjectPath Type Reason Message --------- -------- ----- ---- ------------- -------- ------ ------- 10m 19s 6 kubelet, dhcp46-100.lab.eng.blr.redhat.com Warning FailedSync Error syncing pod 8m 19s 5 kubelet, dhcp46-100.lab.eng.blr.redhat.com Warning FailedMount Unable to mount volumes for pod "logging-es-data-master-p1h7hxxi-1-399sq_logging(0adba18a-77fe-11e7-9fc8-005056a51dd3)": timeout expired waiting for volumes to attach/mount for pod "logging"/"logging-es-data-master-p1h7hxxi-1-399sq". list of unattached/unmounted volumes=[elasticsearch-storage] Only 6 gluster-block devices were created and IOs were run on them continuously. Other than that, there were no other specific failure induced.
oc get pods NAME READY STATUS RESTARTS AGE logging-curator-1-m4r99 1/1 Running 0 1d logging-curator-ops-1-qfvdz 1/1 Running 1 1d logging-es-data-master-dj9ey1jh-1-6835j 1/1 Running 0 1d logging-es-data-master-p1h7hxxi-1-399sq 0/1 Error 0 1d logging-es-data-master-uaaqo2e8-1-phcg1 0/1 Error 0 1d logging-es-ops-data-master-7u5z9m9j-1-n95j1 1/1 Running 0 1d logging-es-ops-data-master-lt229hum-1-z0z16 0/1 Error 0 1d logging-es-ops-data-master-vw10nnol-1-bz5rx 1/1 Running 1 1d logging-fluentd-pg3g7 1/1 Running 1 1d logging-fluentd-r6w3t 1/1 Running 0 1d logging-fluentd-sk6hx 1/1 Running 0 1d logging-fluentd-tk335 1/1 Running 1 1d logging-kibana-3-6jft3 2/2 Running 0 1d logging-kibana-ops-2-07x4d 2/2 Running 17 1d Once this issue is hit, app pods using gluster-block devices seems to be failing one by one.
However, discovery seems to be failing with a different error now.. [root@dhcp46-100 ~]# iscsiadm -m discovery -t st -p 10.70.46.100 iscsiadm: cannot make connection to 10.70.46.100: Connection refused iscsiadm: cannot make connection to 10.70.46.100: Connection refused iscsiadm: cannot make connection to 10.70.46.100: Connection refused iscsiadm: cannot make connection to 10.70.46.100: Connection refused iscsiadm: cannot make connection to 10.70.46.100: Connection refused iscsiadm: cannot make connection to 10.70.46.100: Connection refused iscsiadm: connection login retries (reopen_max) 5 exceeded iscsiadm: Could not perform SendTargets discovery: encountered connection failure
Me and Ramky had a look at the setup and looks like multipathing was not configured. We are at the moment not sure if that could be a reason. But Ramky will be giving a try once again after configuring multipath. =============================================================================== Following steps needs to be performed to configure multipathing on all the initiator nodes. # mpathconf --enable # cat >> /etc/multipath.conf # LIO iSCSI devices { device { vendor "LIO-ORG" user_friendly_names "yes" # names like mpatha path_grouping_policy "failover" # one path per group path_selector "round-robin 0" failback immediate path_checker "tur" prio "const" no_path_retry 120 rr_weight "uniform" } } Ctrl^C # systemctl restart multipathd ================================================================================ These steps are
For above pod see this [root@dhcp43-165 ~]# cd /var/lib/origin/openshift.local.volumes/pods/428bc478-9484-11e7-86b3-5254001958c3/volumes/kubernetes.io~iscsi/pvc-413e172b-9484-11e7-86b3-5254001958c3 [root@dhcp43-165 pvc-413e172b-9484-11e7-86b3-5254001958c3]# ls ls: cannot open directory .: Input/output error [root@dhcp43-165 pvc-413e172b-9484-11e7-86b3-5254001958c3]#
(In reply to Humble Chirammal from comment #38) > For above pod see this > > [root@dhcp43-165 ~]# cd > /var/lib/origin/openshift.local.volumes/pods/428bc478-9484-11e7-86b3- > 5254001958c3/volumes/kubernetes.io~iscsi/pvc-413e172b-9484-11e7-86b3- > 5254001958c3 > [root@dhcp43-165 pvc-413e172b-9484-11e7-86b3-5254001958c3]# ls > ls: cannot open directory .: Input/output error > [root@dhcp43-165 pvc-413e172b-9484-11e7-86b3-5254001958c3]# For this particular device, io error count: 0x13 and eh_timeout is 10. Ideally this node would have configured for mpath, but at present this node is operating on single path and multipath is not configured.
Created attachment 1324933 [details] iscsi messages from problematic node. ISCSI messages from "dhcp43-165"
Apart from iscsi errors attached, I/O errors are reported on this device. --snip-- Sep 12 11:28:41 dhcp43-165 kernel: sd 8:0:0:0: rejecting I/O to offline device Sep 12 11:28:41 dhcp43-165 kernel: sd 8:0:0:0: rejecting I/O to offline device Sep 12 11:28:41 dhcp43-165 kernel: blk_update_request: I/O error, dev sdg, sector 1065162 Sep 12 11:28:41 dhcp43-165 kernel: sd 8:0:0:0: rejecting I/O to offline device Sep 12 11:28:41 dhcp43-165 kernel: blk_update_request: I/O error, dev sdg, sector 1065169 Sep 12 11:28:41 dhcp43-165 kernel: blk_update_request: I/O error, dev sdg, sector 1065177 Sep 12 11:28:41 dhcp43-165 kernel: blk_update_request: I/O error, dev sdg, sector 1065181 Sep 12 11:28:41 dhcp43-165 kernel: blk_update_request: I/O error, dev sdg, sector 1065185 Sep 12 11:28:41 dhcp43-165 kernel: blk_update_request: I/O error, dev sdg, sector 1065189 Sep 12 11:28:41 dhcp43-165 kernel: blk_update_request: I/O error, dev sdg, sector 1065157 Sep 12 11:28:41 dhcp43-165 kernel: blk_update_request: I/O error, dev sdg, sector 1065157 Sep 12 11:28:41 dhcp43-165 kernel: XFS (sdg): metadata I/O error: block 0x1040ca ("xlog_iodone") error 5 numblks 64 Sep 12 11:28:41 dhcp43-165 kernel: XFS (sdg): Log I/O Error Detected. Shutting down filesystem Sep 12 11:28:41 dhcp43-165 kernel: XFS (sdg): metadata I/O error: block 0x1040d1 ("xlog_iodone") error 5 numblks 64 Sep 12 11:28:41 dhcp43-165 kernel: XFS (sdg): metadata I/O error: block 0x1040d9 ("xlog_iodone") error 5 numblks 64 Sep 12 11:28:41 dhcp43-165 kernel: XFS (sdg): metadata I/O error: block 0x1040dd ("xlog_iodone") error 5 numblks 64 Sep 12 11:28:41 dhcp43-165 kernel: XFS (sdg): metadata I/O error: block 0x1040e1 ("xlog_iodone") error 5 numblks 64 Sep 12 11:28:41 dhcp43-165 kernel: XFS (sdg): metadata I/O error: block 0x1040e5 ("xlog_iodone") error 5 numblks 64 Sep 12 11:28:41 dhcp43-165 kernel: XFS (sdg): metadata I/O error: block 0x1040c5 ("xlog_iodone") error 5 numblks 64 -snip------ So its rejecting I/O and it seems to me we are into the same issue we encountered in previous iterations.
Comment >=39 refers to "sdg" that too got same errors.
(In reply to Humble Chirammal from comment #48) > Comment >=39 refers to "sdg" that too got same errors. The i/o errors and issue pattern look same to me against the last iteration of this issue and the dependent (1480192) bug analysis.
(In reply to Humble Chirammal from comment #49) > (In reply to Humble Chirammal from comment #48) > > Comment >=39 refers to "sdg" that too got same errors. > > The i/o errors and issue pattern look same to me against the last iteration > of this issue and the dependent (1480192) bug analysis. As per https://bugzilla.redhat.com/show_bug.cgi?id=1480192#c19, the issue is seen without gluster in picture, where as this particular iteration I/O errors happened because of ping-timeouts in gluster rpc layer. So they are not same even though end result is I/O errors in dmesg.
(In reply to Pranith Kumar K from comment #50) > (In reply to Humble Chirammal from comment #49) > > (In reply to Humble Chirammal from comment #48) > > > Comment >=39 refers to "sdg" that too got same errors. > > > > The i/o errors and issue pattern look same to me against the last iteration > > of this issue and the dependent (1480192) bug analysis. > > As per https://bugzilla.redhat.com/show_bug.cgi?id=1480192#c19, the issue is > seen without gluster in picture, Well, thats why the entire 'software iscsi' stack ( target and initiator) was in doubt from start and we were/are trying to isolate where the culprit hide or the triggering point. > where as this particular iteration I/O > errors happened because of ping-timeouts in gluster rpc layer. So they are > not same even though end result is I/O errors in dmesg. My assumption is this: The "unmount" of the share fails in initiators due to the I/O error which caused "file exists" in next container respawn. If all the mounted shares/targets rejects I/O at unmount time or started to reject I/O at some point ( may be after serving i/o for some time or after a change in target side) this need to be analysed.
(In reply to Humble Chirammal from comment #52) > (In reply to Pranith Kumar K from comment #50) > > (In reply to Humble Chirammal from comment #49) > > > (In reply to Humble Chirammal from comment #48) > > > > Comment >=39 refers to "sdg" that too got same errors. > > > > > > The i/o errors and issue pattern look same to me against the last iteration > > > of this issue and the dependent (1480192) bug analysis. > > > > As per https://bugzilla.redhat.com/show_bug.cgi?id=1480192#c19, the issue is > > seen without gluster in picture, > > Well, thats why the entire 'software iscsi' stack ( target and initiator) > was in doubt from start and we were/are trying to isolate where the culprit > hide or the triggering point. I think we are going in circles. This iteration of reproducer is not what happened when krk raised this bz. I think that part of the analysis is conclusive. > > > where as this particular iteration I/O > > errors happened because of ping-timeouts in gluster rpc layer. So they are > > not same even though end result is I/O errors in dmesg. > > My assumption is this: > > The "unmount" of the share fails in initiators due to the I/O error which > caused "file exists" in next container respawn. Could you help to confirm this? That would give good data point. > > If all the mounted shares/targets rejects I/O at unmount time or started to > reject I/O at some point ( may be after serving i/o for some time or after a > change in target side) this need to be analysed.
[Side note] As mentioned in previous comment, we should have configured multipath on these systems which is missing in this setup ( atleast on server which refers latest comments about sdg and sda devices) compared to previous iterations.
It looks like commands do get stuck, but handler_glfs gets a reply and sends it to lio. It looks like the problem is just that something is deleting the target from lio: We see these errors in lio: Sep 12 11:26:58 dhcp42-226 kernel: Unable to locate Target IQN: iqn.2016-12.org.gluster-block:df906719-ba77-45ad-814e-01a2589360ae in Storage Node Sep 12 11:26:58 dhcp42-226 kernel: iSCSI Login negotiation failed. There is no runner or lio code that would do this that I know of. However, it does not make sense that we would see those errors but if you logged in from another node that the login would work unless you are creating/deleting targets dynamically.
Created attachment 1325590 [details] kernel logs; naming kernel_logs_$host_$devname
Ran the same test mentioned in comment 80, no app pod crash was seen. Moving the bug to verified.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2017:2773