Hide Forgot
Created attachment 1110464 [details] Information about the machine when the issue occurred Description of problem: ----------------------- Login task on atomic host running RHGS container is blocked for a long time, see call trace below - [83287.004116] INFO: task login:26187 blocked for more than 120 seconds. [83287.005441] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [83287.007711] login D ffff88030c6d0270 0 26187 19423 0x00000004 [83287.009135] ffff88030b9dfd80 0000000000000082 ffff88030781e780 ffff88030b9dffd8 [83287.011544] ffff88030b9dffd8 ffff88030b9dffd8 ffff88030781e780 ffff88030781e780 [83287.013958] 7fffffffffffffff ffff88030c6d0278 0000000000000001 ffff88030c6d0270 [83287.016309] Call Trace: [83287.017394] [<ffffffff8163a889>] schedule+0x29/0x70 [83287.018618] [<ffffffff81638579>] schedule_timeout+0x209/0x2d0 [83287.020361] [<ffffffff810b8a56>] ? try_to_wake_up+0x1b6/0x300 [83287.021642] [<ffffffff810b8bf0>] ? wake_up_state+0x10/0x20 [83287.022934] [<ffffffff8163c62a>] ldsem_down_write+0xea/0x255 [83287.024232] [<ffffffff8163cce8>] tty_ldisc_lock_pair_timeout+0x88/0x120 [83287.025561] [<ffffffff813b63ac>] tty_ldisc_hangup+0xcc/0x230 [83287.026872] [<ffffffff813adb54>] __tty_hangup+0x344/0x490 [83287.028167] [<ffffffff813adfb1>] tty_vhangup_self+0x21/0x50 [83287.029472] [<ffffffff811dd8d3>] sys_vhangup+0x23/0x30 [83287.030717] [<ffffffff816458c9>] system_call_fastpath+0x16/0x1b I found this solution in our Knowledgebase - https://access.redhat.com/solutions/31453 I have collected relevant data from the machine as described in that solution. Unfortunately, I don't have enough information to reproduce this issue. I will update this BZ when I have more information. Version-Release number of selected component (if applicable): ------------------------------------------------------------- Red Hat Enterprise Linux Atomic Host release 7.2 rhgs-server-rhel7:3.1.2-3 How reproducible: ----------------- Intermittently Steps to Reproduce: ------------------- Clear steps not available at the moment. Actual results: --------------- Login task is hung.
Created attachment 1110465 [details] Logs from /var/log/dmesg*
(In reply to Shruti Sampat from comment #0) > Login task on atomic host running RHGS container is blocked for a long time, > see call trace below - > > [83287.004116] INFO: task login:26187 blocked for more than 120 seconds. > [83287.005441] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables > this message. > [83287.007711] login D ffff88030c6d0270 0 26187 19423 > 0x00000004 > [83287.009135] ffff88030b9dfd80 0000000000000082 ffff88030781e780 > ffff88030b9dffd8 > [83287.011544] ffff88030b9dffd8 ffff88030b9dffd8 ffff88030781e780 > ffff88030781e780 > [83287.013958] 7fffffffffffffff ffff88030c6d0278 0000000000000001 > ffff88030c6d0270 > [83287.016309] Call Trace: > [83287.017394] [<ffffffff8163a889>] schedule+0x29/0x70 > [83287.018618] [<ffffffff81638579>] schedule_timeout+0x209/0x2d0 > [83287.020361] [<ffffffff810b8a56>] ? try_to_wake_up+0x1b6/0x300 > [83287.021642] [<ffffffff810b8bf0>] ? wake_up_state+0x10/0x20 > [83287.022934] [<ffffffff8163c62a>] ldsem_down_write+0xea/0x255 > [83287.024232] [<ffffffff8163cce8>] tty_ldisc_lock_pair_timeout+0x88/0x120 > [83287.025561] [<ffffffff813b63ac>] tty_ldisc_hangup+0xcc/0x230 > [83287.026872] [<ffffffff813adb54>] __tty_hangup+0x344/0x490 > [83287.028167] [<ffffffff813adfb1>] tty_vhangup_self+0x21/0x50 > [83287.029472] [<ffffffff811dd8d3>] sys_vhangup+0x23/0x30 > [83287.030717] [<ffffffff816458c9>] system_call_fastpath+0x16/0x1b > At this stage, it looks like a resource congestion to me. Are you testing this in a VM ? If yes, how many CPUs are assigned to this VM ? Can you assign total number of vCPUs to the number of host CPUs and reproduce this issue ?
(In reply to Humble Chirammal from comment #3) > > At this stage, it looks like a resource congestion to me. > > Are you testing this in a VM ? If yes, how many CPUs are assigned to this VM > ? Can you assign total number of vCPUs to the number of host CPUs and > reproduce this issue ? I tested with a VM assigned with number of vCPUs equal to the number of host CPUs and kept it running for about a week. Haven't been able to reproduce this issue.
As the issue is due to resource congestion and when VM assigned with number of vCPUs equal to the number of host CPUs the issue is not reproducible. Hence closing the bug, if reproducible create a new RHBZ.