+++ This bug was initially created as a clone of Bug #1833486 +++ Cleaned out the cloned comment log before Marko's report of high rate failures. --- Additional comment from Marko Karg on 2020-08-13 06:09:03 UTC --- I can confirm that logging at 2.5k/s works fine on 4.5.5 but going for higher rates leaves me with messages stuck in the fluentd pods and never getting delivered to ES. --- Additional comment from Marko Karg on 2020-08-13 08:11:47 UTC --- I've reran a test with 100 pods in 100 namespaces, logging at 50000 msgs / sec in total for 20 minutes, which should result in 600000 messages in every index. This is what I got: green open project.logtest98.fdfae9ef-a84a-4c7e-8f70-e580ddf60159.2020.08.13 f3roeu5rR1utKqlASrfTlQ 3 1 451750 0 617.1mb 308.6mb green open project.logtest83.bfbfa57a-2794-4450-90f2-7f9d283c5365.2020.08.13 Lozpgwd9SLmfImn4OQW9fg 3 1 103500 0 143.3mb 72.2mb green open project.logtest86.dcc4f112-0328-49e4-b177-97fb2527b5d8.2020.08.13 POLDYc3HTXyBB30PFZ4H8g 3 1 203000 0 277.4mb 138mb green open project.logtest61.cc3784c7-7702-4b64-84bc-bb5b52235677.2020.08.13 2-mxFKBpTdCoa1C_exjQSw 3 1 100250 0 139.4mb 69.8mb green open project.logtest82.abc253a2-965d-4493-bc30-cfda648d0124.2020.08.13 0MXvzY-qTw6A201EdWoaCA 3 1 201500 0 255.4mb 127.6mb green open project.logtest57.9111cd70-40fa-464d-8a9c-1303ee60e31b.2020.08.13 n7wWZum-S6-6EdpAjoPfzw 3 1 151750 0 208.1mb 103.8mb green open project.logtest40.e924f170-4805-4a71-b10a-c17167d56a58.2020.08.13 n7f_RiqPQHO2O_WyALjLxQ 3 1 200750 0 273.7mb 137mb green open project.logtest37.226c2a8f-e2bd-4154-bc07-7d07e0835389.2020.08.13 QRO2_AEzT_WXwZYfh-ElXw 3 1 252750 0 346mb 173mb green open project.logtest5.c9521dac-510d-40a8-b7b6-94d74851d28a.2020.08.13 8jsCUniUSpu8s4JbXYGdVg 3 1 201500 0 277.4mb 138.5mb green open project.logtest34.1c1679fb-ba08-460d-bc81-1103ffbac4d3.2020.08.13 Hqu7M6C9Q5GUu4EpsYs2_g 3 1 251500 0 344.3mb 171.4mb green open project.logtest97.1f516add-6c98-4be1-9c87-3fd35aa1ec9b.2020.08.13 VmNbro9mTAWxBOz-N0PVog 3 1 151750 0 208.5mb 104.2mb green open project.logtest28.0872992b-4ca1-43ba-84de-dea2b1284098.2020.08.13 w5543si2Rv-Z4yWE0IwKnw 3 1 152000 0 209.5mb 104.8mb green open project.logtest71.f293aeec-656f-421d-be27-a806e68b8dea.2020.08.13 PnSpUG2yQWaBNflO1Dx-jA 3 1 352000 0 481.1mb 240.6mb green open project.logtest70.2e02f50f-a5ea-4e77-89ad-d033370e2066.2020.08.13 k8DIiOaxQ7WDL9LaVZzGKw 3 1 100000 0 138.2mb 68.5mb green open project.logtest8.017152b5-d477-4147-8518-f02358fcec21.2020.08.13 AzHHg0LPTAG_nWlYUhRMFg 3 1 16000 0 24.4mb 12.1mb green open project.logtest75.0e3a3449-e98b-4994-959d-500553bff5bc.2020.08.13 LV-IwSTxSl-9vpGOAnWEqA 3 1 100000 0 137.5mb 68.5mb green open project.logtest20.5da98cdb-7f58-40ad-a92d-59d6cb20a4d0.2020.08.13 qjOK8yd0RtinDDi9zGz44w 3 1 103500 0 142.7mb 71.6mb green open project.logtest30.0099fef1-28b3-4902-9e1c-9a23aefd78bc.2020.08.13 0vYDJT90Q7eTg5wFUItXeg 3 1 204000 0 280.2mb 140.5mb green open project.logtest84.e6e79c7f-142a-42c3-a353-f68a6372634b.2020.08.13 81ft2GteTh2i9WphdqlzJQ 3 1 202750 0 277.8mb 138.8mb green open project.logtest63.2a8f44bb-2307-4399-8479-6796c77a8908.2020.08.13 1j9mFcZYRlqsQogrN5N0Ag 3 1 202500 0 278.3mb 138.6mb green open project.logtest87.4019da71-7b97-47dd-b473-5baf49d90f21.2020.08.13 FfCJF2IvSU2BHW1ohTIPLg 3 1 200000 0 273.9mb 137.1mb green open project.logtest22.14853e9d-dd11-4aef-ba4e-3c19395a0d95.2020.08.13 T8aBvfFfStOdkEvczUItpQ 3 1 203000 0 278.7mb 139.2mb green open project.logtest58.230cf278-d81e-4957-981e-0f994dc24c22.2020.08.13 eAKnTToATtOiiF0mTXmRtg 3 1 103750 0 142.2mb 70.9mb green open project.logtest6.12b8e87a-de40-4e5b-a85e-8acbe52519c9.2020.08.13 Gr2GwGXyRpOfxH_VPzvgIQ 3 1 16000 0 24.3mb 12.1mb green open project.logtest91.4b837eda-5c21-4b38-8351-d797afcf2182.2020.08.13 L_K32ayrTJyOl89c8oKKMg 3 1 252500 0 346.5mb 172.9mb green open project.logtest42.38ce51a4-88e2-4e88-a820-62b5b4275c1c.2020.08.13 7KaHvVMhT7G66jkVdPvHPA 3 1 203000 0 277mb 138.5mb green open project.logtest67.95bf0ab6-ac87-4a38-8ffa-e637163c157d.2020.08.13 yFAgph3xR2CnwuKMgv91UQ 3 1 100750 0 139.3mb 69.7mb green open project.logtest12.2eb97788-8aa8-4dfa-a5f7-6b4b9f463c8b.2020.08.13 RnEbfFOkRvWWLnD2cHEYSA 3 1 501000 0 684.3mb 342.2mb green open project.logtest79.d4d778fe-78df-4d2a-a941-1c6f641c2875.2020.08.13 NiDbcNIdQAm_cWEyM2ldsQ 3 1 149750 0 207.6mb 103.9mb green open project.logtest47.600e17ea-c88f-447e-a9f1-1a610c1fa5d4.2020.08.13 8dATJn1AT0e3WMVIWYkQQg 3 1 204000 0 279.8mb 139.8mb green open project.logtest41.572529c5-6cdb-4389-bd2f-7c78c98b3ba9.2020.08.13 lMoEmd_KQaC6_mu0L03fLw 3 1 352750 0 483.4mb 241.6mb green open project.logtest2.f60f1a2c-8c92-4eda-961c-284eff3c862c.2020.08.13 gdcmORREQD6LGcivPOqGIw 3 1 302000 0 413mb 206.2mb green open project.logtest49.bd65fda1-fcda-4811-9921-dbf857c53687.2020.08.13 YhpsDLrSS9KKwQcj4AV9dw 3 1 202805 0 278.4mb 138.9mb green open project.logtest32.bfec867f-d805-48cd-938f-a3b219027850.2020.08.13 P0ja2ulNRQKBri8jPuM04Q 3 1 101000 0 140.2mb 69.9mb green open project.logtest38.fb0a2902-89b4-47f5-a447-e30323ddc4db.2020.08.13 jqwzyEBhQ_S3kpXJechkuA 3 1 252750 0 344.9mb 172.7mb green open project.logtest62.b4cc0582-835e-4a6a-b355-e603590789fd.2020.08.13 mYjNpdrxRxeKdRlTOrE6vw 3 1 202500 0 276.7mb 138.2mb green open project.logtest9.bd3ab32d-a801-4ecf-bb86-080bf26729f8.2020.08.13 YZ9dUEVVTR-TWO4V2fERrg 3 1 252250 0 345.5mb 172.5mb green open project.logtest21.ef9394cb-6ed8-48f1-a0e0-8366c3705fee.2020.08.13 jscgX1nERCKkhxr0Kofjew 3 1 351750 0 482mb 240.9mb green open project.logtest95.3bf7b5ca-7cf6-45a1-b529-476158cc20ea.2020.08.13 cGGkP_U9SIuf1JyeH2jiZA 3 1 252250 0 345.1mb 172.7mb green open project.logtest1.721c4164-e3ec-42ca-a1ee-738abec432d6.2020.08.13 G-11jZRVQkqs270rRy7OmQ 3 1 600000 0 819mb 409.6mb green open project.logtest4.3ee3b7b0-724c-4c23-bbcd-c3fc1bfe266a.2020.08.13 KGEtDaNCRQa9qCeawA0uDw 3 1 501000 0 685.5mb 342.8mb green open project.logtest24.3e88b5a3-3754-49ae-9aa5-1c9f643c255c.2020.08.13 o_CWjoAITdOPis1m_vvQ9w 3 1 600000 0 817.5mb 408.6mb green open project.logtest81.e19926d0-59bb-4850-b469-b775b9dd9661.2020.08.13 VR5A1qAIRoOPZx0eMdGOaw 3 1 256723 0 326.3mb 163.5mb green open project.logtest60.f6a9c1ac-0302-496b-9d14-4de7ef78eeb9.2020.08.13 T-Mfo8uwTD6ctrhcxmo3PA 3 1 202630 0 278.3mb 138.5mb green open project.logtest77.536c1a73-7c38-4825-94a1-f4ed32c65641.2020.08.13 QRRjY5YNTuSltgVM-lMTPg 3 1 102000 0 141.3mb 70.7mb green open project.logtest11.c6482750-a9e8-4dbc-b3f4-1402ce1d401e.2020.08.13 4AiBAfsFTPalXf623vyTIQ 3 1 253750 0 347.5mb 173.7mb green open project.logtest43.3cd984b9-c2da-4bbe-b513-7f821e818ec9.2020.08.13 6WUk-8IbQoCpgNYH0lRfJg 3 1 153750 0 211.4mb 105.8mb green open project.logtest72.ba3aa48b-8358-4a4c-a512-717f236142cb.2020.08.13 wIwwNQ77Qd62RjVQWCWSMA 3 1 251750 0 318.3mb 159mb green open project.logtest52.c88f7265-fa2d-4813-a36a-f8a2054a55f6.2020.08.13 Wfdnm6TyQG2_xjhjm0HqKA 3 1 203000 0 278.7mb 139.6mb green open project.logtest78.bb3de4aa-9f42-4f58-911a-383aa3a47b39.2020.08.13 NvHzfFJ7TTGeq3-xKTu07Q 3 1 152000 0 194.3mb 97.2mb green open project.logtest15.dda34629-0e1d-46ef-8c56-f7d7c4b9c781.2020.08.13 doiW1BI2R3moeky_I1Hw7g 3 1 252750 0 348.1mb 174.1mb green open project.logtest46.38645aac-c60a-4175-95be-b0936e44b63f.2020.08.13 WGhDKeaCSIOLAMKcB8AoYw 3 1 402000 0 548.6mb 274.6mb green open project.logtest55.fbfc4a80-f54f-4946-9897-c0089dda8520.2020.08.13 Gk7nV1vuRSeQPbFunOib_g 3 1 402000 0 550.3mb 275.2mb green open project.logtest54.532a44d6-2841-45f3-91fc-d3ee96033d57.2020.08.13 d5ESPmkeTrufgS222NxY5Q 3 1 151580 0 208.9mb 104.5mb green open project.logtest45.ee24e156-b682-482a-a99f-d5aa2b6447df.2020.08.13 NJE7qfr9QDKZ2-z17CnGuQ 3 1 100500 0 139.4mb 69.5mb green open project.logtest89.347a235a-5682-4bca-8a29-b0de6b18c59d.2020.08.13 gSP26jjpTre5uMWbOT1QhQ 3 1 51000 0 71.2mb 35.7mb green open project.logtest19.bc3dbbd3-e996-4b28-8a38-dbf2a6959929.2020.08.13 i7CpFQ93StGJV_ZNUXfc7Q 3 1 203250 0 279.6mb 139.6mb green open project.logtest25.a47ec1ff-d1ab-4d85-af73-79e6876df681.2020.08.13 bR2ZqQMWQQqrVeSPwVk8hA 3 1 252750 0 345.8mb 173.2mb green open project.logtest73.b0fa5917-6c01-48e6-997b-4b8fec79258d.2020.08.13 9MaOc2B2SFG5akTSzmLXJg 3 1 203000 0 278.2mb 139.3mb green open project.logtest66.5ff3ca9b-f769-41a5-abbf-f649b4053d24.2020.08.13 AxDKp6PuRwaBG3JrC4RnhA 3 1 102000 0 141.1mb 71.1mb green open project.logtest4.4a8cf193-b552-467f-bd39-42e15dea26a0.2020.08.13 h1VuZy2jSDWsiDxgeBTGBA 3 1 15500 0 23.9mb 11.9mb green open project.logtest44.37095fb0-5272-4baf-8a43-94b5f01c5cbc.2020.08.13 opqhAbfDR3OtduE7fsq2FQ 3 1 402000 0 550.3mb 275.1mb green open project.logtest6.1c1f097c-0a56-4e80-b2ff-bcb166a080d1.2020.08.13 svsRcu6kTU2Osi_X8xqx9g 3 1 402000 0 548.5mb 274.2mb green open project.logtest39.42a22cd5-3bb7-40fa-a39e-c63ea81e7f8b.2020.08.13 1EeMCqihRaOwhS9QSslX3Q 3 1 153250 0 211.5mb 105.4mb green open project.logtest80.4dcf3c47-67c3-4af8-a20e-00ce32377cc7.2020.08.13 Kl2Tkha1Qg-ETwdQ-0gKdQ 3 1 202750 0 277.5mb 138.4mb green open project.logtest90.39846254-42da-4b5d-be34-82f14c5975e3.2020.08.13 4DZ5a3C9R8aE5_EXeBYzBg 3 1 153250 0 210.2mb 104.6mb green open project.logtest13.af6f76f8-0eca-4475-9626-b2dcac6b4b73.2020.08.13 BRbBWcLdQqKZ_SO_VCokTQ 3 1 203000 0 278.8mb 139.2mb green open project.logtest14.06a31b33-301b-47cb-a78e-b742af36e96f.2020.08.13 6gNjA0iNTJGDVFYfkKhM1Q 3 1 204000 0 279.7mb 140.1mb green open project.logtest29.c862fa9f-9537-4d25-b49a-76211a8f99b2.2020.08.13 OuMXpr5dRQGy3XH3WHa32w 3 1 204000 0 280.3mb 140.5mb green open project.logtest3.b5a5874d-fb17-4b05-b0b0-a9cd68260b83.2020.08.13 C6NOizcZSYuWEzYRCqdarw 3 1 15500 0 23.1mb 11.4mb green open project.logtest16.b58bcfe5-93e4-4a8b-87ab-ae4bd3122946.2020.08.13 syyaAOjzSfadd7_hsoD4Mw 3 1 302750 0 414.5mb 207.1mb green open project.logtest36.d81afb6c-4002-4789-b602-bf636d7debc6.2020.08.13 3TvDkL9pQBqxDLd7Oa9sXQ 3 1 451750 0 618.7mb 309.1mb green open project.logtest96.d47b7e6f-95b7-4acc-82bd-21121c4e1d51.2020.08.13 1GAD7hAJQ7aArMo2Ei8TrA 3 1 200500 0 276.1mb 137.9mb green open project.logtest27.832873fc-807c-497a-a90c-ad43668c1058.2020.08.13 _aatrWOpRryuv1yc3_CN-g 3 1 253500 0 346.9mb 173.3mb green open project.logtest48.ed54b2e3-ec08-48f6-948a-4e559aa27e76.2020.08.13 MdEZZWRMRdyC-6ihLV7QuA 3 1 202750 0 277.9mb 139mb green open project.logtest50.50f673b5-33dc-464b-a11b-3d0f54f39d03.2020.08.13 5q7ibP-BTPG8ZVvjx5sBiQ 3 1 103750 0 143.4mb 71.5mb green open project.logtest7.403606b2-be6b-4376-af12-b9ae8b857ee2.2020.08.13 M1eMjVDkQ5SaxOObxW7Uqg 3 1 303000 0 415.3mb 207.7mb green open project.logtest8.3dbf737e-2aa8-4f63-ae6e-2a1bdd777713.2020.08.13 R4nm9EaOTViH8t98APETHA 3 1 251750 0 345.6mb 172.8mb green open project.logtest99.9c8736df-c850-41b5-a107-d7af31b0c478.2020.08.13 WIGEqQyKTIaVBgUrOT4dSQ 3 1 203000 0 279.2mb 139.5mb green open project.logtest0.bf7f1765-e5b7-424e-b741-0fd2f73c3b1d.2020.08.13 _NTYjQ11SN-Qyspr0DwqJA 3 1 200500 0 274.7mb 137.9mb green open project.logtest31.327f9312-48f5-4687-8355-030cc116b5bb.2020.08.13 zCK7RdQwTHi7gSxwbhGR0A 3 1 303250 0 415.1mb 208.1mb green open project.logtest17.a535b1b2-c638-43aa-9aee-9e9515dec5b6.2020.08.13 cSPYYwxeTf6wecdto8MBQw 3 1 550500 0 750.3mb 375.2mb green open project.logtest85.3bc6a3fd-4e32-4a05-8211-ec175ea5ec64.2020.08.13 _sCuf0-9Rgmxz3S_N8bDKQ 3 1 103567 0 143.7mb 71.8mb green open project.logtest18.3e85e952-e453-4b52-8923-b9416fdfd93c.2020.08.13 zOUeK3LSSkupxfQ_nRCMKg 3 1 352500 0 482.1mb 241.1mb green open project.logtest3.e802146c-6399-467e-8d1e-fcdaccbcb6ed.2020.08.13 bNpeZbMoTD6zqj6Ybf0Log 3 1 151750 0 208.8mb 104.1mb green open project.logtest26.a589cb51-4f5d-4ee8-9d7a-d01f90844eb5.2020.08.13 1kAgoux2QDyav81JWgTLzA 3 1 303000 0 415.5mb 207.8mb green open project.logtest33.9d74063d-b622-49f7-ab7b-f8946660eaa2.2020.08.13 RIBzZ8NhSEmFp2AXsMFDSQ 3 1 253500 0 346.2mb 173mb green open project.logtest74.095753e8-81bd-4679-9096-ac66b121825b.2020.08.13 r5P19K2dRkKj4j_NMA-G_A 3 1 100500 0 140.1mb 69.9mb green open project.logtest23.bde0e437-ba92-43cf-994d-ea33dc8de76b.2020.08.13 YxG7ukVBRROVMxu2rUV29Q 3 1 352500 0 482.6mb 241.1mb green open project.logtest64.96351148-6dc3-4996-b621-a77dd1cd2f00.2020.08.13 lDh0h2_wRk-xwIGpAJSolg 3 1 100000 0 139.7mb 69.5mb green open project.logtest51.4172ecc8-b056-4677-bc9e-62f1f0c1b876.2020.08.13 m3uK8QN0R22ZV9wh-o1tsQ 3 1 203000 0 279.3mb 139.3mb green open project.logtest35.a70d6498-bccf-4c0c-9b51-6fe96fe7d4e9.2020.08.13 mqXZO-UrSs-gsHylijnOaQ 3 1 201750 0 276.4mb 138.2mb green open project.logtest53.f1074e11-6822-4851-8309-c9844245126a.2020.08.13 28R0VM6dQLC8EFoFwhn23A 3 1 202500 0 277.8mb 138.6mb green open project.logtest94.5366d218-f1b7-4bd1-9c68-35b7058b3295.2020.08.13 5Km_OoT_T9avlomaNu33uA 3 1 152000 0 208.8mb 104.6mb green open project.logtest69.d8aa5b46-57ce-4e8f-9921-7d26b9fb2595.2020.08.13 b9whDhvvTK2CEkTLa2YuAg 3 1 302750 0 414.3mb 206.8mb green open project.logtest10.a31bb02a-3834-404d-93db-4d028ff90c3a.2020.08.13 1eUSUANATRWR20MmKg9Oow 3 1 351750 0 482.7mb 241.2mb green open project.logtest92.e75fbec3-b659-4ed3-b0f8-06b34db01c10.2020.08.13 IFyGQVoaR6KWQ6NPQFdjug 3 1 252500 0 344.7mb 172.6mb green open project.logtest93.a5781f58-63bf-47e8-8150-a5f898aecb3d.2020.08.13 0zqD198BSe-g1GfVthW82Q 3 1 154000 0 211.2mb 105.6mb green open project.logtest76.bca080d1-7a0b-49fe-b706-01a54ddbaf54.2020.08.13 KcrgXBXdS7etK5P1VTAdfQ 3 1 252000 0 345.5mb 172.8mb green open project.logtest68.f9aeca84-3084-4d5f-986f-29f47d576c0b.2020.08.13 9NkFUMsPQYSxx3K7NUtk5w 3 1 202750 0 277.7mb 138.6mb green open project.logtest88.829a9619-62c9-47ee-a734-99d2ed35996f.2020.08.13 KuAN_CVrTMqJdBE0sky22g 3 1 200257 0 274.4mb 137.3mb green open project.logtest0.f16f49bf-17f3-46cc-819d-27cc99930bd9.2020.08.13 xdfxlUYLSkC1VAUCRGHnsA 3 1 16000 0 24.2mb 12mb green open project.logtest56.98a9c838-64f6-4daa-922d-6f1fdad64ea4.2020.08.13 iH5z9zeFSQOA6I3qh1MsvQ 3 1 154000 0 212.4mb 106.2mb green open project.logtest59.daa14fbd-3ec7-48eb-9a95-fab00e21fd2a.2020.08.13 K-LmMZkFRy2F_5FeqmEIQw 3 1 104000 0 143.7mb 72.1mb green open project.logtest65.054d59bf-3b61-4f48-aff8-9c43bff7a1d9.2020.08.13 AqbSKE4STOiIYLGtIIep6w 3 1 100500 0 139.5mb 69.5mb Only 2 out of the 100 indices actually got the expected 600000 messages, the others are lacking a significant amount, even 15 minutes after the logging test stopped. Looking closer at the last one, logtest65: [kni@e16-h18-b03-fc640 ~]$ oc get pods -A -o wide | grep logtest65 logtest65 centos-logtest-8nq8h 1/1 Running 0 40m 10.130.24.7 worker035 <none> <none> Checking the fluentd dir on node worker035: [kni@e16-h18-b03-fc640 ~]$ oc debug node/worker035 Starting pod/worker035-debug ... To use host binaries, run `chroot /host` Pod IP: 192.168.222.48 If you don't see a command prompt, try pressing enter. sh-4.2# sh-4.2# chroot /host sh-4.4# ls -ltr /var/lib/fluentd/ clo_default_output_es/ retry_clo_default_output_es/ sh-4.4# ls -ltr /var/lib/fluentd/ clo_default_output_es/ retry_clo_default_output_es/ sh-4.4# ls -ltr /var/lib/fluentd/clo_default_output_es/ total 8 -rw-r--r--. 1 root root 221 Aug 13 07:05 buffer.b5acbcee6812adcfb761bf2c22534d4a2.log.meta -rw-r--r--. 1 root root 1187 Aug 13 07:05 buffer.b5acbcee6812adcfb761bf2c22534d4a2.log To me it looks like fluentd is not sending buffered messages to ES anymore. A must-gather can be found at http://file.str.redhat.com/mkarg/bz1833486/must-gather.tgz Please let me know if you need any further information from the cluster. --- Additional comment from Yaniv Joseph on 2020-09-10 13:21:32 UTC --- Hi Jeff, As the case is still unresolved can you re-open the BZ ? Thanks, Yaniv
Closing duplicate. Per the comments: > I can confirm that logging at 2.5k/s works fine on 4.5.5 but going for higher rates leaves me with messages stuck in the fluentd pods and never getting delivered to ES. This is a known limitation that can not be addressed without figuring out how to miss fewer log rotations which have a plan to do; fluentd just isnt the solution *** This bug has been marked as a duplicate of bug 1872465 ***