+++ This bug was initially created as a clone of Bug #1591118 +++ Description of problem: Tons of "invalid character 'c' looking for beginning of value" errors appear in node log when cluster running with crio. Jun 13 23:58:12 ip-172-18-14-79.ec2.internal atomic-openshift-node[18924]: ERROR: logging before flag.Parse: W0613 23:58:12.971203 18924 manager.go:1178] Failed to process watch event {EventType:0 Name:/kubepods.slice/kubepods-besteffort.slice/kubepods-besteffort-pod2262d6c2_6f87_11e8_86dd_0eb037b2774e.slice/crio-b188fba45a612b6fd7cbb00dd6150e182e86ecef6f0447abd67210f08e2a456f.scope WatchSource:0}: invalid character 'c' looking for beginning of value Jun 13 23:58:16 ip-172-18-14-79.ec2.internal atomic-openshift-node[18924]: ERROR: logging before flag.Parse: W0613 23:58:16.489277 18924 manager.go:1178] Failed to process watch event {EventType:0 Name:/kubepods.slice/kubepods-besteffort.slice/kubepods-besteffort-pod2262d6c2_6f87_11e8_86dd_0eb037b2774e.slice/crio-2cb7f38521db0cd9584b4d52c7d5e76e39aea96f226b7f4fadf811252e3c5e5a.scope WatchSource:0}: invalid character 'c' looking for beginning of value Jun 13 23:59:40 ip-172-18-14-79.ec2.internal atomic-openshift-node[18924]: ERROR: logging before flag.Parse: W0613 23:59:40.225422 18924 manager.go:1178] Failed to process watch event {EventType:0 Name:/kubepods.slice/kubepods-besteffort.slice/kubepods-besteffort-pod5b7ccb19_6f87_11e8_86dd_0eb037b2774e.slice/crio-e6ccc38071c3b403accd153c11775f7fd933305f31b551d4b6c6597692ce777d.scope WatchSource:0}: invalid character 'c' looking for beginning of value Jun 13 23:59:54 ip-172-18-14-79.ec2.internal atomic-openshift-node[18924]: ERROR: logging before flag.Parse: W0613 23:59:54.687828 18924 manager.go:1178] Failed to process watch event {EventType:0 Name:/kubepods.slice/kubepods-besteffort.slice/kubepods-besteffort-pod5b7ccb19_6f87_11e8_86dd_0eb037b2774e.slice/crio-ea3d01c5538a2b238bbdacf534aa30b77454cecde90d47c782a9e2c2722ae297.scope WatchSource:0}: invalid character 'c' looking for beginning of value Version-Release number of selected component (if applicable): always How reproducible: v3.10.0-0.67.0 Steps to Reproduce: 1. Setup ocp cluster with crio 2. Check the node log 3. Actual results: Tons of "invalid character 'c' looking for beginning of value" errors Expected results: Should not have such error Additional info: There is no such problem on docker container env. --- Additional comment from Antonio Murdaca on 2018-06-14 08:04:16 UTC --- Fix is here: https://github.com/kubernetes-incubator/cri-o/pull/1615 --- Additional comment from Antonio Murdaca on 2018-06-14 10:02:39 UTC --- that was not the fix, my bad, I'm working on one right now --- Additional comment from Antonio Murdaca on 2018-06-14 10:38:18 UTC --- Fix here https://github.com/kubernetes-incubator/cri-o/pull/1616 --- Additional comment from Mrunal Patel on 2018-06-19 17:48:04 UTC --- This is fixed in crio 1.10.3 --- Additional comment from Mrunal Patel on 2018-06-20 12:50:38 UTC --- Can this be tested and moved to verified? --- Additional comment from Peter Ruan on 2018-06-20 17:26:00 UTC --- still seeing the errors in the node logs with crio verison 1.10.3 [root@qe-pruan-auto-310-all-in-one-crio-preserve-merrn-1 ~]# crio --version crio version 1.10.3 commit: "54383a69903e1b46526be3dbe0bea71b14f00e46-dirty" [root@qe-pruan-auto-310-all-in-one-crio-preserve-merrn-1 ~]# journalctl -u atomic-openshift-node.service | grep "Failed to process watch event" Jun 18 16:30:00 qe-pruan-auto-310-all-in-one-crio-preserve-merrn-1 atomic-openshift-node[22829]: W0618 16:30:00.331301 22829 manager.go:1178] Failed to process watch event {EventType:0 Name:/kubepods.slice/kubepods-besteffort.slice/kubepods-besteffort-podf8ddfa563d27a7f9ca95644fe2f8cdd0.slice/crio-28263d9946f502dc7ed03ecb134d8f13f55b6cc3863aad0e3e4f42ea8bfcfc34.scope WatchSource:0}: invalid character 'c' looking for beginning of value Jun 18 16:30:00 qe-pruan-auto-310-all-in-one-crio-preserve-merrn-1 atomic-openshift-node[22829]: W0618 16:30:00.333551 22829 manager.go:1178] Failed to process watch event {EventType:0 Name:/kubepods.slice/kubepods-besteffort.slice/kubepods-besteffort-pod83e9c11b241c6539016d50a962142eda.slice/crio-075774ac13d768cf7f4729bba748570a5eaa329ca63eae303417d28ffcfe494a.scope WatchSource:0}: invalid character 'c' looking for beginning of value --- Additional comment from Antonio Murdaca on 2018-06-21 07:59:01 UTC --- Could you provide full cri-o log for the above run please? --- Additional comment from Antonio Murdaca on 2018-06-21 08:58:40 UTC --- (In reply to Peter Ruan from comment #6) > still seeing the errors in the node logs with crio verison 1.10.3 > > [root@qe-pruan-auto-310-all-in-one-crio-preserve-merrn-1 ~]# crio --version > crio version 1.10.3 > commit: "54383a69903e1b46526be3dbe0bea71b14f00e46-dirty" > [root@qe-pruan-auto-310-all-in-one-crio-preserve-merrn-1 ~]# journalctl -u > atomic-openshift-node.service | grep "Failed to process watch event" > Jun 18 16:30:00 qe-pruan-auto-310-all-in-one-crio-preserve-merrn-1 > atomic-openshift-node[22829]: W0618 16:30:00.331301 22829 manager.go:1178] > Failed to process watch event {EventType:0 > Name:/kubepods.slice/kubepods-besteffort.slice/kubepods-besteffort- > podf8ddfa563d27a7f9ca95644fe2f8cdd0.slice/crio- > 28263d9946f502dc7ed03ecb134d8f13f55b6cc3863aad0e3e4f42ea8bfcfc34.scope > WatchSource:0}: invalid character 'c' looking for beginning of value > Jun 18 16:30:00 qe-pruan-auto-310-all-in-one-crio-preserve-merrn-1 > atomic-openshift-node[22829]: W0618 16:30:00.333551 22829 manager.go:1178] > Failed to process watch event {EventType:0 > Name:/kubepods.slice/kubepods-besteffort.slice/kubepods-besteffort- > pod83e9c11b241c6539016d50a962142eda.slice/crio- > 075774ac13d768cf7f4729bba748570a5eaa329ca63eae303417d28ffcfe494a.scope > WatchSource:0}: invalid character 'c' looking for beginning of value could you also provide access to the test machine? --- Additional comment from Meng Bo on 2018-06-21 11:05:48 UTC --- Yeah, I can see the problem on OCP 3.10.2-1 and cri-o 1.10.3 Jun 21 17:25:40 ocp310-node2.bmeng.local atomic-openshift-node[1663]: ERROR: logging before flag.Parse: E0621 17:25:36.856474 1663 manager.go:1130] Failed to create existing container: /kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod32c61a68_6881_11e8_8add_525400c1455c.slice/crio-10cb6cd6ad5a692d5a7cd31ee355447ebda9d3d16026768ae4c4978ddf643e60.scope: invalid character 'c' looking for beginning of value Jun 21 17:25:40 ocp310-node2.bmeng.local atomic-openshift-node[1663]: ERROR: logging before flag.Parse: E0621 17:25:36.857341 1663 manager.go:1130] Failed to create existing container: /kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod32c61a68_6881_11e8_8add_525400c1455c.slice/crio-60e955109ec274dc6b4d3e7237097e1eb106aa2e2255d94a2ae221b69b18873c.scope: invalid character 'c' looking for beginning of value --- Additional comment from Antonio Murdaca on 2018-06-25 14:44:50 UTC --- Tentatively moving this to 3.10.z because I don't believe this is a blocker (though, we're still actively working on it!). The issue here is that cgroups show up earlier than cadvisor querying the CRI-O endpoint, resulting in that error in logs. Anyway, the next time cadvisor queries the CRI-O endpoing, everything is gonna be successful. --- Additional comment from Antonio Murdaca on 2018-08-27 13:50:01 UTC --- can you re-test with latest openshift+crio? if it's not working could you please provide a test vm where I can reproduce myself? Thanks --- Additional comment from Meng Bo on 2018-08-29 05:59:37 UTC --- I cannot see the ERRORs on ocp v3.10.34 with cri-o-1.10.6-1.rhaos3.10.git56d7d9a.el7.x86_64 --- Additional comment from Antonio Murdaca on 2018-08-29 09:07:06 UTC --- can you move to verified?
Looks like that issue is still present, at least running 3.11 / cri-o 1.11.7-1 ``` [root@master1 ~]# grep 'Failed to create' /var/log/messages | tail -5 Mar 19 08:36:37 master1 origin-node: E0319 08:36:37.844187 97021 manager.go:1130] Failed to create existing container: /kubepods.slice/kubepods-besteffort.slice/kubepods-besteffort-pod47be93eaccaa069bb51ece81ac83d4dd.slice/crio-a404d24228a566d7868275171610473cc576bad2186767e6cd12dd958b2986cf.scope: invalid character 'c' looking for beginning of value Mar 19 08:36:37 master1 origin-node: E0319 08:36:37.845545 97021 manager.go:1130] Failed to create existing container: /kubepods.slice/kubepods-besteffort.slice/kubepods-besteffort-podda15a24e_3d44_11e9_b6a9_52540016f6b8.slice/crio-048fa6c53c26e783ddb6d23c921552ed63182468c150863e3176f4092f8f2e76.scope: invalid character 'c' looking for beginning of value Mar 19 08:36:37 master1 origin-node: E0319 08:36:37.850936 97021 manager.go:1130] Failed to create existing container: /kubepods.slice/kubepods-pod6584060d_426b_11e9_b846_525400ddfbf0.slice/crio-0c255b94555d6263c8f776b87e9a6d77cc89062f84b7c5ddd99e212a628b3103.scope: invalid character 'c' looking for beginning of value Mar 19 08:36:37 master1 origin-node: E0319 08:36:37.853019 97021 manager.go:1130] Failed to create existing container: /kubepods.slice/kubepods-pod42752a46_321d_11e9_844d_52540016f6b8.slice/crio-b4423ddd6c52fc11c3fa9bb9b7abaf94d04a09fa2953c11da2871bb1a8c12b81.scope: invalid character 'c' looking for beginning of value Mar 19 08:36:37 master1 origin-node: E0319 08:36:37.854713 97021 manager.go:1130] Failed to create existing container: /kubepods.slice/kubepods-besteffort.slice/kubepods-besteffort-pode6f04d9558a3d478e7971667ef0cc6f4.slice/crio-f9e540c0da1dea83e1664030aa33cc3373c1e72623baf990540a6eddc50a43dc.scope: invalid character 'c' looking for beginning of value [root@master1 ~]# grep 'Failed to create' /var/log/messages | head -1 Mar 14 18:47:25 master1 origin-node: E0314 18:47:25.598035 4205 manager.go:1130] Failed to create existing container: /kubepods.slice/kubepods-besteffort.slice/kubepods-besteffort-podc96e1c1a_3205_11e9_b6f4_525400ddfbf0.slice/crio-36002b3eb498cc6805b72c8593659eb73e1c1a61940a6d067697bb3082598050.scope: invalid character 'c' looking for beginning of value [root@master1 ~]# grep 'Failed to create' /var/log/messages | wc -l 50681 ``` According to github, 1.11.7 was released mid-october 2018. Since the fix discussed for 1.10 was validated in august, I've been hoping they were cherry-picked. And maybe they did. Yet looking at my logs, those "invalid character 'c'" are still there, spamming.
We have a 1.11.11 release. Can you try this out? https://brewweb.engineering.redhat.com/brew/buildinfo?buildID=829231
Sure, doesn't seem to help though: [root@master3 ~]# grep 'Failed to create' /var/log/messages | tail -5 ; uptime ; rpm -qa | grep cri-o Mar 19 15:48:05 master3 origin-node: E0319 15:48:05.882733 26633 manager.go:1130] Failed to create existing container: /kubepods.slice/kubepods-besteffort.slice/kubepods-besteffort-pod6b449b4a_3202_11e9_b6f4_525400ddfbf0.slice/crio-3263424e308267d201f2b63fb2c415acd40f80b77a45298a19ba70b99b93096b.scope: invalid character 'c' looking for beginning of value Mar 19 15:48:05 master3 origin-node: E0319 15:48:05.885366 26633 manager.go:1130] Failed to create existing container: /kubepods.slice/kubepods-besteffort.slice/kubepods-besteffort-pod857c7e5cc6472964d18dacaed15e0068.slice/crio-5939681e5e03ff1b6effa7c9496d9facff00128cf69f0cc0cff3e2fb02c6a1da.scope: invalid character 'c' looking for beginning of value Mar 19 15:48:05 master3 origin-node: E0319 15:48:05.887436 26633 manager.go:1130] Failed to create existing container: /kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod7fbd123d_3202_11e9_b6f4_525400ddfbf0.slice/crio-f4e1906fef97a7bf7692a36c0375ebb6427fbed32aef0bc3f71c438b34e05430.scope: invalid character 'c' looking for beginning of value Mar 19 15:48:05 master3 origin-node: E0319 15:48:05.889320 26633 manager.go:1130] Failed to create existing container: /kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podf08aaa63_3204_11e9_b6f4_525400ddfbf0.slice/crio-a630dedc5ca6c46df97e01be98e8400094947117dfc69fe513dedc9669d92630.scope: invalid character 'c' looking for beginning of value Mar 19 15:48:05 master3 origin-node: E0319 15:48:05.890950 26633 manager.go:1130] Failed to create existing container: /kubepods.slice/kubepods-besteffort.slice/kubepods-besteffort-poda02ff2c5f7bde3d006c9b493d09fc61e.slice/crio-d014ae537449af1c60cde717bc26fc3168ce1f6abff5ad156711c8b907ddc81d.scope: invalid character 'c' looking for beginning of value 15:48:47 up 29 min, 1 user, load average: 2.53, 2.83, 3.29 cri-o-1.11.11-1.rhaos3.11.git474f73d.el7.x86_64 Regards.
Meanwhile, I should point out that this 1.11.11 package seems to generate an awful lot of logs. My /var/log/messages is rotated weekly. Average size from previous copies is something like 50M, no compression, ... While my current /var/log/messages is almost 1G today. Taking one log at random, that comes back again and again: [root@master1 log]# grep /opt/cni/bin messages | wc -l 876318 [root@master1 log]# grep /opt/cni/bin messages | tail -3 Mar 21 15:18:54 master1 crio: time="2019-03-21 15:18:54.116562963+01:00" level=error msg="Error adding network: failed to find plugin "bridge" in path [/opt/cni/bin/ /opt/bridge/bin]" Mar 21 15:18:54 master1 crio: time="2019-03-21 15:18:54.116628238+01:00" level=error msg="Error while adding to cni network: failed to find plugin "bridge" in path [/opt/cni/bin/ /opt/bridge/bin]" Mar 21 15:18:54 master1 crio: time="2019-03-21 15:18:54.116729436+01:00" level=error msg="Error deleting network: failed to find plugin "bridge" in path [/opt/cni/bin/ /opt/bridge/bin]" I'ld suggest not publishing that package as is ... Regards.
Hey Samuel, I've set up a cluster to reproduce (3.11 with 1.11.11), and haven't been able to. The second set of logs look like faulty CNI configs, which may be a problem with how the cluster was set up. Can you send me the contents of your plugin_dir and network_dir ? You should be able to find which they are by looking at /etc/crio/crio.conf Best Peter
Also, can you give me more details on how your cluster was brought up?
I'm tracking this issue for a customer having this problem in OpenShfit 3.11 and Cri-o 1.11.10. > Since the fix discussed for 1.10 was validated in august, I've been hoping they were cherry-picked. I see the fix from 1.10 is at least present in 1.11 so perhaps we're seeing this issue manifest another way? 3.11 -> https://github.com/cri-o/cri-o/commit/d0e48a80d Verified 3.10 fix for comparison -> https://github.com/cri-o/cri-o/pull/1616/files
As a follow-up, I've got another customer complaining about those messages (3.11.98). > How was that cluster brought up? Nothing out of the ordinary: os_sdn_network_plugin_name: redhat/openshift-ovs-multitenant os_firewall_use_firewalld: True openshift_master_cluster_hostname: xxx openshift_master_cluster_public_hostname: xxx openshift_master_default_subdomain: xxx openshift_master_api_port: '443' openshift_master_console_port: "{{ openshift_master_api_port }}" openshift_master_session_name: openshift-internal openshift_master_named_certificates: - certfile: xxx keyfile: xxx cafile: xxx names: - xxx openshift_master_overwrite_named_certificates: true oreg_url: "xxx/openshift3/ose-${component}:${version}" oreg_auth_user: foo oreg_auth_password: bar openshift_examples_modify_imagestreams: true openshift_docker_blocked_registries: all openshift_docker_insecure_registries: [] openshift_metrics_install_metrics: False openshift_metrics_server_install: False openshift_logging_install_eventrouter: False openshift_logging_install_logging: False openshift_cluster_monitoring_operator_install: False openshift_crio_var_sock: /var/run/crio/crio.sock openshift_use_crio: True openshift_use_crio_only: False openshift_crio_edits: - key: kubeletArguments.container-runtime value: [ remote ] - key: kubeletArguments.container-runtime-endpoint value: [ "{{ openshift_crio_var_sock }}" ] - key: kubeletArguments.image-service-endpoint value: [ "{{ openshift_crio_var_sock }}" ] - key: kubeletArguments.runtime-request-timeout value: [ 10m ] #openshift_node_groups: / don't have a copy, main diff with crio default node groups being a zone label Last time I've seen that issue, the openshift_use_crio_only was set to True, the use_firewalld to False. We also had some non-default openshift_node_groups.