> - However, along with the 'netplugin failed with no error message: signal: killed' and 'name is reserved' messages, they observed the apiserver reporting panic. > - As a workaround, they restart the kube-apiserver pods and everything works as expected for a while. I would like the following to be tracked: - do a grep on the kube apiserver logs and give us a count of the total number of panics seen with time range. - can you have the customer run the following prometheus query on the web console (time range = starting with the time master was rebooted and going back to 48 hours) and share the screenshot with us? > sum(apiserver_flowcontrol_current_executing_requests) by (flowSchema,priorityLevel)
apjagtap, requests for new data grep the current kube apiserver logs (all instances): > grep -rni -E "timeout.go:(132|134)" namespaces/openshift-kube-apiserver/* please run the following prometheus queries and share the entire screenshot with me. > topk(25, sum(apiserver_flowcontrol_current_executing_requests) by (priorityLevel,instance)) > topk(25, sum(apiserver_flowcontrol_request_concurrency_limit) by (priorityLevel,instance)) Thanks!
apjagtap, > Should I open another bug and share it over sounds good to me, and please follow the instructions for data capture from this - https://bugzilla.redhat.com/show_bug.cgi?id=1908383#c19
*** This bug has been marked as a duplicate of bug 1924741 ***