Bug 1918691

Summary: must-gather collects too much data
Product: OpenShift Container Platform Reporter: Francesco Romani <fromani>
Component: Performance Addon OperatorAssignee: Francesco Romani <fromani>
Status: CLOSED ERRATA QA Contact: Gowrishankar Rajaiyan <grajaiya>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 4.7CC: aos-bugs, grajaiya, kquinn, mniranja, shajmakh
Target Milestone: ---   
Target Release: 4.7.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Cause: must-gather collect unbounded amount of kubelet logs on all nodes Consequence: excessive amount of data transferred and collected, with no clear benefit for the user. Fix: collect bounded amount (last 8 hours) of kubelet logs only on worker nodes (not on control plane nodes) Result: upper bound to data collected.
Story Points: ---
Clone Of:
: 1918693 (view as bug list) Environment:
Last Closed: 2022-08-26 14:47:58 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1918693    

Description Francesco Romani 2021-01-21 11:52:47 UTC
Description of problem:

must-gather collects too much data. This is unnecessary large and doesn't help troubleshooting - actually the sheer size makes the investigation harder.
Also this increases transfer time.
Most notably, kubelet logs are unnecessarily big because there is no bound to how much history we collect (= we get everything it's available on the node!)

Version-Release number of selected component (if applicable):
4.7.0

How reproducible:
always

Steps to Reproduce:
1. run must-gather against a cluster
2.
3.

Actual results:
must-gather takes too much time to collect data; the collected data is unnecessarily (= no clear need) large. Most notably kubelet logs are too big.

Expected results:
must-gather collects a bounded amount of kubelet logs (not everything)

Additional info:
Unclear how much is enough. Unless a clear reason is given to go back in time, each amount is heuristically determined.

Comment 1 Francesco Romani 2021-01-21 12:19:04 UTC
to reproduce:
1. run must-gather (repeatedly)
2. verify it still collects kubelet logs
3. verify the kubelet logs cover the last 8hours (approximatively)