Bug 1635035
| Summary: | Running sosreport on OCP cluster node fails with IndexError from kubernetes plugin trying to get namespaces | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Candace Sheremeta <cshereme> |
| Component: | sos | Assignee: | Pavel Moravec <pmoravec> |
| Status: | CLOSED ERRATA | QA Contact: | Miroslav HradĂlek <mhradile> |
| Severity: | medium | Docs Contact: | |
| Priority: | medium | ||
| Version: | 7.5 | CC: | agk, bmr, cshereme, cww, gavin, klaas, pdwyer, plambri, sbradley |
| Target Milestone: | rc | ||
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | sos-3.7-1.el7 | Doc Type: | If docs needed, set a value |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2019-08-06 13:15:20 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | |||
| Bug Blocks: | 1594286, 1648022 | ||
Defering to 7.7, too late for 7.6 (where the bug is supposed to be present as well as the code hasnt changed since sos 3.5).
If you can reproduce it on your system, could you please provide me access to it?
Or at least, provide either:
1) Add to /usr/lib/python2.7/site-packages/sos/plugins/kubernetes.py the print statement:
kn = self.get_command_output('%s get namespaces' % kube_cmd)
print("kube_cmd='%s', kn=%s" % (kube_cmd, kn))
knsps = [n.split()[0] for n in kn['output'].splitlines()[1:] if n]
re-run sosreport that will fail and provide stdout output.
2) Or provide output of:
kubectl get namespaces
kubectl --kubeconfig=/etc/origin/master/admin.kubeconfig get namespaces
commands (whose parsing to get namespaces is failing)
(FYI in 02193703, I think the sosreport was killed by OOM killer as it was executing another plugin (logs) - see https://bugzilla.redhat.com/show_bug.cgi?id=1183244)
if self.check_is_master():
kube_cmd = "kubectl "
if path.exists('/etc/origin/master/admin.kubeconfig'):
- kube_cmd += "--config=/etc/origin/master/admin.kubeconfig"
+ kube_cmd += "--kubeconfig=/etc/origin/master/admin.kubeconfig"
Greetings
Klaas
(In reply to Klaas Demter from comment #3) > if self.check_is_master(): > kube_cmd = "kubectl " > if path.exists('/etc/origin/master/admin.kubeconfig'): > - kube_cmd += "--config=/etc/origin/master/admin.kubeconfig" > + kube_cmd += > "--kubeconfig=/etc/origin/master/admin.kubeconfig" > > > Greetings > Klaas That is upstream commit https://github.com/sosreport/sos/commit/63ad6c2 included in 3.6 we rebase to in RHEL 7.6. Do you suggest this fixes this BZ? Yeah, that fixes it for me (Case 02192312) Hey Pavel, I can confirm that https://github.com/sosreport/sos/commit/63ad6c2 fixed this issue within my environment as well. .. and here is the problem that is basically workarounded only in sos 3.6: python Python 2.7.15 (default, Sep 21 2018, 23:26:48) [GCC 8.1.1 20180712 (Red Hat 8.1.1-5)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> kn = {'status': 1, 'output': u'Error: unknown flag: --config\n\n\nExamples:\n'} >>> kn {'status': 1, 'output': u'Error: unknown flag: --config\n\n\nExamples:\n'} >>> [n.split()[0] for n in kn['output'].splitlines()[1:] if n] [u'Examples:'] >>> (so far so good, let use longer output snippet, with line containing spaces) >>> kn={'status': 1, 'output': u'Error: unknown flag: --config\n\n\nExamples:\n # List all pods in ps output format.\n kubectl get pods\n \n # List all pods in ps output format with more information (such as node name).\n kubectl get pods -o wide\n \n'} >>> [n.split()[0] for n in kn['output'].splitlines()[1:] if n] Traceback (most recent call last): File "<stdin>", line 1, in <module> IndexError: list index out of range >>> So the problem is when the output contains '\n \n' substring (line with spaces only) where: - this line is nonemtpy, so "if n" is True - splitting this line returns empty list - we try to access 1st item of the list So the code *is* wrong, and it works in 3.6 only b'cos the output does not contain lines with spaces only. Ideally the assignment should be like: knsps = [n.split()[0] for n in kn['output'].splitlines()[1:] if n and len(n.split())] So until something gets broken, the code in 3.6 *will* work well. But I will leave the BZ open / to be fixed for the improvement in https://github.com/sosreport/sos/pull/1442 For verification: 1) mimic you are Kubernetes master and fake kubectl command to return the problematic output: mkdir -p /etc/origin/master/ echo "echo 'nonempty line'; echo ' '; echo 'another nonempty'" > /usr/bin/kubectl chmod a+x /usr/bin/kubectl 2) run sosreport: sosreport -o kubernetes --batch --build 3) check if kubernetes plugin does not raise above exception from Description posted to upstream Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2019:2295 |
Description of problem: Running a sosreport on an OCP cluster node fails with the following error from the kubernetes plugin: Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/sos/sosreport.py", line 1252, in setup plug.setup() File "/usr/lib/python2.7/site-packages/sos/plugins/kubernetes.py", line 75, in setup knsps = [n.split()[0] for n in kn['output'].splitlines()[1:] if n] IndexError: list index out of range Version-Release number of selected component (if applicable): sos-3.5-9.el7_5 seen on OCP 3.9 and 3.10 clusters How reproducible: 100% Steps to Reproduce: 1. Set up OCP cluster (I have customers reporting this issue from both 3.9 and 3.10 clusters, and was able to reproduce the issue on master nodes for both versions) 2. Run sosreport 3. Actual results: ~~~ # sosreport sosreport (version 3.5) This command will collect diagnostic and configuration information from this Red Hat Enterprise Linux system and installed applications. ... Setting up archive ... Setting up plugins ... caught exception in plugin method "kubernetes.setup()" writing traceback to sos_logs/kubernetes-plugin-errors.txt Running plugins. Please wait ... ~~~ sos_logs/kubernetes-plugin-errors.txt shows: Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/sos/sosreport.py", line 1252, in setup plug.setup() File "/usr/lib/python2.7/site-packages/sos/plugins/kubernetes.py", line 75, in setup knsps = [n.split()[0] for n in kn['output'].splitlines()[1:] if n] IndexError: list index out of range Expected results: No exception Additional info: N/A