Bug 1910158
| Summary: | Create a multi-arch resource monitor to auto-detect and clean-up leaked clusters | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Jeremy Poulin <jpoulin> |
| Component: | Multi-Arch | Assignee: | Basavaraju <bgirriam> |
| Multi-Arch sub component: | IBM P / Z | QA Contact: | Deep Mistry <dmistry> |
| Status: | CLOSED CURRENTRELEASE | Docs Contact: | |
| Severity: | low | ||
| Priority: | low | CC: | aos-bugs, clnperez, dslavens, mhamzy, rdossant, skuznets, wking |
| Version: | 4.6 | Keywords: | TestOnly |
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | s390x | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | 1882785 | Environment: | |
| Last Closed: | 2022-08-30 16:08:45 UTC | Type: | --- |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | 1882785 | ||
| Bug Blocks: | |||
|
Description
Jeremy Poulin
2020-12-22 21:45:38 UTC
As part of bug triage, I'm changing the status to "Assigned" as I see that the bug is currently assigned to Deep. Hi Deep, do you think this bug will be resolved before the end of this Sprint (January 16th)? If not, can we add "UpcomingSprint"? Hi Deep, do you know if this bug will be resolved before the end of this sprint (Feb. 6th)? If not, can we set the "Reviewed-In-Sprint" flag to "+"? At the moment we require more input from the testplatform team, this bug will not be resolved in this sprint. Hi Deep, do you think this bug will be resolved by the end of this sprint (Feb 27th)? If not, can we set "Reviewed-in-Sprint"? Hi Deep, do you think this bug will be resolved by the end of this sprint (Mar 20th)? If not, can we set "Reviewed-in-Sprint"? Hi Deep, do you think this bug will be resolved by the end of this sprint (Apr 10th)? If not, can we set "Reviewed-in-Sprint"? Hi Deep, do you think this bug will be resolved by the end of this sprint (May 1st)? If not, can we set "Reviewed-in-Sprint"? Some progress have been made after initial investigation. @steve kuz Can you provide any info as to how we can test the controller locally? cc @mhamzy cc @skuznets More discussion on the progress https://coreos.slack.com/archives/CBN38N3MW/p1627399369095300 Hi Deep, do you think this bug will be resolved before the end of the current sprint (Sep 24th)? If not, can we add "reviewed-in-sprint" flag? Hi Deep, do you think this bug will be resolved before the end of the current sprint (Nov 27th)? If not, can we set the "reviewed-in-sprint" flag? Hi Deep, do you think this bug will be resolved before the end of the current sprint (January 8th)? If not, can we set "reviewed-in-sprint"? Hi Deep, it was mentioned during backlog refinement that the assignee for this bug might change. Can we change the assignee to the correct personnel working on this bug? Hi Basava, do you think this bug would be resolved before the end of the current sprint (January 29th)? If not, can we set the "reviewed-in-Sprint" flag to indicate that we have looked at the bug? Adding reviewed-in-sprint, as it was mentioned during yesterday's sprint planning that Basava will continue to work on this bug. Hi Basava, do you think this bug would be resolved before the end of the current sprint (February 19th)? If not, can we set the "reviewed-in-Sprint" flag to indicate that we have looked at the bug and will continue to work on it? Basava indicated that he will continue to work on this in the next sprint. So setting the flag. Chatted with Basava - this bug will continue in the next sprint. Keeping the "reviewed-in-sprint+" label Hi Basava, do you know if this bug will be resolved before the end of the current sprint (April 23rd)? If not, can we set the "reviewed-in-sprint" flag? Chatted with Basava and found out that this is in QA testing. Marking the status as ON_QA Basava's latest results:
recently in testing its failed to delete the resources do to missing libvirt binaries
{"component":"janitor","error":"Post \"http://boskos.test-pods.svc.cluster.local./acquire?dest=cleaning\u0026owner=Janitor\u0026state=dirty\u0026type=libvirt-ppc64le-quota-slice\": dial tcp: lookup boskos.test-pods.svc.cluster.local.: no such host","file":"/go/src/app/cmd/janitor/janitor.go:137","func":"main.run","level":"info","msg":"no available resource libvirt-ppc64le-quota-slice","severity":"info","time":"2022-05-06T07:09:55Z"}
{"component":"janitor","file":"/go/src/app/cmd/janitor/janitor.go:146","func":"main.run","level":"info","msg":"Acquired resources libvirt-ppc64le-0-2 of type libvirt-ppc64le-quota-slice","severity":"info","time":"2022-05-06T07:10:55Z"}
{"component":"janitor","file":"/go/src/app/cmd/janitor/janitor.go:101","func":"main.janitorClean","level":"info","msg":"executing janitor: /root/libvirt-ppc64le-janitor.sh --slice=libvirt-ppc64le-0-2 --hours=0","severity":"info","time":"2022-05-06T07:10:55Z"}
{"component":"janitor","file":"/go/src/app/cmd/janitor/janitor.go:146","func":"main.run","level":"info","msg":"Acquired resources libvirt-ppc64le-0-0 of type libvirt-ppc64le-quota-slice","severity":"info","time":"2022-05-06T07:10:55Z"}
{"component":"janitor","file":"/go/src/app/cmd/janitor/janitor.go:101","func":"main.janitorClean","level":"info","msg":"executing janitor: /root/libvirt-ppc64le-janitor.sh --slice=libvirt-ppc64le-0-0 --hours=0","severity":"info","time":"2022-05-06T07:10:55Z"}
{"component":"janitor","file":"/go/src/app/cmd/janitor/janitor.go:146","func":"main.run","level":"info","msg":"Acquired resources libvirt-ppc64le-1-0 of type libvirt-ppc64le-quota-slice","severity":"info","time":"2022-05-06T07:10:55Z"}
{"component":"janitor","file":"/go/src/app/cmd/janitor/janitor.go:101","func":"main.janitorClean","level":"info","msg":"executing janitor: /root/libvirt-ppc64le-janitor.sh --slice=libvirt-ppc64le-1-0 --hours=0","severity":"info","time":"2022-05-06T07:10:55Z"}
{"component":"janitor","file":"/go/src/app/cmd/janitor/janitor.go:146","func":"main.run","level":"info","msg":"Acquired resources libvirt-ppc64le-0-1 of type libvirt-ppc64le-quota-slice","severity":"info","time":"2022-05-06T07:10:55Z"}
{"component":"janitor","file":"/go/src/app/cmd/janitor/janitor.go:101","func":"main.janitorClean","level":"info","msg":"executing janitor: /root/libvirt-ppc64le-janitor.sh --slice=libvirt-ppc64le-0-1 --hours=0","severity":"info","time":"2022-05-06T07:10:55Z"}
{"component":"janitor","error":"resources not found","file":"/go/src/app/cmd/janitor/janitor.go:137","func":"main.run","level":"info","msg":"no available resource libvirt-ppc64le-quota-slice","severity":"info","time":"2022-05-06T07:10:55Z"}
{"component":"janitor","error":"exit status 127","file":"/go/src/app/cmd/janitor/janitor.go:105","func":"main.janitorClean","level":"info","msg":"failed to clean up project libvirt-ppc64le-0-1, error info: libvirtcli command not found, installing it.\nlibvirtcli: error while loading shared libraries: libvirt-lxc.so.0: cannot open shared object file: No such file or directory\n","severity":"info","time":"2022-05-06T07:10:55Z"}
{"component":"janitor","error":"exit status 1","file":"/go/src/app/cmd/janitor/janitor.go:105","func":"main.janitorClean","level":"info","msg":"failed to clean up project libvirt-ppc64le-0-0, error info: libvirtcli command not found, installing it.\nmv: cannot stat './libvirtcli': No such file or directory\n","severity":"info","time":"2022-05-06T07:10:56Z"}
{"component":"janitor","error":"exit status 1","file":"/go/src/app/cmd/janitor/janitor.go:105","func":"main.janitorClean","level":"info","msg":"failed to clean up project libvirt-ppc64le-0-2, error info: libvirtcli command not found, installing it.\nmv: cannot stat './libvirtcli': No such file or directory\n","severity":"info","time":"2022-05-06T07:10:56Z"}
{"component":"janitor","error":"exit status 127","file":"/go/src/app/cmd/janitor/janitor.go:105","func":"main.janitorClean","level":"info","msg":"failed to clean up project libvirt-ppc64le-1-0, error info: libvirtcli command not found, installing it.\nlibvirtcli: error while loading shared libraries: libvirt-lxc.so.0: cannot open shared object file: No such file or directory\n","severity":"info","time":"2022-05-06T07:10:56Z"}
root@basavarg-boskos-testing:~/dev/test-infra/config/prow/cluster#
root@basavarg-boskos-testing:~/dev/test-infra/config/prow/cluster#
Resources are marked dirty:
root@basavarg-boskos-testing:~/dev/test-infra/config/prow/cluster# kubectl get resources -n test-pods
NAME TYPE STATE OWNER LAST-UPDATED
libvirt-ppc64le-0-0 libvirt-ppc64le-quota-slice dirty 3s
libvirt-ppc64le-0-1 libvirt-ppc64le-quota-slice dirty 3s
libvirt-ppc64le-0-2 libvirt-ppc64le-quota-slice dirty 3s
libvirt-ppc64le-0-3 libvirt-ppc64le-quota-slice dirty 3s
libvirt-ppc64le-1-0 libvirt-ppc64le-quota-slice dirty 3s
libvirt-ppc64le-1-1 libvirt-ppc64le-quota-slice dirty 3s
libvirt-ppc64le-1-2 libvirt-ppc64le-quota-slice dirty 3s
libvirt-ppc64le-2-0 libvirt-ppc64le-quota-slice dirty 3s
libvirt-ppc64le-2-1 libvirt-ppc64le-quota-slice dirty 3s
libvirt-ppc64le-2-2 libvirt-ppc64le-quota-slice dirty 3s
libvirt-ppc64le-2-3 libvirt-ppc64le-quota-slice dirty 3s
root@basavarg-boskos-testing:~/dev/test-infra/config/prow/cluster#
working on fixing shared library issue.
libvirt client currently in my repo:
https://github.com/Basavaraju-G/janitor
Talked to Deep and Florian and this bug was verified and can be closed. |