Bug 1426009
| Summary: | 3.5: oc get commands on large numbers of resources 90-300% slower than OCP 3.4 | ||||||
|---|---|---|---|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Mike Fiedler <mifiedle> | ||||
| Component: | oc | Assignee: | Maciej Szulik <maszulik> | ||||
| Status: | CLOSED DEFERRED | QA Contact: | Mike Fiedler <mifiedle> | ||||
| Severity: | low | Docs Contact: | |||||
| Priority: | low | ||||||
| Version: | 3.5.0 | CC: | aos-bugs, ccoleman, deads, emarcian, jokerman, mfojtik, mifiedle, mmccomas, vlaad | ||||
| Target Milestone: | --- | Keywords: | Regression | ||||
| Target Release: | 3.5.z | ||||||
| Hardware: | x86_64 | ||||||
| OS: | Linux | ||||||
| Whiteboard: | aos-scalability-35 | ||||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2019-02-26 15:38:03 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
|
Description
Mike Fiedler
2017-02-23 01:38:04 UTC
Contact me for clusters reproducing this. Will try to keep them up through week's end. Adding David, Andy and Clayton for opinions. The issue here is the change to make basic crud commands to be agnostic of registered types, which happened in the 3.4->3.5 timeframe. In summary, using unstructured objects means unmarshalling json into maps of interfaces, which is known for having poor performance and, multiplied by thousands of objects like is the case here, can lead to many seconds just doing plain unmarshaling. I was able to confirm that by debugging and analyzing profiling reports generated by pproc. There are basically two places where it hurts badly in the 'get' command: 1. The actual unmarshaling to unstructured objects when the Visitor's visits happen (a single call to `Infos()` took 18 seconds for 29k secrets. 2. The conversion needed for printing here[1], which took around 12s for the same sampling. We could use a typed structure and fallback to unstructured in the case of TPR and others, but that would lead to a lot of special-case code. Another option would be to try to solve the second problem by making the printer understand and "map" unstructured to its proper kind while printing, which would make it better but only solve part of the issue. Or, we could postpone this until we move get printing to server-side. So any other suggestions about how to fix this? [1] https://github.com/deads2k/kubernetes/blob/61673c4b39606fc7e1de9a3cdd4ff5aaaebc0f31/pkg/kubectl/resource_printer.go#L2263-L2272 Also: changing to a better performing JSON library doesn't help here, I tried ffjson[1] and easyjson[2]. The problem is inherent to unmarshalling to interfaces and better performing libs don't fix that, in most cases they just delegate to 'encoding/json' when dealing with structures like that. [1] https://github.com/pquerna/ffjson [2] https://github.com/mailru/easyjson Moving printing to the server side fixes the bulk of this. We should also have a way for the JSON printer to avoid having to decode the returned object if possible. Ok, so we mark this UpcomingRelease? Is there anything we can do for 3.6 or do we need to wait for printing to move to the server? > Is there anything we can do for 3.6 or do we need to wait for printing to move to the server?
Nothing substantial since UnstructuredObject is now at the core of client-side printers.
A fix is coming most likely with the rebase of Kube 1.8. I'm closing this in favour of https://bugzilla.redhat.com/show_bug.cgi?id=1626291 which is currently tracking the performance impact. |