+++ This bug was initially created as a clone of Bug #1775875 +++ Since it landed way back in 2019-04-04 [1], there have been some bugs in the extraction match-and-replace implementation: * It uses len(value) of the marker when searching the incoming bytes. Especially for short strings like version replacement (which is new in 4.3 [2]), this meant we were only looking for five bytes for 4.2.7, which is \x00_REL. That's not very specific. It does not distinguish between \x00_RELEASE_IMAGE_LOCATION_\x00XX... and \x00_RELEASE_VERSION_LOCATION_\x00XX.... It doesn't even distinguish between the defaultVersionPadded and defaultVersionPrefix constants in pkg/version. We should search for the full marker, regardless of len(value). * I has: nextOffset := end - len(marker) ... _, wErr := w.Write(buf[:end-nextOffset]) ... copy(buf[:nextOffset], buf[end-nextOffset:end]) offset = nextOffset That is problematic for a few reasons: * It doesn't write much data. Substituting for nextOffset, we were writing to: end - nextOffset = end - (end - len(marker)) = len(marker) ``` this makes the buffer size largely meaningless, and means lots of inefficient, small reads/writes. We should try to write everything except the *last* `len(marker)`. * It ignores the amount of data written. You can want to write 1k bytes but only actually write 50 bytes with a given Write call. That didn't happen often before; because of the previous list entry we were only attempting to write a hundred or so bytes at a time. But once we start trying to write the bulk of the buffer size, it will happen more often. We should keep attempting Write until it errors out on us or we finish pushing all the bytes we no longer need. [1]: https://github.com/openshift/origin/pull/22439#event-2252366741 [2]: https://github.com/openshift/oc/pull/88
Compared with old oc client and latest oc client, can't see the larger buffer size: [root@dhcp-140-138 ~]# ./oc-old version -oyaml clientVersion: buildDate: "2019-09-02T15:22:22Z" compiler: gc gitCommit: 9ff96feb1aea1217938e2f1aeaf0be091cc59728 gitTreeState: clean gitVersion: openshift-clients-4.2.0-201909020729 goVersion: go1.12.8 major: "" minor: "" platform: linux/amd64 strace -fo trace-old ./oc-old adm release extract --command=oc quay.io/openshift-release-dev/ocp-release-nightly@sha256:82863b275ab64ea830eb719bfd4d5d7166bdaa0c15890a57f4bd717f369d768b [root@dhcp-140-138 ~]# oc version -o yaml clientVersion: buildDate: "2019-12-06T11:24:15Z" compiler: gc gitCommit: 3fa38ed2761a75d4c648acfc8ac14df930219ed5 gitTreeState: clean gitVersion: v4.3.0 goVersion: go1.12.12 major: "" minor: "" platform: linux/amd64 [root@dhcp-140-138 ~]# strace -fo strace-old ./oc-old adm release extract --command=oc quay.io/openshift-release-dev/ocp-release-nightly@sha256:82863b275ab64ea830eb719bfd4d5d7166bdaa0c15890a57f4bd717f369d768b [root@dhcp-140-138 ~]# strace -fo strace-new oc adm release extract --command=oc quay.io/openshift-release-dev/ocp-release-nightly@sha256:82863b275ab64ea830eb719bfd4d5d7166bdaa0c15890a57f4bd717f369d768b [root@dhcp-140-138 ~]# grep write strace-old |tail -n 20 7234 write(9, "'kubectl cp' will fail.\\n\"\n\"\\n\"\n"..., 16384) = 16384 7234 write(9, "ame or stdin\"\nmsgstr \"Apply a co"..., 16384) = 16384 7234 write(9, "ntainer that the service should "..., 16384 <unfinished ...> 7234 <... write resumed> ) = 16384 7234 write(8, "\27\3\3\0%\0\0\0\0\0\0\4\377\36\30\35\275\301(\4\239\3410'\353e\353\246\377\324\225"..., 42 <unfinished ...> 7234 <... write resumed> ) = 42 7234 write(9, "kg/kubectl/cmd/create_secret.go:"..., 16384) = 16384 7234 write(9, "\n\"\\t\\t# Create a service for a r"..., 16384 <unfinished ...> 7234 <... write resumed> ) = 16384 7234 write(9, "tkubectl cluster-info\"\n\n# https:"..., 16384 <unfinished ...> 7234 <... write resumed> ) = 16384 7234 write(9, "ge=image:v2\\n\"\n\"\\n\"\n\"\\t\\t# Abort"..., 16384) = 16384 7234 write(9, "ocesses running on different mac"..., 16384 <unfinished ...> 7234 <... write resumed> ) = 16384 7234 write(9, "b/HEAD/docs/api-reference/v1/def"..., 16384) = 16384 7234 write(9, "cmd/create_quota.go:61\nmsgid \"\"\n"..., 16384 <unfinished ...> 7234 <... write resumed> ) = 16384 7234 write(9, "\"\"\n\"If non-empty, set the sessio"..., 16384) = 16384 7234 write(9, "quests depending on the server c"..., 7832 <unfinished ...> 7234 <... write resumed> ) = 7832 [root@dhcp-140-138 ~]# grep write strace-new |tail -n 20 7252 write(10, "nt 20 lines of output in pod ngi"..., 16384) = 16384 7252 write(10, "est version.\\n\"\n\"\\n\"\n\"\\t\\tThe de"..., 16384 <unfinished ...> 7252 <... write resumed> ) = 16384 7252 write(10, "ed and working on \"\n\"the server."..., 16384) = 16384 7252 write(10, "'kubectl cp' will fail.\\n\"\n\"\\n\"\n"..., 16384) = 16384 7252 write(10, "ame or stdin\"\nmsgstr \"Apply a co"..., 16384) = 16384 7252 write(10, "ntainer that the service should "..., 16384) = 16384 7252 write(10, "kg/kubectl/cmd/create_secret.go:"..., 16384) = 16384 7252 write(10, "\n\"\\t\\t# Create a service for a r"..., 16384 <unfinished ...> 7252 <... write resumed> ) = 16384 7252 write(10, "tkubectl cluster-info\"\n\n# https:"..., 16384) = 16384 7252 write(10, "ge=image:v2\\n\"\n\"\\n\"\n\"\\t\\t# Abort"..., 16384) = 16384 7252 write(10, "ocesses running on different mac"..., 16384 <unfinished ...> 7252 <... write resumed> ) = 16384 7252 write(10, "b/HEAD/docs/api-reference/v1/def"..., 16384) = 16384 7252 write(10, "cmd/create_quota.go:61\nmsgid \"\"\n"..., 16384 <unfinished ...> 7252 <... write resumed> ) = 16384 7252 write(10, "\"\"\n\"If non-empty, set the sessio"..., 16384) = 16384 7252 write(10, "quests depending on the server c"..., 7832 <unfinished ...> 7252 <... write resumed> ) = 7832
Given it's code freeze today, I'm moving this to .z
I doubt we'll be backporting this fix to 4.3, given it's still being discussed if the solution is complete in master.