Bug 1342293
| Summary: | Swift Storage for Registry produces error when scaled past 1 pod | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Eric Jones <erjones> |
| Component: | Image Registry | Assignee: | Michal Minar <miminar> |
| Status: | CLOSED CURRENTRELEASE | QA Contact: | zhou ying <yinzhou> |
| Severity: | medium | Docs Contact: | |
| Priority: | medium | ||
| Version: | 3.2.0 | CC: | aos-bugs, mfojtik, misalunk, pweil, tdawson, yinzhou |
| Target Milestone: | --- | Keywords: | Reopened, Unconfirmed |
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | atomic-openshift-3.3.0.35-1.git.0.d7bd9b6.el7 | Doc Type: | Bug Fix |
| Doc Text: |
Cause: original upstream swift storage driver was fragile when it came to concurrent pushes
Consequence: when registry pod had been scaled past 1 pod, the image push would have failed due to files getting corrupted
Fix: the registry has been updated to more recent version with more robust swift storage driver
Result: files are now uploaded consistently even with more registry replicas
|
Story Points: | --- |
| Clone Of: | Environment: | ||
| Last Closed: | 2016-11-22 23:25:25 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Eric Jones
2016-06-02 22:11:45 UTC
*** This bug has been marked as a duplicate of bug 1330890 *** Pasting here error message from attached registry log:
level=error msg="response completed with error" err.code=UNKNOWN err.detail="swift: Object Corrupted"
The error is described in swift's upstream driver:
https://github.com/docker/distribution/blob/master/vendor/github.com/ncw/swift/swift.go#L1186
It seems like the registry computes the an MD5 sum of uploaded object differently than swift server.
Confirmed with latest ocp 3.2.1.16, the issue has fixed: [root@openshift-121 master]# openshift version openshift v3.2.1.16 kubernetes v1.2.0-36-g4a3f9c5 etcd 2.2.5 root@openshift-121 master]# oc get po NAME READY STATUS RESTARTS AGE docker-registry-7-q5kzx 1/1 Running 0 19m docker-registry-7-s9pne 1/1 Running 0 52m [root@openshift-121 master]# docker push 172.30.216.221:5000/zhouy/busybox The push refers to a repository [172.30.216.221:5000/zhouy/busybox] e88b3f82283b: Pushed latest: digest: sha256:0421c0ebccb1716cb69b992e946493ceff0de1b4a365f95d4f9d3e45c064ffa6 size: 2095 [root@zhouy testjson]# oc get po NAME READY STATUS RESTARTS AGE ruby-ex-1-build 0/1 Completed 0 29s ruby-ex-2-build 0/1 Completed 0 <invalid> confrimed with OCP 3.3, I can't reproduce this issue: [root@host-8-174-28 ~]# openshift version openshift v3.3.1.4 kubernetes v1.3.0+52492b4 etcd 2.3.0+git [root@host-8-174-28 ~]# oc get po NAME READY STATUS RESTARTS AGE docker-registry-7-id5k8 1/1 Running 0 57m docker-registry-7-lebse 1/1 Running 0 1h [root@host-8-174-28 ~]# docker push 172.30.48.163:5000/zhouy/busybox:latst The push refers to a repository [172.30.48.163:5000/zhouy/busybox] e88b3f82283b: Pushing [==================================================>] 1.294 MB ^C [root@host-8-174-28 ~]# docker push 172.30.48.163:5000/zhouy/busybox:latst The push refers to a repository [172.30.48.163:5000/zhouy/busybox] e88b3f82283b: Layer already exists latst: digest: sha256:fa62410b18a14e3a14d24488fdcf8607749b40f3005737b5bdd52484024f107d size: 2094 [root@zhouy 1101]# oc get build NAME TYPE FROM STATUS STARTED DURATION ruby-ex-1 Source Git@f63d076 Complete 52 minutes ago 1m19s |