Description of problem: Once in 3-4 test runs, PVC clone fails on hostpath-csi-pvc-block storage class backed by OCS. This is only seen after CDI certificates renewal (couldn't reproduce without this step - ran the test 10 times). Version-Release number of selected component (if applicable): 4.13 How reproducible: 1 in 3-4 test runs Steps to Reproduce: 1. Create a source DV 2. Renew CDI certificates (update CDI's secret's annotation, this will trigger the cert renewal) 3. Clone the source DV to a target DV Actual results: Clone fails once in 3-4 runs This is only seen on BM with hostpath-csi-pvc-block (with underlying OCS). The test passed 7 times in a row on BM with hostpath-csi-basic (local) storage class. The test passed 10 times in a row on PSI cluster with hostpath-csi-pvc-block (with underlying LSO). I wasn't able to reproduce this behavior without CDI certificates renewal. Expected results: Clone succeeds Additional info: Test name: test_import_clone_after_certs_renewal $ oc get dv -A test-cdi-certificate dv-target CloneInProgress N/A 4 2m2s $ oc get pods -n test-cdi-certificate NAME READY STATUS RESTARTS AGE 98744b32-3ada-4551-9511-8bd14332ac76-source-pod 0/1 CrashLoopBackOff 4 (34s ago) 2m5s cdi-upload-dv-target 1/1 Running 0 2m11s $ oc logs -n test-cdi-certificate cdi-upload-dv-target I0501 14:23:49.166069 1 uploadserver.go:74] Running server on 0.0.0.0:8443 I0501 14:23:54.709746 1 uploadserver.go:378] Content type header is "filesystem-clone" I0501 14:23:54.709889 1 util.go:204] begin untar to /data... I0501 14:23:54.709898 1 util.go:210] running untar cmd: [/usr/bin/tar --preserve-permissions --no-same-owner -xvC /data] E0501 14:23:54.782846 1 util.go:219] exit status 2 E0501 14:23:54.782883 1 uploadserver.go:391] Saving stream failed: error unarchiving to /data: exit status 2 I0501 14:23:56.795716 1 uploadserver.go:378] Content type header is "filesystem-clone" I0501 14:23:56.795824 1 util.go:204] begin untar to /data... I0501 14:23:56.795832 1 util.go:210] running untar cmd: [/usr/bin/tar --preserve-permissions --no-same-owner -xvC /data] E0501 14:23:56.879546 1 util.go:219] exit status 2 E0501 14:23:56.879578 1 uploadserver.go:391] Saving stream failed: error unarchiving to /data: exit status 2 I0501 14:24:11.303774 1 uploadserver.go:378] Content type header is "filesystem-clone" I0501 14:24:11.303888 1 util.go:204] begin untar to /data... I0501 14:24:11.303897 1 util.go:210] running untar cmd: [/usr/bin/tar --preserve-permissions --no-same-owner -xvC /data] E0501 14:24:11.459124 1 util.go:219] exit status 2 E0501 14:24:11.459161 1 uploadserver.go:391] Saving stream failed: error unarchiving to /data: exit status 2 I0501 14:24:38.322763 1 uploadserver.go:378] Content type header is "filesystem-clone" I0501 14:24:38.322848 1 util.go:204] begin untar to /data... I0501 14:24:38.322857 1 util.go:210] running untar cmd: [/usr/bin/tar --preserve-permissions --no-same-owner -xvC /data] E0501 14:24:38.471889 1 util.go:219] exit status 2 E0501 14:24:38.471915 1 uploadserver.go:391] Saving stream failed: error unarchiving to /data: exit status 2 I0501 14:25:22.163380 1 uploadserver.go:378] Content type header is "filesystem-clone" I0501 14:25:22.163471 1 util.go:204] begin untar to /data... I0501 14:25:22.163480 1 util.go:210] running untar cmd: [/usr/bin/tar --preserve-permissions --no-same-owner -xvC /data] E0501 14:25:22.273202 1 util.go:219] exit status 2 E0501 14:25:22.273227 1 uploadserver.go:391] Saving stream failed: error unarchiving to /data: exit status 2 $ oc logs -n test-cdi-certificate 98744b32-3ada-4551-9511-8bd14332ac76-source-pod VOLUME_MODE=filesystem MOUNT_POINT=/var/run/cdi/clone/source /var/run/cdi/clone/source / UPLOAD_BYTES=1073741846 I0501 14:25:21.793028 10 clone-source.go:220] content-type is "filesystem-clone" I0501 14:25:21.793093 10 clone-source.go:221] mount is "/var/run/cdi/clone/source" I0501 14:25:21.793097 10 clone-source.go:222] upload-bytes is 1073741846 I0501 14:25:21.793110 10 clone-source.go:239] Starting cloner target I0501 14:25:21.793170 10 clone-source.go:177] Executing [/usr/bin/tar cv -S disk.img] I0501 14:25:22.127411 10 clone-source.go:251] Set header to filesystem-clone I0501 14:25:22.272567 10 clone-source.go:127] Wrote 13711360 bytes F0501 14:25:22.273388 10 clone-source.go:260] Unexpected status code 500 goroutine 1 [running]: k8s.io/klog/v2.stacks(0x1) /remote-source/app/vendor/k8s.io/klog/v2/klog.go:1038 +0x89 k8s.io/klog/v2.(*loggingT).output(0x149d920, 0x3, 0x0, 0xc000547d50, 0x0, {0x1060733?, 0x1?}, 0xc0004ec530?, 0x0) /remote-source/app/vendor/k8s.io/klog/v2/klog.go:987 +0x5fc k8s.io/klog/v2.(*loggingT).printf(0x12?, 0x5200b?, 0x0, {0x0, 0x0}, {0xd857eb, 0x19}, {0xc000708ad0, 0x1, 0x1}) /remote-source/app/vendor/k8s.io/klog/v2/klog.go:753 +0x1c5 k8s.io/klog/v2.Fatalf(...) /remote-source/app/vendor/k8s.io/klog/v2/klog.go:1532 main.main() /remote-source/app/cmd/cdi-cloner/clone-source.go:260 +0xc54 goroutine 37 [chan receive]: k8s.io/klog/v2.(*loggingT).flushDaemon(0xc0003b4840?) /remote-source/app/vendor/k8s.io/klog/v2/klog.go:1181 +0x6a created by k8s.io/klog/v2.init.0 /remote-source/app/vendor/k8s.io/klog/v2/klog.go:420 +0xf6 goroutine 293 [sleep]: time.Sleep(0x3b9aca00) /usr/lib/golang/src/runtime/time.go:195 +0x135 kubevirt.io/containerized-data-importer/pkg/util/prometheus.(*ProgressReader).timedUpdateProgress(0xc000314720?) /remote-source/app/pkg/util/prometheus/prometheus.go:55 +0x2c created by kubevirt.io/containerized-data-importer/pkg/util/prometheus.(*ProgressReader).StartTimedUpdate /remote-source/app/pkg/util/prometheus/prometheus.go:48 +0x56 goroutine 295 [IO wait]: internal/poll.runtime_pollWait(0x7fd2804e6c08, 0x72) /usr/lib/golang/src/runtime/netpoll.go:305 +0x89 internal/poll.(*pollDesc).wait(0xc000700080?, 0x0?, 0x0) /usr/lib/golang/src/internal/poll/fd_poll_runtime.go:84 +0x32 internal/poll.(*pollDesc).waitRead(...) /usr/lib/golang/src/internal/poll/fd_poll_runtime.go:89 internal/poll.(*FD).Accept(0xc000700080) /usr/lib/golang/src/internal/poll/fd_unix.go:614 +0x234 net.(*netFD).accept(0xc000700080) /usr/lib/golang/src/net/fd_unix.go:172 +0x35 net.(*TCPListener).accept(0xc000502240) /usr/lib/golang/src/net/tcpsock_posix.go:142 +0x28 net.(*TCPListener).Accept(0xc000502240) /usr/lib/golang/src/net/tcpsock.go:288 +0x3d crypto/tls.(*listener).Accept(0xc0005022e8) /usr/lib/golang/src/crypto/tls/tls.go:66 +0x2d net/http.(*Server).Serve(0xc00046a0f0, {0xee5680, 0xc0005022e8}) /usr/lib/golang/src/net/http/server.go:3070 +0x385 net/http.(*Server).ServeTLS(0xc00046a0f0, {0xee8ec0?, 0xc000502240}, {0xc000318f60, 0x1e}, {0xc000318fc0, 0x1e}) /usr/lib/golang/src/net/http/server.go:3142 +0x3fd net/http.(*Server).ListenAndServeTLS(0xc00046a0f0, {0xc000318f60, 0x1e}, {0xc000318fc0, 0x1e}) /usr/lib/golang/src/net/http/server.go:3299 +0x12f net/http.ListenAndServeTLS(...) /usr/lib/golang/src/net/http/server.go:3265 kubevirt.io/containerized-data-importer/pkg/util/prometheus.StartPrometheusEndpoint.func1() /remote-source/app/pkg/util/prometheus/prometheus.go:113 +0xa5 created by kubevirt.io/containerized-data-importer/pkg/util/prometheus.StartPrometheusEndpoint /remote-source/app/pkg/util/prometheus/prometheus.go:111 +0x23b goroutine 308 [IO wait]: internal/poll.runtime_pollWait(0x7fd2804e6b18, 0x72) /usr/lib/golang/src/runtime/netpoll.go:305 +0x89 internal/poll.(*pollDesc).wait(0xc000700400?, 0xc00081e900?, 0x0) /usr/lib/golang/src/internal/poll/fd_poll_runtime.go:84 +0x32 internal/poll.(*pollDesc).waitRead(...) /usr/lib/golang/src/internal/poll/fd_poll_runtime.go:89 internal/poll.(*FD).Read(0xc000700400, {0xc00081e900, 0x240, 0x240}) /usr/lib/golang/src/internal/poll/fd_unix.go:167 +0x25a net.(*netFD).Read(0xc000700400, {0xc00081e900?, 0x1441b60?, 0x3?}) /usr/lib/golang/src/net/fd_posix.go:55 +0x29 net.(*conn).Read(0xc000506090, {0xc00081e900?, 0x67db15?, 0xc0005c1640?}) /usr/lib/golang/src/net/net.go:183 +0x45 crypto/tls.(*atLeastReader).Read(0xc0003a2de0, {0xc00081e900?, 0x0?, 0x0?}) /usr/lib/golang/src/crypto/tls/conn.go:787 +0x3d bytes.(*Buffer).ReadFrom(0xc0003c25f8, {0xedf900, 0xc0003a2de0}) /usr/lib/golang/src/bytes/buffer.go:202 +0x98 crypto/tls.(*Conn).readFromUntil(0xc0003c2380, {0xee0960?, 0xc000506090}, 0x800?) /usr/lib/golang/src/crypto/tls/conn.go:809 +0xe5 crypto/tls.(*Conn).readRecordOrCCS(0xc0003c2380, 0x0) /usr/lib/golang/src/crypto/tls/conn.go:616 +0x116 crypto/tls.(*Conn).readRecord(...) /usr/lib/golang/src/crypto/tls/conn.go:582 crypto/tls.(*Conn).Read(0xc0003c2380, {0xc00088c000, 0x1000, 0x203000?}) /usr/lib/golang/src/crypto/tls/conn.go:1315 +0x16f net/http.(*connReader).Read(0xc0007bb620, {0xc00088c000, 0x1000, 0x1000}) /usr/lib/golang/src/net/http/server.go:786 +0x171 bufio.(*Reader).fill(0xc000315440) /usr/lib/golang/src/bufio/bufio.go:106 +0xff bufio.(*Reader).ReadSlice(0xc000315440, 0x0?) /usr/lib/golang/src/bufio/bufio.go:372 +0x2f bufio.(*Reader).ReadLine(0xc000315440) /usr/lib/golang/src/bufio/bufio.go:401 +0x27 net/textproto.(*Reader).readLineSlice(0xc0009261e0) /usr/lib/golang/src/net/textproto/reader.go:58 +0x99 net/textproto.(*Reader).ReadLine(...) /usr/lib/golang/src/net/textproto/reader.go:39 net/http.readRequest(0xc0007e9a08?) /usr/lib/golang/src/net/http/request.go:1036 +0x79 net/http.(*conn).readRequest(0xc0003a6280, {0xee9a30, 0xc0005c1600}) /usr/lib/golang/src/net/http/server.go:994 +0x24a net/http.(*conn).serve(0xc0003a6280, {0xee9ad8, 0xc0004ee780}) /usr/lib/golang/src/net/http/server.go:1916 +0x345 created by net/http.(*Server).Serve /usr/lib/golang/src/net/http/server.go:3102 +0x4db goroutine 309 [IO wait]: internal/poll.runtime_pollWait(0x7fd2804e6de8, 0x72) /usr/lib/golang/src/runtime/netpoll.go:305 +0x89 internal/poll.(*pollDesc).wait(0xc000700380?, 0xc000846000?, 0x0) /usr/lib/golang/src/internal/poll/fd_poll_runtime.go:84 +0x32 internal/poll.(*pollDesc).waitRead(...) /usr/lib/golang/src/internal/poll/fd_poll_runtime.go:89 internal/poll.(*FD).Read(0xc000700380, {0xc000846000, 0x1000, 0x1000}) /usr/lib/golang/src/internal/poll/fd_unix.go:167 +0x25a net.(*netFD).Read(0xc000700380, {0xc000846000?, 0xc0003bcfe0?, 0xc000846005?}) /usr/lib/golang/src/net/fd_posix.go:55 +0x29 net.(*conn).Read(0xc000506088, {0xc000846000?, 0x0?, 0x0?}) /usr/lib/golang/src/net/net.go:183 +0x45 crypto/tls.(*atLeastReader).Read(0xc0003a2df8, {0xc000846000?, 0x0?, 0x0?}) /usr/lib/golang/src/crypto/tls/conn.go:787 +0x3d bytes.(*Buffer).ReadFrom(0xc0003c2278, {0xedf900, 0xc0003a2df8}) /usr/lib/golang/src/bytes/buffer.go:202 +0x98 crypto/tls.(*Conn).readFromUntil(0xc0003c2000, {0xee0960?, 0xc000506088}, 0x100a?) /usr/lib/golang/src/crypto/tls/conn.go:809 +0xe5 crypto/tls.(*Conn).readRecordOrCCS(0xc0003c2000, 0x0) /usr/lib/golang/src/crypto/tls/conn.go:616 +0x116 crypto/tls.(*Conn).readRecord(...) /usr/lib/golang/src/crypto/tls/conn.go:582 crypto/tls.(*Conn).Read(0xc0003c2000, {0xc000847000, 0x1000, 0x1?}) /usr/lib/golang/src/crypto/tls/conn.go:1315 +0x16f net/http.(*persistConn).Read(0xc000464fc0, {0xc000847000?, 0x4520a0?, 0xc0007cdec8?}) /usr/lib/golang/src/net/http/transport.go:1929 +0x4e bufio.(*Reader).fill(0xc000501b00) /usr/lib/golang/src/bufio/bufio.go:106 +0xff bufio.(*Reader).Peek(0xc000501b00, 0x1) /usr/lib/golang/src/bufio/bufio.go:144 +0x5d net/http.(*persistConn).readLoop(0xc000464fc0) /usr/lib/golang/src/net/http/transport.go:2093 +0x1ac created by net/http.(*Transport).dialConn /usr/lib/golang/src/net/http/transport.go:1751 +0x173e goroutine 310 [select]: net/http.(*persistConn).writeLoop(0xc000464fc0) /usr/lib/golang/src/net/http/transport.go:2392 +0xf5 created by net/http.(*Transport).dialConn /usr/lib/golang/src/net/http/transport.go:1752 +0x1791 YAMLs: DV-SOURCE: apiVersion: cdi.kubevirt.io/v1beta1 kind: DataVolume metadata: labels: created-by-dynamic-class-creator: 'Yes' name: dv-source namespace: test-cdi-certificate spec: contentType: kubevirt source: http: url: http:<server>/files/cnv-tests/cirros-images/cirros-0.4.0-x86_64-disk.qcow2 storage: resources: requests: storage: 1Gi storageClassName: hostpath-csi-pvc-block DV-TARGET apiVersion: cdi.kubevirt.io/v1beta1 kind: DataVolume metadata: labels: created-by-dynamic-class-creator: 'Yes' name: dv-target namespace: test-cdi-certificate spec: contentType: kubevirt source: pvc: name: dv-source namespace: test-cdi-certificate storage: resources: requests: storage: 1Gi storageClassName: hostpath-csi-pvc-block
What is common in the log of each failure (caught this log 3 times) is that it only gets to write 13711360 bytes: $ oc logs -n test-cdi-certificate 2d47ca14-d69c-4dd0-bd83-a4caa9092ae8-source-pod ... UPLOAD_BYTES=1073741846 ... I0430 17:52:22.457305 10 clone-source.go:127] Wrote 13711360 bytes F0430 17:52:22.457925 10 clone-source.go:260] Unexpected status code 500
Could be related to this bug: https://bugzilla.redhat.com/show_bug.cgi?id=2172853
Sounds similar to https://bugzilla.redhat.com/show_bug.cgi?id=2172853#c6 - does this test scenario have a first consumer with different group that could race the cloner pod?
Yes, it had a first consumer with a different fsGroup, and since we changed it to 107 - we don't see this failure anymore (had several tier2 runs).
(In reply to Jenia Peimer from comment #5) > Yes, it had a first consumer with a different fsGroup, and since we changed > it to 107 - we don't see this failure anymore (had several tier2 runs). So this is a dup of https://bugzilla.redhat.com/show_bug.cgi?id=2172853?
Right, closing this as a duplicate *** This bug has been marked as a duplicate of bug 2172853 ***