Bug 1369588

Summary: Conformance Test failures for OCP
Product: OpenShift Container Platform Reporter: Jason DeTiberus <jdetiber>
Component: Test InfrastructureAssignee: Avesh Agarwal <avagarwa>
Status: CLOSED DEFERRED QA Contact: libra bugs <libra-bugs>
Severity: low Docs Contact:
Priority: high    
Version: 3.3.0CC: abhgupta, aos-bugs, bparees, ccoleman, eparis, erich, jdetiber, jliggitt, jokerman, jvyas, mmccomas, mnewby, sdodson, tstclair
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-08-16 19:54:24 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Bug Depends On:    
Bug Blocks: 1378171    

Description Jason DeTiberus 2016-08-23 20:59:44 UTC
Description of problem:

An installed cluster using openshift-ansible does not successfully pass conformance tests. There are some tests that assume that files will be available in a particular location or that AllowAll auth is enabled.

Version-Release number of selected component (if applicable):

3.3

How reproducible:

100%

Steps to Reproduce:
1. Install a cluster using oo-install or openshift-ansible
2.  run 'KUBECONFIG=/etc/origin/master/admin.kubeconfig /usr/libexec/atomic-openshift/extended.test --ginkgo.v=true --ginkgo.skip="" --ginkgo.focus="Conformance"'

Actual results:

(dgoodwin@wrx ~) $ cat conformance.log2| grep "Failure " | grep -v OnFailure
• Failure in Spec Setup (BeforeEach) [0.008 seconds]
• Failure in Spec Setup (BeforeEach) [0.017 seconds]
• Failure in Spec Setup (BeforeEach) [0.007 seconds]
• Failure in Spec Setup (BeforeEach) [0.007 seconds]
• Failure in Spec Setup (BeforeEach) [0.007 seconds]
• Failure in Spec Setup (BeforeEach) [0.007 seconds]
• Failure in Spec Setup (BeforeEach) [0.007 seconds]
• Failure in Spec Setup (BeforeEach) [0.011 seconds]
• Failure in Spec Setup (BeforeEach) [0.007 seconds]
• Failure in Spec Setup (BeforeEach) [0.007 seconds]
• Failure in Spec Setup (BeforeEach) [0.007 seconds]
• Failure in Spec Setup (BeforeEach) [0.007 seconds]
• Failure in Spec Setup (BeforeEach) [0.009 seconds]
• Failure in Spec Setup (BeforeEach) [0.007 seconds]
• Failure in Spec Setup (BeforeEach) [0.007 seconds]
• Failure in Spec Setup (BeforeEach) [0.006 seconds]
• Failure [67.671 seconds]
• Failure [67.810 seconds]
• Failure [15.328 seconds]
• Failure [7.091 seconds]
• Failure [5.186 seconds]
• Failure [5.176 seconds]
• Failure in Spec Setup (BeforeEach) [0.008 seconds]
• Failure in Spec Setup (BeforeEach) [0.014 seconds]
• Failure in Spec Setup (BeforeEach) [0.008 seconds]

The spec setup ones all look identical:

STEP: Building a namespace api object
Aug 18 15:09:48.925: INFO: challenger chose not to retry the request
Aug 18 15:09:48.926: INFO: Unknown output type: . Skipping.
Aug 18 15:09:48.926: INFO: Waiting up to 1m0s for all nodes to be ready

• Failure in Spec Setup (BeforeEach) [0.008 seconds]
[builds][Conformance] s2i build with a quota
/builddir/build/BUILD/atomic-openshift-git-0.be35299/_build/src/github.com/openshift/origin/test/extended/builds/s2i_quota.go:52
  Building from a template [BeforeEach]
  /builddir/build/BUILD/atomic-openshift-git-0.be35299/_build/src/github.com/openshift/origin/test/extended/builds/s2i_quota.go:51
    should create an s2i build with a quota and run it
    /builddir/build/BUILD/atomic-openshift-git-0.be35299/_build/src/github.com/openshift/origin/test/extended/builds/s2i_quota.go:50

    Aug 18 15:09:48.925: challenger chose not to retry the request

    /builddir/build/BUILD/atomic-openshift-git-0.be35299/_build/src/github.com/openshift/origin/test/extended/util/cli.go:390



The remaining 6 failures look fairly legit and varied, looks like some
DNS timeouts, possibly a network issue as well. You can isolate these
with searching for "Failure \[" but here they are below:

This one appears twice:

• Failure [67.671 seconds]
[k8s.io] DNS
/builddir/build/BUILD/atomic-openshift-git-0.be35299/_build/src/github.com/openshift/origin/vendor/k8s.io/kubernetes/test/e2e/framework/framework.go:685
  should provide DNS for the cluster [Conformance] [It]
  /builddir/build/BUILD/atomic-openshift-git-0.be35299/_build/src/github.com/openshift/origin/vendor/k8s.io/kubernetes/test/e2e/dns.go:304

  Expected error:
      <*errors.errorString | 0xc8200ea0a0>: {
          s: "timed out waiting for the condition",
      }
      timed out waiting for the condition
  not to have occurred

  /builddir/build/BUILD/atomic-openshift-git-0.be35299/_build/src/github.com/openshift/origin/vendor/k8s.io/kubernetes/test/e2e/dns.go:204
------------------------------
[k8s.io] DNS
  should provide DNS for services [Conformance]
  /builddir/build/BUILD/atomic-openshift-git-0.be35299/_build/src/github.com/openshift/origin/vendor/k8s.io/kubernetes/test/e2e/dns.go:352
STEP: Creating a kubernetes client
Aug 18 14:47:05.925: INFO: >>> kubeConfig: /etc/origin/master/admin.kubeconfig




• Failure [15.328 seconds]
[k8s.io] ClusterDns [Feature:Example]
/builddir/build/BUILD/atomic-openshift-git-0.be35299/_build/src/github.com/openshift/origin/vendor/k8s.io/kubernetes/test/e2e/framework/framework.go:685
  should create pod that uses dns [Conformance] [It]
  /builddir/build/BUILD/atomic-openshift-git-0.be35299/_build/src/github.com/openshift/origin/vendor/k8s.io/kubernetes/test/e2e/example_cluster_dns.go:151

  Expected error:
      <*errors.errorString | 0xc821c264f0>: {
          s: "Error running &{/usr/bin/kubectl [kubectl
--server=https://172.16.132.37:8443
--kubeconfig=/etc/origin/master/admin.kubeconfig create -f
examples/cluster-dns/dns-backend-rc.yaml
--namespace=e2e-tests-dnsexample0-onwj4] []  <nil>  the path
\"examples/cluster-dns/dns-backend-rc.yaml\" does not exist\n [] <nil>
0xc82028aa00 exit status 1 <nil> true [0xc8204149c0 0xc8204149e0
0xc820414b30] [0xc8204149c0 0xc8204149e0 0xc820414b30] [0xc8204149d8
0xc820414b28] [0xa915d0 0xa915d0] 0xc8211ea360}:\nCommand
stdout:\n\nstderr:\nthe path
\"examples/cluster-dns/dns-backend-rc.yaml\" does not
exist\n\nerror:\nexit status 1\n",
      }
      Error running &{/usr/bin/kubectl [kubectl
--server=https://172.16.132.37:8443
--kubeconfig=/etc/origin/master/admin.kubeconfig create -f
examples/cluster-dns/dns-backend-rc.yaml
--namespace=e2e-tests-dnsexample0-onwj4] []  <nil>  the path
"examples/cluster-dns/dns-backend-rc.yaml" does not exist
       [] <nil> 0xc82028aa00 exit status 1 <nil> true [0xc8204149c0
0xc8204149e0 0xc820414b30] [0xc8204149c0 0xc8204149e0 0xc820414b30]
[0xc8204149d8 0xc820414b28] [0xa915d0 0xa915d0] 0xc8211ea360}:
      Command stdout:

      stderr:
      the path "examples/cluster-dns/dns-backend-rc.yaml" does not exist

      error:
      exit status 1

  not to have occurred

  /builddir/build/BUILD/atomic-openshift-git-0.be35299/_build/src/github.com/openshift/origin/vendor/k8s.io/kubernetes/test/e2e/framework/util.go:2003




• Failure [7.091 seconds]
[k8s.io] hostPath
/builddir/build/BUILD/atomic-openshift-git-0.be35299/_build/src/github.com/openshift/origin/vendor/k8s.io/kubernetes/test/e2e/framework/framework.go:685
  should support subPath [Conformance] [It]
  /builddir/build/BUILD/atomic-openshift-git-0.be35299/_build/src/github.com/openshift/origin/vendor/k8s.io/kubernetes/test/e2e/host_path.go:121

  Expected error:
      <*errors.errorString | 0xc82199d710>: {
          s: "pod 'pod-host-path-test' terminated with failure:
&{ExitCode:1 Signal:0 Reason:Error Message: StartedAt:{Time:2016-08-18
14:54:46 -0400 EDT} FinishedAt:{Time:2016-08-18 14:54:46 -0400 EDT}
ContainerID:docker://9fb9c131e9055330065cc58c25c7c322e0fba2be1bbb115182ae191498b9f07a}",
      }
      pod 'pod-host-path-test' terminated with failure: &{ExitCode:1
Signal:0 Reason:Error Message: StartedAt:{Time:2016-08-18 14:54:46
-0400 EDT} FinishedAt:{Time:2016-08-18 14:54:46 -0400 EDT}
ContainerID:docker://9fb9c131e9055330065cc58c25c7c322e0fba2be1bbb115182ae191498b9f07a}
  not to have occurred

  /builddir/build/BUILD/atomic-openshift-git-0.be35299/_build/src/github.com/openshift/origin/vendor/k8s.io/kubernetes/test/e2e/framework/util.go:2116



• Failure [5.186 seconds]
[k8s.io] Proxy
/builddir/build/BUILD/atomic-openshift-git-0.be35299/_build/src/github.com/openshift/origin/vendor/k8s.io/kubernetes/test/e2e/framework/framework.go:685
  version v1
  /builddir/build/BUILD/atomic-openshift-git-0.be35299/_build/src/github.com/openshift/origin/vendor/k8s.io/kubernetes/test/e2e/proxy.go:41
    should proxy to cadvisor [Conformance] [It]
    /builddir/build/BUILD/atomic-openshift-git-0.be35299/_build/src/github.com/openshift/origin/vendor/k8s.io/kubernetes/test/e2e/proxy.go:62

    Expected error:
        <*errors.StatusError | 0xc821c31900>: {
            ErrStatus: {
                TypeMeta: {Kind: "", APIVersion: ""},
                ListMeta: {SelfLink: "", ResourceVersion: ""},
                Status: "Failure",
                Message: "an error on the server has prevented the
request from succeeding",
                Reason: "InternalError",
                Details: {
                    Name: "",
                    Group: "",
                    Kind: "",
                    Causes: [
                        {
                            Type: "UnexpectedServerResponse",
                            Message: "Error: 'dial tcp
172.16.132.35:4194: getsockopt: no route to host'\nTrying to reach:
'http://172.16.132.35:4194/containers/'",
                            Field: "",
                        },
                    ],
                    RetryAfterSeconds: 0,
                },
                Code: 503,
            },
        }
        an error on the server has prevented the request from succeeding
    not to have occurred

    /builddir/build/BUILD/atomic-openshift-git-0.be35299/_build/src/github.com/openshift/origin/vendor/k8s.io/kubernetes/test/e2e/proxy.go:313


Expected results:

Conformance tests pass

Additional info:

Digging in a bit, I think I've identified the issue with the openshift specific test failures. In tests that use the NewCLI function from origin/test/extended/util/cli.go are being returned with the username 'admin', and the cluster is configured for DenyAll auth, which would explain the error being seen for those tests: "challenger chose not to retry the request".

Some additional info about the failures from the kuberentes e2e tests:
- These two kubernetes DNS e2e failures look legitimate, test output shows multiple attempts to query dns before failing.
  - [Fail] [k8s.io] DNS [It] should provide DNS for the cluster [Conformance] 
    /builddir/build/BUILD/atomic-openshift-git-0.be35299/_build/src/github.com/openshift/origin/vendor/k8s.io/kubernetes/test/e2e/dns.go:204

  - [Fail] [k8s.io] DNS [It] should provide DNS for services [Conformance] 
    /builddir/build/BUILD/atomic-openshift-git-0.be35299/_build/src/github.com/openshift/origin/vendor/k8s.io/kubernetes/test/e2e/dns.go:204


- [Fail] [k8s.io] ClusterDns [Feature:Example] [It] should create pod that uses dns [Conformance] 
  - Error running &{/usr/bin/kubectl [kubectl --server=https://172.16.132.37:8443 --kubeconfig=/etc/origin/master/admin.kubeconfig create -f examples/cluster-dns/dns-backend-rc.yaml --namespace=e2e-tests-dnsexample0-onwj4] []  <nil>  the path "examples/cluster-dns/dns-backend-rc.yaml" does not exist
  - I don't think we are packaging the examples up, and I suspect even if we were, we'd need to have a way to make sure that the test is run in the proper working directory to find them.

- [Fail] [k8s.io] hostPath [It] should support subPath [Conformance] 
  - Aug 18 14:54:45.720: INFO: No Status.Info for container 'test-container-1' in pod 'pod-host-path-test' yet
  - Aug 18 14:54:45.720: INFO: Waiting for pod pod-host-path-test in namespace 'e2e-tests-hostpath-qxcmb' status to be 'success or failure'(found phase: "Pending", readiness: false) (2.504625ms elapsed)
  - Aug 18 14:54:47.723: INFO: Unexpected error occurred: pod 'pod-host-path-test' terminated with failure: &{ExitCode:1 Signal:0 Reason: Error Message: StartedAt:{Time:2016-08-18 14:54:46 -0400 EDT} FinishedAt:{Time:2016-08-18 14:54:46 -0400 EDT} ContainerID:docker://9fb9c131e9055330065cc58c25c7c322e0fba2be1bbb115182ae191498b9f07a}


- cadvisor proxy tests:
  - [Fail] [k8s.io] Proxy version v1 [It] should proxy to cadvisor [Conformance]
    - Aug 18 14:55:03.175: INFO: /api/v1/proxy/nodes/172.16.132.35:4194/containers/: no body (0; 8.174766ms)
    - No service on the host is listening on 4194 and we do not open the firewall port for 4194
    - This test is also listed in the EXCLUDED_TESTS in test/extended/setup.sh, the below test should probably be added there as well.
  - [Fail] [k8s.io] Proxy version v1 [It] should proxy to cadvisor using proxy subresource [Conformance] 
    - same as above.

Comment 1 Avesh Agarwal 2016-08-25 16:09:09 UTC
I can reproduce following 4 out of 6:

[Fail] [k8s.io] hostPath [It] should support subPath [Conformance]
[Fail] [k8s.io] ClusterDns [Feature:Example] [It] should create pod that uses dns [Conformance]
[Fail] [k8s.io] Proxy version v1 [It] should proxy to cadvisor [Conformance]
[Fail] [k8s.io] Proxy version v1 [It] should proxy to cadvisor using proxy subresource [Conformance]

Can not reproduce these 2:
should provide DNS for the cluster [Conformance]
should provide DNS for services [Conformance]

Comment 2 Avesh Agarwal 2016-08-25 16:17:40 UTC
On my f24 VM the following test:
 
[Fail] [k8s.io] hostPath [It] should support subPath [Conformance]

is failing due to selinux avcs as the test passes in selinux permissive mode.
I think I will test on rhel7 too to make sure.

Comment 3 Avesh Agarwal 2016-08-25 16:19:23 UTC
time->Thu Aug 25 12:18:26 2016
type=AVC msg=audit(1472141906.557:3523): avc:  denied  { write } for  pid=15682 comm="mt" name="test-file" dev="tmpfs" ino=317712 scontext=system_u:system_r:svirt_lxc_net_t:s0:c29,c183 tcontext=system_u:object_r:docker_tmp_t:s0 tclass=file permissive=0
----
time->Thu Aug 25 12:18:27 2016
type=AVC msg=audit(1472141907.814:3527): avc:  denied  { open } for  pid=15771 comm="mt" path="/test-volume/sub-path/test-file" dev="tmpfs" ino=317712 scontext=system_u:system_r:svirt_lxc_net_t:s0:c143,c496 tcontext=system_u:object_r:docker_tmp_t:s0 tclass=file permissive=0
----
time->Thu Aug 25 12:18:29 2016
type=AVC msg=audit(1472141909.814:3528): avc:  denied  { open } for  pid=15771 comm="mt" path="/test-volume/sub-path/test-file" dev="tmpfs" ino=317712 scontext=system_u:system_r:svirt_lxc_net_t:s0:c143,c496 tcontext=system_u:object_r:docker_tmp_t:s0 tclass=file permissive=0

Comment 4 Avesh Agarwal 2016-08-25 16:49:52 UTC
I have sent PR to origin to exclude the test "should proxy to cadvisor using proxy subresource" :

https://github.com/openshift/origin/pull/10647

Comment 5 Avesh Agarwal 2016-08-25 17:10:29 UTC
As per discussion with Andy Goldstein, i think the right way to run these tests is by invoking test/extended/core.sh.

Comment 6 Timothy St. Clair 2016-08-26 14:37:20 UTC
That is not in our rpms, and the name is very confusing to an average user.

Comment 7 Jason DeTiberus 2016-08-26 18:31:51 UTC
Once we have a direction forward for this and are ready for the packaging work, this Bug can be re-assigned to Troy Dawson or Scott Dodson.