Bug 1723914 - 4.2 CI failed with - Not all desired DNS DaemonSets available
Summary: 4.2 CI failed with - Not all desired DNS DaemonSets available
Keywords:
Status: CLOSED DUPLICATE of bug 1725832
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: DNS
Version: 4.2.0
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 4.2.0
Assignee: Dan Mace
QA Contact: Hongan Li
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-06-25 17:55 UTC by Ben Bennett
Modified: 2019-09-13 08:05 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-09-13 08:05:15 UTC
Target Upstream Version:


Attachments (Terms of Use)

Description Ben Bennett 2019-06-25 17:55:30 UTC
Description of problem:

A 4.2 CI run failed with "Not all desired DNS DaemonSets available".

https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-ocp-installer-e2e-aws-4.2/56


Version-Release number of selected component (if applicable):

release:4.2.0-0.nightly-2019-06-25-143607


How reproducible:

Only seen once.

Comment 1 Dan Mace 2019-06-25 19:35:54 UTC
https://storage.googleapis.com/origin-ci-test/logs/release-openshift-ocp-installer-e2e-aws-4.2/56/artifacts/e2e-aws/must-gather/namespaces/openshift-dns/core/events.yaml
https://storage.googleapis.com/origin-ci-test/logs/release-openshift-ocp-installer-e2e-aws-4.2/56/artifacts/e2e-aws/must-gather/namespaces/openshift-dns/apps/daemonsets.yaml
https://gcsweb-ci.svc.ci.openshift.org/gcs/origin-ci-test/logs/release-openshift-ocp-installer-e2e-aws-4.2/56/artifacts/e2e-aws/must-gather/namespaces/openshift-dns/pods/

Current line of inquiry is investigating dns-default-cg574 which appears in the event log but not in the state dump.

    message: 'Failed create pod sandbox: rpc error: code = Unknown desc = failed to
    create pod network sandbox k8s_dns-default-cg574_openshift-dns_80dff9ab-975b-11e9-a1aa-0a349e682728_0(d64b9d47df6414e11f0ad4cbed67d6f216bc38d3aff6812670239fc22e20ec03):
    netplugin failed but error parsing its diagnostic message "": unexpected end of
    JSON input'

Lots of other SDN errors for the pods that do exist before they finally got created.

Still looking around, just wanted to communicate some notes.

Comment 3 Dan Mace 2019-06-25 19:49:07 UTC
Is our status reporting here correct? We're reporting degraded, which seems appropriate.

Comment 4 Dan Mace 2019-07-30 14:06:10 UTC
All evidence so far points to some transient SDN issue. If this is still happening, feel free to re-open against SDN.

Comment 5 W. Trevor King 2019-09-13 04:01:36 UTC
New bug filed in bug 1751246.  Marking this one as a dup of the new one so they have a structured Bugzilla connection ;)

*** This bug has been marked as a duplicate of bug 1751246 ***

Comment 6 Casey Callendrello 2019-09-13 08:04:04 UTC
This is definitely not a duplicate of the other bug - loopback is coredumping

Jun 25 15:12:12 ip-10-0-155-42 systemd-coredump[2686]: Process 2642 (loopback) of user 0 dumped core.
                                                       
                                                       Stack trace of thread 2642:
                                                       #0  0x00007f61c09960d3 _dl_relocate_object (/usr/lib64/ld-2.28.so)
                                                       #1  0x00007f61c098e1af dl_main (/usr/lib64/ld-2.28.so)
                                                       #2  0x00007f61c09a3b00 _dl_sysdep_start (/usr/lib64/ld-2.28.so)
                                                       #3  0x00007f61c098c0f8 _dl_start (/usr/lib64/ld-2.28.so)
                                                       #4  0x00007f61c098b038 _start (/usr/lib64/ld-2.28.so)


It looks like another instance of https://bugzilla.redhat.com/show_bug.cgi?id=1725832

Comment 7 Casey Callendrello 2019-09-13 08:05:15 UTC

*** This bug has been marked as a duplicate of bug 1725832 ***


Note You need to log in before you can comment on or make changes to this bug.