Bug 1976387
Summary: | sdn containers - readiness probe failing - nonexisting /etc/cni/net.d/80-openshift-network.conf file | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Jan Chaloupka <jchaloup> |
Component: | Networking | Assignee: | Mohamed Mahmoud <mmahmoud> |
Networking sub component: | openshift-sdn | QA Contact: | zhaozhanqi <zzhao> |
Status: | CLOSED DUPLICATE | Docs Contact: | |
Severity: | medium | ||
Priority: | medium | CC: | astoycos, bbennett, pehunt |
Version: | 4.9 | ||
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2021-07-15 14:09:37 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Jan Chaloupka
2021-06-25 22:16:53 UTC
Also, not localized to a single job. There are others, # of "Probe failed" for "openshift-sdn" lines summed through all nodes. https://testgrid.k8s.io/redhat-openshift-ocp-release-4.9-informing#periodic-ci-openshift-release-master-ci-4.9-e2e-aws 1408499853863948288 -> 55 1408137236276318208 -> 54 1407774700578279424 -> 49 1407412255246520320 -> 48 1407049775383056384 -> 46 1406687261579284480 -> 0 1406324869435494400 -> 0 1405962259166924800 -> 47 1405599673976098817 -> 0 1405237081269080064 -> 0 1404874471273140224 -> 44 (6-15 11:52) 1404511939131871232 -> 0 1404149543762661376 -> 0 1403787152164130816 -> 2 1403424759693185024 -> 0 1403062368040128512 -> 0 1402699862897594368 -> 0 1402337406559981568 -> 0 1401974886473142272 -> 0 1401612310719500288 -> 0 1401249710009749504 -> 0 1400887319149416448 -> 0 1400524803466596352 -> 3 1400162364472430592 -> 4 1399799932109459456 -> 5 https://testgrid.k8s.io/redhat-openshift-ocp-release-4.9-informing#periodic-ci-openshift-release-master-ci-4.9-e2e-azure: 1408499853989777408 -> 49 1408137236427313152 -> 53 1407774700871880704 -> 47 1407412256152489984 -> 90 1407049775517274112 -> 0 1406687261705113600 -> 57 1406324869557129216 -> 58 1405962259393417216 -> 48 1405599674072567808 -> 7 1405237081441046528 -> 77 1404874471415746560 -> 57 (6-15 11:52) 1404511939651964928 -> 0 1404149543888490496 -> 0 1403787152361263104 -> 0 1403424759898705920 -> 0 1403062368266620928 -> 0 1402699863031812096 -> 0 1402337406731948032 -> 0 1401974886645108736 -> 0 1401612311164096512 -> 0 1401249710789890048 -> 0 1400887319367520256 -> 1 1400524803760197632 -> 0 1400162365361623040 -> 0 1399799940405792768 -> 0 https://testgrid.k8s.io/redhat-openshift-ocp-release-4.9-informing#periodic-ci-openshift-release-master-ci-4.9-e2e-gcp-serial: 1408499854119800832 -> 8 1408137236557336576 -> 5 1407774701412945920 -> 8 1407412256626446336 -> 6 1407049775655686144 -> 6 1406687262351036416 -> 5 1406324870190469120 -> 8 1405962260718817280 -> 8 1405599674206785536 -> 7 1405237081633984512 -> 8 1404874471545769984 -> 12 (6-15 11:53) 1404511939865874432 -> 0 1404149544530219008 -> 0 1403787152927494144 -> 2 1403424760028729344 -> 0 1403062368509890560 -> 0 1402699863178612736 -> 0 1402337407499505664 -> 0 1401974887265865728 -> 0 1401612312141369344 -> 0 1401249718952005632 -> 0 1400887323310166016 -> 0 1400524811716792320 -> 0 1400162365575532544 -> 0 1399799949662621696 -> 0 https://testgrid.k8s.io/redhat-openshift-ocp-release-4.9-informing#periodic-ci-openshift-release-master-nightly-4.9-e2e-gcp: 1408523026344972288 -> 0 1408426965920124928 -> 57 1408399956833734656 -> 50 1408366545993732096 -> 100 1408333565816475648 -> 37 1408290809438015488 -> 0 1408215655278186496 -> 43 1408188377177526272 -> 119 1408112656614690816 -> 47 1408071249019539456 -> 26 1408042914583416832 -> 35 1407979468877729792 -> 30 1407965466604867584 -> 35 1407909975249915904 -> 71 1407826111701716992 -> 61 1407779446575861760 -> 42 1407730976733270016 -> 40 1407679009713557504 -> 49 1407617307236110336 -> 36 1407422783708729344 -> 27 1407368834712604672 -> 23 1407286713490870272 -> 26 1407256069159260160 -> 26 1407158247877513216 -> 16 1407141805060788224 -> 20 1407056138754592768 -> 25 1407022796508237824 -> 24 1406990828429119488 -> 16 1406965032025067520 -> 0 1406897173617971200 -> 0 1406877277626568704 -> 0 1406534725148872704 -> 18 1406509018096078848 -> 2 1406431749080092672 -> 23 1406400607555686400 -> 32 1406344090815041536 -> 23 1406301013836566528 -> 17 1406260944895479808 -> 17 1406238368362139648 -> 25 1406175443307991040 -> 15 1406153596495466496 -> 20 1406097444055289856 -> 36 1406035929683988480 -> 37 1405981743948763136 -> 26 1405780442413535232 -> 25 1405770476600430592 -> 23 1405684672725258240 -> 19 1405661817270702080 -> 31 1405616686576439296 -> 26 1405527699652349952 -> 24 1405508999712870400 -> 25 1405453021546024960 -> 37 1405424083033657344 -> 23 1405397031186337792 -> 29 1405349724776566784 -> 36 1405264231183421440 -> 32 1405248914688315392 -> 24 1405156016810627072 -> 16 1405110770324213760 -> 24 1405091023717142528 -> 0 1405047589841145856 -> 54 1405033782586642432 -> 21 1404992772485681152 -> 21 1404974946408468480 -> 30 1404872603113361408 -> 34 1404526770002071552 -> 18 1404508939034300416 -> 29 (6-14 11:40) 1404195348502548480 -> 0 1403470568166002688 -> 0 1402745667595538432 -> 0 1402020678927912960 -> 0 https://testgrid.k8s.io/redhat-openshift-ocp-release-4.9-informing#periodic-ci-openshift-release-master-nightly-4.9-e2e-gcp-rt 1408523029662666752 -> 0 1408426946819264512 -> 548 1408399958633091072 -> 455 1408366552603955200 -> 461 1408333569146753024 -> 488 1408290814869639168 -> 0 1408215657794768896 -> 323 1408188380587495424 -> 509 1408112660024659968 -> 485 1408071254895759360 -> 425 1408042924704272384 -> 382 1407979473063645184 -> 421 1407965469939339264 -> 295 1407909976097165312 -> 441 1407826114226688000 -> 449 1407779446768799744 -> 469 1407730981636411392 -> 465 1407679017875673088 -> 445 1407617318988550144 -> 372 1407422770530226176 -> 96 1407368845160615936 -> 79 1407256072518897664 -> 147 1407158251291676672 -> 105 1407141808449785856 -> 119 1407056139593453568 -> 102 1407022799905624064 -> 114 1406990832505982976 -> 162 1406965032100564992 -> 0 1406897176881139712 -> 0 1406877281799901184 -> 0 1406534730026848256 -> 89 1406509021455716352 -> 139 1406431752439730176 -> 129 1406400610902740992 -> 114 1406344094812213248 -> 122 1406301017192009728 -> 128 1406260945075834880 -> 112 1406238348372086784 -> 132 1406175448244686848 -> 129 1406153600513609728 -> 89 1406097444885762048 -> 155 1406035908980903936 -> 125 1405780442572918784 -> 117 1405770480123645952 -> 165 1405684675984232448 -> 113 1405661820621950976 -> 137 1405616691475386368 -> 120 1405527699706875904 -> 83 1405509000614645760 -> 108 1405453021898346496 -> 96 1405424078902267904 -> 180 1405397031219892224 -> 116 1405349725611233280 -> 131 1405264227836366848 -> 134 1405248915506204672 -> 125 1405156033394905088 -> 106 1405110754855620608 -> 120 1405090997704069120 -> 72 1405047570882891776 -> 112 1405033772675502080 -> 95 1404992764315176960 -> 98 1404974920785465344 -> 137 1404872622969196544 -> 96 https://testgrid.k8s.io/redhat-openshift-ocp-release-4.9-informing#periodic-ci-openshift-release-master-nightly-4.9-e2e-vsphere 1408533142368686080 -> 0 1408523026382721024 -> 0 1408455104478056448 -> 34 1408426967753035776 -> 6 1408399952639430656 -> 13 1408366547570790400 -> 15 1408364467191812096 -> 10 1408333567477420032 -> 10 1408290810675335168 -> 0 1408273825409273856 -> 15 1408215657039794176 -> 15 1408188379769606144 -> 18 1408183217189556224 -> 23 1408112658313383936 -> 8 1408071251561287680 -> 13 1408042917049667584 -> 10 1407979470547062784 -> 15 1407965468270006272 -> 18 1407909971059806208 -> 0 1407826113379438592 -> 28 1407779446651359232 -> 8 1407730979124023296 -> 0 1407730464503894016 -> 7 1407679012892839936 -> 34 1407617315037515776 -> 52 1407549227453648896 -> 3 1407458618147606528 -> 11 1407422791266865152 -> 9 1407368842723725312 -> 9 1407368005796499456 -> 7 1407286722638647296 -> 13 1407186691839496192 -> 12 1407158255431454720 -> 8 1407141813474562048 -> 3 1407096090112561152 -> 12 1407056143800340480 -> 3 1407022804217368576 -> 10 1407005659186073600 -> 0 1406990846770810880 -> 0 1406965035137241088 -> 13 1406897181897527296 -> 5 1406877286845648896 -> 5 1406824431300382720 -> 7 1406733823135191040 -> 0 1406643266799013888 -> 4 1406534735080984576 -> 4 1406509001989951488 -> 3 1406462038309343232 -> 16 1406431756634034176 -> 3 1406400615940100096 -> 2 1406371430127374336 -> 28 1406344099010711552 -> 5 1406301022229368832 -> 5 1406260952050962432 -> 5 1406238349148033024 -> 12 1406175453277851648 -> 5 1406153605597106176 -> 3 1406097449088454656 -> 2 1406035909350002688 -> 4 1406009075359027200 -> 11 1405981753193009152 -> 5 1405918513444425728 -> 7 1405780443298533376 -> 10 1405770485215531008 -> 0 1405737286787665920 -> 10 1405684680161759232 -> 5 1405661800703201280 -> 8 1405646487823585280 -> 6 1405616696584048640 -> 11 1405527699794956288 -> 10 1405508997385031680 -> 3 1405453023580262400 -> 3 1405424080579989504 -> 6 1405397031106646016 -> 12 1405349727280566272 -> 2 1405264229514088448 -> 0 1405248917204897792 -> 12 1405193584629518336 -> 3 1405156015153876992 -> 5 1405110762778660864 -> 12 1405091002804342784 -> 4 1405047577249845248 -> 6 1405033775854784512 -> 2 1404992765158232064 -> 8 1404974929681584128 -> 5 1404921787745046528 -> 5 1404872632188276736 -> 5 1404831229940862976 -> 4 1404740595171201024 -> 7 1404650009047076864 -> 9 (6-14 21:00) 1404559350072086528 -> 0 1404526756559327232 -> 4 1404508953294934016 -> 7 (6-14 11:40) 1404378314050637824 -> 0 1404287719776980992 -> 0 1404197112677142528 -> 0 1404106554839404544 -> 0 1404015920824717312 -> 0 1403925325989023744 -> 0 1403834720696930304 -> 0 1403744159000432640 -> 4 1403653527003205632 -> 7 1403562933157367808 -> 0 1403472327181602816 -> 0 1403381772640587776 -> 0 1403291136822349824 -> 0 1403200542074736640 -> 0 1403109934266060800 -> 0 Based on the increased occurrence of the "Probe failed" logs there's a chance something changed/got merged around (6-15 11:52)/(6-14 11:40). Could be worth investigation what it was. The following script was used to collect the occurrences: ``` JOB=periodic-ci-openshift-release-master-ci-4.9-e2e-azure JOB=periodic-ci-openshift-release-master-ci-4.9-e2e-gcp-serial JOB=periodic-ci-openshift-release-master-nightly-4.9-e2e-gcp JOB=periodic-ci-openshift-release-master-nightly-4.9-e2e-gcp-rt JOB=periodic-ci-openshift-release-master-nightly-4.9-e2e-vsphere IDS=$(curl https://testgrid.k8s.io/redhat-openshift-ocp-release-4.9-informing/table?tab=${JOB} 2>/dev/null | jq --raw-output ".changelists[]") pullData() { ID=$1 total=0 for dir in $(gsutil ls gs://origin-ci-test/logs/${JOB}/${ID}/artifacts/e2e-vsphere/gather-extra/artifacts/nodes 2>/dev/null); do count=$(gsutil cp ${dir}journal - 2>/dev/null | gunzip -c | grep "Probe failed" | grep "openshift-sdn" | wc -l) total=$(( $total + $count )) done echo " ${ID} -> $total" } i=0 for ID in ${IDS}; do #echo -e "Checking https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/${JOB}/${ID}" if [ $i -le 30 ]; then i=$(($i + 1)) pullData $ID & else i=0 wait fi done ``` The scripts were run for each job separately. Results then sorted by job id. All the mentioned jobs have must-gather in the bucket. E.g. https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/logs/periodic-ci-openshift-release-master-nightly-4.9-e2e-gcp-rt/1407422770530226176/artifacts/e2e-gcp-rt/gather-must-gather/artifacts/ it seems that recent crio version 1.20+ doesn't respect timeout seconds settings that is why from the logs the readiness probe check took ~3sec however it still timed out even though the limit was set to 5secs crio team has PR to fix this issue https://bugzilla.redhat.com/show_bug.cgi?id=1978268 once they confirmed this bz will get duplicated to 1978268 yeah this does loook like a dup *** This bug has been marked as a duplicate of bug 1978268 *** |