Bug 1786374

Summary: 4.2.z upgrades failing: Node didn't have enough resource: ephemeral-storage
Product: OpenShift Container Platform Reporter: Ben Parees <bparees>
Component: InstallerAssignee: Abhinav Dahiya <adahiya>
Installer sub component: openshift-installer QA Contact: Johnny Liu <jialiu>
Status: CLOSED DUPLICATE Docs Contact:
Severity: high    
Priority: unspecified CC: lmohanty, vrutkovs, wking
Version: 4.2.z   
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-01-02 17:11:37 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Ben Parees 2019-12-24 19:11:54 UTC
Description of problem:
Upgrade cannot be retrieved:

https://prow.svc.ci.openshift.org/job-history/origin-ci-test/logs/release-openshift-origin-installer-e2e-azure-upgrade-4.2

example:
https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-origin-installer-e2e-azure-upgrade-4.2/307

https://storage.googleapis.com/origin-ci-test/logs/release-openshift-origin-installer-e2e-azure-upgrade-4.2/307/artifacts/e2e-azure-upgrade/clusterversion.json


                 {
                        "lastTransitionTime": "2019-12-24T16:48:06Z",
                        "message": "Working towards registry.svc.ci.openshift.org/ocp/release:4.2.0-0.ci-2019-12-23-142302: downloading update",
                        "reason": "DownloadingUpdate",
                        "status": "True",
                        "type": "Progressing"
                    },
                    {
                        "lastTransitionTime": "2019-12-24T16:29:16Z",
                        "message": "Unable to retrieve available updates: currently installed version 4.2.0-0.ci-2019-12-20-203443 not found in the \"stable-4.2\" channel",
                        "reason": "RemoteFailed",
                        "status": "False",
                        "type": "RetrievedUpdates"
                    }

Comment 1 Ben Parees 2019-12-24 19:14:20 UTC
GCP upgrades also failing, same error:

https://prow.svc.ci.openshift.org/job-history/origin-ci-test/logs/release-openshift-origin-installer-e2e-gcp-upgrade-4.2

https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-origin-installer-e2e-gcp-upgrade-4.2/302

https://storage.googleapis.com/origin-ci-test/logs/release-openshift-origin-installer-e2e-gcp-upgrade-4.2/302/artifacts/e2e-gcp-upgrade/clusterversion.json


                   {
                        "lastTransitionTime": "2019-12-24T16:34:26Z",
                        "message": "Working towards registry.svc.ci.openshift.org/ocp/release:4.2.0-0.ci-2019-12-23-142302: downloading update",
                        "reason": "DownloadingUpdate",
                        "status": "True",
                        "type": "Progressing"
                    },
                    {
                        "lastTransitionTime": "2019-12-24T16:19:57Z",
                        "message": "Unable to retrieve available updates: currently installed version 4.2.0-0.ci-2019-12-20-203443 not found in the \"stable-4.2\" channel",
                        "reason": "RemoteFailed",
                        "status": "False",
                        "type": "RetrievedUpdates"
                    }

Comment 2 Lalatendu Mohanty 2020-01-02 16:37:39 UTC
The installed version is "4.2.0-0.ci-2019-12-20-203443" which looks like a CI build or nightly build, so I do not think it should be in the stable channel. Looks like a test issue.

Comment 3 Lalatendu Mohanty 2020-01-02 16:39:48 UTC
Current versions available in the 4.2 stable channel

$ curl --silent --header 'Accept:application/json' https://api.openshift.com/api/upgrades_info/v1/graph?channel=stable-4.2 | jq '. as $graph | $graph.nodes | map(.version == "4.2.0"[24/24]
ex(true) as $orig | $graph.edges | map(select(.[0] == $orig)[1]) | map($graph.nodes[.])'

[                
  {                                                                                                      
    "version": "4.2.12",
    "payload": "quay.io/openshift-release-dev/ocp-release@sha256:77ade34c373062c6a6c869e0e56ef93b2faaa373adadaac1430b29484a24d843",
    "metadata": {                                                                                        
      "io.openshift.upgrades.graph.release.manifestref": "sha256:77ade34c373062c6a6c869e0e56ef93b2faaa373adadaac1430b29484a24d843",
      "description": "",
      "url": "https://access.redhat.com/errata/RHBA-2019:4181",
      "io.openshift.upgrades.graph.release.channels": "candidate-4.2,fast-4.2,stable-4.2"
    }                                                                                                                                                                                                              
  },             
  {                     
    "version": "4.2.9",                                                                                  
    "payload": "quay.io/openshift-release-dev/ocp-release@sha256:f28cbabd1227352fe704a00df796a4511880174042dece96233036a10ac61639",
    "metadata": {                                                                                                                                                                                                  
      "url": "https://access.redhat.com/errata/RHBA-2019:3953",
      "description": "",
      "io.openshift.upgrades.graph.release.manifestref": "sha256:f28cbabd1227352fe704a00df796a4511880174042dece96233036a10ac61639",
      "io.openshift.upgrades.graph.release.channels": "candidate-4.2,fast-4.2,stable-4.2"
    }                                                                                                                                                                                                              
  },             
  {                                                                                                      
    "version": "4.2.10",
    "payload": "quay.io/openshift-release-dev/ocp-release@sha256:dc2e38fb00085d6b7f722475f8b7b758a0cb3a02ba42d9acf8a8298a6d510d9c",
    "metadata": {                                                                                                                                                                                                  
      "url": "https://access.redhat.com/errata/RHBA-2019:4093",
      "description": "",
      "io.openshift.upgrades.graph.release.manifestref": "sha256:dc2e38fb00085d6b7f722475f8b7b758a0cb3a02ba42d9acf8a8298a6d510d9c",
      "io.openshift.upgrades.graph.release.channels": "candidate-4.2,fast-4.2,stable-4.2"
    }                                                                                                                                                                                                              
  },             
  {                                                                                                                                                                                                                
    "version": "4.2.7", 
    "payload": "quay.io/openshift-release-dev/ocp-release@sha256:bac62983757570b9b8f8bc84c740782984a255c16372b3e30cfc8b52c0a187b9",
    "metadata": {                                                                                        
      "description": "",
      "io.openshift.upgrades.graph.release.channels": "candidate-4.2,fast-4.2,stable-4.2",
      "url": "https://access.redhat.com/errata/RHBA-2019:3869",
      "io.openshift.upgrades.graph.release.manifestref": "sha256:bac62983757570b9b8f8bc84c740782984a255c16372b3e30cfc8b52c0a187b9"
    }                                                                                                                                                                                                              
  },
{                     
    "version": "4.2.8",                                                                                  
    "payload": "quay.io/openshift-release-dev/ocp-release@sha256:4bf307b98beba4d42da3316464013eac120c6e5a398646863ef92b0e2c621230",
    "metadata": {                                                                                                                                                                                                  
      "url": "https://access.redhat.com/errata/RHBA-2019:3919",
      "description": "",
      "io.openshift.upgrades.graph.release.channels": "candidate-4.2,fast-4.2,stable-4.2",
      "io.openshift.upgrades.graph.release.manifestref": "sha256:4bf307b98beba4d42da3316464013eac120c6e5a398646863ef92b0e2c621230"
    }
  },
  {
    "version": "4.2.2",
    "payload": "quay.io/openshift-release-dev/ocp-release@sha256:dc782b44cac3d59101904cc5da2b9d8bdb90e55a07814df50ea7a13071b0f5f0",
    "metadata": {
      "io.openshift.upgrades.graph.release.manifestref": "sha256:dc782b44cac3d59101904cc5da2b9d8bdb90e55a07814df50ea7a13071b0f5f0",
      "description": "",
      "io.openshift.upgrades.graph.release.channels": "candidate-4.2,fast-4.2,stable-4.2",
      "url": "https://access.redhat.com/errata/RHBA-2019:3151"
    }
  },
  {
    "version": "4.2.11",
    "payload": "quay.io/openshift-release-dev/ocp-release@sha256:49ee20ee3102b15a7cf4c019fd8875134fda41ccda1dc27b6e4483ded2aa8a5c",
    "metadata": {
      "description": "",
      "url": "https://access.redhat.com/errata/RHBA-2019:4181",
      "io.openshift.upgrades.graph.release.channels": "candidate-4.2,fast-4.2,stable-4.2",
      "io.openshift.upgrades.graph.release.manifestref": "sha256:49ee20ee3102b15a7cf4c019fd8875134fda41ccda1dc27b6e4483ded2aa8a5c"
    }
  }
]

Comment 4 Vadim Rutkovsky 2020-01-02 16:45:42 UTC
This seems to be a red herring. The real issue is lack of ephemeral storage:

```
Dec 24 16:53:38.962 I ns/openshift-cluster-version pod/version--x9n5n-g8bdd node/ci-op-z64nnxtd-43a0f-lvpxh-master-1 created
Dec 24 16:53:38.979 I ns/openshift-cluster-version job/version--x9n5n Created pod: version--x9n5n-g8bdd
Dec 24 16:53:38.992 W ns/openshift-cluster-version pod/version--x9n5n-g8bdd Node didn't have enough resource: ephemeral-storage, requested: 2097152, used: 0, capacity: 0
Dec 24 16:53:39.006 I ns/openshift-cluster-version pod/version--x9n5n-9p5pl node/ci-op-z64nnxtd-43a0f-lvpxh-master-1 created
Dec 24 16:53:39.013 I ns/openshift-cluster-version job/version--x9n5n Created pod: version--x9n5n-9p5pl
Dec 24 16:53:39.061 W ns/openshift-cluster-version pod/version--x9n5n-9p5pl Node didn't have enough resource: ephemeral-storage, requested: 2097152, used: 0, capacity: 0
Dec 24 16:53:40.614 W clusterversion/version changed Failing to False
Dec 24 16:53:49.125 I ns/openshift-cluster-version pod/version--x9n5n-v24dt node/ci-op-z64nnxtd-43a0f-lvpxh-master-1 created
Dec 24 16:53:49.132 I ns/openshift-cluster-version job/version--x9n5n Created pod: version--x9n5n-v24dt
Dec 24 16:53:49.172 W ns/openshift-cluster-version pod/version--x9n5n-v24dt Node didn't have enough resource: ephemeral-storage, requested: 2097152, used: 0, capacity: 0
Dec 24 16:54:09.184 I ns/openshift-cluster-version pod/version--x9n5n-d4j7c node/ci-op-z64nnxtd-43a0f-lvpxh-master-1 created
Dec 24 16:54:09.191 I ns/openshift-cluster-version job/version--x9n5n Created pod: version--x9n5n-d4j7c
Dec 24 16:54:09.217 W ns/openshift-cluster-version pod/version--x9n5n-d4j7c Node didn't have enough resource: ephemeral-storage, requested: 2097152, used: 0, capacity: 0
Dec 24 16:54:49.244 I ns/openshift-cluster-version pod/version--x9n5n-kv7cr node/ci-op-z64nnxtd-43a0f-lvpxh-master-1 created
Dec 24 16:54:49.261 I ns/openshift-cluster-version job/version--x9n5n Created pod: version--x9n5n-kv7cr
Dec 24 16:54:49.267 W ns/openshift-cluster-version pod/version--x9n5n-kv7cr Node didn't have enough resource: ephemeral-storage, requested: 2097152, used: 0, capacity: 0
Dec 24 16:54:51.859 W ns/openshift-kube-apiserver pod/kube-apiserver-ci-op-z64nnxtd-43a0f-lvpxh-master-0 Removed file for secret: /%!(EXTRA *errors.StatusError=secrets "user-serving-cert-009" not found) (4 times)
Dec 24 16:56:09.284 W ns/openshift-cluster-version job/version--x9n5n Job was active longer than specified deadline
Dec 24 16:56:25.625 E clusterversion/version changed Failing to True: UpdatePayloadRetrievalFailed: Unable to download and prepare the update: deadline exceeded, reason: "DeadlineExceeded", message: "Job was active longer than specified deadline"
```
So it looks like a dupe of #1786315

Comment 5 W. Trevor King 2020-01-02 17:11:37 UTC
I agree with comment 4.  Closing as a dup.

*** This bug has been marked as a duplicate of bug 1786315 ***