Bug 1297521
| Summary: | Scaling up pod causes loop with Node is out of disk | ||||||
|---|---|---|---|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Ryan Howe <rhowe> | ||||
| Component: | Node | Assignee: | Andy Goldstein <agoldste> | ||||
| Status: | CLOSED ERRATA | QA Contact: | DeShuai Ma <dma> | ||||
| Severity: | high | Docs Contact: | |||||
| Priority: | high | ||||||
| Version: | 3.1.0 | CC: | aos-bugs, decarr, eparis, erich, jokerman, mmccomas, pep, tdawson, xtian | ||||
| Target Milestone: | --- | ||||||
| Target Release: | --- | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | atomic-openshift-3.1.1.900-1.git.1.bacd67f.el7 | Doc Type: | Bug Fix | ||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2016-05-12 16:26:35 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Bug Depends On: | |||||||
| Bug Blocks: | 1267746 | ||||||
| Attachments: |
|
||||||
|
Description
Ryan Howe
2016-01-11 18:45:35 UTC
This should be resolved with the next rebase into origin. The following upstream PRs add the ability to prevent scheduling to nodes that are out of disk: https://github.com/kubernetes/kubernetes/pull/16178 https://github.com/kubernetes/kubernetes/pull/16179 Not a 3.1.1 blocker Upstream fixed merged Oct 29 and Nov 2. Fixed when rebase lands. The upstream PRs have landed in openshift/origin repository. Verify on openshift v3.1.1.905
steps:
1. Get the node
[root@openshift-115 dma]# oc get node
NAME STATUS AGE
openshift-115.lab.sjc.redhat.com Ready,SchedulingDisabled 1d
openshift-136.lab.sjc.redhat.com Ready 1d
2.Create a rc and scale the pod replicas=0
[root@openshift-115 dma]# oc get rc -n dma
CONTROLLER CONTAINER(S) IMAGE(S) SELECTOR REPLICAS AGE
mysql-1 mysql brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/rhscl/mysql-56-rhel7:latest deployment=mysql-1,deploymentconfig=mysql,name=mysql 0 18m
3.Create a large file to fill the disk with 100% usage
[root@openshift-136 ~]# df -lh
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/rhel72-root 10G 10G 20K 100% /
devtmpfs 1.9G 0 1.9G 0% /dev
tmpfs 1.9G 0 1.9G 0% /dev/shm
tmpfs 1.9G 190M 1.7G 11% /run
tmpfs 1.9G 0 1.9G 0% /sys/fs/cgroup
/dev/vda1 497M 197M 300M 40% /boot
tmpfs 380M 0 380M 0% /run/user/0
4.Scale the rc with replicas=3
# oc scale rc/mysql-1 --replicas=3 -n dma
5. Check the pod status
[root@openshift-115 dma]# oc get pod -n dma
NAME READY STATUS RESTARTS AGE
mysql-1-8ss17 0/1 Pending 0 1m
mysql-1-aj620 0/1 Pending 0 1m
mysql-1-ufryk 0/1 Pending 0 1m
[root@openshift-115 dma]# oc describe pod/mysql-1-8ss17 -n dma|grep FailedScheduling
1m 33s 7 {default-scheduler } Warning FailedScheduling no nodes available to schedule pods
[root@openshift-115 dma]# oc describe pod/mysql-1-aj620 -n dma|grep FailedScheduling
2m 11s 12 {default-scheduler } Warning FailedScheduling no nodes available to schedule pods
[root@openshift-115 dma]# oc describe pod/mysql-1-ufryk -n dma|grep FailedScheduling
2m 14s 13 {default-scheduler } Warning FailedScheduling no nodes available to schedule pods
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2016:1064 |