Bug 1474274 - Pod keeps in ContainerCreating status when set invalid value in pod bandwidth
Pod keeps in ContainerCreating status when set invalid value in pod bandwidth
Status: NEW
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking (Show other bugs)
3.6.0
Unspecified Unspecified
medium Severity low
: ---
: ---
Assigned To: Ben Bennett
Meng Bo
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2017-07-24 05:30 EDT by Yan Du
Modified: 2017-07-26 10:54 EDT (History)
3 users (show)

See Also:
Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Yan Du 2017-07-24 05:30:21 EDT
Description of problem:
Pod keep in ContainerCreating status when set invalid value in pod bandwidth
# oc get pod -n d1
NAME      READY     STATUS              RESTARTS   AGE
iperf     0/1       ContainerCreating   0          1h


Version-Release number of selected component (if applicable):
openshift v3.6.153
kubernetes v1.6.1+5115d708d7

How reproducible:
Always

Steps to Reproduce:
1. Create a pod with invalid pod bandwidth
# oc create -f https://raw.githubusercontent.com/openshift-qe/v3-testfiles/master/networking/egress-ingress/invalid-iperf.json
2. Check the pod
# oc get pod



Actual results:
# oc get pod -n d1
NAME      READY     STATUS              RESTARTS   AGE
iperf     0/1       ContainerCreating   0          1h

# oc describe pod iperf -n d1
Name:            iperf
Namespace:        d1
Security Policy:    anyuid
Node:            host-8-174-69.host.centralci.eng.rdu2.redhat.com/10.8.174.69
Start Time:        Mon, 24 Jul 2017 02:52:44 -0400
Labels:            <none>
Annotations:        kubernetes.io/egress-bandwidth=-10M
            kubernetes.io/ingress-bandwidth=-3M
            openshift.io/scc=anyuid
Status:            Pending
IP:            
Controllers:        <none>
Containers:
  iperf:
    Container ID:    
    Image:        yadu/hello-openshift-iperf
    Image ID:        
    Port:        
    State:        Waiting
      Reason:        ContainerCreating
    Ready:        False
    Restart Count:    0
    Environment:    <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-hf0cm (ro)
Conditions:
  Type        Status
  Initialized     True 
  Ready     False 
  PodScheduled     True 
Volumes:
  default-token-hf0cm:
    Type:    Secret (a volume populated by a Secret)
    SecretName:    default-token-hf0cm
    Optional:    false
QoS Class:    BestEffort
Node-Selectors:    <none>
Tolerations:    <none>
Events:
  FirstSeen    LastSeen    Count    From                                SubObjectPath    Type        Reason            Message
  ---------    --------    -----    ----                                -------------    --------    ------            -------
  1h        1h        1    default-scheduler                                Normal        Scheduled        Successfully assigned iperf to host-8-174-69.host.centralci.eng.rdu2.redhat.com
  1h        59m        9    kubelet, host-8-174-69.host.centralci.eng.rdu2.redhat.com            Warning        DNSSearchForming    Found and omitted duplicated dns domain in host search line: 'cluster.local' during merging with cluster dns domains
  1h        8m        114    kubelet, host-8-174-69.host.centralci.eng.rdu2.redhat.com            Normal        SandboxChanged        Pod sandbox changed, it will be killed and re-created.
  1h        3m        125    kubelet, host-8-174-69.host.centralci.eng.rdu2.redhat.com            Warning        FailedSync        Error syncing pod


Expected results:
Pod status should be Error or something else instead of keeping in ContainerCreating status. 
Before 3.5, when we set invalid value in pod bandwidth, we get some warning like "resource is unreasonably small (< 1kbit)" in events log. 

In 3.6, FailedSync event was intentionally changed to reduce etcd event spam according this PR:https://github.com/openshift/origin/pull/14693 , now when we set invalid bandwidth in pod, we don't have any meaningful warnning in events log and the pod keeps in ContainerCreating status, it may confuse users which don't have permission to check the node log

Additional info:
Comment 1 Derek Carr 2017-07-25 10:05:18 EDT
invalid values should be caught in validation, not at runtime.
Comment 2 Aleksandar Kostadinov 2017-07-25 12:33:18 EDT
Derek, makes sense to me. Just presently we need to make sure that user receives some feedback. Even admins can have trouble diagnosing such issues when they don't expect what the trouble could be.

I don't know if `ingress-bandwidth` is the only annotation that can have this problem. IMO we need to be sure to send feedback for any post-validation issues now and in the future.

While I agree that we shouldn't have post-validation issues, we obviously have. And new features can introduce such at any time. Implementing a way for this to be propagated back to the user is ultimate for a reasonable UX.
Comment 3 Derek Carr 2017-07-26 10:54:24 EDT
You can always send back a new event specific invalid bandwidth settings.  Piggybacking on the FailedSync event is not ideal.  I think the FailedSync event should go away honestly as it means nothing to a user.  InvalidBandwidth events is much more meaningful.

Note You need to log in before you can comment on or make changes to this bug.