Bug 1273149 - openshift-master keeps getting killed due to memory usage after upgrade
Summary: openshift-master keeps getting killed due to memory usage after upgrade
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Cluster Version Operator
Version: 3.0.0
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: ---
Assignee: Jason DeTiberus
QA Contact: Anping Li
URL:
Whiteboard:
: 1273749 1275983 (view as bug list)
Depends On:
Blocks: OSOPS_V3
TreeView+ depends on / blocked
 
Reported: 2015-10-19 18:35 UTC by Wesley Hearn
Modified: 2016-05-12 16:24 UTC (History)
16 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-05-12 16:24:48 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Untruncated log (2.19 KB, application/x-xz)
2015-10-20 19:24 UTC, Wesley Hearn
no flags Details
go top. (13.38 KB, application/x-xz)
2015-10-21 15:32 UTC, Wesley Hearn
no flags Details
go top 2 (13.11 KB, application/x-xz)
2015-10-21 15:35 UTC, Wesley Hearn
no flags Details
ps output (26.63 KB, application/x-xz)
2015-10-21 17:32 UTC, Wesley Hearn
no flags Details
go top (112.90 KB, application/x-xz)
2015-10-21 17:32 UTC, Wesley Hearn
no flags Details
cpu.pprof and free_memory (144.84 KB, application/x-xz)
2015-10-27 20:49 UTC, Wesley Hearn
no flags Details
mem profile (316.76 KB, application/x-xz)
2015-10-28 15:56 UTC, Wesley Hearn
no flags Details
master log and free_memory (41.21 KB, application/x-xz)
2015-10-28 18:47 UTC, Wesley Hearn
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2016:1064 0 normal SHIPPED_LIVE Important: Red Hat OpenShift Enterprise 3.2 security, bug fix, and enhancement update 2016-05-12 20:19:17 UTC

Description Wesley Hearn 2015-10-19 18:35:15 UTC
Description of problem:
After upgrading from 3.0.0.1 to 3.0.2.0 the master will not stay up for longer then 10/15mins. It ends up getting killed due to memory running out on the host.

Version-Release number of selected component (if applicable):
openshift-master-3.0.2.0-0.git.16.4d626fe.el7ose.x86_64

How reproducible:
Always

Steps to Reproduce:
1.
2.
3.

Actual results:
[root@ip-172-31-5-247 ~]# service openshift-master status
Redirecting to /bin/systemctl status  openshift-master.service
openshift-master.service - OpenShift Master
   Loaded: loaded (/usr/lib/systemd/system/openshift-master.service; enabled)
   Active: failed (Result: signal) since Mon 2015-10-19 13:36:38 EDT; 34min ago
     Docs: https://github.com/openshift/origin
  Process: 28342 ExecStart=/usr/bin/openshift start master --config=${CONFIG_FILE} $OPTIONS (code=killed, signal=KILL)
 Main PID: 28342 (code=killed, signal=KILL)
   CGroup: /system.slice/openshift-master.service

Oct 19 13:36:27 ip-172-31-5-247.ec2.internal openshift-master[28342]: E1019 13:36:27.087612   28342 replication_controller.go:342] unable to create pod replica: Pod "kitchensink-example-2110-i2u5y" is invalid: spec.containers[0].ports[...
Oct 19 13:36:27 ip-172-31-5-247.ec2.internal openshift-master[28342]: I1019 13:36:27.087635   28342 event.go:203] Event(api.ObjectReference{Kind:"ReplicationController", Namespace:"test", Name:"kitchensink-example-2110", ...rror creating:
Oct 19 13:36:27 ip-172-31-5-247.ec2.internal openshift-master[28342]: I1019 13:36:27.093861   28342 replication_controller.go:340] Failed creation, decrementing expectations for controller "test"/"kitchensink-example-2110"
Oct 19 13:36:27 ip-172-31-5-247.ec2.internal openshift-master[28342]: E1019 13:36:27.093876   28342 replication_controller.go:342] unable to create pod replica: Pod "kitchensink-example-2110-5y7cx" is invalid: spec.containers[0].ports[...
Oct 19 13:36:27 ip-172-31-5-247.ec2.internal openshift-master[28342]: I1019 13:36:27.093894   28342 event.go:203] Event(api.ObjectReference{Kind:"ReplicationController", Namespace:"test", Name:"kitchensink-example-2110", ...rror creating:
Oct 19 13:36:27 ip-172-31-5-247.ec2.internal openshift-master[28342]: I1019 13:36:27.100491   28342 replication_controller.go:340] Failed creation, decrementing expectations for controller "test"/"kitchensink-example-2110"
Oct 19 13:36:27 ip-172-31-5-247.ec2.internal openshift-master[28342]: E1019 13:36:27.100507   28342 replication_controller.go:342] unable to create pod replica: Pod "kitchensink-example-2110-p6gtf" is invalid: spec.containers[0].ports[...
Oct 19 13:36:27 ip-172-31-5-247.ec2.internal openshift-master[28342]: I1019 13:36:27.100526   28342 event.go:203] Event(api.ObjectReference{Kind:"ReplicationController", Namespace:"test", Name:"kitchensink-example-2110", ...rror creating:
Oct 19 13:36:38 ip-172-31-5-247.ec2.internal systemd[1]: openshift-master.service: main process exited, code=killed, status=9/KILL
Oct 19 13:36:38 ip-172-31-5-247.ec2.internal systemd[1]: Unit openshift-master.service entered failed state.
Hint: Some lines were ellipsized, use -l to show in full.

---------------------------
Looking at the full journal log for kernel messages
---------------------------
Oct 19 13:36:38 ip-172-31-5-247.ec2.internal kernel: openshift invoked oom-killer: gfp_mask=0x280da, order=0, oom_score_adj=0
Oct 19 13:36:38 ip-172-31-5-247.ec2.internal kernel: openshift cpuset=/ mems_allowed=0
Oct 19 13:36:38 ip-172-31-5-247.ec2.internal kernel: CPU: 0 PID: 28388 Comm: openshift Not tainted 3.10.0-229.14.1.el7.x86_64 #1
Oct 19 13:36:38 ip-172-31-5-247.ec2.internal kernel: Hardware name: Xen HVM domU, BIOS 4.2.amazon 06/01/2015
Oct 19 13:36:38 ip-172-31-5-247.ec2.internal kernel:  ffff8800df9f0000 000000002d5a2a76 ffff8800e539ba58 ffffffff81604516
Oct 19 13:36:38 ip-172-31-5-247.ec2.internal kernel:  ffff8800e539bae8 ffffffff815ff4df ffff880035541ad0 ffff880035541ae8
Oct 19 13:36:38 ip-172-31-5-247.ec2.internal kernel:  ffffffff00000206 fffeefff00000000 0000000000000001 ffffffff81117b03
Oct 19 13:36:38 ip-172-31-5-247.ec2.internal kernel: Call Trace:
Oct 19 13:36:38 ip-172-31-5-247.ec2.internal kernel:  [<ffffffff81604516>] dump_stack+0x19/0x1b
Oct 19 13:36:38 ip-172-31-5-247.ec2.internal kernel:  [<ffffffff815ff4df>] dump_header+0x8e/0x214
Oct 19 13:36:38 ip-172-31-5-247.ec2.internal kernel:  [<ffffffff81117b03>] ? proc_do_uts_string+0xf3/0x130
Oct 19 13:36:38 ip-172-31-5-247.ec2.internal kernel:  [<ffffffff8115a44e>] oom_kill_process+0x24e/0x3b0
Oct 19 13:36:38 ip-172-31-5-247.ec2.internal kernel:  [<ffffffff8115ac76>] out_of_memory+0x4b6/0x4f0
Oct 19 13:36:38 ip-172-31-5-247.ec2.internal kernel:  [<ffffffff81160e35>] __alloc_pages_nodemask+0xa95/0xb90
Oct 19 13:36:38 ip-172-31-5-247.ec2.internal kernel:  [<ffffffff811a287a>] alloc_pages_vma+0x9a/0x140
Oct 19 13:36:38 ip-172-31-5-247.ec2.internal kernel:  [<ffffffff81182f6f>] handle_mm_fault+0x9ef/0xd60
Oct 19 13:36:38 ip-172-31-5-247.ec2.internal kernel:  [<ffffffff8160f7e6>] __do_page_fault+0x156/0x520
Oct 19 13:36:38 ip-172-31-5-247.ec2.internal kernel:  [<ffffffff810b1c67>] ? update_curr+0x67/0x150
Oct 19 13:36:38 ip-172-31-5-247.ec2.internal kernel:  [<ffffffff810ad7a6>] ? __dequeue_entity+0x26/0x40
Oct 19 13:36:38 ip-172-31-5-247.ec2.internal kernel:  [<ffffffff81012609>] ? __switch_to+0x179/0x4a0
Oct 19 13:36:38 ip-172-31-5-247.ec2.internal kernel:  [<ffffffff8160fbca>] do_page_fault+0x1a/0x70
Oct 19 13:36:38 ip-172-31-5-247.ec2.internal kernel:  [<ffffffff8160be08>] page_fault+0x28/0x30
Oct 19 13:36:38 ip-172-31-5-247.ec2.internal kernel: Mem-Info:
Oct 19 13:36:38 ip-172-31-5-247.ec2.internal kernel: Node 0 DMA per-cpu:
Oct 19 13:36:38 ip-172-31-5-247.ec2.internal kernel: CPU    0: hi:    0, btch:   1 usd:   0
Oct 19 13:36:38 ip-172-31-5-247.ec2.internal kernel: CPU    1: hi:    0, btch:   1 usd:   0
Oct 19 13:36:38 ip-172-31-5-247.ec2.internal kernel: Node 0 DMA32 per-cpu:
Oct 19 13:36:38 ip-172-31-5-247.ec2.internal kernel: CPU    0: hi:  186, btch:  31 usd:   0
Oct 19 13:36:38 ip-172-31-5-247.ec2.internal kernel: CPU    1: hi:  186, btch:  31 usd:   0
Oct 19 13:36:38 ip-172-31-5-247.ec2.internal kernel: Node 0 Normal per-cpu:
Oct 19 13:36:38 ip-172-31-5-247.ec2.internal kernel: CPU    0: hi:   90, btch:  15 usd:   0
Oct 19 13:36:38 ip-172-31-5-247.ec2.internal kernel: CPU    1: hi:   90, btch:  15 usd:   0
Oct 19 13:36:38 ip-172-31-5-247.ec2.internal kernel: active_anon:826239 inactive_anon:13873 isolated_anon:0
                                                      active_file:0 inactive_file:0 isolated_file:0
                                                      unevictable:12811 dirty:0 writeback:0 unstable:0
                                                      free:20067 slab_reclaimable:5859 slab_unreclaimable:9656
                                                      mapped:5833 shmem:45848 pagetables:3158 bounce:0
                                                      free_cma:0
Oct 19 13:36:38 ip-172-31-5-247.ec2.internal kernel: Node 0 DMA free:15344kB min:276kB low:344kB high:412kB active_anon:404kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB presen
Oct 19 13:36:38 ip-172-31-5-247.ec2.internal kernel: lowmem_reserve[]: 0 3584 3773 3773
Oct 19 13:36:38 ip-172-31-5-247.ec2.internal kernel: Node 0 DMA32 free:64684kB min:63928kB low:79908kB high:95892kB active_anon:3304552kB inactive_anon:55492kB active_file:0kB inactive_file:0kB unevictable:51244kB isolated(anon):0kB isola
Oct 19 13:36:38 ip-172-31-5-247.ec2.internal kernel: lowmem_reserve[]: 0 0 189 189
Oct 19 13:36:38 ip-172-31-5-247.ec2.internal kernel: Node 0 Normal free:240kB min:3376kB low:4220kB high:5064kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB pres
Oct 19 13:36:38 ip-172-31-5-247.ec2.internal kernel: lowmem_reserve[]: 0 0 0 0
Oct 19 13:36:38 ip-172-31-5-247.ec2.internal kernel: Node 0 DMA: 3*4kB (UEM) 2*8kB (UE) 3*16kB (UE) 3*32kB (UE) 1*64kB (M) 2*128kB (UE) 2*256kB (UE) 2*512kB (EM) 3*1024kB (UEM) 3*2048kB (EMR) 1*4096kB (M) = 15340kB
Oct 19 13:36:38 ip-172-31-5-247.ec2.internal kernel: Node 0 DMA32: 1364*4kB (UEM) 1025*8kB (UEM) 680*16kB (UEM) 368*32kB (UE) 202*64kB (UEM) 91*128kB (UEM) 11*256kB (U) 0*512kB 1*1024kB (R) 0*2048kB 0*4096kB = 64728kB
Oct 19 13:36:38 ip-172-31-5-247.ec2.internal kernel: Node 0 Normal: 2*4kB (E) 3*8kB (E) 9*16kB (UE) 2*32kB (UE) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 240kB
Oct 19 13:36:38 ip-172-31-5-247.ec2.internal kernel: Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
Oct 19 13:36:38 ip-172-31-5-247.ec2.internal kernel: 48968 total pagecache pages
Oct 19 13:36:38 ip-172-31-5-247.ec2.internal kernel: 0 pages in swap cache
Oct 19 13:36:38 ip-172-31-5-247.ec2.internal kernel: Swap cache stats: add 0, delete 0, find 0/0
Oct 19 13:36:38 ip-172-31-5-247.ec2.internal kernel: Free swap  = 0kB
Oct 19 13:36:38 ip-172-31-5-247.ec2.internal kernel: Total swap = 0kB
Oct 19 13:36:38 ip-172-31-5-247.ec2.internal kernel: 1048477 pages RAM
Oct 19 13:36:38 ip-172-31-5-247.ec2.internal kernel: 0 pages HighMem/MovableOnly


--------
Master stays up about 10mins
-----------
[root@ip-172-31-5-247 ~]# service openshift-master start
Redirecting to /bin/systemctl start  openshift-master.service
[root@ip-172-31-5-247 ~]# service openshift-master status
Redirecting to /bin/systemctl status  openshift-master.service
openshift-master.service - OpenShift Master
   Loaded: loaded (/usr/lib/systemd/system/openshift-master.service; enabled)
   Active: active (running) since Mon 2015-10-19 14:15:44 EDT; 8s ago
     Docs: https://github.com/openshift/origin
 Main PID: 65260 (openshift)
   CGroup: /system.slice/openshift-master.service
           └─65260 /usr/bin/openshift start master --config=/etc/openshift/master/master-config.yaml --loglevel=2

Oct 19 14:15:50 ip-172-31-5-247.ec2.internal openshift-master[65260]: E1019 14:15:50.194270   65260 replication_controller.go:342] unable to create pod replica: Pod "kitchensink-example-2110-bhcgn" is invalid: spec.containers[0].ports[...
Oct 19 14:15:50 ip-172-31-5-247.ec2.internal openshift-master[65260]: I1019 14:15:50.194290   65260 event.go:203] Event(api.ObjectReference{Kind:"ReplicationController", Namespace:"test", Name:"kitchensink-example-2110", ...rror creating:
Oct 19 14:15:50 ip-172-31-5-247.ec2.internal openshift-master[65260]: I1019 14:15:50.194756   65260 replication_controller.go:340] Failed creation, decrementing expectations for controller "test"/"kitchensink-example-2110"
Oct 19 14:15:50 ip-172-31-5-247.ec2.internal openshift-master[65260]: E1019 14:15:50.194769   65260 replication_controller.go:342] unable to create pod replica: Pod "kitchensink-example-2110-5dmmn" is invalid: spec.containers[0].ports[...
Oct 19 14:15:50 ip-172-31-5-247.ec2.internal openshift-master[65260]: I1019 14:15:50.194786   65260 event.go:203] Event(api.ObjectReference{Kind:"ReplicationController", Namespace:"test", Name:"kitchensink-example-2110", ...rror creating:
Oct 19 14:15:50 ip-172-31-5-247.ec2.internal openshift-master[65260]: I1019 14:15:50.195770   65260 replication_controller.go:340] Failed creation, decrementing expectations for controller "test"/"kitchensink-example-2110"
Oct 19 14:15:50 ip-172-31-5-247.ec2.internal openshift-master[65260]: E1019 14:15:50.195784   65260 replication_controller.go:342] unable to create pod replica: Pod "kitchensink-example-2110-rxnct" is invalid: spec.containers[0].ports[...
Oct 19 14:15:50 ip-172-31-5-247.ec2.internal openshift-master[65260]: I1019 14:15:50.195942   65260 event.go:203] Event(api.ObjectReference{Kind:"ReplicationController", Namespace:"test", Name:"kitchensink-example-2110", ...rror creating:
Oct 19 14:15:50 ip-172-31-5-247.ec2.internal openshift-master[65260]: I1019 14:15:50.207065   65260 replication_controller.go:432] Failed to update replica count for controller springmlb-1, requeuing
Oct 19 14:15:50 ip-172-31-5-247.ec2.internal openshift-master[65260]: I1019 14:15:50.207130   65260 replication_controller.go:334] Too few "jboss"/"springmlb-1" replicas, need 1, creating 1

[root@ip-172-31-5-247 ~]# service openshift-master status
Redirecting to /bin/systemctl status  openshift-master.service
openshift-master.service - OpenShift Master
   Loaded: loaded (/usr/lib/systemd/system/openshift-master.service; enabled)
   Active: failed (Result: signal) since Mon 2015-10-19 14:24:48 EDT; 25s ago
     Docs: https://github.com/openshift/origin
 Main PID: 65260 (code=killed, signal=ABRT)

Oct 19 14:23:39 ip-172-31-5-247.ec2.internal openshift-master[65260]: I1019 14:23:39.795387   65260 replication_controller.go:432] Failed to update replica count for controller kitchensink-example-2110, requeuing
Oct 19 14:23:39 ip-172-31-5-247.ec2.internal openshift-master[65260]: I1019 14:23:39.795483   65260 replication_controller.go:334] Too few "test"/"kitchensink-example-2110" replicas, need 5, creating 5
Oct 19 14:23:39 ip-172-31-5-247.ec2.internal openshift-master[65260]: I1019 14:23:39.821717   65260 replication_controller.go:432] Failed to update replica count for controller nodejs-ex5-1, requeuing
Oct 19 14:23:39 ip-172-31-5-247.ec2.internal openshift-master[65260]: I1019 14:23:39.821797   65260 replication_controller.go:334] Too few "rettoris"/"nodejs-ex5-1" replicas, need 1, creating 1
Oct 19 14:23:39 ip-172-31-5-247.ec2.internal openshift-master[65260]: I1019 14:23:39.828359   65260 replication_controller.go:340] Failed creation, decrementing expectations for controller "test"/"kitchensink-example-2110"
Oct 19 14:23:39 ip-172-31-5-247.ec2.internal openshift-master[65260]: E1019 14:23:39.828381   65260 replication_controller.go:342] unable to create pod replica: Pod "kitchensink-example-2110-q542a" is invalid: spec.containers[0].ports[...
Oct 19 14:23:39 ip-172-31-5-247.ec2.internal openshift-master[65260]: I1019 14:23:39.828403   65260 event.go:203] Event(api.ObjectReference{Kind:"ReplicationController", Namespace:"test", Name:"kitchensink-example-2110", ...rror creating:
Oct 19 14:23:39 ip-172-31-5-247.ec2.internal openshift-master[65260]: I1019 14:23:39.834791   65260 replication_controller.go:340] Failed creation, decrementing expectations for controller "test"/"kitchensink-example-2110"
Oct 19 14:24:48 ip-172-31-5-247.ec2.internal systemd[1]: openshift-master.service: main process exited, code=killed, status=6/ABRT
Oct 19 14:24:48 ip-172-31-5-247.ec2.internal systemd[1]: Unit openshift-master.service entered failed state.
Hint: Some lines were ellipsized, use -l to show in full.

Expected results:


Additional info:
[root@ip-172-31-5-247 ~]# oc get users | egrep -v 'NAME' | wc -l
8
[root@ip-172-31-5-247 ~]# oc get projects | egrep -v 'NAME|default|openshift|openshift-infra' | wc -l
18
[root@ip-172-31-5-247 ~]# oadm manage-node --list-pods --selector='' | grep Running | wc -l
8
[root@ip-172-31-5-247 ~]# oadm manage-node --list-pods --selector=type=compute | grep Running | wc -l
4
[root@ip-172-31-5-247 ~]# oc get nodes
NAME            LABELS                                                                          STATUS                        AGE
172.31.12.197   color=black,kubernetes.io/hostname=172.31.12.197,region=us-east-1,type=infra    Ready                         4d
172.31.12.198   color=black,kubernetes.io/hostname=172.31.12.198,region=us-east-1,type=infra    Ready                         4d
172.31.14.43    color=black,kubernetes.io/hostname=172.31.14.43,region=us-east-1,type=compute   Ready                         4d
172.31.14.44    color=black,kubernetes.io/hostname=172.31.14.44,region=us-east-1,type=compute   Ready                         4d
172.31.14.45    color=black,kubernetes.io/hostname=172.31.14.45,region=us-east-1,type=compute   Ready                         4d
172.31.5.247    kubernetes.io/hostname=172.31.5.247,region=us-east-1,type=master                NotReady,SchedulingDisabled   86d

Comment 2 Derek Carr 2015-10-20 15:31:49 UTC
Is it possible to get non-truncated log output for the following:

unable to create pod replica: Pod "kitchensink-example-2110-i2u5y" is invalid: spec.containers[0].ports[...

Comment 3 Wesley Hearn 2015-10-20 19:24:59 UTC
Created attachment 1084916 [details]
Untruncated log

Also can we get this bumped urgent, or high at minimum as it is blocking us from upgrading other clusters.

Comment 4 Derek Carr 2015-10-20 20:26:06 UTC
$ export OPENSHIFT_PROFILE=web
$ openshift start master

At 1 minute intervals, can you do the following:

$ go tool pprof --top http://127.0.0.1:6060/debug/pprof/heap > top-<timestamp>

And make available in an archive.

Comment 5 Wesley Hearn 2015-10-21 15:32:31 UTC
Created attachment 1085203 [details]
go top.

Has one time out and also seen the memory on the box drop down to <90M free. It hasn't crashed this time yet but the box has gone unresponsive(it will go unresponsive for a few mins then bounce back to normal, did that a few times. You can see when by looking at the file names)
[root@ip-172-31-5-247 ~]# free -m
              total        used        free      shared  buff/cache   available
Mem:           3534        3190          89         179         253          18
Swap:             0           0           0
[root@ip-172-31-5-247 ~]# 


[root@ip-172-31-5-247 top]# while true; do sleep 60; go tool pprof --top http://127.0.0.1:6060/debug/pprof/heap > top-$(date +%s); done
Fetching profile from http://127.0.0.1:6060/debug/pprof/heap
Saved profile in /root/pprof/pprof.127.0.0.1:6060.inuse_objects.inuse_space.001.pb.gz
Fetching profile from http://127.0.0.1:6060/debug/pprof/heap
Saved profile in /root/pprof/pprof.127.0.0.1:6060.inuse_objects.inuse_space.002.pb.gz
Fetching profile from http://127.0.0.1:6060/debug/pprof/heap
Saved profile in /root/pprof/pprof.127.0.0.1:6060.inuse_objects.inuse_space.003.pb.gz
Fetching profile from http://127.0.0.1:6060/debug/pprof/heap
Saved profile in /root/pprof/pprof.127.0.0.1:6060.inuse_objects.inuse_space.004.pb.gz
Fetching profile from http://127.0.0.1:6060/debug/pprof/heap
Saved profile in /root/pprof/pprof.127.0.0.1:6060.inuse_objects.inuse_space.005.pb.gz
Fetching profile from http://127.0.0.1:6060/debug/pprof/heap
Saved profile in /root/pprof/pprof.127.0.0.1:6060.inuse_objects.inuse_space.006.pb.gz
Fetching profile from http://127.0.0.1:6060/debug/pprof/heap
Saved profile in /root/pprof/pprof.127.0.0.1:6060.inuse_objects.inuse_space.007.pb.gz
Fetching profile from http://127.0.0.1:6060/debug/pprof/heap
Saved profile in /root/pprof/pprof.127.0.0.1:6060.inuse_objects.inuse_space.008.pb.gz
Fetching profile from http://127.0.0.1:6060/debug/pprof/heap
Saved profile in /root/pprof/pprof.127.0.0.1:6060.inuse_objects.inuse_space.009.pb.gz
Fetching profile from http://127.0.0.1:6060/debug/pprof/heap
Saved profile in /root/pprof/pprof.127.0.0.1:6060.inuse_objects.inuse_space.010.pb.gz
Fetching profile from http://127.0.0.1:6060/debug/pprof/heap
Saved profile in /root/pprof/pprof.127.0.0.1:6060.inuse_objects.inuse_space.011.pb.gz
Fetching profile from http://127.0.0.1:6060/debug/pprof/heap
Saved profile in /root/pprof/pprof.127.0.0.1:6060.inuse_objects.inuse_space.012.pb.gz
Fetching profile from http://127.0.0.1:6060/debug/pprof/heap
Saved profile in /root/pprof/pprof.127.0.0.1:6060.inuse_objects.inuse_space.013.pb.gz
Fetching profile from http://127.0.0.1:6060/debug/pprof/heap
Saved profile in /root/pprof/pprof.127.0.0.1:6060.inuse_objects.inuse_space.014.pb.gz
Fetching profile from http://127.0.0.1:6060/debug/pprof/heap
Saved profile in /root/pprof/pprof.127.0.0.1:6060.inuse_objects.inuse_space.015.pb.gz
Fetching profile from http://127.0.0.1:6060/debug/pprof/heap
Saved profile in /root/pprof/pprof.127.0.0.1:6060.inuse_objects.inuse_space.016.pb.gz
Fetching profile from http://127.0.0.1:6060/debug/pprof/heap
Saved profile in /root/pprof/pprof.127.0.0.1:6060.inuse_objects.inuse_space.017.pb.gz
Fetching profile from http://127.0.0.1:6060/debug/pprof/heap
Saved profile in /root/pprof/pprof.127.0.0.1:6060.inuse_objects.inuse_space.018.pb.gz
Fetching profile from http://127.0.0.1:6060/debug/pprof/heap
Saved profile in /root/pprof/pprof.127.0.0.1:6060.inuse_objects.inuse_space.019.pb.gz
Fetching profile from http://127.0.0.1:6060/debug/pprof/heap
Saved profile in /root/pprof/pprof.127.0.0.1:6060.inuse_objects.inuse_space.020.pb.gz
Fetching profile from http://127.0.0.1:6060/debug/pprof/heap
http fetch http://127.0.0.1:6060/debug/pprof/heap: Get http://127.0.0.1:6060/debug/pprof/heap: net/http: timeout awaiting response headers
Fetching profile from http://127.0.0.1:6060/debug/pprof/heap
Saved profile in /root/pprof/pprof.127.0.0.1:6060.inuse_objects.inuse_space.021.pb.gz

Comment 6 Wesley Hearn 2015-10-21 15:35:58 UTC
Created attachment 1085204 [details]
go top 2

As soon as I posted it, it ended up getting killed
Here is all the top logs.

Comment 7 Derek Carr 2015-10-21 15:55:13 UTC
So in a 3GB machine, it appears that openshift-master is consuming ~600MB of heap.

I assume you are not manually limiting the amount of memory to the openshift-master, so the system itself is reaching an OOM event, and the OOM killer in the kernel is just targeting the openshift-master first.

Can you do the following?

Modify the openshift-master unit file to set the OOMScoreAdjust to -999.

Increase the reporting interval to 15s for top output.

At each reporting interval, also take a snapshot of following:

$ ps aux --sort -rss > ps-<timestamp>

And provide in archive.

I suspect another process is causing run-away memory usage right now, and the kernel is targeting openshift-master first because of its OOMScoreAdjust defaulting to 0.

Comment 8 Wesley Hearn 2015-10-21 17:32:15 UTC
Created attachment 1085258 [details]
ps output

Comment 9 Wesley Hearn 2015-10-21 17:32:36 UTC
Created attachment 1085259 [details]
go top

Comment 10 Derek Carr 2015-10-21 19:46:48 UTC
I will need a full mem.pprof output now after the process dies.

$ OPENSHIFT_PROFILE=mem
$ openshift start master

If/when process dies, you will get a mem.pprof file created.

Please provide the mem.pprof and the binary so I can view symbols in a tar.

Comment 12 Derek Carr 2015-10-22 17:35:40 UTC
Can we get a script to capture when free ram dips below 100Ms?

Comment 14 Derek Carr 2015-10-27 20:06:32 UTC
The oscillation in reported free memory is interesting.

Thu Oct 22 13:48:29 EDT 2015 - 1667M
Thu Oct 22 14:15:45 EDT 2015 - 2351M
Thu Oct 22 14:15:50 EDT 2015 - 257M
Thu Oct 22 15:31:23 EDT 2015 - 238M
Thu Oct 22 15:31:38 EDT 2015 - 1363M
Thu Oct 22 15:31:53 EDT 2015 - 1580M
Thu Oct 22 15:32:08 EDT 2015 - 2245M
Thu Oct 22 15:32:23 EDT 2015 - 157M
Thu Oct 22 15:37:44 EDT 2015 - 994M
Thu Oct 22 15:37:59 EDT 2015 - 1477M
Thu Oct 22 15:38:14 EDT 2015 - 78M

If I look at -in_use space in the profile, we are using about 100MB of space.

If I look at -alloc_space, we are allocating 360GB of content over the run.  

A common area of allocation aligns with the need to allocate routes:

(pprof) list RouteAllocator
Total: 360.87GB
ROUTINE ======================== github.com/openshift/origin/pkg/cmd/server/origin.(*MasterConfig).RouteAllocator in /home/decarr/go/src/github.com/openshift/ose/_output/local/go/src/github.com/openshift/origin/pkg/cmd/server/origin/master.go
   16.56GB    16.60GB (flat, cum)  4.60% of Total
         .          .    608:	factory := routeallocationcontroller.RouteAllocationControllerFactory{
         .          .    609:		OSClient:   osclient,
         .          .    610:		KubeClient: kclient,
         .          .    611:	}
         .          .    612:
         .    43.31MB    613:	plugin, err := routeplugin.NewSimpleAllocationPlugin(c.Options.RoutingConfig.Subdomain)
         .          .    614:	if err != nil {
   16.56GB    16.56GB    615:		glog.Fatalf("Route plugin initialization failed: %v", err)
         .          .    616:	}
         .          .    617:
         .          .    618:	return factory.Create(plugin)
         .          .    619:}
         .          .    620:

As you can see, the failure to allocate a route plugin is resulting in 16GB of allocated usage to log errors.  Is there any message logged to this effect in the journal that could point to a misconfiguration?

Another area of unexpected allocation is occurring here:

(pprof) list thirdpartyapi
Total: 360.87GB
ROUTINE ======================== k8s.io/kubernetes/pkg/master.(*Master).thirdpartyapi in /home/decarr/go/src/github.com/openshift/ose/Godeps/_workspace/src/k8s.io/kubernetes/pkg/master/master.go
    6.71GB     6.71GB (flat, cum)  1.86% of Total
         .          .    818:func (m *Master) thirdpartyapi(group, kind, version string) *apiserver.APIGroupVersion {
         .          .    819:	resourceStorage := thirdpartyresourcedataetcd.NewREST(m.thirdPartyStorage, group, kind)
         .          .    820:
         .          .    821:	apiRoot := "/thirdparty/" + group + "/"
         .          .    822:
    4.49MB     4.49MB    823:	storage := map[string]rest.Storage{
         .          .    824:		strings.ToLower(kind) + "s": resourceStorage,
         .          .    825:	}
         .          .    826:
         .          .    827:	return &apiserver.APIGroupVersion{
         .          .    828:		Root: apiRoot,
         .          .    829:
         .    16.03kB    830:		Creater:   thirdpartyresourcedata.NewObjectCreator(version, api.Scheme),
         .          .    831:		Convertor: api.Scheme,
         .          .    832:		Typer:     api.Scheme,
         .          .    833:
  368.22kB     1.01MB    834:		Mapper:  thirdpartyresourcedata.NewMapper(explatest.RESTMapper, kind, version),
         .          .    835:		Codec:   explatest.Codec,
         .          .    836:		Linker:  explatest.SelfLinker,
         .          .    837:		Storage: storage,
         .          .    838:		Version: version,
         .          .    839:
         .          .    840:		Admit:   m.admissionControl,
         .          .    841:		Context: m.requestContextMapper,
         .          .    842:
    6.70GB     6.70GB    843:		MinRequestTimeout: m.minRequestTimeout,

For some reason, the third party api (which is not even used) is cumulatively consuming 6.7GB of allocations for MinRequestTimeout.

Finally, another interesting outlier implies that someone is making large numbers of invalid log requests for a container in a pod.

(pprof) list pod.streamLocation
Total: 360.87GB
ROUTINE ======================== k8s.io/kubernetes/pkg/registry/pod.streamLocation in /home/decarr/go/src/github.com/openshift/ose/Godeps/_workspace/src/k8s.io/kubernetes/pkg/registry/pod/strategy.go
    5.35GB    45.99GB (flat, cum) 12.75% of Total
         .          .    320:	// Try to figure out a container
         .          .    321:	if container == "" {
         .          .    322:		if len(pod.Spec.Containers) == 1 {
         .          .    323:			container = pod.Spec.Containers[0].Name
         .          .    324:		} else {
         .    39.71GB    325:			return nil, nil, errors.NewBadRequest(fmt.Sprintf("a container name must be specified for pod %s", name))
         .          .    326:		}
         .          .    327:	}
         .          .    328:	nodeHost := pod.Spec.NodeName
         .          .    329:	if len(nodeHost) == 0 {
         .          .    330:		// If pod has not been assigned a host, return an empty location
         .          .    331:		return nil, nil, errors.NewBadRequest(fmt.Sprintf("pod %s does not have a host assigned", name))
         .          .    332:	}
         .          .    333:	nodeScheme, nodePort, nodeTransport, err := connInfo.GetConnectionInfo(nodeHost)
         .          .    334:	if err != nil {
         .   836.88kB    335:		return nil, nil, err
         .          .    336:	}
         .          .    337:	params := url.Values{}
         .          .    338:	if err := streamParams(params, opts); err != nil {
         .          .    339:		return nil, nil, err
         .          .    340:	}
         .          .    341:	loc := &url.URL{
         .          .    342:		Scheme:   nodeScheme,
         .          .    343:		Host:     fmt.Sprintf("%s:%d", nodeHost, nodePort),
         .          .    344:		Path:     fmt.Sprintf("/%s/%s/%s/%s", path, pod.Namespace, name, container),
         .   957.92MB    345:		RawQuery: params.Encode(),
         .          .    346:	}
    5.35GB     5.35GB    347:	return loc, nodeTransport, nil
         .          .    348:}
         .          .    349:
         .          .    350:// PortForwardLocation returns the port-forward URL for a pod.
         .          .    351:func PortForwardLocation(getter ResourceGetter, connInfo client.ConnectionInfoGetter, ctx api.Context, name string) (*url.URL, http.RoundTripper, error) {
         .          .    352:	pod, err := getPod(getter, ctx, name)
(pprof) 

It would be good if we can capture the output at one of those oscillation periods for CPU to see the garbage collector in action and who is causing the largest usage of runtime.mallocgc.

$ OPENSHIFT_PROFILE=cpu
$ openshift start master

Send 

cpu.pprof 
free_mem

Based on your free_mem output, I am most interested when there is a large delta in free memory, so if we start and run for a period with 2.3GB free and drop below 300M, I want that period.  So basically, update your script to send the profile when we first drop below 300M.

Comment 15 Derek Carr 2015-10-27 20:19:49 UTC
Hmm, my line numbers may have been off in symbols.  Looking again.

Comment 16 Derek Carr 2015-10-27 20:37:12 UTC
Ok, my symbols are off.  I would still like to get the cpu.pprof as requested.

Comment 17 Derek Carr 2015-10-27 20:42:56 UTC
Issue #1 - etcd helper is churning through lots of garbage

https://github.com/kubernetes/kubernetes/pull/16384

I don't think that is the conclusive issue, but it was 14% of allocated space over course of run.

Comment 18 Wesley Hearn 2015-10-27 20:49:54 UTC
Created attachment 1087056 [details]
cpu.pprof and free_memory

Comment 20 Wesley Hearn 2015-10-28 15:56:34 UTC
Created attachment 1087263 [details]
mem profile

Here is the memory profile with the latest code

Comment 21 Derek Carr 2015-10-28 18:38:46 UTC
Scott / Wes,

The fix did eliminate one major source of allocation, so etcd prefix key generation is no longer in the top100 of alloc_space.  The bad news is obviously there is still an issue elsewhere.

The fix did eliminate one major source of allocations, so the etcd prefix key code went from being in the top10 of allocators to no longer in the top100.

Wes - can you re-run the scenario, but also set the following environment variable:

$ export GODEBUG=gctrace=2

Then in the tar include the journal output for the openshift master.  You should see new output that matches for each gc invocation as follows (gc<num>):

I1028 14:34:42.600323   14583 master.go:229] Started Swagger Schema API at 0.0.0.0:8443/swaggerapi/
gc145(4): 3+5+5924+3 us, 17 -> 35 MB, 208925 (2038337-1829412) objects, 92 goroutines, 4185/1546/0 sweeps, 35(2649) handoff, 9(38) steal, 246/85/8 yields
gc146(4): 0+3634+5485+173 us, 20 -> 20 MB, 99434 (2038337-1938903) objects, 92 goroutines, 4186/0/4146 sweeps, 25(1840) handoff, 11(43) steal, 156/25/2 yields
gc147(4): 9+8+5955+7 us, 20 -> 40 MB, 274409 (2236168-1961759) objects, 101 goroutines, 4323/960/0 sweeps, 29(1915) handoff, 12(55) steal, 196/43/4 yields
gc148(4): 1+3440+5876+4 us, 21 -> 21 MB, 101353 (2236168-2134815) objects, 101 goroutines, 4324/0/4282 sweeps, 23(2009) handoff, 8(42) steal, 149/36/3 yields
W1028 14:34:43.062622   14583 run_components.go:166] Could not start DNS: listen tcp4 0.0.0.0:53: bind: address already in use

Thanks!

Comment 22 Wesley Hearn 2015-10-28 18:47:39 UTC
Created attachment 1087307 [details]
master log and free_memory

Comment 23 Derek Carr 2015-10-29 15:47:14 UTC
snippet:

Oct 28 14:44:57 ip-172-31-5-247.ec2.internal openshift-master[73835]: gc325(2): 43+2+31642+23 us, 57 -> 114 MB, 856087 (73408681-72552594) objects, 1084 goroutines, 13462/0/0 sweeps, 5(331) handoff, 2(65) steal, 30/10/1 yields
Oct 28 14:44:57 ip-172-31-5-247.ec2.internal openshift-master[73835]: gc326(2): 0+29975+26136+11 us, 57 -> 57 MB, 286272 (73408681-73122409) objects, 1084 goroutines, 13462/0/12500 sweeps, 4(353) handoff, 2(73) steal, 44/6/0 yields

analysis:
For others that review the bug, see details on gc output here:
https://golang.org/pkg/runtime/

The gc debug trace shows that we are not growing heap - it hovers between 57 and 114MB for the entire run.  

We may be growing goroutines.  Prior to the replication controller getting to work, we have ~730 active goroutines, and by the end of the run, we have 1038, and it appears to peak earlier at ~1100 goroutines.

Looking at the logs, it appears replication controller is in a fail loop:

Oct 28 14:43:03 ip-172-31-5-247.ec2.internal openshift-master[73835]: E1028 14:43:03.573749   73835 replication_controller.go:342] unable to create pod replica: Pod "kitchensink-example-2110-b7rcs" is invalid: spec.containers[0].ports[0].name: invalid value 'kitchensink-example-tcp-8080', Details: must be an IANA_SVC_NAME (at most 15 characters, matching regex [a-z0-9]([a-z0-9-]*[a-z0-9])*, it must contain at least one letter [a-z], and hyphens cannot be adjacent to other hyphens): e.g. "http"
Oct 28 14:43:03 ip-172-31-5-247.ec2.internal openshift-master[73835]: I1028 14:43:03.573769   73835 event.go:203] Event(api.ObjectReference{Kind:"ReplicationController", Namespace:"test", Name:"kitchensink-example-2110", UID:"4eaaa51d-71ea-11e5-9a7b-0aeb67874ee5", APIVersion:"v1", ResourceVersion:"9723983", FieldPath:""}): reason: 'failedCreate' Error creating: Pod "kitchensink-example-2110-b7rcs" is invalid: spec.containers[0].ports[0].name: invalid value 'kitchensink-example-tcp-8080', Details: must be an IANA_SVC_NAME (at most 15 characters, matching regex [a-z0-9]([a-z0-9-]*[a-z0-9])*, it must contain at least one letter [a-z], and hyphens cannot be adjacent to other hyphens): e.g. "http"

This occurs approximately 864 times before the run completes and fails.

At minimum, we need to get the templates fixed up to use valid container port names.

Comment 24 Clayton Coleman 2015-10-29 19:24:18 UTC
I've seen up to 4k goroutines on the stage environment with everything still running fine.

Comment 25 Ben Parees 2015-10-29 19:59:29 UTC
just to get it recorded... there are no templates in the environment that define the invalid ports, the DCs that exist with invalid ports were created via the webconsole flow where you specify source and choose a builder and the object gets generated.

Since the webconsole is updated now, we shouldn't see more of those created, but the existing ones need to be fixed, either by manually editing all the bad DCs (and triggering a new deployment if one does not automatically happening), or by doing a scripted migration.

the scripted migration needs to either edit etcd directly, or automate the editing/patching of all the existing deploymentconfigs.  I'm not the expert in this space for either of those things, David Eads might be a good person to talk to.

Comment 26 Luke Meyer 2015-10-30 12:06:21 UTC
*** Bug 1273749 has been marked as a duplicate of this bug. ***

Comment 27 Luke Meyer 2015-10-30 12:20:40 UTC
*** Bug 1275983 has been marked as a duplicate of this bug. ***

Comment 28 Derek Carr 2015-10-30 14:47:47 UTC
Further analysis appears to show the following:

1. etcd and the openshift master were co-located on a machine with ~3.5GB RAM
2. etcd was consuming 2GB of RAM and openshift master 1GB RAM

machine appears to have been undersized to support the etcd workload.  appears that snapshot of data set would spike memory usage and potentially oom_kill the kernel and a random process was targeted by the oom_killer (etcd / openshift master / etc.)

Further investigation shows that there may potentially be a large number of events being generated and persisted in etcd.  Events should expire via TTL in etcd after 2 hrs by default.

Looking at the log, the presence of bad data over the period:

Mon 2015-10-19 15:26:58 EDT, end at Wed 2015-10-28 14:45:56 EDT

Should have resulted in 866 event records being generated, which is not a quantity that should have overloaded the system, but live debugging of the system appeared to show much larger numbers of events, which could mean they are being generated by other components, or etcd ttl is not expiring as expected.

Can we collect the following data:

$ oc get events --all-namespaces | wc -l
$ oc get events --all-namespaces -o yaml | grep deletionTimestamp

Events should all be in the last 2 hr window.

Comment 29 Derek Carr 2015-10-30 15:01:27 UTC
In addition, I would expect:

curl -k --cert /openshift.local.config/master/master.etcd-client.crt --key /openshift.local.config/master/master.etcd-client.key https://localhost:4001/v2/keys/?recursive=true | python -mjson.tool | grep expiration | wc -l

to return the same value as $ oc get events --all-namespaces | wc -l subtracting 1 value for the header.

I would like to see if we turned off the master (to stop creation of new events), and just left etcd running, if the etcd expiration is actually occurring.

I would expect that the number of values returned by:

curl -k --cert /openshift.local.config/master/master.etcd-client.crt --key /openshift.local.config/master/master.etcd-client.key https://localhost:4001/v2/keys/?recursive=true | python -mjson.tool | grep expiration | wc -l

to decrease over time as the events reach their expiration window.

If not, we need to further analyze why.

Comment 30 Derek Carr 2015-10-30 17:22:40 UTC
Summary of findings:

OpenShift 3.0 shipped prior to Kubernetes 1.0.

Validation of pod.spec.container.port.name changed in the lead-up to Kubernetes 1.0 and the web console/templates had produced invalid output.  The web console and templates have since been updated to no longer produce this output.

The impact of this is that when upgrading to OpenShift 3.0.2, users with older data had replication controllers that could not be updated, and whose pod templates would always fail validation.  The only way to fix this was via manual manipulation of the template. We should document this as a migration step for users coming from OpenShift 3.0.0, or provide some utility to ensure it is done for them.

As for the behavior of the replication controller and the proliferation of events, it was correlated to how events are keyed in the compaction algorithm.  Right now, event.Message is included in the key, but in the instance of the replication controller in this circumstance, the message was unique in each instance because it embedded a unique generated pod name that failed validation.

This overwhelmed the number of events produced by a bad actor in the 2 hr window.

I am evaluating not using event.Message in the key for identifying uniqueness.

Comment 31 Derek Carr 2015-11-02 20:31:12 UTC
I am coding up a fix in Kube upstream as part of this issue:
https://github.com/kubernetes/kubernetes/issues/16600

Will cherry-pick into Origin to avoid other event producers producing so many similar events in a short period of time as part of Kubernetes 1.1, and Origin 3.1, and Origin 3.0.2.

Do we want to keep this bugzilla open now that memory usage has stabilized over the weekend?

Comment 32 Wesley Hearn 2015-11-02 22:52:40 UTC
I vote we keep the bug open and have qe verify that upgrades from 3.0.0.1->3.0.2.X with this patch do not OOM and that the messages do increment count instead of generating new messages.

Comment 33 Derek Carr 2015-11-03 16:09:27 UTC
Documentation PR describing validation change is here:
https://github.com/openshift/openshift-docs/pull/1138

Comment 34 Eric Paris 2015-11-04 14:17:38 UTC
Marking this BZ as 'upcoming release'. Derek's events rewrite is targeted after 3.1. For 3.1 we are going to

a) document the port name restrictions
b) create a new 'pre-upgrade' script which will warn for this situation before users get into it. The upgrade installer will block if this script fails and expect the user to update their port names.

Comment 37 Andy Goldstein 2016-01-07 03:06:32 UTC
Moving to MODIFIED since https://github.com/kubernetes/kubernetes/pull/16384 merged upstream. We'll move to ON_QA once it's in origin (will land in the rebase currently in progress).

Comment 38 Derek Carr 2016-02-03 14:55:42 UTC
This has landed in last origin rebase.

Comment 40 Anping Li 2016-02-15 08:18:28 UTC
The system works well after upgrade v3.0.0.1 from the latest GA OSE release v3.1.1.6. so move bug to verified.

Comment 42 errata-xmlrpc 2016-05-12 16:24:48 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2016:1064


Note You need to log in before you can comment on or make changes to this bug.