Description of problem: After upgrading from 3.0.0.1 to 3.0.2.0 the master will not stay up for longer then 10/15mins. It ends up getting killed due to memory running out on the host. Version-Release number of selected component (if applicable): openshift-master-3.0.2.0-0.git.16.4d626fe.el7ose.x86_64 How reproducible: Always Steps to Reproduce: 1. 2. 3. Actual results: [root@ip-172-31-5-247 ~]# service openshift-master status Redirecting to /bin/systemctl status openshift-master.service openshift-master.service - OpenShift Master Loaded: loaded (/usr/lib/systemd/system/openshift-master.service; enabled) Active: failed (Result: signal) since Mon 2015-10-19 13:36:38 EDT; 34min ago Docs: https://github.com/openshift/origin Process: 28342 ExecStart=/usr/bin/openshift start master --config=${CONFIG_FILE} $OPTIONS (code=killed, signal=KILL) Main PID: 28342 (code=killed, signal=KILL) CGroup: /system.slice/openshift-master.service Oct 19 13:36:27 ip-172-31-5-247.ec2.internal openshift-master[28342]: E1019 13:36:27.087612 28342 replication_controller.go:342] unable to create pod replica: Pod "kitchensink-example-2110-i2u5y" is invalid: spec.containers[0].ports[... Oct 19 13:36:27 ip-172-31-5-247.ec2.internal openshift-master[28342]: I1019 13:36:27.087635 28342 event.go:203] Event(api.ObjectReference{Kind:"ReplicationController", Namespace:"test", Name:"kitchensink-example-2110", ...rror creating: Oct 19 13:36:27 ip-172-31-5-247.ec2.internal openshift-master[28342]: I1019 13:36:27.093861 28342 replication_controller.go:340] Failed creation, decrementing expectations for controller "test"/"kitchensink-example-2110" Oct 19 13:36:27 ip-172-31-5-247.ec2.internal openshift-master[28342]: E1019 13:36:27.093876 28342 replication_controller.go:342] unable to create pod replica: Pod "kitchensink-example-2110-5y7cx" is invalid: spec.containers[0].ports[... Oct 19 13:36:27 ip-172-31-5-247.ec2.internal openshift-master[28342]: I1019 13:36:27.093894 28342 event.go:203] Event(api.ObjectReference{Kind:"ReplicationController", Namespace:"test", Name:"kitchensink-example-2110", ...rror creating: Oct 19 13:36:27 ip-172-31-5-247.ec2.internal openshift-master[28342]: I1019 13:36:27.100491 28342 replication_controller.go:340] Failed creation, decrementing expectations for controller "test"/"kitchensink-example-2110" Oct 19 13:36:27 ip-172-31-5-247.ec2.internal openshift-master[28342]: E1019 13:36:27.100507 28342 replication_controller.go:342] unable to create pod replica: Pod "kitchensink-example-2110-p6gtf" is invalid: spec.containers[0].ports[... Oct 19 13:36:27 ip-172-31-5-247.ec2.internal openshift-master[28342]: I1019 13:36:27.100526 28342 event.go:203] Event(api.ObjectReference{Kind:"ReplicationController", Namespace:"test", Name:"kitchensink-example-2110", ...rror creating: Oct 19 13:36:38 ip-172-31-5-247.ec2.internal systemd[1]: openshift-master.service: main process exited, code=killed, status=9/KILL Oct 19 13:36:38 ip-172-31-5-247.ec2.internal systemd[1]: Unit openshift-master.service entered failed state. Hint: Some lines were ellipsized, use -l to show in full. --------------------------- Looking at the full journal log for kernel messages --------------------------- Oct 19 13:36:38 ip-172-31-5-247.ec2.internal kernel: openshift invoked oom-killer: gfp_mask=0x280da, order=0, oom_score_adj=0 Oct 19 13:36:38 ip-172-31-5-247.ec2.internal kernel: openshift cpuset=/ mems_allowed=0 Oct 19 13:36:38 ip-172-31-5-247.ec2.internal kernel: CPU: 0 PID: 28388 Comm: openshift Not tainted 3.10.0-229.14.1.el7.x86_64 #1 Oct 19 13:36:38 ip-172-31-5-247.ec2.internal kernel: Hardware name: Xen HVM domU, BIOS 4.2.amazon 06/01/2015 Oct 19 13:36:38 ip-172-31-5-247.ec2.internal kernel: ffff8800df9f0000 000000002d5a2a76 ffff8800e539ba58 ffffffff81604516 Oct 19 13:36:38 ip-172-31-5-247.ec2.internal kernel: ffff8800e539bae8 ffffffff815ff4df ffff880035541ad0 ffff880035541ae8 Oct 19 13:36:38 ip-172-31-5-247.ec2.internal kernel: ffffffff00000206 fffeefff00000000 0000000000000001 ffffffff81117b03 Oct 19 13:36:38 ip-172-31-5-247.ec2.internal kernel: Call Trace: Oct 19 13:36:38 ip-172-31-5-247.ec2.internal kernel: [<ffffffff81604516>] dump_stack+0x19/0x1b Oct 19 13:36:38 ip-172-31-5-247.ec2.internal kernel: [<ffffffff815ff4df>] dump_header+0x8e/0x214 Oct 19 13:36:38 ip-172-31-5-247.ec2.internal kernel: [<ffffffff81117b03>] ? proc_do_uts_string+0xf3/0x130 Oct 19 13:36:38 ip-172-31-5-247.ec2.internal kernel: [<ffffffff8115a44e>] oom_kill_process+0x24e/0x3b0 Oct 19 13:36:38 ip-172-31-5-247.ec2.internal kernel: [<ffffffff8115ac76>] out_of_memory+0x4b6/0x4f0 Oct 19 13:36:38 ip-172-31-5-247.ec2.internal kernel: [<ffffffff81160e35>] __alloc_pages_nodemask+0xa95/0xb90 Oct 19 13:36:38 ip-172-31-5-247.ec2.internal kernel: [<ffffffff811a287a>] alloc_pages_vma+0x9a/0x140 Oct 19 13:36:38 ip-172-31-5-247.ec2.internal kernel: [<ffffffff81182f6f>] handle_mm_fault+0x9ef/0xd60 Oct 19 13:36:38 ip-172-31-5-247.ec2.internal kernel: [<ffffffff8160f7e6>] __do_page_fault+0x156/0x520 Oct 19 13:36:38 ip-172-31-5-247.ec2.internal kernel: [<ffffffff810b1c67>] ? update_curr+0x67/0x150 Oct 19 13:36:38 ip-172-31-5-247.ec2.internal kernel: [<ffffffff810ad7a6>] ? __dequeue_entity+0x26/0x40 Oct 19 13:36:38 ip-172-31-5-247.ec2.internal kernel: [<ffffffff81012609>] ? __switch_to+0x179/0x4a0 Oct 19 13:36:38 ip-172-31-5-247.ec2.internal kernel: [<ffffffff8160fbca>] do_page_fault+0x1a/0x70 Oct 19 13:36:38 ip-172-31-5-247.ec2.internal kernel: [<ffffffff8160be08>] page_fault+0x28/0x30 Oct 19 13:36:38 ip-172-31-5-247.ec2.internal kernel: Mem-Info: Oct 19 13:36:38 ip-172-31-5-247.ec2.internal kernel: Node 0 DMA per-cpu: Oct 19 13:36:38 ip-172-31-5-247.ec2.internal kernel: CPU 0: hi: 0, btch: 1 usd: 0 Oct 19 13:36:38 ip-172-31-5-247.ec2.internal kernel: CPU 1: hi: 0, btch: 1 usd: 0 Oct 19 13:36:38 ip-172-31-5-247.ec2.internal kernel: Node 0 DMA32 per-cpu: Oct 19 13:36:38 ip-172-31-5-247.ec2.internal kernel: CPU 0: hi: 186, btch: 31 usd: 0 Oct 19 13:36:38 ip-172-31-5-247.ec2.internal kernel: CPU 1: hi: 186, btch: 31 usd: 0 Oct 19 13:36:38 ip-172-31-5-247.ec2.internal kernel: Node 0 Normal per-cpu: Oct 19 13:36:38 ip-172-31-5-247.ec2.internal kernel: CPU 0: hi: 90, btch: 15 usd: 0 Oct 19 13:36:38 ip-172-31-5-247.ec2.internal kernel: CPU 1: hi: 90, btch: 15 usd: 0 Oct 19 13:36:38 ip-172-31-5-247.ec2.internal kernel: active_anon:826239 inactive_anon:13873 isolated_anon:0 active_file:0 inactive_file:0 isolated_file:0 unevictable:12811 dirty:0 writeback:0 unstable:0 free:20067 slab_reclaimable:5859 slab_unreclaimable:9656 mapped:5833 shmem:45848 pagetables:3158 bounce:0 free_cma:0 Oct 19 13:36:38 ip-172-31-5-247.ec2.internal kernel: Node 0 DMA free:15344kB min:276kB low:344kB high:412kB active_anon:404kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB presen Oct 19 13:36:38 ip-172-31-5-247.ec2.internal kernel: lowmem_reserve[]: 0 3584 3773 3773 Oct 19 13:36:38 ip-172-31-5-247.ec2.internal kernel: Node 0 DMA32 free:64684kB min:63928kB low:79908kB high:95892kB active_anon:3304552kB inactive_anon:55492kB active_file:0kB inactive_file:0kB unevictable:51244kB isolated(anon):0kB isola Oct 19 13:36:38 ip-172-31-5-247.ec2.internal kernel: lowmem_reserve[]: 0 0 189 189 Oct 19 13:36:38 ip-172-31-5-247.ec2.internal kernel: Node 0 Normal free:240kB min:3376kB low:4220kB high:5064kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB pres Oct 19 13:36:38 ip-172-31-5-247.ec2.internal kernel: lowmem_reserve[]: 0 0 0 0 Oct 19 13:36:38 ip-172-31-5-247.ec2.internal kernel: Node 0 DMA: 3*4kB (UEM) 2*8kB (UE) 3*16kB (UE) 3*32kB (UE) 1*64kB (M) 2*128kB (UE) 2*256kB (UE) 2*512kB (EM) 3*1024kB (UEM) 3*2048kB (EMR) 1*4096kB (M) = 15340kB Oct 19 13:36:38 ip-172-31-5-247.ec2.internal kernel: Node 0 DMA32: 1364*4kB (UEM) 1025*8kB (UEM) 680*16kB (UEM) 368*32kB (UE) 202*64kB (UEM) 91*128kB (UEM) 11*256kB (U) 0*512kB 1*1024kB (R) 0*2048kB 0*4096kB = 64728kB Oct 19 13:36:38 ip-172-31-5-247.ec2.internal kernel: Node 0 Normal: 2*4kB (E) 3*8kB (E) 9*16kB (UE) 2*32kB (UE) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 240kB Oct 19 13:36:38 ip-172-31-5-247.ec2.internal kernel: Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB Oct 19 13:36:38 ip-172-31-5-247.ec2.internal kernel: 48968 total pagecache pages Oct 19 13:36:38 ip-172-31-5-247.ec2.internal kernel: 0 pages in swap cache Oct 19 13:36:38 ip-172-31-5-247.ec2.internal kernel: Swap cache stats: add 0, delete 0, find 0/0 Oct 19 13:36:38 ip-172-31-5-247.ec2.internal kernel: Free swap = 0kB Oct 19 13:36:38 ip-172-31-5-247.ec2.internal kernel: Total swap = 0kB Oct 19 13:36:38 ip-172-31-5-247.ec2.internal kernel: 1048477 pages RAM Oct 19 13:36:38 ip-172-31-5-247.ec2.internal kernel: 0 pages HighMem/MovableOnly -------- Master stays up about 10mins ----------- [root@ip-172-31-5-247 ~]# service openshift-master start Redirecting to /bin/systemctl start openshift-master.service [root@ip-172-31-5-247 ~]# service openshift-master status Redirecting to /bin/systemctl status openshift-master.service openshift-master.service - OpenShift Master Loaded: loaded (/usr/lib/systemd/system/openshift-master.service; enabled) Active: active (running) since Mon 2015-10-19 14:15:44 EDT; 8s ago Docs: https://github.com/openshift/origin Main PID: 65260 (openshift) CGroup: /system.slice/openshift-master.service └─65260 /usr/bin/openshift start master --config=/etc/openshift/master/master-config.yaml --loglevel=2 Oct 19 14:15:50 ip-172-31-5-247.ec2.internal openshift-master[65260]: E1019 14:15:50.194270 65260 replication_controller.go:342] unable to create pod replica: Pod "kitchensink-example-2110-bhcgn" is invalid: spec.containers[0].ports[... Oct 19 14:15:50 ip-172-31-5-247.ec2.internal openshift-master[65260]: I1019 14:15:50.194290 65260 event.go:203] Event(api.ObjectReference{Kind:"ReplicationController", Namespace:"test", Name:"kitchensink-example-2110", ...rror creating: Oct 19 14:15:50 ip-172-31-5-247.ec2.internal openshift-master[65260]: I1019 14:15:50.194756 65260 replication_controller.go:340] Failed creation, decrementing expectations for controller "test"/"kitchensink-example-2110" Oct 19 14:15:50 ip-172-31-5-247.ec2.internal openshift-master[65260]: E1019 14:15:50.194769 65260 replication_controller.go:342] unable to create pod replica: Pod "kitchensink-example-2110-5dmmn" is invalid: spec.containers[0].ports[... Oct 19 14:15:50 ip-172-31-5-247.ec2.internal openshift-master[65260]: I1019 14:15:50.194786 65260 event.go:203] Event(api.ObjectReference{Kind:"ReplicationController", Namespace:"test", Name:"kitchensink-example-2110", ...rror creating: Oct 19 14:15:50 ip-172-31-5-247.ec2.internal openshift-master[65260]: I1019 14:15:50.195770 65260 replication_controller.go:340] Failed creation, decrementing expectations for controller "test"/"kitchensink-example-2110" Oct 19 14:15:50 ip-172-31-5-247.ec2.internal openshift-master[65260]: E1019 14:15:50.195784 65260 replication_controller.go:342] unable to create pod replica: Pod "kitchensink-example-2110-rxnct" is invalid: spec.containers[0].ports[... Oct 19 14:15:50 ip-172-31-5-247.ec2.internal openshift-master[65260]: I1019 14:15:50.195942 65260 event.go:203] Event(api.ObjectReference{Kind:"ReplicationController", Namespace:"test", Name:"kitchensink-example-2110", ...rror creating: Oct 19 14:15:50 ip-172-31-5-247.ec2.internal openshift-master[65260]: I1019 14:15:50.207065 65260 replication_controller.go:432] Failed to update replica count for controller springmlb-1, requeuing Oct 19 14:15:50 ip-172-31-5-247.ec2.internal openshift-master[65260]: I1019 14:15:50.207130 65260 replication_controller.go:334] Too few "jboss"/"springmlb-1" replicas, need 1, creating 1 [root@ip-172-31-5-247 ~]# service openshift-master status Redirecting to /bin/systemctl status openshift-master.service openshift-master.service - OpenShift Master Loaded: loaded (/usr/lib/systemd/system/openshift-master.service; enabled) Active: failed (Result: signal) since Mon 2015-10-19 14:24:48 EDT; 25s ago Docs: https://github.com/openshift/origin Main PID: 65260 (code=killed, signal=ABRT) Oct 19 14:23:39 ip-172-31-5-247.ec2.internal openshift-master[65260]: I1019 14:23:39.795387 65260 replication_controller.go:432] Failed to update replica count for controller kitchensink-example-2110, requeuing Oct 19 14:23:39 ip-172-31-5-247.ec2.internal openshift-master[65260]: I1019 14:23:39.795483 65260 replication_controller.go:334] Too few "test"/"kitchensink-example-2110" replicas, need 5, creating 5 Oct 19 14:23:39 ip-172-31-5-247.ec2.internal openshift-master[65260]: I1019 14:23:39.821717 65260 replication_controller.go:432] Failed to update replica count for controller nodejs-ex5-1, requeuing Oct 19 14:23:39 ip-172-31-5-247.ec2.internal openshift-master[65260]: I1019 14:23:39.821797 65260 replication_controller.go:334] Too few "rettoris"/"nodejs-ex5-1" replicas, need 1, creating 1 Oct 19 14:23:39 ip-172-31-5-247.ec2.internal openshift-master[65260]: I1019 14:23:39.828359 65260 replication_controller.go:340] Failed creation, decrementing expectations for controller "test"/"kitchensink-example-2110" Oct 19 14:23:39 ip-172-31-5-247.ec2.internal openshift-master[65260]: E1019 14:23:39.828381 65260 replication_controller.go:342] unable to create pod replica: Pod "kitchensink-example-2110-q542a" is invalid: spec.containers[0].ports[... Oct 19 14:23:39 ip-172-31-5-247.ec2.internal openshift-master[65260]: I1019 14:23:39.828403 65260 event.go:203] Event(api.ObjectReference{Kind:"ReplicationController", Namespace:"test", Name:"kitchensink-example-2110", ...rror creating: Oct 19 14:23:39 ip-172-31-5-247.ec2.internal openshift-master[65260]: I1019 14:23:39.834791 65260 replication_controller.go:340] Failed creation, decrementing expectations for controller "test"/"kitchensink-example-2110" Oct 19 14:24:48 ip-172-31-5-247.ec2.internal systemd[1]: openshift-master.service: main process exited, code=killed, status=6/ABRT Oct 19 14:24:48 ip-172-31-5-247.ec2.internal systemd[1]: Unit openshift-master.service entered failed state. Hint: Some lines were ellipsized, use -l to show in full. Expected results: Additional info: [root@ip-172-31-5-247 ~]# oc get users | egrep -v 'NAME' | wc -l 8 [root@ip-172-31-5-247 ~]# oc get projects | egrep -v 'NAME|default|openshift|openshift-infra' | wc -l 18 [root@ip-172-31-5-247 ~]# oadm manage-node --list-pods --selector='' | grep Running | wc -l 8 [root@ip-172-31-5-247 ~]# oadm manage-node --list-pods --selector=type=compute | grep Running | wc -l 4 [root@ip-172-31-5-247 ~]# oc get nodes NAME LABELS STATUS AGE 172.31.12.197 color=black,kubernetes.io/hostname=172.31.12.197,region=us-east-1,type=infra Ready 4d 172.31.12.198 color=black,kubernetes.io/hostname=172.31.12.198,region=us-east-1,type=infra Ready 4d 172.31.14.43 color=black,kubernetes.io/hostname=172.31.14.43,region=us-east-1,type=compute Ready 4d 172.31.14.44 color=black,kubernetes.io/hostname=172.31.14.44,region=us-east-1,type=compute Ready 4d 172.31.14.45 color=black,kubernetes.io/hostname=172.31.14.45,region=us-east-1,type=compute Ready 4d 172.31.5.247 kubernetes.io/hostname=172.31.5.247,region=us-east-1,type=master NotReady,SchedulingDisabled 86d
Is it possible to get non-truncated log output for the following: unable to create pod replica: Pod "kitchensink-example-2110-i2u5y" is invalid: spec.containers[0].ports[...
Created attachment 1084916 [details] Untruncated log Also can we get this bumped urgent, or high at minimum as it is blocking us from upgrading other clusters.
$ export OPENSHIFT_PROFILE=web $ openshift start master At 1 minute intervals, can you do the following: $ go tool pprof --top http://127.0.0.1:6060/debug/pprof/heap > top-<timestamp> And make available in an archive.
Created attachment 1085203 [details] go top. Has one time out and also seen the memory on the box drop down to <90M free. It hasn't crashed this time yet but the box has gone unresponsive(it will go unresponsive for a few mins then bounce back to normal, did that a few times. You can see when by looking at the file names) [root@ip-172-31-5-247 ~]# free -m total used free shared buff/cache available Mem: 3534 3190 89 179 253 18 Swap: 0 0 0 [root@ip-172-31-5-247 ~]# [root@ip-172-31-5-247 top]# while true; do sleep 60; go tool pprof --top http://127.0.0.1:6060/debug/pprof/heap > top-$(date +%s); done Fetching profile from http://127.0.0.1:6060/debug/pprof/heap Saved profile in /root/pprof/pprof.127.0.0.1:6060.inuse_objects.inuse_space.001.pb.gz Fetching profile from http://127.0.0.1:6060/debug/pprof/heap Saved profile in /root/pprof/pprof.127.0.0.1:6060.inuse_objects.inuse_space.002.pb.gz Fetching profile from http://127.0.0.1:6060/debug/pprof/heap Saved profile in /root/pprof/pprof.127.0.0.1:6060.inuse_objects.inuse_space.003.pb.gz Fetching profile from http://127.0.0.1:6060/debug/pprof/heap Saved profile in /root/pprof/pprof.127.0.0.1:6060.inuse_objects.inuse_space.004.pb.gz Fetching profile from http://127.0.0.1:6060/debug/pprof/heap Saved profile in /root/pprof/pprof.127.0.0.1:6060.inuse_objects.inuse_space.005.pb.gz Fetching profile from http://127.0.0.1:6060/debug/pprof/heap Saved profile in /root/pprof/pprof.127.0.0.1:6060.inuse_objects.inuse_space.006.pb.gz Fetching profile from http://127.0.0.1:6060/debug/pprof/heap Saved profile in /root/pprof/pprof.127.0.0.1:6060.inuse_objects.inuse_space.007.pb.gz Fetching profile from http://127.0.0.1:6060/debug/pprof/heap Saved profile in /root/pprof/pprof.127.0.0.1:6060.inuse_objects.inuse_space.008.pb.gz Fetching profile from http://127.0.0.1:6060/debug/pprof/heap Saved profile in /root/pprof/pprof.127.0.0.1:6060.inuse_objects.inuse_space.009.pb.gz Fetching profile from http://127.0.0.1:6060/debug/pprof/heap Saved profile in /root/pprof/pprof.127.0.0.1:6060.inuse_objects.inuse_space.010.pb.gz Fetching profile from http://127.0.0.1:6060/debug/pprof/heap Saved profile in /root/pprof/pprof.127.0.0.1:6060.inuse_objects.inuse_space.011.pb.gz Fetching profile from http://127.0.0.1:6060/debug/pprof/heap Saved profile in /root/pprof/pprof.127.0.0.1:6060.inuse_objects.inuse_space.012.pb.gz Fetching profile from http://127.0.0.1:6060/debug/pprof/heap Saved profile in /root/pprof/pprof.127.0.0.1:6060.inuse_objects.inuse_space.013.pb.gz Fetching profile from http://127.0.0.1:6060/debug/pprof/heap Saved profile in /root/pprof/pprof.127.0.0.1:6060.inuse_objects.inuse_space.014.pb.gz Fetching profile from http://127.0.0.1:6060/debug/pprof/heap Saved profile in /root/pprof/pprof.127.0.0.1:6060.inuse_objects.inuse_space.015.pb.gz Fetching profile from http://127.0.0.1:6060/debug/pprof/heap Saved profile in /root/pprof/pprof.127.0.0.1:6060.inuse_objects.inuse_space.016.pb.gz Fetching profile from http://127.0.0.1:6060/debug/pprof/heap Saved profile in /root/pprof/pprof.127.0.0.1:6060.inuse_objects.inuse_space.017.pb.gz Fetching profile from http://127.0.0.1:6060/debug/pprof/heap Saved profile in /root/pprof/pprof.127.0.0.1:6060.inuse_objects.inuse_space.018.pb.gz Fetching profile from http://127.0.0.1:6060/debug/pprof/heap Saved profile in /root/pprof/pprof.127.0.0.1:6060.inuse_objects.inuse_space.019.pb.gz Fetching profile from http://127.0.0.1:6060/debug/pprof/heap Saved profile in /root/pprof/pprof.127.0.0.1:6060.inuse_objects.inuse_space.020.pb.gz Fetching profile from http://127.0.0.1:6060/debug/pprof/heap http fetch http://127.0.0.1:6060/debug/pprof/heap: Get http://127.0.0.1:6060/debug/pprof/heap: net/http: timeout awaiting response headers Fetching profile from http://127.0.0.1:6060/debug/pprof/heap Saved profile in /root/pprof/pprof.127.0.0.1:6060.inuse_objects.inuse_space.021.pb.gz
Created attachment 1085204 [details] go top 2 As soon as I posted it, it ended up getting killed Here is all the top logs.
So in a 3GB machine, it appears that openshift-master is consuming ~600MB of heap. I assume you are not manually limiting the amount of memory to the openshift-master, so the system itself is reaching an OOM event, and the OOM killer in the kernel is just targeting the openshift-master first. Can you do the following? Modify the openshift-master unit file to set the OOMScoreAdjust to -999. Increase the reporting interval to 15s for top output. At each reporting interval, also take a snapshot of following: $ ps aux --sort -rss > ps-<timestamp> And provide in archive. I suspect another process is causing run-away memory usage right now, and the kernel is targeting openshift-master first because of its OOMScoreAdjust defaulting to 0.
Created attachment 1085258 [details] ps output
Created attachment 1085259 [details] go top
I will need a full mem.pprof output now after the process dies. $ OPENSHIFT_PROFILE=mem $ openshift start master If/when process dies, you will get a mem.pprof file created. Please provide the mem.pprof and the binary so I can view symbols in a tar.
Can we get a script to capture when free ram dips below 100Ms?
The oscillation in reported free memory is interesting. Thu Oct 22 13:48:29 EDT 2015 - 1667M Thu Oct 22 14:15:45 EDT 2015 - 2351M Thu Oct 22 14:15:50 EDT 2015 - 257M Thu Oct 22 15:31:23 EDT 2015 - 238M Thu Oct 22 15:31:38 EDT 2015 - 1363M Thu Oct 22 15:31:53 EDT 2015 - 1580M Thu Oct 22 15:32:08 EDT 2015 - 2245M Thu Oct 22 15:32:23 EDT 2015 - 157M Thu Oct 22 15:37:44 EDT 2015 - 994M Thu Oct 22 15:37:59 EDT 2015 - 1477M Thu Oct 22 15:38:14 EDT 2015 - 78M If I look at -in_use space in the profile, we are using about 100MB of space. If I look at -alloc_space, we are allocating 360GB of content over the run. A common area of allocation aligns with the need to allocate routes: (pprof) list RouteAllocator Total: 360.87GB ROUTINE ======================== github.com/openshift/origin/pkg/cmd/server/origin.(*MasterConfig).RouteAllocator in /home/decarr/go/src/github.com/openshift/ose/_output/local/go/src/github.com/openshift/origin/pkg/cmd/server/origin/master.go 16.56GB 16.60GB (flat, cum) 4.60% of Total . . 608: factory := routeallocationcontroller.RouteAllocationControllerFactory{ . . 609: OSClient: osclient, . . 610: KubeClient: kclient, . . 611: } . . 612: . 43.31MB 613: plugin, err := routeplugin.NewSimpleAllocationPlugin(c.Options.RoutingConfig.Subdomain) . . 614: if err != nil { 16.56GB 16.56GB 615: glog.Fatalf("Route plugin initialization failed: %v", err) . . 616: } . . 617: . . 618: return factory.Create(plugin) . . 619:} . . 620: As you can see, the failure to allocate a route plugin is resulting in 16GB of allocated usage to log errors. Is there any message logged to this effect in the journal that could point to a misconfiguration? Another area of unexpected allocation is occurring here: (pprof) list thirdpartyapi Total: 360.87GB ROUTINE ======================== k8s.io/kubernetes/pkg/master.(*Master).thirdpartyapi in /home/decarr/go/src/github.com/openshift/ose/Godeps/_workspace/src/k8s.io/kubernetes/pkg/master/master.go 6.71GB 6.71GB (flat, cum) 1.86% of Total . . 818:func (m *Master) thirdpartyapi(group, kind, version string) *apiserver.APIGroupVersion { . . 819: resourceStorage := thirdpartyresourcedataetcd.NewREST(m.thirdPartyStorage, group, kind) . . 820: . . 821: apiRoot := "/thirdparty/" + group + "/" . . 822: 4.49MB 4.49MB 823: storage := map[string]rest.Storage{ . . 824: strings.ToLower(kind) + "s": resourceStorage, . . 825: } . . 826: . . 827: return &apiserver.APIGroupVersion{ . . 828: Root: apiRoot, . . 829: . 16.03kB 830: Creater: thirdpartyresourcedata.NewObjectCreator(version, api.Scheme), . . 831: Convertor: api.Scheme, . . 832: Typer: api.Scheme, . . 833: 368.22kB 1.01MB 834: Mapper: thirdpartyresourcedata.NewMapper(explatest.RESTMapper, kind, version), . . 835: Codec: explatest.Codec, . . 836: Linker: explatest.SelfLinker, . . 837: Storage: storage, . . 838: Version: version, . . 839: . . 840: Admit: m.admissionControl, . . 841: Context: m.requestContextMapper, . . 842: 6.70GB 6.70GB 843: MinRequestTimeout: m.minRequestTimeout, For some reason, the third party api (which is not even used) is cumulatively consuming 6.7GB of allocations for MinRequestTimeout. Finally, another interesting outlier implies that someone is making large numbers of invalid log requests for a container in a pod. (pprof) list pod.streamLocation Total: 360.87GB ROUTINE ======================== k8s.io/kubernetes/pkg/registry/pod.streamLocation in /home/decarr/go/src/github.com/openshift/ose/Godeps/_workspace/src/k8s.io/kubernetes/pkg/registry/pod/strategy.go 5.35GB 45.99GB (flat, cum) 12.75% of Total . . 320: // Try to figure out a container . . 321: if container == "" { . . 322: if len(pod.Spec.Containers) == 1 { . . 323: container = pod.Spec.Containers[0].Name . . 324: } else { . 39.71GB 325: return nil, nil, errors.NewBadRequest(fmt.Sprintf("a container name must be specified for pod %s", name)) . . 326: } . . 327: } . . 328: nodeHost := pod.Spec.NodeName . . 329: if len(nodeHost) == 0 { . . 330: // If pod has not been assigned a host, return an empty location . . 331: return nil, nil, errors.NewBadRequest(fmt.Sprintf("pod %s does not have a host assigned", name)) . . 332: } . . 333: nodeScheme, nodePort, nodeTransport, err := connInfo.GetConnectionInfo(nodeHost) . . 334: if err != nil { . 836.88kB 335: return nil, nil, err . . 336: } . . 337: params := url.Values{} . . 338: if err := streamParams(params, opts); err != nil { . . 339: return nil, nil, err . . 340: } . . 341: loc := &url.URL{ . . 342: Scheme: nodeScheme, . . 343: Host: fmt.Sprintf("%s:%d", nodeHost, nodePort), . . 344: Path: fmt.Sprintf("/%s/%s/%s/%s", path, pod.Namespace, name, container), . 957.92MB 345: RawQuery: params.Encode(), . . 346: } 5.35GB 5.35GB 347: return loc, nodeTransport, nil . . 348:} . . 349: . . 350:// PortForwardLocation returns the port-forward URL for a pod. . . 351:func PortForwardLocation(getter ResourceGetter, connInfo client.ConnectionInfoGetter, ctx api.Context, name string) (*url.URL, http.RoundTripper, error) { . . 352: pod, err := getPod(getter, ctx, name) (pprof) It would be good if we can capture the output at one of those oscillation periods for CPU to see the garbage collector in action and who is causing the largest usage of runtime.mallocgc. $ OPENSHIFT_PROFILE=cpu $ openshift start master Send cpu.pprof free_mem Based on your free_mem output, I am most interested when there is a large delta in free memory, so if we start and run for a period with 2.3GB free and drop below 300M, I want that period. So basically, update your script to send the profile when we first drop below 300M.
Hmm, my line numbers may have been off in symbols. Looking again.
Ok, my symbols are off. I would still like to get the cpu.pprof as requested.
Issue #1 - etcd helper is churning through lots of garbage https://github.com/kubernetes/kubernetes/pull/16384 I don't think that is the conclusive issue, but it was 14% of allocated space over course of run.
Created attachment 1087056 [details] cpu.pprof and free_memory
Created attachment 1087263 [details] mem profile Here is the memory profile with the latest code
Scott / Wes, The fix did eliminate one major source of allocation, so etcd prefix key generation is no longer in the top100 of alloc_space. The bad news is obviously there is still an issue elsewhere. The fix did eliminate one major source of allocations, so the etcd prefix key code went from being in the top10 of allocators to no longer in the top100. Wes - can you re-run the scenario, but also set the following environment variable: $ export GODEBUG=gctrace=2 Then in the tar include the journal output for the openshift master. You should see new output that matches for each gc invocation as follows (gc<num>): I1028 14:34:42.600323 14583 master.go:229] Started Swagger Schema API at 0.0.0.0:8443/swaggerapi/ gc145(4): 3+5+5924+3 us, 17 -> 35 MB, 208925 (2038337-1829412) objects, 92 goroutines, 4185/1546/0 sweeps, 35(2649) handoff, 9(38) steal, 246/85/8 yields gc146(4): 0+3634+5485+173 us, 20 -> 20 MB, 99434 (2038337-1938903) objects, 92 goroutines, 4186/0/4146 sweeps, 25(1840) handoff, 11(43) steal, 156/25/2 yields gc147(4): 9+8+5955+7 us, 20 -> 40 MB, 274409 (2236168-1961759) objects, 101 goroutines, 4323/960/0 sweeps, 29(1915) handoff, 12(55) steal, 196/43/4 yields gc148(4): 1+3440+5876+4 us, 21 -> 21 MB, 101353 (2236168-2134815) objects, 101 goroutines, 4324/0/4282 sweeps, 23(2009) handoff, 8(42) steal, 149/36/3 yields W1028 14:34:43.062622 14583 run_components.go:166] Could not start DNS: listen tcp4 0.0.0.0:53: bind: address already in use Thanks!
Created attachment 1087307 [details] master log and free_memory
snippet: Oct 28 14:44:57 ip-172-31-5-247.ec2.internal openshift-master[73835]: gc325(2): 43+2+31642+23 us, 57 -> 114 MB, 856087 (73408681-72552594) objects, 1084 goroutines, 13462/0/0 sweeps, 5(331) handoff, 2(65) steal, 30/10/1 yields Oct 28 14:44:57 ip-172-31-5-247.ec2.internal openshift-master[73835]: gc326(2): 0+29975+26136+11 us, 57 -> 57 MB, 286272 (73408681-73122409) objects, 1084 goroutines, 13462/0/12500 sweeps, 4(353) handoff, 2(73) steal, 44/6/0 yields analysis: For others that review the bug, see details on gc output here: https://golang.org/pkg/runtime/ The gc debug trace shows that we are not growing heap - it hovers between 57 and 114MB for the entire run. We may be growing goroutines. Prior to the replication controller getting to work, we have ~730 active goroutines, and by the end of the run, we have 1038, and it appears to peak earlier at ~1100 goroutines. Looking at the logs, it appears replication controller is in a fail loop: Oct 28 14:43:03 ip-172-31-5-247.ec2.internal openshift-master[73835]: E1028 14:43:03.573749 73835 replication_controller.go:342] unable to create pod replica: Pod "kitchensink-example-2110-b7rcs" is invalid: spec.containers[0].ports[0].name: invalid value 'kitchensink-example-tcp-8080', Details: must be an IANA_SVC_NAME (at most 15 characters, matching regex [a-z0-9]([a-z0-9-]*[a-z0-9])*, it must contain at least one letter [a-z], and hyphens cannot be adjacent to other hyphens): e.g. "http" Oct 28 14:43:03 ip-172-31-5-247.ec2.internal openshift-master[73835]: I1028 14:43:03.573769 73835 event.go:203] Event(api.ObjectReference{Kind:"ReplicationController", Namespace:"test", Name:"kitchensink-example-2110", UID:"4eaaa51d-71ea-11e5-9a7b-0aeb67874ee5", APIVersion:"v1", ResourceVersion:"9723983", FieldPath:""}): reason: 'failedCreate' Error creating: Pod "kitchensink-example-2110-b7rcs" is invalid: spec.containers[0].ports[0].name: invalid value 'kitchensink-example-tcp-8080', Details: must be an IANA_SVC_NAME (at most 15 characters, matching regex [a-z0-9]([a-z0-9-]*[a-z0-9])*, it must contain at least one letter [a-z], and hyphens cannot be adjacent to other hyphens): e.g. "http" This occurs approximately 864 times before the run completes and fails. At minimum, we need to get the templates fixed up to use valid container port names.
I've seen up to 4k goroutines on the stage environment with everything still running fine.
just to get it recorded... there are no templates in the environment that define the invalid ports, the DCs that exist with invalid ports were created via the webconsole flow where you specify source and choose a builder and the object gets generated. Since the webconsole is updated now, we shouldn't see more of those created, but the existing ones need to be fixed, either by manually editing all the bad DCs (and triggering a new deployment if one does not automatically happening), or by doing a scripted migration. the scripted migration needs to either edit etcd directly, or automate the editing/patching of all the existing deploymentconfigs. I'm not the expert in this space for either of those things, David Eads might be a good person to talk to.
*** Bug 1273749 has been marked as a duplicate of this bug. ***
*** Bug 1275983 has been marked as a duplicate of this bug. ***
Further analysis appears to show the following: 1. etcd and the openshift master were co-located on a machine with ~3.5GB RAM 2. etcd was consuming 2GB of RAM and openshift master 1GB RAM machine appears to have been undersized to support the etcd workload. appears that snapshot of data set would spike memory usage and potentially oom_kill the kernel and a random process was targeted by the oom_killer (etcd / openshift master / etc.) Further investigation shows that there may potentially be a large number of events being generated and persisted in etcd. Events should expire via TTL in etcd after 2 hrs by default. Looking at the log, the presence of bad data over the period: Mon 2015-10-19 15:26:58 EDT, end at Wed 2015-10-28 14:45:56 EDT Should have resulted in 866 event records being generated, which is not a quantity that should have overloaded the system, but live debugging of the system appeared to show much larger numbers of events, which could mean they are being generated by other components, or etcd ttl is not expiring as expected. Can we collect the following data: $ oc get events --all-namespaces | wc -l $ oc get events --all-namespaces -o yaml | grep deletionTimestamp Events should all be in the last 2 hr window.
In addition, I would expect: curl -k --cert /openshift.local.config/master/master.etcd-client.crt --key /openshift.local.config/master/master.etcd-client.key https://localhost:4001/v2/keys/?recursive=true | python -mjson.tool | grep expiration | wc -l to return the same value as $ oc get events --all-namespaces | wc -l subtracting 1 value for the header. I would like to see if we turned off the master (to stop creation of new events), and just left etcd running, if the etcd expiration is actually occurring. I would expect that the number of values returned by: curl -k --cert /openshift.local.config/master/master.etcd-client.crt --key /openshift.local.config/master/master.etcd-client.key https://localhost:4001/v2/keys/?recursive=true | python -mjson.tool | grep expiration | wc -l to decrease over time as the events reach their expiration window. If not, we need to further analyze why.
Summary of findings: OpenShift 3.0 shipped prior to Kubernetes 1.0. Validation of pod.spec.container.port.name changed in the lead-up to Kubernetes 1.0 and the web console/templates had produced invalid output. The web console and templates have since been updated to no longer produce this output. The impact of this is that when upgrading to OpenShift 3.0.2, users with older data had replication controllers that could not be updated, and whose pod templates would always fail validation. The only way to fix this was via manual manipulation of the template. We should document this as a migration step for users coming from OpenShift 3.0.0, or provide some utility to ensure it is done for them. As for the behavior of the replication controller and the proliferation of events, it was correlated to how events are keyed in the compaction algorithm. Right now, event.Message is included in the key, but in the instance of the replication controller in this circumstance, the message was unique in each instance because it embedded a unique generated pod name that failed validation. This overwhelmed the number of events produced by a bad actor in the 2 hr window. I am evaluating not using event.Message in the key for identifying uniqueness.
I am coding up a fix in Kube upstream as part of this issue: https://github.com/kubernetes/kubernetes/issues/16600 Will cherry-pick into Origin to avoid other event producers producing so many similar events in a short period of time as part of Kubernetes 1.1, and Origin 3.1, and Origin 3.0.2. Do we want to keep this bugzilla open now that memory usage has stabilized over the weekend?
I vote we keep the bug open and have qe verify that upgrades from 3.0.0.1->3.0.2.X with this patch do not OOM and that the messages do increment count instead of generating new messages.
Documentation PR describing validation change is here: https://github.com/openshift/openshift-docs/pull/1138
Marking this BZ as 'upcoming release'. Derek's events rewrite is targeted after 3.1. For 3.1 we are going to a) document the port name restrictions b) create a new 'pre-upgrade' script which will warn for this situation before users get into it. The upgrade installer will block if this script fails and expect the user to update their port names.
Moving to MODIFIED since https://github.com/kubernetes/kubernetes/pull/16384 merged upstream. We'll move to ON_QA once it's in origin (will land in the rebase currently in progress).
This has landed in last origin rebase.
The system works well after upgrade v3.0.0.1 from the latest GA OSE release v3.1.1.6. so move bug to verified.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2016:1064