Bug 1333667

Summary:	2606 salt-minion processes (fork bomb) after ceph package upgrade to Red Hat Ceph Storage 1.3.2
Product:	[Red Hat Storage] Red Hat Ceph Storage	Reporter:	Vikhyat Umrao <vumrao>
Component:	Calamari	Assignee:	Christina Meno <gmeno>
Calamari sub component:	Minions	QA Contact:	Tejas <tchandra>
Status:	CLOSED ERRATA	Docs Contact:	Bara Ancincova <bancinco>
Severity:	high
Priority:	high	CC:	bsingh, ceph-eng-bugs, gmeno, hnallurv, kdreyer, linuxkidd, mhackett, vumrao
Version:	1.3.2
Target Milestone:	rc
Target Release:	2.2
Hardware:	x86_64
OS:	Linux
Whiteboard:
Fixed In Version:	RHEL: calamari-server-1.5.0-1.el7cp Ubuntu: calamari_1.5.0-2redhat1xenial	Doc Type:	No Doc Update
Doc Text:	undefined	Story Points:	---
Clone Of:		Environment:
Last Closed:	2017-03-14 15:43:59 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Vikhyat Umrao 2016-05-06 06:45:26 UTC

Description of problem:
2606 salt-minion processes (fork bomb) after ceph package upgrade to Red Hat Ceph Storage 1.3.2

$ grep swapents messages-20160502-cephmon02 -A 2643 | grep salt-minion | wc -l
2606

$ grep swapents messages-20160502-cephmon02  -A 2643 | grep salt-minion | head -n 5
Apr 29 16:49:13 cephmon02 kernel: [48440]     0 48440   165927    10579     149      822             0 salt-minion
Apr 29 16:49:13 cephmon02 kernel: [ 2748]     0  2748   163489     9480     148     2806             0 salt-minion
Apr 29 16:49:13 cephmon02 kernel: [32521]     0 32521   163489     9851     148     2433             0 salt-minion
Apr 29 16:49:13 cephmon02 kernel: [32533]     0 32533   163489     9187     144     3098             0 salt-minion
Apr 29 16:49:13 cephmon02 kernel: [32540]     0 32540   163489     9763     148     2522             0 salt-minion



Version-Release number of selected component (if applicable):
$cat installed-rpms| grep salt-minion
salt-minion-2014.1.5-3.el7cp.noarch    


- Due to  2606 salt-minion processes (fork bomb) , this process invoked oom-killer 

Apr 29 16:49:13 cephmon02 kernel: salt-minion invoked oom-killer: gfp_mask=0x201da, order=0, oom_score_adj=0
Apr 29 16:49:13 cephmon02 kernel: salt-minion cpuset=/ mems_allowed=0-1
Apr 29 16:49:13 cephmon02 kernel: CPU: 5 PID: 29062 Comm: salt-minion Not tainted 3.10.0-229.4.2.el7.x86_64 #1
Apr 29 16:49:13 cephmon02 kernel: Hardware name: Dell Inc. PowerEdge R620/0PXXHP, BIOS 2.4.3 07/09/2014
Apr 29 16:49:13 cephmon02 kernel: ffff8801872d6660 000000000edbb4e9 ffff880116697950 ffffffff816042d6
Apr 29 16:49:13 cephmon02 kernel: ffff8801166979e0 ffffffff815ff29f 0000000000003ddd ffff880116697a20
Apr 29 16:49:13 cephmon02 kernel: 0000000000003ddd ffff880116697a10 0000000000003ddd 0000000000000001
Apr 29 16:49:13 cephmon02 kernel: Call Trace:
Apr 29 16:49:13 cephmon02 kernel: [<ffffffff816042d6>] dump_stack+0x19/0x1b
Apr 29 16:49:13 cephmon02 kernel: [<ffffffff815ff29f>] dump_header+0x8e/0x214
Apr 29 16:49:13 cephmon02 kernel: [<ffffffff8115a0ce>] oom_kill_process+0x24e/0x3b0
Apr 29 16:49:13 cephmon02 kernel: [<ffffffff81159c36>] ? find_lock_task_mm+0x56/0xc0
Apr 29 16:49:13 cephmon02 kernel: [<ffffffff8107bd0e>] ? has_capability_noaudit+0x1e/0x30
Apr 29 16:49:13 cephmon02 kernel: [<ffffffff8115a8f6>] out_of_memory+0x4b6/0x4f0
Apr 29 16:49:13 cephmon02 kernel: [<ffffffff81160ab5>] __alloc_pages_nodemask+0xa95/0xb90
Apr 29 16:49:13 cephmon02 kernel: [<ffffffff8119f029>] alloc_pages_current+0xa9/0x170
Apr 29 16:49:13 cephmon02 kernel: [<ffffffff81156917>] __page_cache_alloc+0x97/0xc0
Apr 29 16:49:13 cephmon02 kernel: [<ffffffff81158c08>] filemap_fault+0x188/0x430
Apr 29 16:49:13 cephmon02 kernel: [<ffffffff8117e27e>] __do_fault+0x7e/0x510
Apr 29 16:49:13 cephmon02 kernel: [<ffffffff811825d5>] handle_mm_fault+0x3d5/0xd60
Apr 29 16:49:13 cephmon02 kernel: [<ffffffff810d2108>] ? get_futex_key+0x1c8/0x2b0
Apr 29 16:49:13 cephmon02 kernel: [<ffffffff8160f5a6>] __do_page_fault+0x156/0x520
Apr 29 16:49:13 cephmon02 kernel: [<ffffffff810d4ef2>] ? do_futex+0x122/0x5b0
Apr 29 16:49:13 cephmon02 kernel: [<ffffffff8160f98a>] do_page_fault+0x1a/0x70
Apr 29 16:49:13 cephmon02 kernel: [<ffffffff8160bbc8>] page_fault+0x28/0x30

- Swap was also exhausted 

Apr 29 16:49:13 cephmon02 kernel: Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
Apr 29 16:49:13 cephmon02 kernel: Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
Apr 29 16:49:13 cephmon02 kernel: 441337 total pagecache pages
Apr 29 16:49:13 cephmon02 kernel: 107956 pages in swap cache
Apr 29 16:49:13 cephmon02 kernel: Swap cache stats: add 280044, delete 172088, find 8487/11002
Apr 29 16:49:13 cephmon02 kernel: Free swap  = 0kB <================ 
Apr 29 16:49:13 cephmon02 kernel: Total swap = 1048572kB
Apr 29 16:49:13 cephmon02 kernel: 8376971 pages RAM
Apr 29 16:49:13 cephmon02 kernel: 0 pages HighMem/MovableOnly
Apr 29 16:49:13 cephmon02 kernel: 194004 pages reserve

- and finally ceph-mon process became victim.  It was killed due to *Out of memory* :

Apr 29 16:49:13 cephmon02 kernel: Out of memory: Kill process 32743 (ceph-mon) score 16 or sacrifice child
Apr 29 16:49:13 cephmon02 kernel: Killed process 32743 (ceph-mon) total-vm:5116528kB, anon-rss:547076kB, file-rss:0kB

Comment 3 Vikhyat Umrao 2016-05-06 07:12:23 UTC

- We have found in upstream similar discussion : https://github.com/saltstack/salt/issues/8435
- This issue was closed with upstream commits : https://github.com/saltstack/salt/pull/8222

 -                'minutes': opts['mine_interval']
+                'minutes': opts['mine_interval'],
+                'jid_include' : True,
+                'maxrunning' : 2

^^ these commits add *'maxrunning' : 2* but we are not sure if it is helping. 

- As we verified this is the latest  *salt-minion* package we ship :  
salt-minion-2014.1.5-3.el7cp.noarch    

- And this packages has commits from https://github.com/saltstack/salt/pull/8222 but not in same lines of code may be downstream package salt-minion-2014.1.5-3.el7cp.noarch is little different in code base from upstream which has this fix : https://github.com/saltstack/salt/pull/8222

- When these Ceph systems were upgraded all packages were upgraded for Ceph and RHEL but not salt-minion as this is the latest version we have salt-minion-2014.1.5-3.el7cp.noarch  .

- If we check sos report of one of the system :   *ceph* package is from  Fri Apr 29 18:23:29 2016 

  $cat installed-rpms| grep ceph       
  ceph-0.94.5-9.el7cp.x86_64                                  Fri Apr 29 18:23:29 2016 

- But salt-minion is from Tue Jun 16 19:57:28 2015 as it was not upgraded.

$cat installed-rpms| grep salt-minion
salt-minion-2014.1.5-3.el7cp.noarch                         Tue Jun 16 19:57:28 2015

Comment 9 Christina Meno 2016-10-06 18:53:19 UTC

seems like the culprit is https://github.com/saltstack/salt/pull/32373

so we'll need to investigate bumping version of salt to 2015.8 or later

Comment 10 Vikhyat Umrao 2016-10-28 19:51:10 UTC

(In reply to Gregory Meno from comment #9)
> seems like the culprit is https://github.com/saltstack/salt/pull/32373
> 
> so we'll need to investigate bumping version of salt to 2015.8 or later

Thank you Gregory.

I have verified and in 2.0 we have salt packages version: 2015.5.5-1.

# rpm -qa | grep salt
salt-2015.5.5-1.el7.noarch
salt-minion-2015.5.5-1.el7.noarch

I am moving this bug for 2.2 release and we can rebase salt to 2015.8 or later in 2.2 downstream.

Comment 11 Mike Hackett 2016-11-18 15:45:36 UTC

Hello Gregory,

We have another customer experiencing this issue after adding new monitors to their cluster running RHCS 1.3.3. 
A day after adding the new monitors the cluster started experiencing very slow/unresponsive MONS. On some of the MONs the customer was able to login and briefly run top which showed many salt-minions in 'D' state, so they could not kill them. The machines got so slow that that they required a reboot, and the problem cleared. This happened on several, but not all, MONs. It was not isolated to only the new MONs. 

The provided journal.ctl log shows large amounts of salt-minion processes and the rgw and mon services are victims taking OOM's.


# rpm -qa | grep salt
salt-2014.1.5-3.el7cp
salt-minion-2014.1.5-3.el7cp


minion logs show large amounts of the following messages:

2016-11-16 14:30:59,161 [salt.minion      ][ERROR   ] Exception [Errno 32] Broken pipe occurred in scheduled job

I am moving this bug priority to HIGH as this customer is running co-located MONs and RGW's and needing to reboot the node periodically is impact to their GW production traffic.

Comment 13 Mike Hackett 2016-11-18 15:47:50 UTC

journalctl log file location: https://api.access.redhat.com/rs/cases/01742412/attachments/ed91a041-6dad-46ab-9d9c-5e2553a8ed08

Comment 15 Christina Meno 2016-11-18 18:46:11 UTC

Mike are they actively using calamari, if not we could disable it?

Comment 16 Christina Meno 2016-11-18 18:59:46 UTC

That would be on each monitor:
systemctl disable salt-minion
systemctl stop salt-minion

Comment 22 Christina Meno 2016-12-01 19:18:51 UTC

Adding a monitor seems to have caused it? Interesting...

I don't understand why that is the case. Seems like Saltstack doesn't either -- https://github.com/saltstack/salt/issues/32349 I'll try adding a monitor in my test setup and see if I can reproduce.

Here is what you can to to un-schedule that heartbeat task
from the admin node:
sudo su -c 'echo "" > /opt/calamari/salt/pillar/top.sls'
sudo salt '*' pillar.items | grep ceph.heartbeat # verify it is gone


Here is data I'd love to have if you see this again:
run from the admin node
sudo salt-run jobs.list_jobs | grep heartbeat -C2
sudo salt '*' ceph.get_heartbeats

Comment 26 Christina Meno 2017-01-13 18:51:29 UTC

Re: Mike no luck reproducing, and I noticed that I missed a step in c22

This is what you should run if it happens again...

sudo su -c 'echo "" > /opt/calamari/salt/pillar/top.sls'

sudo salt '*' saltutil.sync_all

sudo salt '*' pillar.items | grep ceph.heartbeat # verify it is gone

Comment 27 Christina Meno 2017-01-13 18:53:17 UTC

Harish,

You're probably right saltstack won't be installed in 2.2
That being said I don't think that I'm taking any action to remove it
I probably need to test with an upgraded system so I won't close or retarget this just yet.

Comment 28 Harish NV Rao 2017-01-17 10:00:02 UTC

@Gregory,
a) what is the decision on this bug?
b) please share the steps to verify the fix if it's going to be fixed in 2.2

Comment 29 Christina Meno 2017-01-18 04:08:45 UTC

A. Plan to fix it.
B. Steps to reproduce
take a cluster from RHCS 1.3 upgrade it to 2.2 and then add a monitor
monitors should not be killed as out of memory.
Check "free" command before and after addition of monitor results for memory used should be similar after 1 5 and 15 mins

Comment 36 Tejas 2017-02-08 15:05:00 UTC

Steps followed:

1. Upgraded a ceph 1.3.3 cluster to 2.2.
2. added a mon after upgrade.
3. Checked the mem usage after addition of mon. The memory usage does not change much with time. Also the mon does not get killed due to OOM.

So moving this bug to verfied.

Thanks,
Tejas

Comment 39 errata-xmlrpc 2017-03-14 15:43:59 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2017-0514.html