Bug 1040089

Summary: Salt minion crashes on startup
Product: [Fedora] Fedora EPEL Reporter: Johan Dorland <johand>
Component: saltAssignee: Clint Savage <herlo1>
Status: CLOSED UPSTREAM QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: low Docs Contact:
Priority: unspecified    
Version: el6CC: andrewniemants, erik, herlo1
Target Milestone: ---   
Target Release: ---   
Hardware: noarch   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-02-11 15:02:17 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Johan Dorland 2013-12-10 17:04:10 UTC
Description of problem:
When the salt-minion starts it crashes with the following error:

[ERROR   ] An un-handled exception was caught by salt's global exception handler:
TypeError: argument of type 'NoneType' is not iterable
Traceback (most recent call last):
  File "/usr/bin/salt-minion", line 14, in <module>
    salt_minion()
  File "/usr/lib/python2.6/site-packages/salt/scripts.py", line 30, in salt_minion
    minion.start()
  File "/usr/lib/python2.6/site-packages/salt/__init__.py", line 219, in start
    self.prepare()
  File "/usr/lib/python2.6/site-packages/salt/__init__.py", line 207, in prepare
    self.minion = salt.minion.Minion(self.config)
  File "/usr/lib/python2.6/site-packages/salt/minion.py", line 514, in __init__
    self.returners)
  File "/usr/lib/python2.6/site-packages/salt/utils/schedule.py", line 69, in __init__
    clean_proc_dir(opts)
  File "/usr/lib/python2.6/site-packages/salt/utils/schedule.py", line 259, in clean_proc_dir
    if 'pid' in job:
TypeError: argument of type 'NoneType' is not iterable
Traceback (most recent call last):
  File "/usr/bin/salt-minion", line 14, in <module>
    salt_minion()
  File "/usr/lib/python2.6/site-packages/salt/scripts.py", line 30, in salt_minion
    minion.start()
  File "/usr/lib/python2.6/site-packages/salt/__init__.py", line 219, in start
    self.prepare()
  File "/usr/lib/python2.6/site-packages/salt/__init__.py", line 207, in prepare
    self.minion = salt.minion.Minion(self.config)
  File "/usr/lib/python2.6/site-packages/salt/minion.py", line 514, in __init__
    self.returners)
  File "/usr/lib/python2.6/site-packages/salt/utils/schedule.py", line 69, in __init__
    clean_proc_dir(opts)
  File "/usr/lib/python2.6/site-packages/salt/utils/schedule.py", line 259, in clean_proc_dir
    if 'pid' in job:
TypeError: argument of type 'NoneType' is not iterable


Version-Release number of selected component (if applicable):
0.17.2

How reproducible:
From what I have seen it affects all salt-minions with version 0.17.2. We have Saltstack running to administer roughly 10 virtual machines and they all crash with the exact same error. Our vm's that run 0.17.1 and have not been upgraded to 0.17.2 run fine however.

Additional info:
The salt minion seems to try to remove any pid files from minion's that are no longer running. In this process however it seems to try to iterate over a None value and it crashes.

Comment 1 Andrew Niemantsverdriet 2013-12-10 18:44:16 UTC
Can you give some additional information on how to reproduce this? As I am unable to.

* Have the keys been accepted on the salt-master?
* Is salt-master running 0.17.2 as well?

I see the problem but am curious as to how it happened.

Comment 2 Johan Dorland 2013-12-10 22:02:06 UTC
The keys have been accepted by the salt master. Our current setup was working fine when running 0.17.1 (and before that also older versions). It was only after upgrading to 0.17.2 that this problem started to occur.
The salt master is also running 0.17.2.

I think I found the problem however.
When starting the salt-minion executes clean_proc_dir()

    "Loop through jid files in the minion proc directory (default /var/cache/salt/minion/proc)
    and remove any that refer to processes that no longer exist"

On all our minions this directory was not empty.

job = salt.payload.Serial(opts).load(fp_)

returns None however, which causes the if statement to crash.

When I remove all the files in /var/cache/salt/minion/proc/ the salt-minion starts properly.

Comment 3 Johan Dorland 2013-12-12 14:33:49 UTC
The bug can easily be reproduced by creating an empty file in /var/cache/salt/minion/proc and restarting the salt-minion. It might not really simulate a real world scenario, but it causes the exact same exception.

service salt-minion stop
touch /var/cache/salt/minion/proc/12345
salt-minion

The bug occurs on both CentOS 6.5 as well as Fedora 19, both with salt-minion 0.17.2

In our setup these jid files in /var/cache/salt/minion/proc were probably a result of some unexpected power loss during running state.highstates, or manually killing yum after someone ran yum update without -y trying to update our virtual machines.

Comment 4 Erik Johnson 2013-12-12 16:02:59 UTC
Thanks for the information. We'll look into this. Have you filed an issue on the Salt issue tracker on GitHub? If not, I will.

Comment 5 Johan Dorland 2013-12-17 14:48:02 UTC
Issue on github has been created.

https://github.com/saltstack/salt/issues/9316

Comment 6 Erik Johnson 2014-02-05 17:42:41 UTC
I haven't been able to reproduce this in 0.17.4 or 0.17.5.

Comment 7 Johan Dorland 2014-02-11 15:02:17 UTC
The bug seems to be fixed in 0.17.4 indeed.