Bug 967631

Summary:

swift keeps spamming console when it cannot access config.

Product:

Red Hat OpenStack

Reporter:

Jaroslav Henner <jhenner>

Component:

openstack-swift

Assignee:

Pete Zaitcev <zaitcev>

Status:

CLOSED ERRATA

QA Contact:

Jaroslav Henner <jhenner>

Severity:

urgent

Docs Contact:

Priority:

urgent

Version:

3.0

CC:

apevec, derekh, hateya, jhenner, lhh, mkollaro, zaitcev

Target Milestone:

Target Release:

3.0

Hardware:

x86_64

OS:

Linux

Whiteboard:

Fixed In Version:

openstack-swift-1.8.0-6.el6ost

Doc Type:

Bug Fix

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2013-06-27 16:49:31 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
rsyslog.conf	none
Console spamming band-aid	none
follow-up patch to add swift homedir for signing_dir	derekh: review+

Description Jaroslav Henner 2013-05-27 17:16:30 UTC

Description of problem:
I got flood of
Message from syslogd@controller at May 24 18:27:59 ...
 ¿<130>proxy-server UNCAUGHT EXCEPTION#012Traceback (most recent call last):#012  File "/usr/bin/swift-proxy-server", line 22, in <module>#012    run_wsgi(conf_file, 'proxy-server', default_port=8080, **options)#012  File "/usr/lib/python2.6/site-packages/swift/common/wsgi.py", line 187, in run_wsgi#012    run_server()#012  File "/usr/lib/python2.6/site-packages/swift/common/wsgi.py", line 149, in run_server#012    global_conf={'log_name': log_name})#012  File "/usr/lib/python2.6/site-packages/PasteDeploy-1.5.0-py2.6.egg/paste/deploy/loadwsgi.py", line 247, in loadapp#012    return loadobj(APP, uri, name=name, **kw)#012  File "/usr/lib/python2.6/site-packages/PasteDeploy-1.5.0-py2.6.egg/paste/deploy/loadwsgi.py", line 271, in loadobj#012    global_conf=global_conf)#012  File "/usr/lib/python2.6/site-packages/PasteDeploy-1.5.0-py2.6.egg/paste/deploy/loadwsgi.py", line 296, in loadcontext#012    global_conf=global_conf)#012  File "/usr/lib/python2.6/site-packages/PasteDeploy-1.5.0-py2.6.egg/paste/deploy/loadwsgi.py", line 317, in _loadconfig#012    loader = ConfigLoader(path)#012  File "/usr/lib/python2.6/site-packages/PasteDeploy-1.5.0-py2.6.egg/paste/deploy/loadwsgi.py", line 393, in __init__#012    with open(filename) as f:#012IOError: [Errno 13] Permission denied: '/etc/swift/proxy-server.conf'

Fixed with chown swift:swift /etc/swift.


Version-Release number of selected component (if applicable):
openstack-swift-1.8.0-2.el6ost.noarch

How reproducible:
100%

Steps to Reproduce:
1. chown root:root /etc/swift
2. /etc/init.d/openstack-swift-proxy restart

Actual results:
Spam flood

Expected results:
restart fail

Additional info:
annoying because it floods the admin console

Comment 2 Pete Zaitcev 2013-05-28 23:23:55 UTC

This is all quite strange. First of all, uncaught exceptions should not spam
consoles, because they are logged at the priority "crit", whereas our default
rsyslog.conf spams with "emerg". So, how could this happen?

More to the point, however, I am unable to reproduce the symptom where bad
permissions cause uncaught exceptions. This is because Swift daemons
check their configs before forking. If permissions do not allow reading,
they bail with "Error: unable to locate /etc/swift/proxy-server.conf"
in /var/log/messages. I have just tested this, using the instructions
in this bug report, and there's no uncaught exception.

Jaroslav, I need more information to fix this.

1. Can you reproduce at will? What are permissions of /etc/swift
right now? ls -l

2. Please reproduce the problem and save /var/log/audit/audit.log.
I want to see if SElinux makes us pass the start-up check and trips
us later.

3a. Please attach your rsyslog.conf and anything from /etc/rsyslog.d.
I want to find out why and how this end on the console.

3b. Please run  logger -p local0.crit test-crit. Does that end on console?

Comment 3 Jaroslav Henner 2013-05-29 06:20:15 UTC

(In reply to Pete Zaitcev from comment #2)
> This is all quite strange. First of all, uncaught exceptions should not spam
> consoles, because they are logged at the priority "crit", whereas our default
> rsyslog.conf spams with "emerg". So, how could this happen?

Obviously, it happens before the process enters the try, except block which does the logging or before it registers the sys.excepthook. 

From the symptoms I think that the check is incomplete, it is not checking the user:group or it is not checking the dir permissions.

BTW, it happens on line
#012IOError: [Errno 13] Permission denied: '/etc/swift/proxy-server.conf'
/usr/lib/python2.6/site-packages/PasteDeploy-1.5.0-py2.6.egg/paste/deploy/loadwsgi.py 393



> 
> More to the point, however, I am unable to reproduce the symptom where bad
> permissions cause uncaught exceptions. This is because Swift daemons
> check their configs before forking. If permissions do not allow reading,
> they bail with "Error: unable to locate /etc/swift/proxy-server.conf"
> in /var/log/messages. I have just tested this, using the instructions
> in this bug report, and there's no uncaught exception.

I tried to do chmod a-r /etc/swift but it didn't fail. Swift changed the permissions (which actually I would consider as a bug as well) and started.
Please check whether you are really doing user:group change on the directory
chown root:root /etc/swift

> 
> Jaroslav, I need more information to fix this.
> 
> 1. Can you reproduce at will? What are permissions of /etc/swift
> right now? ls -l

drwx------. swift swift system_u:object_r:etc_t:s0       /etc/swift

> 
> 2. Please reproduce the problem and save /var/log/audit/audit.log.
> I want to see if SElinux makes us pass the start-up check and trips
> us later.
setenforce 0
chown root:root /etc/swift
restart the proxy
SPAM SPAM SPAM SPAM SPAM SPAM SPAM SPAM SPAM SPAM

This proves selinux is not the issue.

> 
> 3a. Please attach your rsyslog.conf and anything from /etc/rsyslog.d.
> I want to find out why and how this end on the console.

On console or not on console, it is flooding the /var/log/messages as well. Disks are not endless...

> 
> 3b. Please run  logger -p local0.crit test-crit. Does that end on console?

nope, only to the messages file

Comment 4 Jaroslav Henner 2013-05-29 06:21:56 UTC

Created attachment 754194 [details]
rsyslog.conf

Nothing else in the .d files

Comment 5 Pete Zaitcev 2013-05-30 03:06:00 UTC

I have reproduced the issue of the proxy server looping and spewing the
exception message. It seems like a regression in 1.8.0. The 1.7.4 is not
affected.

The issue of this message getting to consoles I'm going to ignore for now,
it does not happen here. Perhaps the /etc/rsyslogd.conf beling in DOS
format at Jaroslav's system does something bad.

Comment 6 Jaroslav Henner 2013-05-30 09:22:38 UTC

(In reply to Pete Zaitcev from comment #5)
> I have reproduced the issue of the proxy server looping and spewing the
> exception message. It seems like a regression in 1.8.0. The 1.7.4 is not
> affected.
> 
> The issue of this message getting to consoles I'm going to ignore for now,
> it does not happen here. Perhaps the /etc/rsyslogd.conf beling in DOS
> format at Jaroslav's system does something bad.

What do you mean by rsyslogd.conf in DOS format? Should I consider it as an insult? (; 

Perhaps some CR/LF got changed when Copying&Pasting it from term to this BZ. Maybe I should open a bug only for the logging to console. It also happens when some swift nodes gets unreachable, but I didn't pay much of attention on that.

Comment 7 Pete Zaitcev 2013-05-30 16:33:01 UTC

I agree that a separate bug for logging to consoles is needed. Either
clone this one, or file anew and mention bug 967631 in the report.
Please find a reproducer which uses some other way to trigger it,
e.g. when nodes are unreacheable, because I'm going to fix this one
and that'll cut your ability to reproduce.

Comment 8 Pete Zaitcev 2013-05-30 19:14:50 UTC

N.B. Upstream review - https://review.openstack.org/31075

Comment 9 Lon Hohberger 2013-06-17 13:53:26 UTC

*** Bug 975047 has been marked as a duplicate of this bug. ***

Comment 12 Pete Zaitcev 2013-06-19 20:44:12 UTC

Filed bug 976081 against Packstack for signing_dir=/etc/swift.

Comment 13 Pete Zaitcev 2013-06-20 00:30:53 UTC

Created attachment 763175 [details]
Console spamming band-aid

I do not think we should apply this, but just so we have it, here's
a patch that uncodes the syslog message, so that BOM is not trasmitted,
and thus consoles aren't spammed. This is verified to work on Lon's
reproducer.

The key fix, however, is to get signing_dir out of puppet modules.

Comment 14 Alan Pevec 2013-06-20 09:54:40 UTC

Created attachment 763388 [details]
follow-up patch to add swift homedir for signing_dir

Comment 15 Derek Higgins 2013-06-20 10:04:56 UTC

Comment on attachment 763388 [details]
follow-up patch to add swift homedir for signing_dir

lgtm

Comment 16 Haim 2013-06-23 16:08:52 UTC

verified on latest puddle; 

[root@nott-vdsb yum.repos.d(keystone_admin)]# chown root:root /etc/swift
[root@nott-vdsb yum.repos.d(keystone_admin)]# /etc/init.d/openstack-swift-proxy restart
Stopping openstack-swift-proxy:                            [FAILED]
Starting openstack-swift-proxy:                            [  OK  ]

[root@nott-vdsb yum.repos.d(keystone_admin)]# /etc/init.d/openstack-swift-proxy status
openstack-swift-proxy dead but pid file exists

[root@nott-vdsb yum.repos.d(keystone_admin)]# chown swift:swift /etc/swift
[root@nott-vdsb yum.repos.d(keystone_admin)]# /etc/init.d/openstack-swift-proxy start 
Starting openstack-swift-proxy:                            [  OK  ]
[root@nott-vdsb yum.repos.d(keystone_admin)]# /etc/init.d/openstack-swift-proxy status
openstack-swift-proxy (pid  7443) is running...

Comment 18 errata-xmlrpc 2013-06-27 16:49:31 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2013-0993.html