Description of problem: koschei-watcher.service has recently died with the following error: Traceback (most recent call last): File "/usr/lib64/python2.7/runpy.py", line 162, in _run_module_as_main "__main__", fname, loader, pkg_name) File "/usr/lib64/python2.7/runpy.py", line 72, in _run_code exec code in run_globals File "/usr/lib/python2.7/site-packages/koschei/main.py", line 41, in <module> service().run_service() File "/usr/lib/python2.7/site-packages/koschei/service.py", line 61, in run_service self.main() File "/usr/lib/python2.7/site-packages/koschei/watcher.py", line 91, in main for _, _, topic, msg in self.fedmsg.tail_messages(): File "/usr/lib/python2.7/site-packages/fedmsg/core.py", line 374, in tail_messages if not validate or fedmsg.crypto.validate(msg, **self.c): File "/usr/lib/python2.7/site-packages/fedmsg/crypto/__init__.py", line 245, in validate return backend.validate(message, **cfg) File "/usr/lib/python2.7/site-packages/fedmsg/crypto/x509.py", line 144, in validate crl = M2Crypto.X509.load_crl(crl) File "/usr/lib64/python2.7/site-packages/M2Crypto/X509.py", line 1097, in load_crl f=BIO.openfile(file) File "/usr/lib64/python2.7/site-packages/M2Crypto/BIO.py", line 186, in openfile return File(open(filename, mode)) IOError: [Errno 2] No such file or directory: '/var/run/fedmsg/crl.pem'
Can you provide more detail when it happened? at startup or after some time? which koschei version? on fedora cloud machine or locally? is it reproducible? Koschei just calls fedmsg's public API. It doesn't provide it any configuration, fedmsg loads it's global configration in /etc/fedmsg.d. Such error seems to be caused by misconfigured fedmsg, which is outside of koschei's scope. I just tried it with default configuration that's in fedora 21 and it seems to work properly.
(In reply to Michael Simacek from comment #1) > Can you provide more detail when it happened? > at startup or after some time? After some time koschei-watcher.service just died. > which koschei version? > on fedora cloud machine or locally? It was in production on koschei.cloud.fedoraproject.org. I think it was latest upstream version. > is it reproducible? No, I didn't see it again.
I found the problem in fedmsg, reassingnig. Description of the problem: In fedmsg.crypt._load_remote_certs, when there is a network failure and the certificate hasn't been downloaded yet, it ignores the error and returns the path as if it was there. This causes the failure later, because the file doesn't exist and raises IOError. IOErrors cannot be reliably handled by the client application, because it has no idea what kind of error it is, therefore cannot simply restart itself. Expected result: If there is a network error and there's no cached version of a certificate, it should raise more specific exception, so that the application can determine it was a network failure and act appropriately. Additional question: If it's been running for longer time as the original report says, why there wasn't a cached file already?
How about this for a solution? Have fedmsg re-raise the original `requests.exceptions.ConnectionError` during the call to `fedmsg.crypto.validate(msg, **config)`: diff --git a/fedmsg/crypto/x509.py b/fedmsg/crypto/x509.py index f15f183..4d02f5f 100644 --- a/fedmsg/crypto/x509.py +++ b/fedmsg/crypto/x509.py @@ -246,7 +246,8 @@ def _load_remote_cert(location, cache, cache_expiry, **config): with open(cache, 'w') as f: f.write(response.content) except requests.exceptions.ConnectionError: - log.warn("Could not access %r" % location) + log.error("Could not access %r" % location) + raise except IOError as e: # If we couldn't write to the specified cache location, try a # similar place but inside our home directory instead. I'll wait for feedback before committing and preparing a release.
(In reply to Ralph Bean from comment #4) > How about this for a solution? Have fedmsg re-raise the original > `requests.exceptions.ConnectionError` during the call to > `fedmsg.crypto.validate(msg, **config) ConnectionError should be fine. > > > diff --git a/fedmsg/crypto/x509.py b/fedmsg/crypto/x509.py > index f15f183..4d02f5f 100644 > --- a/fedmsg/crypto/x509.py > +++ b/fedmsg/crypto/x509.py > @@ -246,7 +246,8 @@ def _load_remote_cert(location, cache, cache_expiry, > **config): > with open(cache, 'w') as f: > f.write(response.content) > except requests.exceptions.ConnectionError: > - log.warn("Could not access %r" % location) > + log.error("Could not access %r" % location) > + raise > except IOError as e: > # If we couldn't write to the specified cache location, try a > # similar place but inside our home directory instead. > > > I'll wait for feedback before committing and preparing a release.
Upstream: https://github.com/fedora-infra/fedmsg/pull/316
fedmsg-0.12.0-1.fc21 has been submitted as an update for Fedora 21. https://admin.fedoraproject.org/updates/fedmsg-0.12.0-1.fc21
fedmsg-0.12.0-1.fc20 has been submitted as an update for Fedora 20. https://admin.fedoraproject.org/updates/fedmsg-0.12.0-1.fc20
fedmsg-0.12.0-1.el7 has been submitted as an update for Fedora EPEL 7. https://admin.fedoraproject.org/updates/fedmsg-0.12.0-1.el7
fedmsg-0.12.0-1.el6 has been submitted as an update for Fedora EPEL 6. https://admin.fedoraproject.org/updates/fedmsg-0.12.0-1.el6
And.. it's in the infra repo now, too.
fedmsg-0.12.1-1.el7 has been submitted as an update for Fedora EPEL 7. https://admin.fedoraproject.org/updates/fedmsg-0.12.1-1.el7
fedmsg-0.12.1-1.el6 has been submitted as an update for Fedora EPEL 6. https://admin.fedoraproject.org/updates/fedmsg-0.12.1-1.el6
fedmsg-0.12.1-1.fc21 has been submitted as an update for Fedora 21. https://admin.fedoraproject.org/updates/fedmsg-0.12.1-1.fc21
fedmsg-0.12.1-1.fc20 has been submitted as an update for Fedora 20. https://admin.fedoraproject.org/updates/fedmsg-0.12.1-1.fc20
fedmsg-0.12.2-1.el7 has been submitted as an update for Fedora EPEL 7. https://admin.fedoraproject.org/updates/fedmsg-0.12.2-1.el7
fedmsg-0.12.2-1.el6 has been submitted as an update for Fedora EPEL 6. https://admin.fedoraproject.org/updates/fedmsg-0.12.2-1.el6
This bug appears to have been reported against 'rawhide' during the Fedora 22 development cycle. Changing version to '22'. More information and reason for this action is here: https://fedoraproject.org/wiki/Fedora_Program_Management/HouseKeeping/Fedora22
fedmsg-0.12.2-1.el6 has been pushed to the Fedora EPEL 6 stable repository. If problems still persist, please make note of it in this bug report.
fedmsg-0.12.2-1.el7 has been pushed to the Fedora EPEL 7 stable repository. If problems still persist, please make note of it in this bug report.
fedmsg-0.12.1-1.fc21 has been pushed to the Fedora 21 stable repository. If problems still persist, please make note of it in this bug report.
fedmsg-0.12.1-1.fc20 has been pushed to the Fedora 20 stable repository. If problems still persist, please make note of it in this bug report.