Bug 1272535
Summary: | Server and bridge regularly crash when releng is signing rpms in batched mode | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Amanda Carter <acarter> | ||||||
Component: | sigul | Assignee: | Patrick Uiterwijk <puiterwijk> | ||||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||
Severity: | unspecified | Docs Contact: | |||||||
Priority: | unspecified | ||||||||
Version: | rawhide | CC: | acarter, dennis, kevin, lmacken, mitr, nmavrogi, opensource | ||||||
Target Milestone: | --- | ||||||||
Target Release: | --- | ||||||||
Hardware: | Unspecified | ||||||||
OS: | Unspecified | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2016-08-15 16:42:50 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Attachments: |
|
Description
Amanda Carter
2015-10-16 16:24:26 UTC
The while loop is to deal with a case where the client is waiting on a final ack that the bridge thinks it has been sent. it causes the client to deadlock waiting on something that will never come. *** Bug 801168 has been marked as a duplicate of this bug. *** See some details from previous reports of similar problems in BZ#801168 marked as a dupe of this one. ok. I sat down to gather more info today. The releng scripting basically boils down to calling: sigul --batch sign-rpm --store-in-koji --koji-only --v3-signature <keyname> rpm (when rpm is 1 rpm), and 'sign-rpms' if it's more than one. So, for testing I did the following: * Gathered a list of all the rpms produced by a fedora-23 texlive build: https://koji.fedoraproject.org/koji/buildinfo?buildID=686733 (5159 rpms) * ran: sigul sign-rpms --output /home/fedora/kevin/foobar --v3-signature fedora-23 `cat texlive-rpms-list` * Entered my passphrase for the f23 key. * Waited until it hung. I can attach the entire server and bridge logs if you like. The last few lines of each are: bridge: 2015-11-09 20:34:45,266 DEBUG: sign-rpms:send replies: Started handling 'id' = '\x00\x00\x02\x16', 'status' = '\x00\x00\x00\x00' 2015-11-09 20:34:45,266 DEBUG: sign-rpms:read requests: Started handling 'id' = '\x00\x00\x03\xcd', 'rpm-arch' = 'noarch', 'rpm-epoch' = '4', 'rpm-name' = 'texlive-cmsd-doc', 'rpm-release' = '13.fc23', 'rpm-sigmd5' = '\xad/I\xb3K\xd7\xf4\x8brcN\xc8!\x88;\xac', 'rpm-version' = 'svn18787.0' 2015-11-09 20:34:45,266 DEBUG: Subreply: 'id' = '\x00\x00\x02A', 'status' = '\x00\x00\x00\x00' 2015-11-09 20:34:45,268 INFO: Subrequest: 'id' = '\x00\x00\x03\xcd', 'rpm-arch' = 'noarch', 'rpm-epoch' = '4', 'rpm-name' = 'texlive-cmsd-doc', 'rpm-release' = '13.fc23', 'rpm-sigmd5' = '\xad/I\xb3K\xd7\xf4\x8brcN\xc8!\x88;\xac', 'rpm-version' = 'svn18787.0' 2015-11-09 20:34:45,452 DEBUG: sign-rpms:koji requests: Started handling reply 'id' = '\x00\x00\x02\x17', 'status' = '\x00\x00\x00\x00' 2015-11-09 20:34:45,453 DEBUG: sign-rpms:koji requests: Started handling request 'id' = '\x00\x00\x03\x93', 'rpm-arch' = 'noarch', 'rpm-epoch' = '4', 'rpm-name' = 'texlive-circuitikz', 'rpm-release' = '13.fc23', 'rpm-sigmd5' = '\xf6lu$"Q\'t\x18ak6W\x8f\x80\xcc', 'rpm-version' = 'svn28665.0.3.0' 2015-11-09 20:34:45,453 DEBUG: sign-rpms:read requests: Started handling 'id' = '\x00\x00\x03\xce', 'rpm-arch' = 'noarch', 'rpm-epoch' = '4', 'rpm-name' = 'texlive-cmtiup', 'rpm-release' = '13.fc23', 'rpm-sigmd5' = '\xa8\xffS\x91\x95\xa6\xcf\xbe;\x9e\xa5;>\x95\xea\xd3', 'rpm-version' = 'svn20512.1.3a' 2015-11-09 20:34:45,453 INFO: Subrequest: 'id' = '\x00\x00\x03\xce', 'rpm-arch' = 'noarch', 'rpm-epoch' = '4', 'rpm-name' = 'texlive-cmtiup', 'rpm-release' = '13.fc23', 'rpm-sigmd5' = '\xa8\xffS\x91\x95\xa6\xcf\xbe;\x9e\xa5;>\x95\xea\xd3', 'rpm-version' = 'svn20512.1.3a' 2015-11-09 20:34:45,454 DEBUG: sign-rpms:send replies: Started handling 'id' = '\x00\x00\x02\x17', 'status' = '\x00\x00\x00\x00' 2015-11-09 20:34:45,455 DEBUG: sign-rpms:read replies: Started handling 'id' = '\x00\x00\x02B', 'status' = '\x00\x00\x00\x00' 2015-11-09 20:34:45,455 DEBUG: Subreply: 'id' = '\x00\x00\x02B', 'status' = '\x00\x00\x00\x00' vault: 2015-11-09 20:27:35,217 INFO: Subrequest: 'id' = '\x00\x00\x03\x0e', 'rpm-arch' = 'noarch', 'rpm-epoch' = '4', 'rpm-name' = 'texlive-canoniclayout', 'rpm-release' = '13.fc23', 'rpm-sigmd5' = 'D53\xde9oW\xa5\xbaFF\xa3\xf4\x93\xe1\xd3', 'rpm-version' = 'svn24523.0.4' 2015-11-09 20:27:35,595 INFO: Signed RPM ('texlive-boisik', 4L, 'svn15878.0.5', '13.fc23', 'noarch', '364acb3259369b950069591310cf76aded1b5ae6130ce9571e2d312cfb1a0c77c8c7ac1ad862f01576340dc82e369d4c8d7ecef47c30a6c6ece2ed4e9f8958ef') with key fedora-23 2015-11-09 20:27:35,595 DEBUG: sign-rpms:signing: Started handling ('texlive-boisik-doc', 4L, 'svn15878.0.5', '13.fc23', 'noarch', '15db7859449d3e8ac6e54f109b71bb8c87421978880c11a050bfc6e910287bdd4817aab9d4bc30012e2c9623e5abe3b232dd88c4a9c09dc0ce555004699ec665') 2015-11-09 20:27:35,597 DEBUG: sign-rpms:requests: Started handling 'id' = '\x00\x00\x03\x0f', 'rpm-arch' = 'noarch', 'rpm-epoch' = '4', 'rpm-name' = 'texlive-canoniclayout-doc', 'rpm-release' = '13.fc23', 'rpm-sigmd5' = '\xfa\xf0\xc9\xda\xd6?xf\x82\x04\xf7z\xb7\xe9D\xcb', 'rpm-version' = 'svn24523.0.4' 2015-11-09 20:27:35,597 INFO: Subrequest: 'id' = '\x00\x00\x03\x0f', 'rpm-arch' = 'noarch', 'rpm-epoch' = '4', 'rpm-name' = 'texlive-canoniclayout-doc', 'rpm-release' = '13.fc23', 'rpm-sigmd5' = '\xfa\xf0\xc9\xda\xd6?xf\x82\x04\xf7z\xb7\xe9D\xcb', 'rpm-version' = 'svn24523.0.4' 2015-11-09 20:27:35,984 INFO: Signed RPM ('texlive-boisik-doc', 4L, 'svn15878.0.5', '13.fc23', 'noarch', '15db7859449d3e8ac6e54f109b71bb8c87421978880c11a050bfc6e910287bdd4817aab9d4bc30012e2c9623e5abe3b232dd88c4a9c09dc0ce555004699ec665') with key fedora-23 2015-11-09 20:27:35,984 DEBUG: sign-rpms:signing: Started handling ('texlive-boites', 4L, 'svn32235.1.1', '13.fc23', 'noarch', '99e46617a9ad6c5bf0017a8fb4a61e0341a9224fc1840aacba758e024ae7c80e0f53e34bcc130cabf1ad375369902e3f89d6a8414b8ff4f33ff678902d69f7fb') 2015-11-09 20:27:35,985 DEBUG: sign-rpms:requests: Started handling 'id' = '\x00\x00\x03\x10', 'rpm-arch' = 'noarch', 'rpm-epoch' = '4', 'rpm-name' = 'texlive-cantarell', 'rpm-release' = '13.fc23', 'rpm-sigmd5' = 'M\xcb\x10\x19\xd1l\xa1J\xb07\x13\xa4a\xad\xd0&', 'rpm-version' = 'svn27066.2.4' 2015-11-09 20:27:35,986 INFO: Subrequest: 'id' = '\x00\x00\x03\x10', 'rpm-arch' = 'noarch', 'rpm-epoch' = '4', 'rpm-name' = 'texlive-cantarell', 'rpm-release' = '13.fc23', 'rpm-sigmd5' = 'M\xcb\x10\x19\xd1l\xa1J\xb07\x13\xa4a\xad\xd0&', 'rpm-version' = 'svn27066.2.4' 2015-11-09 20:27:36,360 INFO: Signed RPM ('texlive-boites', 4L, 'svn32235.1.1', '13.fc23', 'noarch', '99e46617a9ad6c5bf0017a8fb4a61e0341a9224fc1840aacba758e024ae7c80e0f53e34bcc130cabf1ad375369902e3f89d6a8414b8ff4f33ff678902d69f7fb') with key fedora-23 Let me know if you want the full logs or other debugging info... Using pyrasite on the vault I see its threads are: Thread 0x7f6fdaa8a700 File "/usr/lib64/python2.7/threading.py", line 784, in __bootstrap self.__bootstrap_inner() File "/usr/lib64/python2.7/threading.py", line 811, in __bootstrap_inner self.run() File "<string>", line 167, in run File "/usr/lib64/python2.7/code.py", line 243, in interact more = self.push(line) File "/usr/lib64/python2.7/code.py", line 265, in push more = self.runsource(source, self.filename) File "/usr/lib64/python2.7/code.py", line 87, in runsource self.runcode(code) File "/usr/lib64/python2.7/code.py", line 103, in runcode exec code in self.locals File "<console>", line 3, in <module> Thread 0x7f6feb00d740 File "/usr/share/sigul/server.py", line 1552, in <module> main() File "/usr/share/sigul/server.py", line 1532, in main (_, status) = os.waitpid(child_pid, 0) Not sure that helps, but thought I would add it. I was able to reproduce the issue even without --store-in-koji. Just with the same amount of packages. Hopefully we get something useful out of it soon. (In reply to Kevin Fenzi from comment #5) > Using pyrasite on the vault I see its threads are: FWIW on the vault, almost nothing interesting is happening in the top-level process; every request is handled by a new child, and that child has another helper process (and then there is RPM and GPG). Backtraces from all relevant processes (all sigul Python processes, for starters) might be more useful. (Trying to reproduce the reproduction by Nikos…) (In reply to Kevin Fenzi from comment #4) > ok. I sat down to gather more info today. > > The releng scripting basically boils down to calling: > > sigul --batch sign-rpm --store-in-koji --koji-only --v3-signature <keyname> > rpm > > (when rpm is 1 rpm), and 'sign-rpms' if it's more than one. > > So, for testing I did the following: > > * Gathered a list of all the rpms produced by a fedora-23 texlive build: > https://koji.fedoraproject.org/koji/buildinfo?buildID=686733 (5159 rpms) > > * ran: > sigul sign-rpms --output /home/fedora/kevin/foobar --v3-signature fedora-23 > `cat texlive-rpms-list` > > * Entered my passphrase for the f23 key. > > * Waited until it hung. Hi Kevin, could you clarify what do you mean by it hung? Which process hung? Are the server and bridge unable to serve any other connections? What is the output of the client in the "hung" state when run with the -v -v options? (In reply to Nikos Mavrogiannopoulos from comment #8) > Hi Kevin, could you clarify what do you mean by it hung? Which process hung? > Are the server and bridge unable to serve any other connections? What is the > output of the client in the "hung" state when run with the -v -v options? Everything stopped processing. No more logs on the vault/server, bridge or client. New client attempts just hang and don't get any answer from the bridge. In order to bring things back up, the server and bridge must both be restarted. The situation when it blocks is the following: Connections: client <--established--> bridge <--established--> server client has two processes. The parent has a socket with the child in FIN_WAIT2 mode, that is waiting a termination from the child. The child has its socket in CLOSE_WAIT mode, that is, it has not yet called close in that socket. The client's child which is blocked indefinitely on nspr_poll() - that explains why it hasn't called close in the previous socket. As he has only one more socket with the bridge, it is most likely waiting input from the bridge. bridge has two connections hanging indefinitely in CLOSE_WAIT mode. They are connections to koji and kojipkg. For some reason I cannot perform lsof/gdb/ or strace in that system. I'll have to retry with the bridge in F23. Created attachment 1094863 [details]
backtrace using pyrasite and gdb of sigul-bridge
Attaching the backtrace of the sigul bridge.
OK, thanks to Nikos arranging for a good testing environment, sigul-0.101 is available at http://people.redhat.com/mitr/rpmsigner/rhel6/ . tl:dr; Please give this a try; it should fix sign-rpms and do nothing for sign-rpm. Detailed ChangeLog at https://git.fedorahosted.org/cgit/sigul.git/tree/NEWS?h=sigul-0.101 . Notes: * This fixes a hang in (sigul sign-rpms) which we we able to reproduce. (As a side-effect, the bridge may now be performing two Koji requests concurrently, instead of only one at a time.) * In case of some exceptions, the processes could hang without printing anything. Now the exception is properly logged, but the hang has not been fixed yet (it is, AFAIK, theoretical [reproduced for me by a permission issue you are certainly not running into], and we wanted to have something for you to test ASAP), tracked as #1283364 . Note that if this scenario happens, the logged exception may not be at the bottom of the log! Search for “Unexpected error” or “Terminated by an exception”.) * Both of these affect _only_ (sigul sign-rpms) as that has been the provided reproducer. (I’ll now try running (sigul ign-rpm) on the same set of RPMs overnight, we’ll see how that fares.) So, I updated to 0.101 on the bridge, vault and bodhi-backend01 client. I was able to do a full run of texlive signing (5200+ packages) in about 33 minutes with no lockup. ;) I then did a run of 471 packages via the releng scripts that use koji, etc. That ran fine for a while, but then seems to have gotten stuck. On the bridge: 2015-11-19 21:51:23,808 DEBUG: sign-rpms:send requests: Started handling 'id' = '\x00\x00\x01\xd5', 'rpm-arch' = 'i686', 'rpm-epoch' = '', 'rpm-name' = 'gchemtable', 'rpm-release' = '10.fc23', 'rpm-sigmd5' = 'h\xc2[SV)\x17\x91\x1b&;\xa65\x89\xc2\xea', 'rpm-version' = '0.14.10' 2015-11-19 21:51:23,872 DEBUG: sign-rpms:send requests: Started handling 'id' = '\x00\x00\x01\xd6', 'rpm-arch' = 'x86_64', 'rpm-epoch' = '2', 'rpm-name' = 'libpng-debuginfo', 'rpm-release' = '1.fc23', 'rpm-sigmd5' = '0\xac\x1aki\x1c\x08\x96\x1d\xf6%\xee\x16\xdd\x17\xc9', 'rpm-version' = '1.6.19' 2015-11-19 21:51:23,938 DEBUG: sign-rpms:send requests finished, exc_info: None 2015-11-19 21:51:23,938 DEBUG: Sending final EOFs to sign-rpms:read replies... 2015-11-19 21:51:23,939 DEBUG: Waiting for sign-rpms:read replies... 2015-11-19 21:59:20,062 ERROR: I/O error: NSPR connection reset On the vault: 015-11-19 21:43:06,247 INFO: Subrequest: 'id' = '\x00\x00\x01\xcf', 'rpm-arch' = 'armv7hl', 'rpm-epoch' = '', 'rpm-name' = 'geany-plugins-geanygendoc', 'rpm-release' = '1.fc23', 'rpm-sigmd5' = '\x8bE=\x99-`\x91\xb5\x90\x94\xf6\xa8\xc2\x90\xbfQ', 'rpm-version' = '1.26' 2015-11-19 21:43:06,279 DEBUG: sign-rpms:requests: Started handling 'id' = '\x00\x00\x01\xd0', 'rpm-arch' = 'x86_64', 'rpm-epoch' = '', 'rpm-name' = 'pypy-devel', 'rpm-release' = '1.fc23', 'rpm-sigmd5' = '\x9c+\xdd\xe2\xf1\xd7\n>\xac\x1a\x99\x13tE\xac\x11', 'rpm-version' = '4.0.0' 2015-11-19 21:43:06,279 INFO: Subrequest: 'id' = '\x00\x00\x01\xd0', 'rpm-arch' = 'x86_64', 'rpm-epoch' = '', 'rpm-name' = 'pypy-devel', 'rpm-release' = '1.fc23', 'rpm-sigmd5' = '\x9c+\xdd\xe2\xf1\xd7\n>\xac\x1a\x99\x13tE\xac\x11', 'rpm-version' = '4.0.0' I then killed the bridge and restarted it and on the vault got: 2015-11-19 21:57:48,171 DEBUG: (child) _CombiningBuffer: data dropped 2015-11-19 21:57:51,387 DEBUG: sign-rpms:replies: Started handling ('gcrystal', None, '0.14.10', '10.fc23', 'armv7hl', '98ecdf8fcd018a5367c44a86c48e1cbd56dde6616e16a2dbace8773e38971a5d49474f785802865a51401cee58577916e6a2657df5610b3871b6c2b666eac771') This is using a different path/command line here... sigul --batch sign-rpms --store-in-koji --koji-only --v3-signature fedora-23 compiz-0.8.9-2.fc23.x86_64.rpm nordugrid-arc-cache-service-5.0.4-1.fc23.armv7hl.rpm nordugrid-arc-5.0.4-1.fc 23.x86_64.rpm babl-devel-0.1.14-1.fc23.armv7hl.rpm compiz-devel-0.8.9-2.fc23.x86_64.rpm lxdm-debuginfo-0.5.2-2.D 20151121gitd5b7ae6b.fc23.armv7hl.rpm perl-thrift-0.9.1-17.fc23.noarch.rpm...413more Can you test the koji path's there? Or would that be something you would need a staging instance in our infrastructure for? I just tried to duplicate the issue again today and couldn't. I will see if I can get a reproducer or at least see it again. Thanks. Created attachment 1098145 [details] pyrasite.patch Just to make sure, reiterating: > Note that if this scenario happens, the logged exception may not be at the > bottom of the log! Search for “Unexpected error” or “Terminated by an > exception”.) We can test a from-Koji path, but not --store-in-koji, so also not --koji-only; the RPMs are being sent to clients. Also, for completeness, we must have > koji_do_proxy_auth = False in sigul/settings.py; not appropriate for your use. As for more information, thanks for pointing out the existence of pyrasite; that seems to be the best tool available at least so far. What I found useful: The “Dumping thread stacks” example from http://pyrasite.readthedocs.org/en/latest/Payloads.html is a good starting point, if possible when run on _all_ processes on all three hosts. Those backtraces do not capture the exceptions from already terminated threads, though; so at that point interactive examination may be needed, e.g. once I ended up running (pyrasite shell $pid) and things like > sys._current_frames().items()[2][1].f_back.f_back.f_back.f_locals > sys._current_frames().items()[2][1].f_back.f_back.f_back.f_locals['koji_queue']._qsize() > sys._current_frames().items()[2][1].f_back.f_back.f_locals['threads'][2].exc_info > traceback.print_exception(*sys._current_frames().items()[2][1].f_back.f_back.f_locals['threads'][2].exc_info) Also, running pyrasite as root on a sigul-owned processes did not work because the sigul-owned processes could not connect back to the root-owned socket; the attached pyrasite patch (very insecure on a multi-user machine!) helped. (In reply to Miloslav Trmač from comment #14) > …and things like > … (To clarify, this is not a request for capturing the output of precisely those commands; that would be backtrace-specific.) I got a lockup today and logs on the bridge had: 2015-11-25 18:01:11,312 DEBUG: sign-rpms:send replies: Started handling 'id' = '\x00\x00\x00\xf7', 'status' = '\x00\x00\x00\x00' 2015-11-25 18:01:11,313 ERROR: NSPR error Traceback (most recent call last): File "/usr/share/sigul/utils.py", line 559, in run self._real_run() File "/usr/share/sigul/bridge.py", line 878, in _real_run self.__handle_one_rpm(rpm) File "/usr/share/sigul/bridge.py", line 884, in __handle_one_rpm self.__conn.client_buf.write(rpm.reply_header_data) File "/usr/share/sigul/double_tls.py", line 880, in write self.__socket.send(utils.u32_pack(len(data))) NSPRError: (PR_HOST_UNREACHABLE_ERROR) Host is unreachable. but then it kept going for a while. Still don't have a solid reproducer. ;( (In reply to Kevin Fenzi from comment #16) > I got a lockup today and logs on the bridge had: > > 2015-11-25 18:01:11,312 DEBUG: sign-rpms:send replies: Started handling 'id' > = '\x00\x00\x00\xf7', 'status' = '\x00\x00\x00\x00' > 2015-11-25 18:01:11,313 ERROR: NSPR error > Traceback (most recent call last): > File "/usr/share/sigul/utils.py", line 559, in run > self._real_run() > File "/usr/share/sigul/bridge.py", line 878, in _real_run > self.__handle_one_rpm(rpm) > File "/usr/share/sigul/bridge.py", line 884, in __handle_one_rpm > self.__conn.client_buf.write(rpm.reply_header_data) > File "/usr/share/sigul/double_tls.py", line 880, in write > self.__socket.send(utils.u32_pack(len(data))) > NSPRError: (PR_HOST_UNREACHABLE_ERROR) Host is unreachable. Thanks, this is sufficient to show that #1283364 will need to be fixed. And that I was wrong about ‘Search for “Unexpected error” or “Terminated by an exception”.)', the error messages can have other forms, like the above. > but then it kept going for a while. That is expected; in this case things hang only after the vault can’t sign any more because too many replies are in flight, which eventually prevents the bridge from queuing more incoming requests. > Still don't have a solid reproducer. ;( FWIW, the bridge got “host unreachable” (or something NSPR mapped to “host unreachable”) while trying to send data to the client. Could this be related to a signing client shutting down/suspending a laptop, a VPN dropping, or something like that? (Not that that would fix sigul, of course.) (In reply to Miloslav Trmač from comment #18) > FWIW, the bridge got “host unreachable” (or something NSPR mapped to “host > unreachable”) while trying to send data to the client. Could this be related > to a signing client shutting down/suspending a laptop, a VPN dropping, or > something like that? (Not that that would fix sigul, of course.) Shouldn't be. The bridge is a vm on a cisco UCS blade center, and the client host is another vm on the same blade center (although a different blade). There's no vpns or suspends going on. sigul-0.102 is available at http://people.redhat.com/mitr/rpmsigner/rhel6/ , with the same deployment mechanism and/or issues as the last time. Again, this only affects sign-rpms and does nothing for sign-rpm. Detailed ChangeLog at https://git.fedorahosted.org/cgit/sigul.git/tree/NEWS?h=sigul-0.102 . With this, exceptions in any of the worker threads should cause the whole processing pipeline to eventually terminate. Note that it takes the Linux kernel, per tcp(7), “13 to 30 minutes” for it to detect that a host is completely dead and not responding to incoming data at all, and only then is user-space notified. The RPM items will in the mean time queue in the various socket buffers, and eventually in inter-thread queues; so depending on the processing speed, between the time a host goes away and sigul terminates, the logs may show activity or they may show no progress. The Recv-Q and Send-Q columns in (ss -nat) may suggest such queueing, or high values in retry counts within (ss -nato). (FWIW a possible way to reproduce something similar, if not exactly equivalent, in the harness by Nikos is to (docker stop sigul-client); that will result in “connection reset” instead of “host unreachable”.) For example, for me, with the bridge downloading RPMs from koji but sending them back to the client, (docker stop sigul-client) causes: A The bridge→client socket buffers fill up, within a minute B Bridge processing of replies continues until ~200 RPMs are queued C Then vault→bridge socket buffers fill up D Then the two vault queues fill up (until ~200 RPMs queued, again) (Vault stops logging activity at this point) E Then the bridge→vault socket buffers fill up F Request processing in the bridge continues until ~200 RPMs are queued (Bridge stops logging activity at this point) G The client→bridge socket buffers, usually full for me most of the time because the client processing is faster than the bridge, would eventually empty and the bridge would block until more data from the (actually missing) client arrives. I have seen this process terminate at D and F so far; i.e. there has always been visible activity logged on the bridge at least up to the point of an ECONNRESET and the bridge and server terminating the processing of this request; but that may not be the case for you. Still, the process with data queued in its send buffer should terminate within 30 minutes or so if the peer completely goes away. [This is different from a peer hanging; a peer hanging but continuing to ACK that it can not receive more data will keep the connection open indefinitely. But that is clearly not the case for your PR_HOST_UNREACHABLE_ERROR.] FWIW all of this will “only” convert server hangs into signing failures; it will still be necessary to either resolve the cause of the “host unreachable” errors (which are unlikely to be within sigul) or to have clients retry after that network error goes away. Thanks for all the info. I have updated to 0.102 here. Will let you know any results. With both client and brige updated to 0.102 I get the following error on the bridge trying so sign Rawhide: 2015-11-28 08:52:27,341 ERROR: sign-rpms:send replies: I/O error: NSPR connection reset 2015-11-28 08:52:41,424 DEBUG: sign-rpms:koji replies: Sending queue EOF failed, queue already orphaned 2015-11-28 08:52:41,424 INFO: sign-rpms:koji replies: Writing to work queue failed, queue orphaned 2015-11-28 08:52:42,362 DEBUG: sign-rpms:read replies: Exception, shutting down write side as well 2015-11-28 08:52:42,362 DEBUG: sign-rpms:read replies: Sending queue EOF failed, queue already orphaned 2015-11-28 08:52:42,362 INFO: sign-rpms:read replies: Writing to work queue failed, queue orphaned 2015-11-28 08:52:42,363 ERROR: sign-rpms:send requests: I/O error: NSPR connection reset 2015-11-28 08:52:56,447 DEBUG: sign-rpms:koji requests: Sending queue EOF failed, queue already orphaned 2015-11-28 08:52:56,447 INFO: sign-rpms:koji requests: Writing to work queue failed, queue orphaned it seems like sigul was not done writing errors when I posted comment:22 - bug 1283364 contains the full error. So, the last week I was pushing updates the batches all worked fine for me aside from one case where the client didn't see that the server was all done. Till: Can you test batches with f24 again and see what issues (if any) remain? (Since it looks like 1283364 was fixed by iptables rule adjustments?) I got a lockup on the client today client side shows 14961 pts/8 S+ 0:04 | \_ /usr/bin/python ./sigulsign_unsigned.py fedora-23 -v --write-all --sigul-batch-size=50 libtool-2.4.6-8.fc23 llvm-3.7.0-3.fc23 elfutils-0.165-2.fc23 gnome-calendar-3.18.2.1-2.fc23 nfs-utils-1.3.3-6.rc3.fc23 kernel-4.3.3-301.fc23 python-pandas-0.17.1-1.fc23 ardour4-4.6.0-1.fc23 nodejs-mysql-2.6.2-3.fc23 perl-WWW-Twilio-TwiML-1.05-1.fc23 libdkimpp-1.0.8-1.fc23 python-flower-0.8.3-5.fc23 qmmp-0.9.6-1.fc23 pcre2-10.21-1.fc23 php-twig-1.23.3-1.fc23 docker-compose-1.5.2-4.fc23 openscap-daemon-0.1.1-2.fc23 perl-Net-GitHub-0.82-1.fc23 python-boto3-1.2.3-1.fc23 perl-Getopt-Lucid-1.06-1.fc23 perl-Date-Holidays-DE-1.7-1.fc23 gofer-2.7.1-1.fc23 mk-files-20151111-1.fc23 fontopia-1.1-2.fc23 vdr-2.2.0-8.fc23 python-ivi-0.14.9-3.fc23 dcap-2.47.10-1.fc23 wordwarvi-1.1-1.git6beed31.fc23 nordugrid-arc-5.0.5-1.fc23 nordugrid-arc-doc-2.0.6-1.fc23 coreboot-utils-4.2-1.fc23 ocserv-0.10.11-1.fc23 perl-Razor-Agent-2.85-21.fc23 python-voluptuous-0.8.8-1.fc23 python-photutils-0.2-1.fc23 weechat-1.4-1.fc23 batctl-2015.2-1.fc23 fastd-17-4.fc23 libuecc-6-1.fc23 openttd-1.5.3-1.fc23 gphotoframe-2.0.2-2.hg2084299dffb6.fc23 yash-2.40-1.fc23 prosody-0.9.9-2.fc23 php-di-symfony2-bridge-1.1.0-1.fc23 php-di-phpdoc-reader-2.0.1-1.fc23 php-di-invoker-1.2.0-1.fc23 php-mnapoli-phpunit-easymock-0.2.1-1.fc23 php-container-interop-1.1.0-1.fc23 php-di-5.2.0-1.fc23 checkstyle-6.13-1.fc23 rubygem-pkg-config-1.1.7-1.fc23 dislocker-0.5.1-2.fc23 libsigrok-0.3.0-4.fc23 hash-slinger-2.7-1.fc23 docker-client-3.1.9-1.fc23 freeciv-2.5.2-1.fc23 metis-5.1.0-10.fc23 voms-2.0.12-7.fc23 php-pear-PHP-CodeSniffer-2.5.1-1.fc23 golang-github-DataDog-datadog-go-0-0.1.gitb050cd8.fc23 mediainfo-0.7.81-1.fc23 libmediainfo-0.7.81-1.fc23 perl-System-Command-1.117-1.fc23 python-ldap3-0.9.8.6-3.fc23 ibus-anthy-1.5.8-1.fc23 elementary-1.16.1-1.fc23 perl-Log-Report-Optional-1.02-1.fc23 yoshimi-1.3.8.1-1.fc23 scap-security-guide-0.1.28-1.fc23 eclipse-pydev-4.5.1-1.fc23 gsi-openssh-7.1p2-1.fc23 evas-generic-loaders-1.16.0-2.fc23 texlive-2014-18.20140525_r34255.fc23 python-gammu-2.5-1.fc23 nsd-4.1.7-4.fc23 efl-1.16.1-2.fc23 mg-20160118-1.fc23 pidgin-window-merge-0.3-5.fc23 sssd-1.13.3-2.fc23 docker-registry-0.9.1-4.fc23 scalapack-2.0.2-12.fc23 cutter-1.2.5-1.fc23 qucs-0.0.18-7.fc23 python-bcrypt-2.0.0-3.fc23 pcsc-lite-1.8.15-1.fc23 vagrant-libvirt-0.0.32-1.fc23 perl-Number-Fraction-2.00-1.fc23 mkvtoolnix-8.8.0-1.fc23 colorhug-client-0.2.8-1.fc23 python-monotonic-0.5-1.fc23 hwdata-0.285-2.fc23 gnome-2048-3.18.2-2.fc23 rubygem-test-unit-rr-1.0.5-1.fc23 rubygem-test-unit-3.1.7-100.fc23 mozilla-noscript-2.9.0.2-1.fc23 php-pecl-propro-1.0.2-1.fc23 php-pecl-raphf-1.1.2-1.fc23 gnome-abrt-1.2.1-1.fc23 libinput-1.1.4-3.fc23 lxqt-notificationd-0.10.0-3.fc23 liblxqt-0.10.0-7.fc23 lxqt-panel-0.10.0-4.fc23 lxqt-config-0.10.0-3.fc23 lxqt-common-0.10.0-7.fc23 lxqt-powermanagement-0.10.0-3.fc23 lxqt-policykit-0.10.0-3.fc23 lxqt-globalkeys-0.10.0-4.fc23 lxqt-session-0.10.0-3.fc23 lxqt-about-0.10.0-4.fc23 libqtxdg-1.3.0-2.fc23 lxqt-sudo-0.10.0-4.fc23 pcmanfm-qt-0.10.0-2.fc23 19440 pts/8 Sl+ 0:00 | \_ /usr/bin/python /usr/share/sigul/client.py --batch sign-rpms --store-in-koji --koji-only --v3-signature fedora-23 eclipse-pydev-debuginfo-4.5.1-1.fc23.x86_64.rpm texlive-tabvar-svn28908.1.7-18.fc23.noarch.rpm texlive-supertabular-svn15878.4.1a-18.fc23.noarch.rpm texlive-dtk-doc-svn27443.1.29-18.fc23.noarch.rpm texlive-listings-svn33095.1.5c-18.fc23.noarch.rpm texlive-bidi-doc-svn32896.14.6-18.fc23.noarch.rpm texlive-figsize-svn18784.0.1-18.fc23.noarch.rpm texlive-newfile-doc-svn15878.1.0c-18.fc23.noarch.rpm efl-debuginfo-1.16.1-2.fc23.armv7hl.rpm texlive-babel-albanian-svn30254.1.0c-18.fc23.noarch.rpm texlive-threeparttable-doc-svn17383.0-18.fc23.noarch.rpm texlive-pxbase-svn28516.0.5-18.fc23.noarch.rpm qucs-0.0.18-7.fc23.x86_64.rpm texlive-latexdiff-doc-svn30961.1.0.3-18.fc23.noarch.rpm texlive-digiconfigs-svn15878.0.5-18.fc23.noarch.rpm texlive-vwcol-doc-svn15878.0.1-18.fc23.noarch.rpm texlive-ogham-svn24876.0-18.fc23.noarch.rpm texlive-sourcecodepro-doc-svn32006.2.2-18.fc23.noarch.rpm texlive-jfontmaps-bin-svn29848.0-18.20140525_r34255.fc23.noarch.rpm texlive-ifplatform-svn21156.0.3a-18.fc23.noarch.rpm texlive-etoolbox-doc-svn20922.2.1-18.fc23.noarch.rpm texlive-dialogl-svn28946.0-18.fc23.noarch.rpm texlive-slantsc-svn25007.2.11-18.fc23.noarch.rpm texlive-covington-doc-svn17750.0-18.fc23.noarch.rpm texlive-hepparticles-svn15878.0-18.fc23.noarch.rpm texlive-pst-2dplot-svn15878.1.5-18.fc23.noarch.rpm texlive-biblatex-dw-svn31752.1.6a-18.fc23.noarch.rpm texlive-lgreek-svn21818.0-18.fc23.noarch.rpm scalapack-mpich-2.0.2-12.fc23.armv7hl.rpm texlive-edmac-svn15878.0-18.fc23.noarch.rpm texlive-altfont-doc-svn15878.1.1-18.fc23.noarch.rpm texlive-chemcono-doc-svn17119.1.3-18.fc23.noarch.rpm texlive-memoir-doc-svn31044.3.7b-18.fc23.noarch.rpm texlive-chextras-doc-svn27118.1.01-18.fc23.noarch.rpm texlive-lfb-svn15878.1.0-18.fc23.noarch.rpm texlive-gfsbaskerville-svn19440.1.0-18.fc23.noarch.rpm texlive-tpic2pdftex-bin-svn29741.0-18.20140525_r34255.fc23.noarch.rpm texlive-bibleref-svn21145.1.14-18.fc23.noarch.rpm texlive-auto-pst-pdf-doc-svn23723.0.6-18.fc23.noarch.rpm texlive-uri-svn21608.0-18.fc23.noarch.rpm texlive-libris-svn19409.1.007-18.fc23.noarch.rpm texlive-todonotes-doc-svn28362.0-18.fc23.noarch.rpm texlive-shadowtext-svn26522.0.3-18.fc23.noarch.rpm texlive-sfmath-svn15878.0.8-18.fc23.noarch.rpm texlive-thumbs-doc-svn33134.1.0n-18.fc23.noarch.rpm texlive-unamthesis-doc-svn29799.2.02-18.fc23.noarch.rpm texlive-tkz-linknodes-doc-svn22833.1.0c-18.fc23.noarch.rpm texlive-mdframed-doc-svn31075.1.9b-18.fc23.noarch.rpm texlive-bgteubner-svn25892.2.02-18.fc23.noarch.rpm texlive-gsftopk-bin-svn30088.0-18.20140525_r34255.fc23.x86_64.rpm 19441 pts/8 S+ 0:00 | \_ /usr/bin/python /usr/share/sigul/client.py --batch sign-rpms --store-in-koji --koji-only --v3-signature fedora-23 eclipse-pydev-debuginfo-4.5.1-1.fc23.x86_64.rpm texlive-tabvar-svn28908.1.7-18.fc23.noarch.rpm texlive-supertabular-svn15878.4.1a-18.fc23.noarch.rpm texlive-dtk-doc-svn27443.1.29-18.fc23.noarch.rpm texlive-listings-svn33095.1.5c-18.fc23.noarch.rpm texlive-bidi-doc-svn32896.14.6-18.fc23.noarch.rpm texlive-figsize-svn18784.0.1-18.fc23.noarch.rpm texlive-newfile-doc-svn15878.1.0c-18.fc23.noarch.rpm efl-debuginfo-1.16.1-2.fc23.armv7hl.rpm texlive-babel-albanian-svn30254.1.0c-18.fc23.noarch.rpm texlive-threeparttable-doc-svn17383.0-18.fc23.noarch.rpm texlive-pxbase-svn28516.0.5-18.fc23.noarch.rpm qucs-0.0.18-7.fc23.x86_64.rpm texlive-latexdiff-doc-svn30961.1.0.3-18.fc23.noarch.rpm texlive-digiconfigs-svn15878.0.5-18.fc23.noarch.rpm texlive-vwcol-doc-svn15878.0.1-18.fc23.noarch.rpm texlive-ogham-svn24876.0-18.fc23.noarch.rpm texlive-sourcecodepro-doc-svn32006.2.2-18.fc23.noarch.rpm texlive-jfontmaps-bin-svn29848.0-18.20140525_r34255.fc23.noarch.rpm texlive-ifplatform-svn21156.0.3a-18.fc23.noarch.rpm texlive-etoolbox-doc-svn20922.2.1-18.fc23.noarch.rpm texlive-dialogl-svn28946.0-18.fc23.noarch.rpm texlive-slantsc-svn25007.2.11-18.fc23.noarch.rpm texlive-covington-doc-svn17750.0-18.fc23.noarch.rpm texlive-hepparticles-svn15878.0-18.fc23.noarch.rpm texlive-pst-2dplot-svn15878.1.5-18.fc23.noarch.rpm texlive-biblatex-dw-svn31752.1.6a-18.fc23.noarch.rpm texlive-lgreek-svn21818.0-18.fc23.noarch.rpm scalapack-mpich-2.0.2-12.fc23.armv7hl.rpm texlive-edmac-svn15878.0-18.fc23.noarch.rpm texlive-altfont-doc-svn15878.1.1-18.fc23.noarch.rpm texlive-chemcono-doc-svn17119.1.3-18.fc23.noarch.rpm texlive-memoir-doc-svn31044.3.7b-18.fc23.noarch.rpm texlive-chextras-doc-svn27118.1.01-18.fc23.noarch.rpm texlive-lfb-svn15878.1.0-18.fc23.noarch.rpm texlive-gfsbaskerville-svn19440.1.0-18.fc23.noarch.rpm texlive-tpic2pdftex-bin-svn29741.0-18.20140525_r34255.fc23.noarch.rpm texlive-bibleref-svn21145.1.14-18.fc23.noarch.rpm texlive-auto-pst-pdf-doc-svn23723.0.6-18.fc23.noarch.rpm texlive-uri-svn21608.0-18.fc23.noarch.rpm texlive-libris-svn19409.1.007-18.fc23.noarch.rpm texlive-todonotes-doc-svn28362.0-18.fc23.noarch.rpm texlive-shadowtext-svn26522.0.3-18.fc23.noarch.rpm texlive-sfmath-svn15878.0.8-18.fc23.noarch.rpm texlive-thumbs-doc-svn33134.1.0n-18.fc23.noarch.rpm texlive-unamthesis-doc-svn29799.2.02-18.fc23.noarch.rpm texlive-tkz-linknodes-doc-svn22833.1.0c-18.fc23.noarch.rpm texlive-mdframed-doc-svn31075.1.9b-18.fc23.noarch.rpm texlive-bgteubner-svn25892.2.02-18.fc23.noarch.rpm texlive-gsftopk-bin-svn30088.0-18.20140525_r34255.fc23.x86_64.rpm [ausil@bodhi-backend01 scripts]$ lsof |grep python|grep TCP|grep FIN_WAIT2 python 19449 ausil 7u IPv4 2846994384 0t0 TCP localhost:52957->localhost:55554 (FIN_WAIT2) python 19449 19452 ausil 7u IPv4 2846994384 0t0 TCP localhost:52957->localhost:55554 (FIN_WAIT2) python 19450 ausil 7u IPv4 2846761826 0t0 TCP bodhi-backend01.phx2.fedoraproject.org:56028->sign-bridge1:44334 (FIN_WAIT2) on the bridge tail -50 /var/log/sigul_bridge.log-20160117 2016-01-20 17:28:51,869 INFO: Subrequest: 'id' = '\x00\x00\x00 ', 'rpm-arch' = 'noarch', 'rpm-epoch' = '4', 'rpm-name' = 'texlive-arabxetex-doc', 'rpm-release' = '18.fc23', 'rpm-sigmd5' = '\x04\xafryX\x1e\x80q2!\xfd\xf2\xd1)\xa8\x84', 'rpm-version' = 'svn17470.v1.1.4' 2016-01-20 17:28:51,880 DEBUG: sign-rpms:read requests: Started handling 'id' = '\x00\x00\x00!', 'rpm-arch' = 'noarch', 'rpm-epoch' = '4', 'rpm-name' = 'texlive-l2tabu-doc', 'rpm-release' = '18.fc23', 'rpm-sigmd5' = '\x1cy\tc\xfa\x94\x80\x1c\xda\x0f\x0c*.\xd7\x1a\x02', 'rpm-version' = 'svn24038.2.3' 2016-01-20 17:28:51,880 INFO: Subrequest: 'id' = '\x00\x00\x00!', 'rpm-arch' = 'noarch', 'rpm-epoch' = '4', 'rpm-name' = 'texlive-l2tabu-doc', 'rpm-release' = '18.fc23', 'rpm-sigmd5' = '\x1cy\tc\xfa\x94\x80\x1c\xda\x0f\x0c*.\xd7\x1a\x02', 'rpm-version' = 'svn24038.2.3' 2016-01-20 17:28:51,886 DEBUG: sign-rpms:read requests: Started handling 'id' = '\x00\x00\x00"', 'rpm-arch' = 'noarch', 'rpm-epoch' = '4', 'rpm-name' = 'texlive-kotex-utils', 'rpm-release' = '18.fc23', 'rpm-sigmd5' = 'L\x02V&[\x16\xe8,m\x01\x8d\x16<\x05\x02\x98', 'rpm-version' = 'svn32101.2.0.1' 2016-01-20 17:28:51,886 INFO: Subrequest: 'id' = '\x00\x00\x00"', 'rpm-arch' = 'noarch', 'rpm-epoch' = '4', 'rpm-name' = 'texlive-kotex-utils', 'rpm-release' = '18.fc23', 'rpm-sigmd5' = 'L\x02V&[\x16\xe8,m\x01\x8d\x16<\x05\x02\x98', 'rpm-version' = 'svn32101.2.0.1' 2016-01-20 17:28:51,892 DEBUG: sign-rpms:read requests: Started handling 'id' = '\x00\x00\x00#', 'rpm-arch' = 'noarch', 'rpm-epoch' = '4', 'rpm-name' = 'texlive-hitec', 'rpm-release' = '18.fc23', 'rpm-sigmd5' = '\x06|Y\xc5-\xa2\x89\xfb\xd8*\xc2\xeaa(\n\x02', 'rpm-version' = 'svn15878.0.0_beta_' 2016-01-20 17:28:51,892 INFO: Subrequest: 'id' = '\x00\x00\x00#', 'rpm-arch' = 'noarch', 'rpm-epoch' = '4', 'rpm-name' = 'texlive-hitec', 'rpm-release' = '18.fc23', 'rpm-sigmd5' = '\x06|Y\xc5-\xa2\x89\xfb\xd8*\xc2\xeaa(\n\x02', 'rpm-version' = 'svn15878.0.0_beta_' 2016-01-20 17:28:51,898 DEBUG: sign-rpms:read requests: Started handling 'id' = '\x00\x00\x00$', 'rpm-arch' = 'noarch', 'rpm-epoch' = '4', 'rpm-name' = 'texlive-feynmf', 'rpm-release' = '18.fc23', 'rpm-sigmd5' = '\xeakE\xe3\xc6\x99\n2*\xd16^\x8f\xe6nT', 'rpm-version' = 'svn17259.1.08' 2016-01-20 17:28:51,898 INFO: Subrequest: 'id' = '\x00\x00\x00$', 'rpm-arch' = 'noarch', 'rpm-epoch' = '4', 'rpm-name' = 'texlive-feynmf', 'rpm-release' = '18.fc23', 'rpm-sigmd5' = '\xeakE\xe3\xc6\x99\n2*\xd16^\x8f\xe6nT', 'rpm-version' = 'svn17259.1.08' 2016-01-20 17:28:51,905 DEBUG: sign-rpms:read requests: Started handling 'id' = '\x00\x00\x00%', 'rpm-arch' = 'noarch', 'rpm-epoch' = '4', 'rpm-name' = 'texlive-l3packages', 'rpm-release' = '18.fc23', 'rpm-sigmd5' = '\x8b3f\x17\x85>1\x95F\x99\xdaC\x9e\xc4\xdd\xa8', 'rpm-version' = 'svn32537.SVN_4634' 2016-01-20 17:28:51,905 INFO: Subrequest: 'id' = '\x00\x00\x00%', 'rpm-arch' = 'noarch', 'rpm-epoch' = '4', 'rpm-name' = 'texlive-l3packages', 'rpm-release' = '18.fc23', 'rpm-sigmd5' = '\x8b3f\x17\x85>1\x95F\x99\xdaC\x9e\xc4\xdd\xa8', 'rpm-version' = 'svn32537.SVN_4634' 2016-01-20 17:28:51,911 DEBUG: sign-rpms:read requests: Started handling 'id' = '\x00\x00\x00&', 'rpm-arch' = 'noarch', 'rpm-epoch' = '4', 'rpm-name' = 'texlive-aecc', 'rpm-release' = '18.fc23', 'rpm-sigmd5' = '\xdc\x1b\xff\xfe\x8f\xbff\x86\xbfPZ\xa4u\x9c\xe5N', 'rpm-version' = 'svn28574.0' 2016-01-20 17:28:51,912 INFO: Subrequest: 'id' = '\x00\x00\x00&', 'rpm-arch' = 'noarch', 'rpm-epoch' = '4', 'rpm-name' = 'texlive-aecc', 'rpm-release' = '18.fc23', 'rpm-sigmd5' = '\xdc\x1b\xff\xfe\x8f\xbff\x86\xbfPZ\xa4u\x9c\xe5N', 'rpm-version' = 'svn28574.0' 2016-01-20 17:28:51,917 DEBUG: sign-rpms:read requests: Started handling 'id' = "\x00\x00\x00'", 'rpm-arch' = 'x86_64', 'rpm-epoch' = '', 'rpm-name' = 'blacs-common', 'rpm-release' = '12.fc23', 'rpm-sigmd5' = '\x03\x96\xff\xa0J8R\x98\x18\xf6\xfd\x99r\x04;\xbc', 'rpm-version' = '2.0.2' 2016-01-20 17:28:51,917 INFO: Subrequest: 'id' = "\x00\x00\x00'", 'rpm-arch' = 'x86_64', 'rpm-epoch' = '', 'rpm-name' = 'blacs-common', 'rpm-release' = '12.fc23', 'rpm-sigmd5' = '\x03\x96\xff\xa0J8R\x98\x18\xf6\xfd\x99r\x04;\xbc', 'rpm-version' = '2.0.2' 2016-01-20 17:28:51,923 DEBUG: sign-rpms:read requests: Started handling 'id' = '\x00\x00\x00(', 'rpm-arch' = 'i686', 'rpm-epoch' = '', 'rpm-name' = 'scalapack-debuginfo', 'rpm-release' = '12.fc23', 'rpm-sigmd5' = '"Ly\xa8)\x11\xef\xbe\xe3t\xf8As\xe0\xd2M', 'rpm-version' = '2.0.2' 2016-01-20 17:28:51,923 INFO: Subrequest: 'id' = '\x00\x00\x00(', 'rpm-arch' = 'i686', 'rpm-epoch' = '', 'rpm-name' = 'scalapack-debuginfo', 'rpm-release' = '12.fc23', 'rpm-sigmd5' = '"Ly\xa8)\x11\xef\xbe\xe3t\xf8As\xe0\xd2M', 'rpm-version' = '2.0.2' 2016-01-20 17:28:51,929 DEBUG: sign-rpms:read requests: Started handling 'id' = '\x00\x00\x00)', 'rpm-arch' = 'noarch', 'rpm-epoch' = '4', 'rpm-name' = 'texlive-xetexfontinfo-doc', 'rpm-release' = '18.fc23', 'rpm-sigmd5' = '\'F\x9d$\x12\xc2\xe5\xa01\xc8\x01"\xb8j\xd7\x12', 'rpm-version' = 'svn15878.0' 2016-01-20 17:28:51,929 INFO: Subrequest: 'id' = '\x00\x00\x00)', 'rpm-arch' = 'noarch', 'rpm-epoch' = '4', 'rpm-name' = 'texlive-xetexfontinfo-doc', 'rpm-release' = '18.fc23', 'rpm-sigmd5' = '\'F\x9d$\x12\xc2\xe5\xa01\xc8\x01"\xb8j\xd7\x12', 'rpm-version' = 'svn15878.0' 2016-01-20 17:28:51,954 DEBUG: sign-rpms:read requests: Started handling 'id' = '\x00\x00\x00*', 'rpm-arch' = 'noarch', 'rpm-epoch' = '4', 'rpm-name' = 'texlive-icsv-doc', 'rpm-release' = '18.fc23', 'rpm-sigmd5' = "V\x18\x8d\x8bi\x9dg'\xbeH)\xdc\xa9\x95\xe3\x0f", 'rpm-version' = 'svn15878.0.2' 2016-01-20 17:28:51,954 INFO: Subrequest: 'id' = '\x00\x00\x00*', 'rpm-arch' = 'noarch', 'rpm-epoch' = '4', 'rpm-name' = 'texlive-icsv-doc', 'rpm-release' = '18.fc23', 'rpm-sigmd5' = "V\x18\x8d\x8bi\x9dg'\xbeH)\xdc\xa9\x95\xe3\x0f", 'rpm-version' = 'svn15878.0.2' 2016-01-20 17:28:51,965 DEBUG: sign-rpms:read requests: Started handling 'id' = '\x00\x00\x00+', 'rpm-arch' = 'noarch', 'rpm-epoch' = '4', 'rpm-name' = 'texlive-tdsfrmath', 'rpm-release' = '18.fc23', 'rpm-sigmd5' = '\xff\xda\xcd\x97\x1d\xdcQ;7\xdb(\x7f\x8a\x8b\xd4\x01', 'rpm-version' = 'svn15878.1.3' 2016-01-20 17:28:51,965 INFO: Subrequest: 'id' = '\x00\x00\x00+', 'rpm-arch' = 'noarch', 'rpm-epoch' = '4', 'rpm-name' = 'texlive-tdsfrmath', 'rpm-release' = '18.fc23', 'rpm-sigmd5' = '\xff\xda\xcd\x97\x1d\xdcQ;7\xdb(\x7f\x8a\x8b\xd4\x01', 'rpm-version' = 'svn15878.1.3' 2016-01-20 17:28:51,972 DEBUG: sign-rpms:read requests: Started handling 'id' = '\x00\x00\x00,', 'rpm-arch' = 'noarch', 'rpm-epoch' = '', 'rpm-name' = 'scap-security-guide', 'rpm-release' = '1.fc23', 'rpm-sigmd5' = '\x00\xd8\xcd\x8f\xd4\x87\x15B\x14lD&\xd0>K\xa7', 'rpm-version' = '0.1.28' 2016-01-20 17:28:51,972 INFO: Subrequest: 'id' = '\x00\x00\x00,', 'rpm-arch' = 'noarch', 'rpm-epoch' = '', 'rpm-name' = 'scap-security-guide', 'rpm-release' = '1.fc23', 'rpm-sigmd5' = '\x00\xd8\xcd\x8f\xd4\x87\x15B\x14lD&\xd0>K\xa7', 'rpm-version' = '0.1.28' 2016-01-20 17:28:51,978 DEBUG: sign-rpms:read requests: Started handling 'id' = '\x00\x00\x00-', 'rpm-arch' = 'noarch', 'rpm-epoch' = '4', 'rpm-name' = 'texlive-arabi', 'rpm-release' = '18.fc23', 'rpm-sigmd5' = 'J\xe9\xfd\xbep\xf0\xa9QZ\xcaD\xd8\xc4yZ\x94', 'rpm-version' = 'svn25095.1.1' 2016-01-20 17:28:51,978 INFO: Subrequest: 'id' = '\x00\x00\x00-', 'rpm-arch' = 'noarch', 'rpm-epoch' = '4', 'rpm-name' = 'texlive-arabi', 'rpm-release' = '18.fc23', 'rpm-sigmd5' = 'J\xe9\xfd\xbep\xf0\xa9QZ\xcaD\xd8\xc4yZ\x94', 'rpm-version' = 'svn25095.1.1' 2016-01-20 17:28:51,984 DEBUG: sign-rpms:read requests: Started handling 'id' = '\x00\x00\x00.', 'rpm-arch' = 'noarch', 'rpm-epoch' = '4', 'rpm-name' = 'texlive-lewis-doc', 'rpm-release' = '18.fc23', 'rpm-sigmd5' = "qpi\xfc\xd8,4b\xb8'\x9ce\xf07\xa9\x02", 'rpm-version' = 'svn15878.0.1' 2016-01-20 17:28:51,984 INFO: Subrequest: 'id' = '\x00\x00\x00.', 'rpm-arch' = 'noarch', 'rpm-epoch' = '4', 'rpm-name' = 'texlive-lewis-doc', 'rpm-release' = '18.fc23', 'rpm-sigmd5' = "qpi\xfc\xd8,4b\xb8'\x9ce\xf07\xa9\x02", 'rpm-version' = 'svn15878.0.1' 2016-01-20 17:28:51,990 DEBUG: sign-rpms:read requests: Started handling 'id' = '\x00\x00\x00/', 'rpm-arch' = 'x86_64', 'rpm-epoch' = '4', 'rpm-name' = 'texlive-kpathsea-bin', 'rpm-release' = '18.20140525_r34255.fc23', 'rpm-sigmd5' = "N)\x94m;\xb3|\xa2N\xc5\xfe\xd6\x0e\xd7'9", 'rpm-version' = 'svn30088.0' 2016-01-20 17:28:51,990 INFO: Subrequest: 'id' = '\x00\x00\x00/', 'rpm-arch' = 'x86_64', 'rpm-epoch' = '4', 'rpm-name' = 'texlive-kpathsea-bin', 'rpm-release' = '18.20140525_r34255.fc23', 'rpm-sigmd5' = "N)\x94m;\xb3|\xa2N\xc5\xfe\xd6\x0e\xd7'9", 'rpm-version' = 'svn30088.0' 2016-01-20 17:28:51,998 DEBUG: sign-rpms:read requests: Started handling 'id' = '\x00\x00\x000', 'rpm-arch' = 'noarch', 'rpm-epoch' = '4', 'rpm-name' = 'texlive-scheme-minimal', 'rpm-release' = '18.20140525_r34255.fc23', 'rpm-sigmd5' = 'q\x07\x88oU\xb5(\x1ew\t\x83\xa3\x8c\xefnm', 'rpm-version' = 'svn13822.0' 2016-01-20 17:28:51,998 INFO: Subrequest: 'id' = '\x00\x00\x000', 'rpm-arch' = 'noarch', 'rpm-epoch' = '4', 'rpm-name' = 'texlive-scheme-minimal', 'rpm-release' = '18.20140525_r34255.fc23', 'rpm-sigmd5' = 'q\x07\x88oU\xb5(\x1ew\t\x83\xa3\x8c\xefnm', 'rpm-version' = 'svn13822.0' 2016-01-20 17:28:52,004 DEBUG: sign-rpms:read requests: Started handling 'id' = '\x00\x00\x001', 'rpm-arch' = 'i686', 'rpm-epoch' = '', 'rpm-name' = 'mkvtoolnix', 'rpm-release' = '1.fc23', 'rpm-sigmd5' = '%\x07K8rq\xbd\x94\x11\xae8\xf4G\x17s\xa2', 'rpm-version' = '8.8.0' 2016-01-20 17:28:52,004 INFO: Subrequest: 'id' = '\x00\x00\x001', 'rpm-arch' = 'i686', 'rpm-epoch' = '', 'rpm-name' = 'mkvtoolnix', 'rpm-release' = '1.fc23', 'rpm-sigmd5' = '%\x07K8rq\xbd\x94\x11\xae8\xf4G\x17s\xa2', 'rpm-version' = '8.8.0' 2016-01-20 17:28:52,005 DEBUG: sign-rpms:read requests finished, exc_info: None 2016-01-20 17:28:52,005 DEBUG: Sending final EOFs to sign-rpms:koji requests... 2016-01-20 17:28:52,005 DEBUG: Waiting for sign-rpms:koji requests... 2016-01-20 17:28:53,221 DEBUG: sign-rpms:koji requests: Started handling request 'id' = '\x00\x00\x00\x01', 'rpm-arch' = 'noarch', 'rpm-epoch' = '4', 'rpm-name' = 'texlive-hyphen-uppersorbian', 'rpm-release' = '18.fc23', 'rpm-sigmd5' = '\xfe\x0e3\x8e\x80\xb9\x80\x80%\xc1T\xc7rT\xfe)', 'rpm-version' = 'svn23085.0' 2016-01-20 17:28:53,221 DEBUG: sign-rpms:send requests: Started handling 'id' = '\x00\x00\x00\x00', 'rpm-arch' = 'noarch', 'rpm-epoch' = '4', 'rpm-name' = 'texlive-gmverb', 'rpm-release' = '18.fc23', 'rpm-sigmd5' = 'J\xd9\xcc\xc9\x05\xe7\xbb\x1b\x11\xb7-\xc1cu\x9e?', 'rpm-version' = 'svn24288.v0.98' 2016-01-20 17:28:53,398 DEBUG: sign-rpms:koji requests: Started handling request 'id' = '\x00\x00\x00\x02', 'rpm-arch' = 'noarch', 'rpm-epoch' = '4', 'rpm-name' = 'texlive-figflow', 'rpm-release' = '18.fc23', 'rpm-sigmd5' = ')l\xad$\xea\x85\x11\x0fX\xcc\x03q\xf6-\x01\xb7', 'rpm-version' = 'svn21462.0' 2016-01-20 17:28:53,582 DEBUG: sign-rpms:koji requests: Started handling request 'id' = '\x00\x00\x00\x03', 'rpm-arch' = 'noarch', 'rpm-epoch' = '4', 'rpm-name' = 'texlive-pst-ob3d', 'rpm-release' = '18.fc23', 'rpm-sigmd5' = "\x12\xa0\xd58|\x1d\x0cY\xee'\xc7w\xebs\x15\xac", 'rpm-version' = 'svn15878.0.21' 2016-01-20 17:28:53,787 DEBUG: sign-rpms:koji requests: Started handling request 'id' = '\x00\x00\x00\x04', 'rpm-arch' = 'noarch', 'rpm-epoch' = '4', 'rpm-name' = 'texlive-plates-doc', 'rpm-release' = '18.fc23', 'rpm-sigmd5' = '?\x01a\x875\x85\xaf`\xc4\xd8\x89\xaf\xe0O\x08<', 'rpm-version' = 'svn15878.0.1' 2016-01-20 17:28:54,161 DEBUG: sign-rpms:koji requests: Started handling request 'id' = '\x00\x00\x00\x05', 'rpm-arch' = 'noarch', 'rpm-epoch' = '4', 'rpm-name' = 'texlive-actuarialangle-doc', 'rpm-release' = '18.fc23', 'rpm-sigmd5' = '~\xca\xa6`}R\x95`5H\xd0\x84O\x8b\x1es', 'rpm-version' = 'svn28004.0' 2016-01-20 17:28:54,348 DEBUG: sign-rpms:koji requests: Started handling request 'id' = '\x00\x00\x00\x06', 'rpm-arch' = 'i686', 'rpm-epoch' = '4', 'rpm-name' = 'texlive-pstools-bin', 'rpm-release' = '18.20140525_r34255.fc23', 'rpm-sigmd5' = '\xcb3zO\x8a\xb1\xf7\xda\x82\xe1\x13$\xef\xaf\xb6J', 'rpm-version' = 'svn30088.0' 2016-01-20 17:28:54,534 DEBUG: sign-rpms:koji requests: Started handling request 'id' = '\x00\x00\x00\x07', 'rpm-arch' = 'noarch', 'rpm-epoch' = '4', 'rpm-name' = 'texlive-pkuthss-doc', 'rpm-release' = '18.fc23', 'rpm-sigmd5' = 'C\xafe\x03\x8d\xac\xa7\x05\xd5 \xc5\xcc\x8a\xf9\x9d\x82', 'rpm-version' = 'svn33464.1.5.2' 2016-01-20 17:28:54,709 DEBUG: sign-rpms:koji requests: Started handling request 'id' = '\x00\x00\x00\x08', 'rpm-arch' = 'noarch', 'rpm-epoch' = '4', 'rpm-name' = 'texlive-munich', 'rpm-release' = '18.fc23', 'rpm-sigmd5' = '\xdf\x80c5(\x03kU{\xce\xcd\x93\xdf\xd1\xb3\xf2', 'rpm-version' = 'svn15878.0' 2016-01-20 17:28:54,758 DEBUG: sign-rpms:send requests: Started handling 'id' = '\x00\x00\x00\x01', 'rpm-arch' = 'noarch', 'rpm-epoch' = '4', 'rpm-name' = 'texlive-hyphen-uppersorbian', 'rpm-release' = '18.fc23', 'rpm-sigmd5' = '\xfe\x0e3\x8e\x80\xb9\x80\x80%\xc1T\xc7rT\xfe)', 'rpm-version' = 'svn23085.0' 2016-01-20 17:28:54,785 DEBUG: sign-rpms:send requests: Started handling 'id' = '\x00\x00\x00\x02', 'rpm-arch' = 'noarch', 'rpm-epoch' = '4', 'rpm-name' = 'texlive-figflow', 'rpm-release' = '18.fc23', 'rpm-sigmd5' = ')l\xad$\xea\x85\x11\x0fX\xcc\x03q\xf6-\x01\xb7', 'rpm-version' = 'svn21462.0' 2016-01-20 17:28:54,807 DEBUG: sign-rpms:send requests: Started handling 'id' = '\x00\x00\x00\x03', 'rpm-arch' = 'noarch', 'rpm-epoch' = '4', 'rpm-name' = 'texlive-pst-ob3d', 'rpm-release' = '18.fc23', 'rpm-sigmd5' = "\x12\xa0\xd58|\x1d\x0cY\xee'\xc7w\xebs\x15\xac", 'rpm-version' = 'svn15878.0.21' @Dennis: If there is a stateful firewall on bodhi-backend01 it will most likely be the reason why the client hung. Ansible contains the firewall rules for autosign01 that I used to work around this. I put these same rules in place on bodhi-backend now. it seems that there are no crashs with the firewall disabled. However it appears to me that the client exists without errors even though not all packages were signed (not 100% sure). btw. did you notice https://fedorahosted.org/sigul/ticket/2 regardig a different patch required for sigul now? This bug appears to have been reported against 'rawhide' during the Fedora 24 development cycle. Changing version to '24'. More information and reason for this action is here: https://fedoraproject.org/wiki/Fedora_Program_Management/HouseKeeping/Fedora24#Rawhide_Rebase This package has changed ownership in the Fedora Package Database. Reassigning to the new owner of this component. (In reply to Till Maas from comment #28) > it seems that there are no crashs with the firewall disabled. However it > appears to me that the client exists without errors even though not all > packages were signed (not 100% sure). Did you ever manage to dig into this? Either discover that this is still a real problem or that it is not? nirik | to my knowledge that is fixed. Kevin says that this issue was fixed, so I'll close this bug out. If it turns out that this bug still happens, feel free to reopen. |