Bug 801168

Summary: Sigul hangs from time to time when signing.
Product: [Fedora] Fedora EPEL Reporter: Kevin Fenzi <kevin>
Component: sigulAssignee: Miloslav Trmač <mitr>
Status: CLOSED DUPLICATE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: el6CC: dcantrell, dennis, mitr
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-07-17 19:43:12 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Make sure to close the server socket in all cases none

Description Kevin Fenzi 2012-03-07 20:09:55 UTC
When signing a bunch of packages one at a time, from time to time the process will just stall. 

On the sign vault we see: 

27892 ?        Ss     0:03 /usr/bin/python /usr/share/sigul/server.py -v -v -d
14275 ?        S      0:00  \_ /usr/bin/python /usr/share/sigul/server.py -v -v -d
14276 ?        S      0:00      \_ /usr/bin/python /usr/share/sigul/server.py -v -v -d

If we kill -9 the child process (14276) things continue fine. 

On the vault in logs: 

2012-03-07 18:21:17,934 INFO: Signed RPM ('nautilus-debuginfo', None, '3.3.91', '1.fc17', 'x86_64', 
...snip...') with key fedora-17
2012-03-07 18:44:03,171 DEBUG: Child exited with status 9

On the bridge in logs: 

2012-03-07 18:22:17,356 DEBUG: Request handling finished
2012-03-07 18:22:17,357 DEBUG: Waiting for the server to connect
2012-03-07 18:44:01,354 DEBUG: Waiting for the client to connect

sigul-0.99-0.1.el6.noarch
using scripts at: 
http://git.fedorahosted.org/git/?p=releng

Comment 1 Dennis Gilmore 2012-06-08 18:04:08 UTC
i have done some further digging here

2012-06-08 17:38:25,587 DEBUG: sign-rpms:signing finished, exc_info: None
2012-06-08 17:38:25,587 DEBUG: Sending final EOFs to sign-rpms:replies...
2012-06-08 17:38:25,587 DEBUG: Waiting for sign-rpms:replies...
2012-06-08 17:38:25,589 DEBUG: sign-rpms:replies finished, exc_info: None
^C
[root@sign-vault02 ~]# lsof|grep TCP|grep python
python    4217     sigul    7u     IPv4              29980      0t0        TCP sign-vault1:55199->sign-bridge1:44333 (FIN_WAIT2)
[root@sign-vault02 ~]# ps axf|grep sigu
 4486 pts/0    S+     0:00                          \_ grep sigu
 1785 ?        Ss     0:00 /usr/bin/python /usr/share/sigul/server.py -v -d -v
 4216 ?        S      0:03  \_ /usr/bin/python /usr/share/sigul/server.py -v -d -v
 4217 ?        S      0:08      \_ /usr/bin/python /usr/share/sigul/server.py -v -d -v

[root@sign-bridge02 ~]# lsof|grep TCP|grep python
python    1309   sigul    6u     IPv4              12099      0t0        TCP *:44333 (LISTEN)
python    1309   sigul    7u     IPv4              12100      0t0        TCP *:44334 (LISTEN)
python    1309   sigul   10u     IPv4              17079      0t0        TCP sign-bridge1:42468->kojipkgs.fedoraproject.org:http (CLOSE_WAIT)

the bridge has closed the connection. but the server is waiting on a fin/ack to close the socket

when we kill the process on the server
2012-06-08 17:28:22,363 DEBUG: sign-rpms:replies finished, exc_info: None
2012-06-08 17:35:19,635 DEBUG: Child exited with status 9
2012-06-08 17:35:19,635 DEBUG: Request handling finished
2012-06-08 17:35:19,679 DEBUG: Waiting for a request

and it then will gladly sign the next batch of rpms.  we need to ensure that the server closes the socket. 

this seems to only happen with batch signing

Comment 2 Miloslav Trmač 2012-06-26 21:11:12 UTC
Created attachment 594600 [details]
Make sure to close the server socket in all cases

(In reply to comment #1)
> i have done some further digging here
> 
> 2012-06-08 17:38:25,589 DEBUG: sign-rpms:replies finished, exc_info: None
> python    4217     sigul    7u     IPv4              29980      0t0       
> TCP sign-vault1:55199->sign-bridge1:44333 (FIN_WAIT2)

> python    1309   sigul   10u     IPv4              17079      0t0        TCP
> sign-bridge1:42468->kojipkgs.fedoraproject.org:http (CLOSE_WAIT)

Thanks, this suggests one suspect code path.

Can you test this patch, please?

Comment 3 Miloslav Trmač 2012-07-17 19:43:12 UTC
AFAICT this should be fixed in (rawhide and http://people.redhat.com/mitr/rpmsigner/rhel6/) sigul-0.100, please reopen if not.

Comment 4 Amanda Carter 2015-11-04 21:02:11 UTC

*** This bug has been marked as a duplicate of bug 1272535 ***