Description of problem: spacewalk-repo-sync Version-Release number of selected component (if applicable): 1.7 How reproducible: Always. Happens on multiple repositories, from various servers, the only common element is that those repositories are missing packages, or problems reading their signatures. Steps to Reproduce: 1. create a channel with repository pointing to one that may be missing packages, e.g.: http://download.opensuse.org/repositories/devel:/languages:/python/openSUSE_12.1/ 2. attempt to synchronize the repository, either manually with spacewalk-repo-sync or via the web interface. it seems to happen more often when executed via the web interface/taskomatic 3. see that spacewalk-repo-sync eventually hangs, and requires SIGKILL Actual results: sample session: tail -f /var/log/rhn/repo-sync/*python* shows our last signs of activity: failure: x86_64/python-pyrad-1.2-2.1.x86_64.rpm from opensuse12.1-x86_64-python: [Errno 256] No more mirrors to try. 638/904 : python-pyrss2gen-1.0.0-2.1-0.x86_64 failure: x86_64/python-pyrss2gen-1.0.0-2.1.x86_64.rpm from opensuse12.1-x86_64-python: [Errno 256] No more mirrors to try. 639/904 : python-pyst-0.2-6.1-0.x86_64 failure: x86_64/python-pyst-0.2-6.1.x86_64.rpm from opensuse12.1-x86_64-python: [Errno 256] No more mirrors to try. 640/904 : python-pyxmpp-debuginfo-1.1.1-2.1-0.x86_64 failure: x86_64/python-pyxmpp-debuginfo-1.1.1-2.1.x86_64.rpm from opensuse12.1-x86_64-python: [Errno 256] No more mirrors to try. ps shows: root 11755 4.2 0.6 422360 53996 ? S 09:12 0:18 /usr/bin/python -u /usr/bin/spacewalk-repo-sync --channel opensuse12.1-x86_64-python --type yum # strace -t -f -p 11755 Process 11755 attached - interrupt to quit 09:21:45 write(2, "failure: x86_64/python-pyxmpp-de"..., 130 # finally, after issuing kill -9, taskomatic's log shows: INFO | jvm 1 | 2012/05/10 09:28:00 | failure: x86_64/python-pymongo-debugsource-2.1.1-4.1.x86_64.rpm from opensuse12.1-x86_64-python: [Errno 256] No more mirrors to try. INFO | jvm 1 | 2012/05/10 09:28:00 | failure: x86_64/python-pypar-debuginfo-2.1.4_94-1.1.x86_64.rpm from opensuse12.1-x86_64-python: [Errno 256] No more mirrors to try. INFO | jvm 1 | 2012/05/10 09:28:00 | failure: x86_64/python-pypar-debugsource-2.1.4_94-1.1.x86_64.rpm from opensuse12.1-x86_64-python: [Errno 256] No more mirrors to try. INFO | jvm 1 | 2012/05/10 09:28:00 | failure: x86_64/python-pyrad-1.2-2.1.x86_64.rpm from opensuse12.1-x86_64-python: [Errno 256] No more mirrors to try. INFO | jvm 1 | 2012/05/10 09:28:00 | failure: x86_64/python-pyrss2gen-1.0.0-2.1.x86_64.rpm from opensuse12.1-x86_64-python: [Errno 256] No more mirrors to try. INFO | jvm 1 | 2012/05/10 09:28:00 | failure: x86_64/python-pyst-0.2-6.1.x86_64.rpm from opensuse12.1-x86_64-python: [Errno 256] No more mirrors to try. INFO | jvm 1 | 2012/05/10 09:28:00 | Traceback (most recent call last): INFO | jvm 1 | 2012/05/10 09:28:00 | Expected results: Additional info: Biggest issue here is that there can be only a single spacewalk-repo-sync process running, so a failure caused by one channel will block all the other updates.
Well, I can't reproduce the bug with reproducer above. And I can't reproduce the bug even with repo containing 3500+ missing packages. # spacewalk-repo-sync -c test_repo_channel Repo URL: http://download.opensuse.org/repositories/devel:/languages:/python/openSUSE_12.1/ Packages in repo: 3529 Packages already synced: 0 Packages to sync: 3529 ... 3527/3529 : man-pages-es-extra-1.55-14.el6-0.noarch failure: ../Packages/man-pages-es-extra-1.55-14.el6.noarch.rpm from spacewalk-n-centos6: [Errno 256] No more mirrors to try. 3528/3529 : iso-codes-3.16-2.el6-0.noarch failure: ../Packages/iso-codes-3.16-2.el6.noarch.rpm from spacewalk-n-centos6: [Errno 256] No more mirrors to try. 3529/3529 : tftp-0.49-7.el6-0.x86_64 failure: ../Packages/tftp-0.49-7.el6.x86_64.rpm from spacewalk-n-centos6: [Errno 256] No more mirrors to try. Repo http://xxx/pub/repo/ has 0 errata. Sync completed. Total time: 0:00:23 I guess this is your local network issue.
Network issue is very unlikely, since the issue can be repeated with same repositories mirrored on a nearby server. The issue is always reproducible, regardless of source, and much larger repos from same sources succeed. Only correlation between spacewalk-repo-sync hanging and not seem to be repos with large number of missing packages. One thing to note: this is almost always triggered when spacewalk-repo-sync is executed via taskomatic. I have a very hard time reproducing it when launching spacewalk-repo-sync directly from CLI. Is there anything else I can provide to aid with debugging this issue? Thanks!
The reproducer is actually pretty deterministic. The best way to see the behaviour is to just have a repodata with no packages. To achieve that, do: # mkdir -p /var/www/html/pub/fedora-18/repodata # cd /var/www/html/pub/fedora-18/repodata # wget --no-directories -r --level 1 --no-parent http://dl.fedoraproject.org/pub/fedora/linux/releases/18/Everything/x86_64/os/repodata/ Then create channel and repository and point that repository to http://localhost/pub/fedora-18. Then click Sync Now. Look at the log in /var/log/rhn/reposync -- it ends with 575/33868 : openldap-servers-2.4.33-3.fc18-0.x86_64 failure: Packages/o/openldap-servers-2.4.33-3.fc18.x86_64.rpm from fedora-18: [Errno 256] No more mirrors to try. 576/33868 : OpenImageIO-devel-1.0.9-1.fc18-0.i686 failure: Packages/o/OpenImageIO-devel-1.0.9-1.fc18.i686.rpm from fedora-18: [Errno 256] No more mirrors to try. The lines with the "failure" string amount to 64551 bytes, the other lines total 29029 bytes. I assume that spacewalk-repo-sync outputs stuff both to the log file and to its stdout and stderr, and taskomatic tries to read the stdout output first and then things get stuck on that stderr pipe buffer.
Fixed in spacewalk nightly by commit 1e7b903cbcb421a2e4eb74a0590618a71e0a263d 820612 - implemented executor with paralled stdout and stderr reading
+ checkstyle fix ... spacewalk.git: 0a28394bdbdea829e4cbdc0d13efca02deb1bb29
Fix for this bug is present in Spacewalk 2.0, closing this bug as CURRENTRELEASE.
*** Bug 965386 has been marked as a duplicate of this bug. ***
*** Bug 889885 has been marked as a duplicate of this bug. ***