Bug 738729

Summary: stuck in endless loop in bwstat_delay when using spice client
Product: [Fedora] Fedora Reporter: Alon Levy <alevy>
Component: trickleAssignee: Nicoleau Fabien <nicoleau.fabien>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 16CC: dblechte, manuel.wolfshant, mtasaka, nicoleau.fabien
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: trickle-1.07-12.fc16 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-10-09 19:51:38 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
bwstat_getdelay exit while loop if no packets none

Description Alon Levy 2011-09-15 17:08:41 UTC
Description of problem:
The command:
 trickle -u 10 -d 20 `which spicec` -h localhost -p 9011

Becomes unresponsive. This doesn't happen if I raise the download limit enough (500+). Attaching with gdb shows being in an infinite loop inside a recv call on bwstat_delay. The attached patch to the package includes a simple patch to fix it. I've tested it with the above command and while it solves it, I get occational crashes so not sure if it is the right approach, but the crashes seem to be unrelated, and only happened when using gnome-shell (not sure how that relates, but possibly gnome-shell is causing a refresh of spice client that is causing it to try a recv - don't know).

Version-Release number of selected component (if applicable):
trickle-1.07-11.fc16.x86_64

How reproducible:
50% with the command line below (1/2), 100% with a win7 vm running qxl driver.

Steps to Reproduce:
1. start spice server (qemu-kvm -spice disable-ticketing,port=9011)
2. trickle -u 10 -d 20 `which spicec` -h localhost -p 9011
  
Actual results:
program gets stuck, no keyboard interaction possible. attaching shows stuck in 
/usr/src/debug/trickle-1.07/bwstat.c in the loop starting at line 189.

Expected results:
work normally (window appears with bios, can do Ctrl+B, press tab, play with the command line of PXE)

Additional info:

Following patch (to the fedora trickle package, includes the patch to trickle 1.07, tested on Fedora 16) fixes the problem:


commit 4bd0d811fadd7789295dbf5a4304f5159637584e
Author: Alon Levy <alevy>
Date:   Thu Sep 15 18:08:55 2011 +0300

    1.07-12 fix endless loop (local patch)

diff --git a/trickle-1.07-bwsta_getdelay-stop-if-no-packets.patch b/trickle-1.07-bwsta_getdelay-stop-if-no-packets.patch
new file mode 100644
index 0000000..6760350
--- /dev/null
+++ b/trickle-1.07-bwsta_getdelay-stop-if-no-packets.patch
@@ -0,0 +1,25 @@
+From 3b22c327ff6ac3ee51332919e91ae63d47225a9b Mon Sep 17 00:00:00 2001
+From: Alon Levy <alevy>
+Date: Thu, 15 Sep 2011 18:05:13 +0300
+Subject: [PATCH] bwsta_getdelay: stop if no packets
+
+---
+ bwstat.c |    2 +-
+ 1 files changed, 1 insertions(+), 1 deletions(-)
+
+diff --git a/bwstat.c b/bwstat.c
+index a1c1085..9567275 100644
+--- a/bwstat.c
++++ b/bwstat.c
+@@ -210,7 +210,7 @@ bwstat_getdelay(struct bwstat *bs, size_t *len, uint lim, short which)
+ 
+ 			ent += xent;
+ 		}
+-	} while (pool > 0 && ncli > 0);
++	} while (pool > 0 && ncli > 0 && (TAILQ_FIRST(&poolq) != TAILQ_END(&poolq)));
+ 
+ 	/*
+ 	 * This is the case of a client that is not using its limit.
+-- 
+1.7.6.2
+
diff --git a/trickle.spec b/trickle.spec
index 1e883aa..530b9f1 100644
--- a/trickle.spec
+++ b/trickle.spec
@@ -1,6 +1,6 @@
 Name:           trickle
 Version:        1.07 
-Release:        11%{?dist}
+Release:        12%{?dist}
 Summary:        Portable lightweight userspace bandwidth shaper
 
 Group:          Applications/System
@@ -15,6 +15,7 @@ BuildRequires:  libevent-devel
 Patch0:         %{name}-%{version}-include_netdb.patch
 Patch1:         %{name}-%{version}-libdir.patch
 Patch2:         %{name}-%{version}-CVE-2009-0415.patch
+Patch3:         %{name}-%{version}-bwsta_getdelay-stop-if-no-packets.patch
 
 %description
 trickle is a portable lightweight userspace bandwidth shaper.
@@ -33,6 +34,7 @@ trickle runs entirely in userspace and does not require root privileges.
 %patch0 -p1 -b .include_netdb
 %patch1 -p1 -b .libdir
 %patch2 -p1 -b .cve
+%patch3 -p1
 touch -r configure aclocal.m4 Makefile.in stamp-h.in
 
 iconv -f ISO88591 -t UTF8 < README > README.UTF8
@@ -71,6 +73,8 @@ rm -rf $RPM_BUILD_ROOT
 
 
 %changelog
+* Thu Sep 15 2011 Alon Levy <alevy> 1.07-12
+- Fix endless loop in bwstat_delay (seen with spicec)
 * Sun Feb 13 2011 Nicoleau Fabien <nicoleau.fabien> 1.07-11
 - Revert to the working patch
 * Wed Feb 09 2011 Fedora Release Engineering <rel-eng.org> - 1.07-10

Comment 1 Nicoleau Fabien 2011-09-18 21:02:34 UTC
Hello,
thank you for the report, and the patch.

But I don't really understand how to apply this. I tried to copy in a file, then applied it. But when I build, the patch doesn't work.

Can you simply provide the bwsta_getdelay-stop-if-no-packets.patch ?

Comment 2 Alon Levy 2011-09-19 08:05:06 UTC
Created attachment 523795 [details]
bwstat_getdelay exit while loop if no packets

Comment 3 Fedora Update System 2011-09-19 19:37:08 UTC
trickle-1.07-12.fc16 has been submitted as an update for Fedora 16.
https://admin.fedoraproject.org/updates/trickle-1.07-12.fc16

Comment 4 Fedora Update System 2011-09-20 19:03:50 UTC
Package trickle-1.07-12.fc16:
* should fix your issue,
* was pushed to the Fedora 16 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing trickle-1.07-12.fc16'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/trickle-1.07-12.fc16
then log in and leave karma (feedback).

Comment 5 Fedora Update System 2011-10-09 19:51:38 UTC
trickle-1.07-12.fc16 has been pushed to the Fedora 16 stable repository.  If problems still persist, please make note of it in this bug report.