1420844 – python-requests Response.iter_lines() is unreliable when delimiter is in a text chunk boundary

Bug 1420844 - python-requests Response.iter_lines() is unreliable when delimiter is in a text chunk boundary

Summary: python-requests Response.iter_lines() is unreliable when delimiter is in a te...

Keywords:
Status:	CLOSED UPSTREAM
Alias:	None
Product:	Red Hat Software Collections
Classification:	Red Hat
Component:	python33
Sub Component:
Version:	python33
Hardware:	All
OS:	Linux
Priority:	unspecified
Severity:	medium
Target Milestone:	---
Target Release:	3.1
Assignee:	Python Maintainers
QA Contact:	BaseOS QE - Apps
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2017-02-09 15:48 UTC by Paulo Andrade
Modified:	2020-12-14 08:09 UTC (History)
CC List:	1 user (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2017-09-25 14:50:17 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
request_iter_lines_test.py (1.00 KB, text/plain) 2017-02-09 15:48 UTC, Paulo Andrade	no flags	Details
View All

Description Paulo Andrade 2017-02-09 15:48:26 UTC

Created attachment 1248855 [details]
request_iter_lines_test.py

The attached test should not report any errors if run as:

$ python3 -m unittest request_iter_lines_test.py

but instead, depending on where the delimiter is in the
input, it will incorrectly add an empty entry.

  Failures can be corrected by this quick/test patch, for
example, on fedora 25 python 3.5 this quick patch fixes all
failures in the test case when there is a delimiter:
"""
--- /usr/lib/python3.5/site-packages/requests/models.py.orig	2017-02-01 16:04:02.117318286 -0200
+++ /usr/lib/python3.5/site-packages/requests/models.py	2017-02-01 16:53:15.219410425 -0200
@@ -707,16 +707,29 @@
 
             if pending is not None:
                 chunk = pending + chunk
+                pending = None
 
             if delimiter:
+                wrap = False
+                for i in range(1, len(delimiter) + 1):
+                    if chunk.endswith(delimiter[:i]):
+                        wrap = True
+                        break
+                if wrap:
+                    if pending is not None:
+                        pending = pending + chunk
+                    else:
+                        pending = chunk
+                    continue
                 lines = chunk.split(delimiter)
             else:
                 lines = chunk.splitlines()
 
             if lines and lines[-1] and chunk and lines[-1][-1] == chunk[-1]:
-                pending = lines.pop()
-            else:
-                pending = None
+                if pending is not None:
+                    pending = pending + lines.pop()
+                else:
+                    pending = lines.pop()
 
             for line in lines:
                 yield line
"""

but the test case fails when a delimiter is not specified, lines are
separated by '\r\n' and the text chunk ends in the middle of the
sequence, that is, ends in '\r' and next chunk starts with '\n'.
So, chunk.splitlines() possibly needs to override TextIOWrapper
logic in this condition.

Comment 6 Paulo Andrade 2017-08-30 20:14:26 UTC

Issue reported upstream at https://github.com/requests/requests/issues/4271

Comment 7 Charalampos Stratakis 2017-09-25 14:50:17 UTC

This issue has been fixed upstream at the 3.0.0 branch [0] (not released yet) thus closing the bugzilla.

[0] https://github.com/requests/requests/pull/3984

Note You need to log in before you can comment on or make changes to this bug.