Bug 964298 - yum's dead server detection logic interacts badly with malware-scanning network proxies
yum's dead server detection logic interacts badly with malware-scanning netwo...
Status: CLOSED WONTFIX
Product: Fedora
Classification: Fedora
Component: yum (Show other bugs)
18
All Linux
unspecified Severity low
: ---
: ---
Assigned To: packaging-team-maint
Fedora Extras Quality Assurance
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2013-05-17 14:56 EDT by James Ralston
Modified: 2014-02-05 18:09 EST (History)
6 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2014-02-05 18:09:45 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description James Ralston 2013-05-17 14:56:43 EDT
Description of problem:

In our organization, we have begun experimenting with inline malware scanning in our enterprise network proxies.

When inline malware scanning is enabled, when a client requests a download through the proxy, before the proxy returns any portion of the file to the client, the proxy first downloads the file and performs a malware scan. If the scan finds no malware, then the file is returned to the client.

The proxy's actions produce behavior that is very non-intuitive to both humans and automated programs: an attempt to download a large file will appear to "hang" with no progress for many seconds, and then the entire file is downloaded at essentially LAN network speeds (>100 Mbit/s).

Because this behavior can be non-intuitive, the proxy supports a feature it calls "trickle first" mode: instead of serving no data to the client until the scan is complete, the proxy starts sending the file to the client at the (non-configurable) rate of 1 byte per second. The idea is to convince both humans and automated programs that the download is successfully in progress. Once the proxy completes the malware scan, the rest of the file is sent to the client at LAN network speeds.

Unfortunately, the logic yum uses to detect dead download servers interacts badly with the proxy's behavior.

If the proxy does not use trickle mode, yum will wait for the number of seconds defined by timeout (default: 30), give up, and attempt to use another server.

But the proxy's "trickle first" mode isn't enough to satisfy yum. That is because yum has a non-configurable download rate threshold of 1000 bytes per second. If the download falls below that rate for more than 5 seconds, yum assumes the server is dead.

Thus, the ONLY way to make yum work properly with our enterprise network proxy is to set yum's timeout to a value greater than the amount of time it will take for yum to download any package; that is, to completely disable yum's dead server detection algorithms. That's a work-around, but it's an undesirable one.

A slightly better work-around would be for yum to make the minimal download rate threshold configurable.  E.g.:

# The number of bytes per second at which a server must send data to prevent
# yum from assuming the server is too slow to use.  Default: 1000.
minrate=1000

This still isn't ideal, as it won't detect the case where a server sends part of the file before dying. But it will detect the case when a server is still answering pings, but isn't serving up any data. And it shouldn't be too difficult to implement. (If I could figure out where yum's logic is implemented, I'd take a crack at it myself.)

Version-Release number of selected component (if applicable):

0:python-urlgrabber-3.9.1-24.fc18.noarch
0:yum-3.4.3-54.fc18.noarch
Comment 1 Zdeněk Pavlas 2013-05-20 06:24:09 EDT
Hi, and thanks for a nice introduction to the problem!

> That is because yum has a non-configurable download rate threshold of 1000 bytes per second.

Yes, that's currently hardcoded in python-urlgrabber.

/usr/lib/python2.7/site-packages/urlgrabber/grabber.py:
        self.curl_obj.setopt(pycurl.LOW_SPEED_LIMIT, 1000)

> If the download falls below that rate for more than 5 seconds, yum assumes the server is dead.

The low speed detection code in curl uses a dedicated option, LOW_SPEED_TIME.  We set it to the same value as CONNECTTIMEOUT (30s default).

        self.curl_obj.setopt(pycurl.CONNECTTIMEOUT, timeout)
        self.curl_obj.setopt(pycurl.LOW_SPEED_TIME, timeout)

> # The number of bytes per second at which a server must send data to prevent
> # yum from assuming the server is too slow to use.  Default: 1000.
> minrate=1000

Yes, this should be useful, and not too hard to implement.  It used to be 1Byte/s, but people complained that Yum didn't move on to the next mirror quickly enough.  See BZ 860181
Comment 2 Fedora End Of Life 2013-12-21 10:31:30 EST
This message is a reminder that Fedora 18 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 18. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '18'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 18's end of life.

Thank you for reporting this issue and we are sorry that we may not be 
able to fix it before Fedora 18 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior to Fedora 18's end of life.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.
Comment 3 Fedora End Of Life 2014-02-05 18:09:45 EST
Fedora 18 changed to end-of-life (EOL) status on 2014-01-14. Fedora 18 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.

Note You need to log in before you can comment on or make changes to this bug.