Bug 1014814

Summary: grub2's tftp implementation doesn't re-transmit ACKs
Product: [Fedora] Fedora Reporter: Tyson Whitehead <twhitehead>
Component: grub2Assignee: Peter Jones <pjones>
Status: CLOSED EOL QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 19CC: bcl, dennis, mads, pjones
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-02-17 17:28:32 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Patch to add re-transmission of ACKs none

Description Tyson Whitehead 2013-10-02 20:37:17 UTC
Description of problem:

Grub2's TFTP implementation does not send another ACK upon receiving the last data packet a second time.  This results in the TFTP transfer breaking down in the face of packet loss and ending in a timeout error.  This looks like so

server -> data 1 -> grub
grub -> ack 1 -> server
server -> data 2 -> grub
grub -> ack 2 -> server
.
.
.
server -> data k -> grub
grub -> ack k -> (lost in the network)
server -> data k -> grub
server -> data k -> grub
server -> data k -> grub
server -> data k -> grub
.
.
.

until Grub reports a timeout error waiting for data k+1.

Version-Release number of selected component (if applicable):

1:2.00-23.fc19

How reproducible:

Always (if you have a busy network or a way of dropping some packets)

Steps to Reproduce:
1. Load a large initramfs via tftp on a busy network that drops some packets

Actual results:

Times out due to the server not going onto the next block because grub does not re-transmitting the ACK that was lost for the previous block it has already received.

Expected results:

Grub should re-transmit the dropped ACK so the server will send the next packet.

Additional info:

The relevant part of RFC 1350 is (Section 2 -- Overview)

"If a packet gets lost in the network, the intended recipient will timeout and may retransmit his last packet (which may be data or an acknowledgment), thus causing the sender of the lost packet to retransmit that lost packet.
...
Notice that both machines involved in a transfer are considered senders and receivers.  One sends data and receives acknowledgments, the other sends acknowledgments and receives data."

http://www.ietf.org/rfc/rfc1350.txt

Comment 1 Tyson Whitehead 2013-10-17 18:49:33 UTC
Created attachment 813514 [details]
Patch to add re-transmission of ACKs

Comment 2 Tyson Whitehead 2013-10-17 18:51:01 UTC
I've attached a patch against the F19 srpm to implement ACK re-transmission.

I've also submitted it upstream for (hopefully) eventually inclussion

https://savannah.gnu.org/bugs/?40293

Comment 3 Fedora End Of Life 2015-01-09 20:05:21 UTC
This message is a notice that Fedora 19 is now at end of life. Fedora 
has stopped maintaining and issuing updates for Fedora 19. It is 
Fedora's policy to close all bug reports from releases that are no 
longer maintained. Approximately 4 (four) weeks from now this bug will
be closed as EOL if it remains open with a Fedora 'version' of '19'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 19 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 4 Fedora End Of Life 2015-02-17 17:28:32 UTC
Fedora 19 changed to end-of-life (EOL) status on 2015-01-06. Fedora 19 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.