Bug 1237395

Summary: extremley slow https clone performance
Product: Red Hat Enterprise Linux 6 Reporter: Ian Wienand <iwienand>
Component: gitAssignee: Petr Stodulka <pstodulk>
Status: CLOSED ERRATA QA Contact: Andrej Dzilský <adzilsky>
Severity: unspecified Docs Contact: Lenka Špačková <lkuprova>
Priority: unspecified    
Version: 6.5CC: adzilsky, lpol, pabelanger, psklenar, thozza
Target Milestone: rcKeywords: Patch
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: git-1.7.1-7.el6 Doc Type: No Doc Update
Doc Text:
undefined
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-03-21 10:00:51 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1254457, 1355829, 1359264    
Attachments:
Description Flags
git with http fix
none
dependency
none
commit that modifies http wait behaviour none

Description Ian Wienand 2015-07-01 01:51:12 UTC
Clones were seen to be taking a *really* long time.  When I strace git (git-remote-https as it ends up being) I saw

---
$ git clone https://git.openstack.org/openstack-infra/system-config
...
4606  1435711642.024680 recvfrom(3, "\27\3\1\0 ", 5, 0, NULL, NULL) = 5
4606  1435711642.024715 recvfrom(3, "H\325\v>8\236A\277;\250K\365\33$W\1\302,M\23\\\ro\212%\321\335\6\357T\326\365", 32, 0, NULL, NULL) = 32
4606  1435711642.024791 select(0, [], [], [], {0, 50000}) = 0 (Timeout)
4606  1435711642.075059 poll([{fd=3, events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}], 1, 0) = 1 ([{fd=3, revents=POLLIN|POLLRDNORM}])
4606  1435711642.075166 recvfrom(3, "\27\3\1\0`", 5, 0, NULL, NULL) = 5
4606  1435711642.075210 recvfrom(3, "=\232\353\276@@\6\2\221wX\225\30\262\221\204\361\251M)\244\3602;\354\3674S\2776\233&"..., 96, 0, NULL, NULL) = 96
4606  1435711642.075306 poll([{fd=3, events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}], 1, 0) = 1 ([{fd=3, revents=POLLIN|POLLRDNORM}])
4606  1435711642.075346 recvfrom(3, "\27\3\1\0 ", 5, 0, NULL, NULL) = 5
4606  1435711642.075379 recvfrom(3, "\370\300\\\373\250N\375\30n]\205\371\16\372L\373\36;]W\237L\214\364\325\374\234\301^\237\31\5", 32, 0, NULL, NULL) = 32
4606  1435711642.075437 select(0, [], [], [], {0, 50000}) = 0 (Timeout)
---

I didn't wait for it to finish because apparently it ends up taking about 40 minutes.

Comparing this to a "bare" curl call for the same thing, i.e.

---
$ curl -o /dev/null -H 'Pragma: no-cache'-v https://git.openstack.org/openstack-infra/system-config/info/refs?service=git-upload-pack
---

it goes *way* faster, a few seconds.  The strace there looks *almost* the same, but the telling difference is the lack of select() calls

---
4679  1435711687.733287 recvfrom(3, "\27\3\1\0 ", 5, 0, NULL, NULL) = 5
4679  1435711687.733319 recvfrom(3, "<~\t$D+\353\245\341\223\303x\374\232\343+aMd\363\200|\307R\3m\211,i\22\336\\", 32, 0, NULL, NULL) = 32
4679  1435711687.733379 stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0
4679  1435711687.733427 poll([{fd=3, events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}], 1, 1000) = 1 ([{fd=3, revents=POLLIN|POLLRDNORM}])
4679  1435711687.733459 poll([{fd=3, events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}], 1, 0) = 1 ([{fd=3, revents=POLLIN|POLLRDNORM}])
4679  1435711687.733493 recvfrom(3, "\27\3\1\0 ", 5, 0, NULL, NULL) = 5
4679  1435711687.733524 recvfrom(3, "\275\20\200[\225\232\245\374n\24)\361\20\312\204\375\377T\227\220\315\\\344\3211\344^\251\266I\216\237", 32, 0, NULL, NULL) = 32
4679  1435711687.733579 stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0
4679  1435711687.733627 poll([{fd=3, events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}], 1, 1000) = 1 ([{fd=3, revents=POLLIN|POLLRDNORM}])
4679  1435711687.733659 poll([{fd=3, events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}], 1, 0) = 1 ([{fd=3, revents=POLLIN|POLLRDNORM}])
---

It was also noticed that 1.8 git doesn't do this.  So I went digging through git to see what might have changed in this respect.  I found [1] (Use curl_multi_fdset to select on curl fds instead of just sleeping) which seemed to describe the exact issue.  So I applied this to a local build and things seem to work much better, the clone happens as quickly as with git://

This is a small change but it makes a big difference in clone time over https, so I think it should be considered.

Will attach patch & packages I built for testing if interested

---
[root@cloud-server-01 test]# yum info git
Installed Packages
Name        : git
Arch        : x86_64
Version     : 1.7.1
Release     : 3.el6_4.1
Size        : 15 M
Repo        : installed
From repo   : base
---

[1] https://git.kernel.org/cgit/git/git.git/commit/?id=6f9dd67ffea3e86276a73e522ce1186a99bbe65d

Comment 2 Ian Wienand 2015-07-01 01:55:42 UTC
Created attachment 1044848 [details]
git with http fix

Comment 3 Ian Wienand 2015-07-01 01:56:21 UTC
Created attachment 1044849 [details]
dependency

Comment 4 Ian Wienand 2015-07-01 01:57:10 UTC
Created attachment 1044850 [details]
commit that modifies http wait behaviour

Comment 5 Petr Stodulka 2015-07-01 08:24:41 UTC
Thans Ian, good work. Posted patch solve this bug. Is it suitable for z-stream? Otherwise it could be fixed in rhel-6.8.

Comment 6 Ian Wienand 2015-07-02 00:25:07 UTC
(In reply to pstodulk from comment #5)
> Is it suitable for z-stream? Otherwise it could be fixed in rhel-6.8.

yeah, i'm not sure; I guess it depends if anyone else notices.  This might be due to a combination of the upstream repo (15200 refs, which might be a lot, I'm not sure ... this comes from gerrit reviews in the repo) and recent changes in the upstream git setup which maybe doesn't help.

I think you can probably switch to git:// if your remote supports it, and you're not behind a firewall, etc.

Comment 7 Ian Wienand 2015-07-03 05:09:25 UTC
Just to convince myself a little bit more on this, I ran the openstack CI job that was failing with the modified packages via [1] and, after about 10 runs it has passed them all.

[1] https://review.openstack.org/#/c/198177/

Comment 8 Paul Belanger 2015-07-06 14:05:14 UTC
Awesome job Ian. I applied your patchset locally and also seen the performance improvement.  Thanks for working on this while I was lounging on vacation.

Comment 14 Petr Stodulka 2016-12-19 13:11:28 UTC
No doc text is needed. It is just fix and user doesn't have to do any change.

Comment 16 errata-xmlrpc 2017-03-21 10:00:51 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2017-0640.html