1355931 – not able to git pull the source

Bug 1355931 - not able to git pull the source

Summary: not able to git pull the source

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	GlusterFS
Classification:	Community
Component:	project-infrastructure
Sub Component:
Version:	mainline
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	---
Assignee:	Nigel Babu
QA Contact:
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2016-07-13 04:29 UTC by Atin Mukherjee
Modified:	2016-08-02 12:26 UTC (History)
CC List:	4 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2016-08-02 12:26:48 UTC
Regression:	---
Mount Type:	---
Documentation:	---
CRM:
Verified Versions:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Atin Mukherjee 2016-07-13 04:29:02 UTC

Description of problem:

Along with git pull, even review.gluster.org is not responding.

Comment 1 Nigel Babu 2016-07-13 04:51:59 UTC

Restarted gerrit to fix the issue.

Comment 2 Nigel Babu 2016-07-13 05:10:25 UTC

This looks like the problem:

org.apache.sshd.common.channel.WindowClosedException: Already closed
        at org.apache.sshd.common.channel.Window.waitForSpace(Window.java:163)
        at org.apache.sshd.common.channel.ChannelOutputStream.flush(ChannelOutputStream.java:116)
        at org.apache.sshd.common.channel.ChannelOutputStream.write(ChannelOutputStream.java:84)
        at java.io.OutputStream.write(OutputStream.java:75)
        at org.eclipse.jgit.transport.PacketLineOut.writePacket(PacketLineOut.java:119)
        at org.eclipse.jgit.transport.PacketLineOut.writeString(PacketLineOut.java:103)
        at org.eclipse.jgit.transport.RefAdvertiser$PacketLineOutRefAdvertiser.writeOne(RefAdvertiser.java:81)
        at org.eclipse.jgit.transport.RefAdvertiser.advertiseId(RefAdvertiser.java:294)
        at org.eclipse.jgit.transport.RefAdvertiser.advertiseAny(RefAdvertiser.java:258)
        at org.eclipse.jgit.transport.RefAdvertiser.send(RefAdvertiser.java:202)
        at org.eclipse.jgit.transport.UploadPack.sendAdvertisedRefs(UploadPack.java:901)
        at org.eclipse.jgit.transport.UploadPack.service(UploadPack.java:715)
        at org.eclipse.jgit.transport.UploadPack.upload(UploadPack.java:666)
        at com.google.gerrit.sshd.commands.Upload.runImpl(Upload.java:80)
        at com.google.gerrit.sshd.AbstractGitCommand.service(AbstractGitCommand.java:101)
        at com.google.gerrit.sshd.AbstractGitCommand.access$000(AbstractGitCommand.java:32)
        at com.google.gerrit.sshd.AbstractGitCommand$1.run(AbstractGitCommand.java:70)
        at com.google.gerrit.sshd.BaseCommand$TaskThunk.run(BaseCommand.java:437)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
        at com.google.gerrit.server.git.WorkQueue$Task.run(WorkQueue.java:377)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
[2016-07-12 01:01:15,928] [NioProcessor-1] WARN

Comment 3 Nigel Babu 2016-07-13 05:23:49 UTC

The solution seems to be to edit the config to set a timeout for cache diff as explained here: https://bugs.chromium.org/p/gerrit/issues/detail?id=3940

I'm not doing anything right now. But if this happens again, I'll modify the configuration.

Comment 4 Nigel Babu 2016-07-13 06:27:06 UTC

This has happened again. For some reason, connection formicary and the VM console is particularly slow. misc, do you know why that's happening?

Comment 5 Nigel Babu 2016-07-13 07:02:18 UTC

Kaushal has pointed out a good deal of packet loss at RDU which is probably related to an ongoing outage in RDU.

Comment 6 M. Scherer 2016-07-13 07:58:35 UTC

So the network is back, RH IT switched to a different network upstream route, and as of 7h20 UTC, this seems to be fine. Can we confirm things are back to normal now, and close the ticket ?

and for people asking how long it took to diagnose, it was seen quite fast, but switching provider involve BGP change, and it take a while to propagate around the internet, like DNS. IT is still dealing with time warner (ie, waiting on their support) to fix the primary route and network, but this should impact us in any way.

Comment 7 M. Scherer 2016-07-13 10:30:47 UTC

grmblb, so it seems that the current network setup wasn't switched for some reason, so since issue come and go, it was not seen right away. But so they are looking now.

Comment 8 M. Scherer 2016-07-13 10:45:14 UTC

So there is 1 single link for those servers, so no switch, and TWC is aware and working on it.

Comment 9 Nigel Babu 2016-07-15 03:02:42 UTC

The immediate issue is now fixed. This warrants a discussion about how we can avoid this in the future.

Comment 10 M. Scherer 2016-07-15 08:25:56 UTC

Having HA on gerrit could be a solution. Not sure what this entails, or if this will really fix a major party of the issue. This also mean having a 2nd hosting provider.

Note You need to log in before you can comment on or make changes to this bug.