git fetch command failed (connection reset) for big git repositories

Hello,we are running TC Enterprise 8.1.5 and experience failed git fetch commands (connection reset) with big git repositories (3-5GB). Smaller git repositories work fine.After a restart of the server this error disappears but after a few hours it starts failing again.Please find attached a thread dump. Any help is appreciated.Regards,Helios

Attachment(s):
TCdumpgit.zip
19 comments
Comment actions Permalink

Hi Helios,

Firstly please upgrade to the latest TeamCity version (9.1 was released yesterday). If the issue is reproduced on the latest version or you are not able to upgrade in the nearest future, then please provide us with teamcity-server.log and teamcity-vcs.log.

0
Comment actions Permalink

Hi Alina,please find attached some logging information running on TC 8.1.5. Upgrade to 9.1 is not yet planned.

Attachment(s):
logs.zip

0
Comment actions Permalink

Hi,

judging by the error your git server resets a connection. Please check if there are some clues why it happens in the git server logs. Do you have some proxy in front of the git server? If so, please check its connection timeout settings, make sure they are large enough to clone the big repo.

0
Comment actions Permalink

When changing something in the VCS settings, it will do a checkout without any problems. But once a new commit has been done the check for changes fails. This is not reproducable in a new TC test setup.

0
Comment actions Permalink

Would cleaning the git caches help?

Can I manually delete only the map entries and directories of the failing repository and keep the other clones?

0
Comment actions Permalink

Hello,

It’s been a half of a year since your original post. Could you please provide the updated details of the issue? What TeamCity version do you use? What errors/warning do you see? Please attach teamcity-vcs.log file.

0
Comment actions Permalink

We are now using the latest version (TC 9.1.6) and still have the same issue with one git repository. All other TC projects have no issues. The one with the issue fails with the following error (also listed in teamcity-vcs.log):

Failed to collect changes, error: Error collecting changes for VCS repository '"ProjectMaster" {instance id=2466, parent internal id=413, parent id=ProjectMaster_ProjectMaster, description: "https://.../ProjectMaster.git#master"}'
 'git fetch' command failed.
 stderr: Connection reset
 stdout: Receiving objects: 1%
 Receiving objects: 2%
 Receiving objects: 3%
 Receiving objects: 4%
 Receiving objects: 5%
 Receiving objects: 6%
 Receiving objects: 7%
 Receiving objects: 8%
 Receiving objects: 100%
 exit code: 1

This VCS root is listed in Administration - Diagnostics - VCS status with duration > 600s.

Note that the initial clone works fine. It's failing after there is a new commit. TC can't check/update the new changes. This git is ~3GB while the others are < 1.2GB.

0
Comment actions Permalink

Please provide teamcity-vcs.log from TeamCity server covering the failed changes collecting.

0
Comment actions Permalink

Please find below the relevant part of the error in the vcs log. Note that I have setup a second TC server to reproduce the issue but there all is fine. (i.e. no errors, it automatically picks up new commits)

[2016-02-23 09:15:01,990]   WARN [l executor 2073] -
jetbrains.buildServer.VCS -
Error occurred on attempt to collect changes for VCS root "Sandbox" {instance id=2472, parent internal id=409, parent id=Sandbox_Sandbox, description: "https://.../project.git#refs/heads/master"}
from state RepositoryState{defaultBranch='refs/heads/master', revisions={refs/heads/master: c21a73378617c1850a0c69abceb41c20d831d3be}}
  to state RepositoryState{defaultBranch='refs/heads/master', revisions={refs/heads/master: c21a73378617c1850a0c69abceb41c20d831d3be}}
by combined checkout rule: +:=> (jetbrains.buildServer.vcs.VcsException: 'git fetch' command failed.
stderr: Connection reset
stdout: Receiving objects:         1%
Receiving objects:         2%
Receiving objects:         3%
Receiving objects:         4%
Receiving objects:         5%
Receiving objects:         6%
Receiving objects:         7%
Receiving objects:         8%
Receiving objects:         9%
Receiving objects:        10%
Receiving objects:        11%
Receiving objects:       100%
exit code: 1)
[2016-02-23 09:15:01,990]   WARN [l executor 2073] -
cs.ConnectionStateReporterImpl -
Unable to collect changes for [Sandbox :: Test : Git {id=Sandbox_TestGit, internal id=bt5880}]: jetbrains.buildServer.vcs.VcsRootVcsException:
Error collecting changes for VCS repository '"Sandbox" {instance id=2472, parent internal id=409, parent id=Sandbox_Sandbox, description: "https://.../project.git#refs/heads/master"}'
'git fetch' command failed.
stderr: Connection reset
stdout: Receiving objects:         1%
Receiving objects:         2%
Receiving objects:         3%
Receiving objects:         4%
Receiving objects:         5%
Receiving objects:         6%
Receiving objects:         7%
Receiving objects:         8%
Receiving objects:         9%
Receiving objects:        10%
Receiving objects:        11%
Receiving objects:       100%
exit code: 1

0
Comment actions Permalink

Please try running 'git gc' in the repository clone on TeamCity server machine. You can find it inside <TeamCity data dir>/system/caches/git directory, the <TeamCity data dir>/system/caches/git/map file contains the mapping between repository url and the clone dir. Let me know if running 'git gc' helps.

0
Comment actions Permalink

MINGW64 /d/.BuildServer/system/caches/git/git-0962C94
D.git (BARE:master)
$ git gc
Nothing new to pack.

Did not help.
FYI, the size is not the reason because I get this error now also on another newly added git repository which has a size of 360MB. This VCS root was created 2 days ago and there was nothing triggered yet. And now suddenly it shows the red error message about 'error collecting changes... git fetch command failed'.

0
Comment actions Permalink

Any other idea? Would cleaning the git cache help? If there are time consuming tasks to try out I would prefer to do them over the weekend.

0
Comment actions Permalink

So far it looks like that your git server closes the connection before all data are downloaded. Please check if your git hosting logs contain any clue on why this happens. If it possible in your setup - try using a different protocol for git roots, e.g. git:// or ssh:// instead of http. 

0
Comment actions Permalink

Our git server is not configured to be used with other protocols. The TC test installation on a VM is still working fine with this git url. Btw, the map file in the git cache contains several entries pointing to this git repository which differ only by case or by different user name. For many of them the corresponding folder does not exist anymore in the git cache anymore. Can we exclude the git cache as a possible reason causing this issue?
Anything else I could try out (git properties)? Again, if it needs a TC service restart then the weekend is the only time I can try such changes.

0
Comment actions Permalink

I don't think redundant entries in the map file can cause connection resets. Also changing git options likely won't help since  git server closes the connection. Are there any difference with regard to network settings between the test and production installation? Please check you git server logs, do they contain any clues on why connection is closed? 

0
Comment actions Permalink

The same network settings. It's just that our productive environment is running for around seven years having been upgraded from TC 6 to 9.1.6. And a fresh installed TC 9.1.6 did not show these errors. The logs in our Bitbucket server do not show anything failing. Just tried with another git of size 1.8GB and exactly the same behavior: clone works, but after the first commit the connection error appears when collecting changes. Currently the teams are blocked. Any help is appreciated.

0
Comment actions Permalink

Please check if switching to jdk http connections fix the problem. To do that add an internal property 

teamcity.git.httpConnectionFactory=jdk

and restart TeamCity.

 

0
Comment actions Permalink

Still the same issue also with this property.

0

Please sign in to leave a comment.