Agents have finished their builds, but the server is still waiting; can't kill the service either. How do I troubleshoot this?

This may be related to TeamCity 8.1.2 (Windows server 2008 R2); once in a while, TeamCity server cannot seem to get build info from the agents.  There are no commonalities

Symptoms:

1.  TeamCity web will show that a build is going on and on on the agent; however on the agents, I look at the logs and the build has finished! But the server is still waiting and shows that the build is now going into overtime.  The logs are partially completed and you can see more output is expected but nothing appears.

2.  Cannot kill the TeamCity service.  Normally I would restart the service but the service hangs as well, in the 'stopping' state.  I have to go into Process Explorer and kill the java processes myself.

On the agent logs (teamcity-agent.log) I see this repeating itself

[2014-05-07 11:12:52,091]   INFO - r.agent.impl.AgentLogProxyImpl - Ping OK and registered on the server again: 32894
[2014-05-07 11:12:52,091]   INFO - r.agent.impl.AgentLogProxyImpl - Pending remote commands count: 11
[2014-05-07 11:12:57,521]   INFO -    jetbrains.buildServer.AGENT - Done: Pre-processing artifacts if needed (archiving)
[2014-05-07 11:12:57,521]   INFO - ages.BuildFinishStagesExecutor - Call finish stage jetbrains.buildServer.agent.impl.buildStages.finishStages.WaitForArtifactsPublishedFStage
[2014-05-07 11:17:52,305]   INFO - r.agent.impl.AgentLogProxyImpl - Failed to perform remote command 100 log messages for build 32894, error: jetbrains.buildServer.xmlrpc.RemoteCallException: Call http://TCServer.internaldomain/RPC2 buildServer.log: org.apache.xmlrpc.XmlRpcException: jetbrains.buildServer.TeamCityRuntimeException: Timeout exceeded while waiting for messages processing
[2014-05-07 11:17:52,305]   INFO - r.agent.impl.AgentLogProxyImpl - Adding ping and re-register command on top of the remote command queue
[2014-05-07 11:17:57,539]   INFO - buildServer.AGENT.registration - Registering on server http://TCServer.internaldomain, AgentDetails{Name='TCAgent.internaldomain', AgentId=6, BuildId=32894, AgentOwnAddress='null', AlternativeAddresses=[10.55.109.166], Port=81, Version='29993', PluginsVersion='29993-md5-6046f906792a5ced1814676fc926e712', AvailableRunners=[Ant, dotnet-tools-dupfinder, dotnet-tools-inspectcode, Duplicator, FxCop, gradle-runner, Inspection, Ipr, jb.nuget.installer, jb.nuget.pack, jb.nuget.publish, jetbrains.dotNetGenericRunner, jetbrains.mspec, jetbrains_powershell, JPS, Maven2, MSBuild, MSTest, NAnt, NUnit, rake-runner, simpleRunner, sln2003, VS.Solution], AvailableVcs=[perforce, mercurial, jetbrains.git, tfs, svn, cvs], AuthorizationToken='a2a0473441c33f89a9a077afeec31b45', PingCode='c8QA4wCUP7tLfMwIO43eagdlkU2ne1wv'}
[2014-05-07 11:17:57,760]   INFO - buildServer.AGENT.registration - Server confirmed we are still registered
[2014-05-07 11:17:57,761]   INFO - r.agent.impl.AgentLogProxyImpl - Ping OK and registered on the server again: 32894
[2014-05-07 11:17:57,761]   INFO - r.agent.impl.AgentLogProxyImpl - Pending remote commands count: 21
[2014-05-07 11:22:57,947]   INFO - r.agent.impl.AgentLogProxyImpl - Failed to perform remote command 100 log messages for build 32894, error: jetbrains.buildServer.xmlrpc.RemoteCallException: Call http://TCServer.internaldomain/RPC2 buildServer.log: org.apache.xmlrpc.XmlRpcException: jetbrains.buildServer.TeamCityRuntimeException: Timeout exceeded while waiting for messages processing
[2014-05-07 11:22:57,948]   INFO - r.agent.impl.AgentLogProxyImpl - Adding ping and re-register command on top of the remote command queue
[2014-05-07 11:23:03,161]   INFO - buildServer.AGENT.registration - Registering on server http://TCServer.internaldomain, AgentDetails{Name='TCAgent.internaldomain', AgentId=6, BuildId=32894, AgentOwnAddress='null', AlternativeAddresses=[10.55.109.166], Port=81, Version='29993', PluginsVersion='29993-md5-6046f906792a5ced1814676fc926e712', AvailableRunners=[Ant, dotnet-tools-dupfinder, dotnet-tools-inspectcode, Duplicator, FxCop, gradle-runner, Inspection, Ipr, jb.nuget.installer, jb.nuget.pack, jb.nuget.publish, jetbrains.dotNetGenericRunner, jetbrains.mspec, jetbrains_powershell, JPS, Maven2, MSBuild, MSTest, NAnt, NUnit, rake-runner, simpleRunner, sln2003, VS.Solution], AvailableVcs=[perforce, mercurial, jetbrains.git, tfs, svn, cvs], AuthorizationToken='a2a0473441c33f89a9a077afeec31b45', PingCode='c8QA4wCUP7tLfMwIO43eagdlkU2ne1wv'}
[2014-05-07 11:23:03,380]   INFO - buildServer.AGENT.registration - Server confirmed we are still registered
[2014-05-07 11:23:03,381]   INFO - r.agent.impl.AgentLogProxyImpl - Ping OK and registered on the server again: 32894
[2014-05-07 11:23:03,381]   INFO - r.agent.impl.AgentLogProxyImpl - Pending remote commands count: 21
[2014-05-07 11:28:03,563]   INFO - r.agent.impl.AgentLogProxyImpl - Failed to perform remote command 100 log messages for build 32894, error: jetbrains.buildServer.xmlrpc.RemoteCallException: Call http://TCServer.internaldomain/RPC2 buildServer.log: org.apache.xmlrpc.XmlRpcException: jetbrains.buildServer.TeamCityRuntimeException: Timeout exceeded while waiting for messages processing
[2014-05-07 11:28:03,563]   INFO - r.agent.impl.AgentLogProxyImpl - Adding ping and re-register command on top of the remote command queue
[2014-05-07 11:28:08,776]   INFO - buildServer.AGENT.registration - Registering on server http://TCServer.internaldomain, AgentDetails{Name='TCAgent.internaldomain', AgentId=6, BuildId=32894, AgentOwnAddress='null', AlternativeAddresses=[10.55.109.166], Port=81, Version='29993', PluginsVersion='29993-md5-6046f906792a5ced1814676fc926e712', AvailableRunners=[Ant, dotnet-tools-dupfinder, dotnet-tools-inspectcode, Duplicator, FxCop, gradle-runner, Inspection, Ipr, jb.nuget.installer, jb.nuget.pack, jb.nuget.publish, jetbrains.dotNetGenericRunner, jetbrains.mspec, jetbrains_powershell, JPS, Maven2, MSBuild, MSTest, NAnt, NUnit, rake-runner, simpleRunner, sln2003, VS.Solution], AvailableVcs=[perforce, mercurial, jetbrains.git, tfs, svn, cvs], AuthorizationToken='a2a0473441c33f89a9a077afeec31b45', PingCode='c8QA4wCUP7tLfMwIO43eagdlkU2ne1wv'}
[2014-05-07 11:28:08,996]   INFO - buildServer.AGENT.registration - Server confirmed we are still registered
[2014-05-07 11:28:08,996]   INFO - r.agent.impl.AgentLogProxyImpl - Ping OK and registered on the server again: 32894
[2014-05-07 11:28:08,996]   INFO - r.agent.impl.AgentLogProxyImpl - Pending remote commands count: 21


This is on all agent logs, even the agent which is on the same machine as the server.  

On the TeamCity server itself, I looked at "teamcity-server.log" but I don't see anything unusual at those times.  Just a few of these

[2014-05-07 11:16:48,183]   INFO -   jetbrains.buildServer.SERVER - Agent description updated for agent: TCAgent {id=1, host=127.0.0.1:9090, agentTypeId=1, pool=Default}
[2014-05-07 11:17:06,750]   INFO -   jetbrains.buildServer.SERVER - Agent description updated for agent: TCAgent {id=1, host=127.0.0.1:9090, agentTypeId=1, pool=Default}
[2014-05-07 11:18:01,743]   INFO -   jetbrains.buildServer.SERVER - Agent description updated for agent: TCAgent {id=1, host=127.0.0.1:9090, agentTypeId=1, pool=Default}
[2014-05-07 11:19:04,637]   INFO -   jetbrains.buildServer.SERVER - Agent description updated for agent: TCAgent2 {id=5, host=10.55.109.167:8090, agentTypeId=5, pool=Default}
[2014-05-07 11:19:21,088]   INFO -   jetbrains.buildServer.SERVER - Agent description updated for agent: TCAgent2 {id=5, host=10.55.109.167:8090, agentTypeId=5, pool=Default}
[2014-05-07 11:19:31,857]   INFO -   jetbrains.buildServer.SERVER - Agent description updated for agent: TCAgent2 {id=5, host=10.55.109.167:8090, agentTypeId=5, pool=Default}
[2014-05-07 11:20:03,275]   INFO -   jetbrains.buildServer.SERVER - Agent description updated for agent: TCAgent2 {id=5, host=10.55.109.167:8090, agentTypeId=5, pool=Default}



I am a bit stuck because I cannot see what is wrong, but these hangups seem to be happening randomly.  We've only noticed this problem since upgrading to 8.1.2 but don't know if it's related specifically to this version.  

Is there anywhere else I could be looking? Any suggestions on how to fix this would also be helpful

Thanks

2 comments
Comment actions Permalink

Please attach full teamcity-agent.log.

0
Comment actions Permalink

hi, im facing the same issue. The work is done, and the status code is 0, but then, the server still keeps waiting for the job to complete, and goes into hanging state. I have given failure conditions as well. (not more then 15minutes), but even then the build hangs.

 

Here is the teamcity-agent.log snapshot:
-----------------------------------------
[2018-08-10 10:02:02,088] INFO - nner2.OsProcessHandlerListener - /Users/builder/Desktop/buildAgent2/temp/agentTmp/custom_script5594973487058699706
[2018-08-10 10:02:02,377] INFO - ildStages.RunnerStagesExecutor - Call stage jetbrains.buildServer.agent.impl.buildStages.runnerStages.start.AgentBuildStepStatusFixer
[2018-08-10 10:02:02,378] INFO - ges.RunnerFinishStagesExecutor - Call finish stage jetbrains.buildServer.agent.impl.buildStages.runnerStages.finish.UnsubscribePropertiesFileUpdaterRunnedFStage
[2018-08-10 10:02:02,378] INFO - ges.RunnerFinishStagesExecutor - Call finish stage jetbrains.buildServer.agent.impl.buildStages.runnerStages.finish.FlushBuildLogRunnerFStage
[2018-08-10 10:03:50,946] INFO - r.agent.impl.AgentLogProxyImpl - Failed to perform remote command 'Publishing 1 files to '.teamcity/properties'' for build with id 12479804, error: jetbrains.buildServer.agent.ArtifactPreprocessingFailedException: Error while sending OPTIONS request: jetbrains.buildServer.agent.ArtifactPreprocessingFailedException: Error while sending OPTIONS request (enable debug to see stacktrace)
[2018-08-10 10:03:50,947] INFO - r.agent.impl.AgentLogProxyImpl - Adding ping and re-register command on top of the remote command queue

 

INFO - buildServer.AGENT.registration - Server supports the following communication protocols: [polling]
[2018-08-10 10:03:51,365] INFO - buildServer.AGENT.registration - Trying to register on server using 'polling' protocol.
[2018-08-10 10:03:51,366] INFO - r.agent.impl.AgentLogProxyImpl - Failed to perform remote command 'pingAndReregister' for build with id 12479804, error: java.lang.Exception: Unable to register on server, pingable: java.lang.Exception: Unable to register on server, pingable (enable debug to see stacktrace)
[2018-08-10 10:03:51,367] INFO - ges.RunnerFinishStagesExecutor - Call finish stage jetbrains.buildServer.agent.impl.buildStages.runnerStages.finish.FireRunnerFinishedFStage
[2018-08-10 10:03:51,377] INFO - mpl$TeamCitySpringSubContainer - Closing SubContext: running build runner context id=12479804 runner 1 of 3: startup date [Fri Aug 10 10:02:01 IST 2018]; parent: SubContext: running build context id=12479804
[2018-08-10 10:03:51,382] INFO - ages.BuildFinishStagesExecutor - Call finish stage jetbrains.buildServer.agent.impl.buildStages.finishStages.FlushBuildLogFStage
[2018-08-10 10:03:51,581] INFO - r.agent.impl.AgentLogProxyImpl - Failed to perform remote command 'pingAndReregister' for build with id 12479804, error: java.lang.Exception: Unable to register on server, pingable: java.lang.Exception: Unable to register on server, pingable (enable debug to see stacktrace)
[2018-08-10 10:03:51,581] INFO - ages.BuildFinishStagesExecutor - Call finish stage jetbrains.buildServer.agent.impl.buildStages.finishStages.FireBeforeBuildFinishFStage
[2018-08-10 10:03:51,589] INFO - ages.BuildFinishStagesExecutor - Call finish stage jetbrains.buildServer.agent.impl.buildStages.finishStages.PublishBuildPropertiesFStage
[2018-08-10 10:03:51,606] INFO - ernalArtifactsProcessingLogger - Publishing artifacts process started
[2018-08-10 10:03:51,606] INFO - ernalArtifactsProcessingLogger - Start: Searching for artifacts to publish in [/Users/builder/Desktop/buildAgent2/temp/agentTmp/build.finish.properties.gz=>.teamcity/properties]
[2018-08-10 10:03:51,607] INFO - ernalArtifactsProcessingLogger - Done: Searching for artifacts to publish, found [1] file
[2018-08-10 10:03:51,607] INFO - ernalArtifactsProcessingLogger - Start: Pre-processing artifacts if needed (archiving)
[2018-08-10 10:03:51,607] INFO - ernalArtifactsProcessingLogger - Done: Pre-processing artifacts if needed (archiving)
[2018-08-10 10:03:51,607] INFO - ages.BuildFinishStagesExecutor - Call finish stage jetbrains.buildServer.agent.impl.buildStages.finishStages.PublishArtifactsFStage
[2018-08-10 10:03:51,607] INFO - ernalArtifactsProcessingLogger - Publishing artifacts process started
[2018-08-10 10:03:51,607] INFO - ernalArtifactsProcessingLogger - Start: Searching for artifacts to publish in [teamcity-info.xml]
[2018-08-10 10:03:51,608] INFO - jetbrains.buildServer.AGENT - Internal artifact path teamcity-info.xml not found
[2018-08-10 10:03:51,608] INFO - ernalArtifactsProcessingLogger - Done: Searching for artifacts to publish, found [0] files
[2018-08-10 10:03:51,608] INFO - ernalArtifactsProcessingLogger - Start: Pre-processing artifacts if needed (archiving)
[2018-08-10 10:03:51,608] INFO - ernalArtifactsProcessingLogger - Done: Pre-processing artifacts if needed (archiving)
[2018-08-10 10:03:51,608] INFO - ages.BuildFinishStagesExecutor - Call finish stage jetbrains.buildServer.agent.impl.buildStages.finishStages.WaitForArtifactsPublishedFStage
[2018-08-10 10:03:51,822] INFO - r.agent.impl.AgentLogProxyImpl - Failed to perform remote command 'pingAndReregister' for build with id 12479804, error: java.lang.Exception: Unable to register on server, pingable: java.lang.Exception: Unable to register on server, pingable

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------

teamcity-build.log:

WARN - % Total % Received % Xferd Average Speed Time Time Time Current
[2018-08-10 10:02:02,255] WARN - Dload Upload Total Spent Left Speed
[2018-08-10 10:02:02,372] WARN -
[2018-08-10 10:02:02,374] WARN - 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
[2018-08-10 10:02:02,374] WARN - 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
[2018-08-10 10:02:02,376] INFO - Process exited with code 0

0

Please sign in to leave a comment.