Agents not reconnecting after server/service shutdown

We're evaluating Teamcity and I've run into an issue that we've been unable to troubleshoot, and doesn't appear to be obvious in scouring the internet. When the server or service shutdown (we've been unable to replicate the exact steps to repeat, but it appears to have happened at this point), the agents can no longer connect. This has happened three times now without any simple resolution (we've started from scratch twice and upgraded once). Upgrading from 9.1.7 to 10.0 did fix the issue (we'd hoped permanently), but it has happened again. We're running Windows Server 2012 R2 on VSPhere.

These are the relevant (i think!) logs from the Agent:

[2016-08-11 13:18:51,490] DEBUG - jetbrains.buildServer.XMLRPC - <<< XML-RPC response <<<
[2016-08-11 13:18:51,490] DEBUG - jetbrains.buildServer.XMLRPC -
[2016-08-11 13:18:51,693] DEBUG - buildServer.AGENT.registration - jetbrains.buildServer.xmlrpc.RemoteCallException: Call 'http://teamcity-app2:8080/RPC2', method 'buildServer.registerAgent3' failed: org.apache.xmlrpc.XmlRpcClientException: Error decoding XML-RPC response
jetbrains.buildServer.xmlrpc.RemoteCallException: Call 'http://teamcity-app2:8080/RPC2', method 'buildServer.registerAgent3' failed: org.apache.xmlrpc.XmlRpcClientException: Error decoding XML-RPC response
at jetbrains.buildServer.xmlrpc.AbstractXmlRpcTarget.call(AbstractXmlRpcTarget.java:86)
at jetbrains.buildServer.agent.impl.ServerXmlRpcProxy.registerAgent3(ServerXmlRpcProxy.java:80)
at jetbrains.buildServer.agent.impl.serverCommunication.XmlRpcProtocol.registerOnServer(XmlRpcProtocol.java:31)
at jetbrains.buildServer.agent.impl.BuildAgentImpl.doRegisterOnBuildServer(BuildAgentImpl.java:980)
at jetbrains.buildServer.agent.impl.BuildAgentImpl.registerOnBuildServer(BuildAgentImpl.java:943)
at jetbrains.buildServer.agent.impl.ServerMonitor.run(ServerMonitor.java:73)
Caused by: org.apache.xmlrpc.XmlRpcClientException: Error decoding XML-RPC response
at org.apache.xmlrpc.XmlRpcClientResponseProcessor.decodeResponse(XmlRpcClientResponseProcessor.java:80)
at org.apache.xmlrpc.XmlRpcClientWorker.execute(XmlRpcClientWorker.java:73)
at org.apache.xmlrpc.TCXmlRpcClient$1.execute(TCXmlRpcClient.java:84)
at org.apache.xmlrpc.XmlRpcClient.execute(XmlRpcClient.java:194)
at jetbrains.buildServer.xmlrpc.impl.CommonsXmlRpcTargetImpl$1.execute(CommonsXmlRpcTargetImpl.java:72)
at org.apache.xmlrpc.XmlRpcClient.execute(XmlRpcClient.java:178)
at jetbrains.buildServer.xmlrpc.AbstractXmlRpcTarget.call(AbstractXmlRpcTarget.java:82)
... 5 more
Caused by: org.xml.sax.SAXParseException; Premature end of file.
at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
at org.apache.xerces.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Source)
at jetbrains.buildServer.TCNonValidatedSAXParser.parse(TCNonValidatedSAXParser.java:57)
at org.apache.xmlrpc.XmlRpc.parse(XmlRpc.java:472)
at org.apache.xmlrpc.XmlRpcClientResponseProcessor.decodeResponse(XmlRpcClientResponseProcessor.java:68)
... 11 more
[2016-08-11 13:18:51,693] INFO - buildServer.AGENT.registration - Registration using 'xml-rpc' failed: jetbrains.buildServer.xmlrpc.RemoteCallException: Call 'http://teamcity-app2:8080/RPC2', method 'buildServer.registerAgent3' failed: org.apache.xmlrpc.XmlRpcClientException: Error decoding XML-RPC response
[2016-08-11 13:18:51,693] DEBUG - buildServer.AGENT.registration - Registration using 'xml-rpc' failed
jetbrains.buildServer.xmlrpc.RemoteCallException: Call 'http://teamcity-app2:8080/RPC2', method 'buildServer.registerAgent3' failed: org.apache.xmlrpc.XmlRpcClientException: Error decoding XML-RPC response
at jetbrains.buildServer.xmlrpc.AbstractXmlRpcTarget.call(AbstractXmlRpcTarget.java:86)
at jetbrains.buildServer.agent.impl.ServerXmlRpcProxy.registerAgent3(ServerXmlRpcProxy.java:80)
at jetbrains.buildServer.agent.impl.serverCommunication.XmlRpcProtocol.registerOnServer(XmlRpcProtocol.java:31)
at jetbrains.buildServer.agent.impl.BuildAgentImpl.doRegisterOnBuildServer(BuildAgentImpl.java:980)
at jetbrains.buildServer.agent.impl.BuildAgentImpl.registerOnBuildServer(BuildAgentImpl.java:943)
at jetbrains.buildServer.agent.impl.ServerMonitor.run(ServerMonitor.java:73)
Caused by: org.apache.xmlrpc.XmlRpcClientException: Error decoding XML-RPC response
at org.apache.xmlrpc.XmlRpcClientResponseProcessor.decodeResponse(XmlRpcClientResponseProcessor.java:80)
at org.apache.xmlrpc.XmlRpcClientWorker.execute(XmlRpcClientWorker.java:73)
at org.apache.xmlrpc.TCXmlRpcClient$1.execute(TCXmlRpcClient.java:84)
at org.apache.xmlrpc.XmlRpcClient.execute(XmlRpcClient.java:194)
at jetbrains.buildServer.xmlrpc.impl.CommonsXmlRpcTargetImpl$1.execute(CommonsXmlRpcTargetImpl.java:72)
at org.apache.xmlrpc.XmlRpcClient.execute(XmlRpcClient.java:178)
at jetbrains.buildServer.xmlrpc.AbstractXmlRpcTarget.call(AbstractXmlRpcTarget.java:82)
... 5 more
Caused by: org.xml.sax.SAXParseException; Premature end of file.
at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
at org.apache.xerces.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Source)
at jetbrains.buildServer.TCNonValidatedSAXParser.parse(TCNonValidatedSAXParser.java:57)
at org.apache.xmlrpc.XmlRpc.parse(XmlRpc.java:472)
at org.apache.xmlrpc.XmlRpcClientResponseProcessor.decodeResponse(XmlRpcClientResponseProcessor.java:68)
... 11 more
[2016-08-11 13:18:51,693] WARN - buildServer.AGENT.registration - Registration using all supported protocols failed
[2016-08-11 13:18:51,693] WARN - buildServer.AGENT.registration - Connection to TeamCity server is probably lost. Will be trying to restore it.

 

And from the Server:

 

[2016-08-11 13:24:15,294] DEBUG - ldServer.AGENT.PollingProtocol - Request from Agent is received. Path: /agents/v1/register, session: null
[2016-08-11 13:24:15,884] DEBUG - ains.buildServer.util.WatchDog - Vcs Monitor 1 start: 0 msec
[2016-08-11 13:24:15,884] DEBUG - ains.buildServer.util.WatchDog - Vcs Monitor 1 done: 0 msec
[2016-08-11 13:24:15,897] ERROR - jetbrains.buildServer.SERVER - Error java.net.SocketTimeoutException while processing request: POST '/runtimeError.html', from client :11027, user-agent Jakarta Commons-HttpClient/3.1, no associated user
java.net.SocketTimeoutException
at org.apache.tomcat.util.net.NioBlockingSelector.read(NioBlockingSelector.java:202)
at org.apache.tomcat.util.net.NioSelectorPool.read(NioSelectorPool.java:246)
at org.apache.tomcat.util.net.NioSelectorPool.read(NioSelectorPool.java:227)
at org.apache.coyote.http11.InternalNioInputBuffer.readSocket(InternalNioInputBuffer.java:427)
at org.apache.coyote.http11.InternalNioInputBuffer.fill(InternalNioInputBuffer.java:799)
at org.apache.coyote.http11.InternalNioInputBuffer$SocketInputBuffer.doRead(InternalNioInputBuffer.java:824)
at org.apache.coyote.http11.filters.IdentityInputFilter.doRead(IdentityInputFilter.java:137)
at org.apache.coyote.http11.AbstractInputBuffer.doRead(AbstractInputBuffer.java:339)
at org.apache.coyote.Request.doRead(Request.java:438)
at org.apache.catalina.connector.InputBuffer.realReadBytes(InputBuffer.java:290)
at org.apache.tomcat.util.buf.ByteChunk.substract(ByteChunk.java:449)
at org.apache.catalina.connector.InputBuffer.read(InputBuffer.java:315)
at org.apache.catalina.connector.CoyoteInputStream.read(CoyoteInputStream.java:167)
at com.intellij.openapi.util.io.StreamUtil.copyStreamContent(StreamUtil.java:34)
at com.intellij.openapi.util.io.StreamUtil.loadFromStream(StreamUtil.java:44)
at com.intellij.openapi.util.io.StreamUtil.readText(StreamUtil.java:53)
at jetbrains.buildServer.controllers.agentServer.AgentPollingProtocolController.registerAgent(AgentPollingProtocolController.java:6)
at jetbrains.buildServer.controllers.agentServer.AgentPollingProtocolController.dispatch(AgentPollingProtocolController.java:56)
at jetbrains.buildServer.controllers.agentServer.AgentPollingProtocolController.handleRequestInternal(AgentPollingProtocolController.java:31)
at org.springframework.web.servlet.mvc.AbstractController.handleRequest(AbstractController.java:147)
at org.springframework.web.servlet.mvc.SimpleControllerHandlerAdapter.handle(SimpleControllerHandlerAdapter.java:50)
at org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:961)
at org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:895)
at org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:967)
at org.springframework.web.servlet.FrameworkServlet.doPost(FrameworkServlet.java:869)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:650)
at org.springframework.web.servlet.FrameworkServlet.service(FrameworkServlet.java:843)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:731)
at jetbrains.buildServer.maintenance.TeamCityDispatcherServlet.service(TeamCityDispatcherServlet.java:42)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:303)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:52)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
at jetbrains.buildServer.web.DisableSessionIdFromUrlFilter.doFilter(DisableSessionIdFromUrlFilter.java:8)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
at org.springframework.web.filter.CompositeFilter$VirtualFilterChain.doFilter(CompositeFilter.java:107)
at jetbrains.buildServer.diagnostic.web.DiagnosticFilter.doFilter(DiagnosticFilter.java:45)
at org.springframework.web.filter.CompositeFilter$VirtualFilterChain.doFilter(CompositeFilter.java:112)
at jetbrains.buildServer.web.DependencyParametersCalculationContextFilter.doFilter(DependencyParametersCalculationContextFilter.java:1)
at org.springframework.web.filter.CompositeFilter$VirtualFilterChain.doFilter(CompositeFilter.java:112)
at org.springframework.web.filter.CompositeFilter.doFilter(CompositeFilter.java:73)
at jetbrains.buildServer.web.DelegatingFilter.doFilter(DelegatingFilter.java:2)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
at jetbrains.buildServer.web.ResponseFragmentFilter.doFilter(ResponseFragmentFilter.java:23)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:220)
at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:122)
at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:169)
at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:103)
at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:116)
at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:436)
at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1078)
at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:625)
at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1757)
at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.run(NioEndpoint.java:1716)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
at java.lang.Thread.run(Thread.java:745)
[2016-08-11 13:24:16,438] DEBUG - ains.buildServer.util.WatchDog - Ping 1 start: 0 msec
[2016-08-11 13:24:16,938] DEBUG - ains.buildServer.util.WatchDog - Ping 1 done: 500 msec
[2016-08-11 13:24:17,018] WARN - jetbrains.buildServer.SERVER - Timeout occurred while processing XML-RPC methodbuildServer.registerAgent3 call. Request details: POST '/RPC2', from client :11028, user-agent TeamCity Agent, no associated user.

 

Any ideas? Or any more information you'd need? 

 

Thanks,

 

James

Please sign in to leave a comment.