BuildAgent Communication Issues v8.0.4

Over the weekend I noticed that a build had been hung for 63 hours. I tried stopping the build, but to no avail. I restarted all 3 of the build agents, but the same communication issues seem to keep happening. This (originally) was with 8.0.3. To try to fix the issue, I went ahead and updated the server to 8.0.4. After doing so, I started each one of the build agents, watched them unregister because of the new update, and then update themselves. The same thing is happening. When I kick off a build, you see the "Running..." status for a long time. If I look at the build agent, it doesn't appear that anything is going on or that it even received notice that it was supposed to do a build.

My question is this: where do I start to see why the TeamCity Server is unable to communicate with its build agents? Nothing is jumping out at me in the logs except for this (in the console of the build agent):

[2013-09-30 13:30:25,929]   WARN -       org.apache.xmlrpc.XmlRpc - java.net.SocketTimeoutException: Read timed out

From the build server, I can ping each of the individual agents and vice versa, so they are able to see each other. Any help would be greatly appreciated.

Levi

Update:

This was in the teamcity-agent.log as well:

[2013-09-30 13:35:29,085]   WARN - buildServer.AGENT.registration - Call http://vm-tcserver:8080/RPC2 buildServer.registerAgent3: java.net.SocketTimeoutException: Read timed out                                                                                                                                       

3 comments
Comment actions Permalink

I still cannot figure out what the issue is. Seems to be TC Server side though. I pointed the build agent to another TC Server installation, and it connected right away. When I try stopping the TC Server, it hangs.

0
Comment actions Permalink

So, I think I've gotten to a point where it will process builds now. I had a backup from 6 days ago and I went ahead and restored that. After restoring that (into v8.0.4), everything seems to be working now. I'm not sure how this makes sense, but it seems like anytime a build was queued in a project that had previously had a hanging build, nothing worked. By restoring the database into a state that didn't have any current (or hanging builds) going, that seems to have resolved it? There may not be a correlation, but I cannot think of anything else.

0
Comment actions Permalink

False alarm. We are using a Webhooks plugin (http://netwolfuk.wordpress.com/teamcity-plugins/tcwebhooks/) in TeamCity. This has worked forever, but recently someone tried adding multiple WebHooks URLs to internal endpoints. We think this is what is hanging up the build process. I'm not sure if it's because we have multiple endpoints configured with that plugin or if it's the internal handlers for those WebHooks. Either way, it seems to NOT be an issue with TC, and either the plugin or the internal WebHook handler.

0

Please sign in to leave a comment.