Page Loading Extremely Slow, Intermittent Timeouts, Errors in logs
I'm not sure where to begin. I've been having lots of problems with extreme slowness ever since upgrading to 4.0. Loading any page takes a very long time. If a build is failing and I click on the link for that build, it can take minutes for the page to load. Builds are timing out sometimes after all of the tests are finished running. I have looked at the various logs and see errors but I don't know what they are all about.
For one build agent that times out, I get the following in the build logs before it times out:
In the teamcity-server.log I am constantly getting the error:
jetbrains.buildServer.SERVER - Failed to load finished build instance: Cannot find build promotion with ID 8187
In the build agent's error.log, I see the following, repeatedly:
[2009-01-07 15:54:47,158] WARN - org.apache.xmlrpc.XmlRpc - java.net.SocketTimeoutException: Read timed out
Haven't seen anything particularly interesting in the other logs.
I am currently using the internal database, have 3 build agents and the following environment variables set:
- TEAMCITY_AGENT_MEM_OPTS=-Xms512m -Xmx1024m -XX:MaxPermSize=256m
- TEAMCITY_SERVER_MEM_OPTS=-Xms512m -Xmx1024m -XX:MaxPermSize=256m
Help!
Please sign in to leave a comment.
First of all, I would recommend to upgrade to 4.0.1. Pages can open slowly for the first time if server compiles JSP. If this is not your case you can install profiler plugin: http://www.jetbrains.net/confluence/display/TW/Server+Profiling and send us profiling snapshot. Also please send us server logs.
As for build timeout, you can try to take a thread dump from a build while it is running. This can be done right from the build results page.
Finally, agent itself does not require so much memory. Minimum size of 512Mb is too high, I would suggest to reduce memory usage to maximum of 256Mb.
Pavel,
Thanks for your response and help. I upgraded to 4.0.1 and decreased the memory on the agents. The pages are still loading slowly after they have been compiled. Specifically, the build results page that has failed tests and the changes dropdown from the main project page are so slow its unusable. Whenever I try to load the build results page for a test run that has failing tests, I see the Failed to Load Build Instance error over 20 times in the logs. The first time I attempt to load the build results page for a run with failed tests for a project, I see the socket Read timed out exception in all of the agent's error.log. Sometimes I only get the error once before the page loads and other times it happens multiple times. I will install the profiler plugin next.
I will try to get a thread dump on the timeout but since it's not consistent I don't always know when it's going to happen.
--Melody
I guess that we need to see profiler snapshot to understand what happens. You can upload snapshot to ftp://ftp.intellij.net/.uploads and send name of the file to teamcity-feedback [at] jetbrains.com
I'm having trouble getting the profiler installed. Keep getting this error:
[YourKit Java Profiler 7.0.12] Cannot start: profiling must not be done via JVMPI on Java 5+; JVMTI should be used instead. I have no idea what that means but I am using jdk1.6.0_04.
How do you start your server? Did you specify -agentlib:yjpagent JVM argument?
Yes, I set the following environment variable: TEAMCITY_SERVER_OPTS=-agentlib:yjpagent. I was a little unclear about what the "Team City Home" directory was so I tried to put the .so file in the TeamCity/bin and in the ../TeamCity/bin (which is where the .BuildServer etc. are). I start my server with the following command: $HOME/TeamCity/bin/startup.sh. I have not been able to properly shut down my agents since I tried to install the profiler. I keep getting this error: jetbrains.buildServer.AGENT - Ping problem: Call http://teamcity.foliofn.com:8111/RPC2 buildServer.ping: java.net.ConnectException: Connection refused.
Could you please try to launch startup.sh by the following command:
sh -x startup.sh
and post here this command output.
Also please try to install latest JDK 6.
Here is the output.
-bash-3.00$ sh -x $HOME/TeamCity/bin/startup.sh
os400=false
darwin=false
+ uname
PRG=/home/teamcity/TeamCity/bin/startup.sh
+ [ -h /home/teamcity/TeamCity/bin/startup.sh ]
+ dirname /home/teamcity/TeamCity/bin/startup.sh
PRGDIR=/home/teamcity/TeamCity/bin
EXECUTABLE=catalina.sh
+ false
+ [ ! -x /home/teamcity/TeamCity/bin/catalina.sh ]
+ exec /home/teamcity/TeamCity/bin/catalina.sh start
Using CATALINA_BASE: /home/teamcity/TeamCity
Using CATALINA_HOME: /home/teamcity/TeamCity
Using CATALINA_TMPDIR: /home/teamcity/TeamCity/temp
Using JRE_HOME: /usr/jdk/jdk1.6.0_04
Well, I missed what script did you use to start TeamCity server. You should use teamcity-server.sh, otherwise parameters passed via TEAMCITY_SERVER_OPTS variable will not be used.
ok. did that but the server is not starting. Here is the output:
-bash-3.00$ sh -x teamcity-server.sh start
+ pwd
old_cwd=/home/teamcity/TeamCity/bin
+ dirname teamcity-server.sh
BIN=.
+ cd .
+ pwd
BIN=/home/teamcity/TeamCity/bin
+ mkdir ../logs
+ [ -Xms512m -Xmx1024m -XX:MaxPermSize=256m = ]
CATALINA_OPTS=-server -Xms512m -Xmx1024m -XX:MaxPermSize=256m -Dlog4j.configuration=file:/home/teamcity/TeamCity/conf/teamcity-server-log4j.xml -Dteamcity_logs=/home/teamcity/TeamCity/logs/ -Djava.awt.headless=true -agentlib:yjpagent -server -Xms512m -Xmx1024m -XX:MaxPermSize=256m -Dlog4j.configuration=file:/home/teamcity/TeamCity/bin/../conf/teamcity-server-log4j.xml -Dteamcity_logs=../logs/ -Djava.awt.headless=true
+ export CATALINA_OPTS
CATALINA_HOME=./..
CATALINA_BASE=./..
+ ./catalina.sh start
Using CATALINA_BASE: /home/teamcity/TeamCity
Using CATALINA_HOME: /home/teamcity/TeamCity
Using CATALINA_TMPDIR: /home/teamcity/TeamCity/temp
Using JRE_HOME: /usr/jdk/jdk1.6.0_04
+ cd /home/teamcity/TeamCity/bin
I think this is the problem: from the catalina.out:
Error occurred during initialization of VM
Could not find agent library on the library path or in the local directory: yjpagent
Are there any error messages in the log? Is there a file in TeamCIty/bin folder with name like hs_err_pid*.log ?
Try to start server without -agentlib:yjpagent parameter. Since you used incorrect script your memory settings were not used, and this could be cause of the slowdown. Also I would recommend to update JDK you are using.
What is the name of the .so file you copied to the TeamCity/bin directory?
libyjpagent.so
I don't see any other errors in the logs. I'll take out the agent and see what happens.
I just restarted the server. Some things do seem faster but the build results page is taking 45+ seconds to load even though I'm loading it multiple times in a row. I still see the Failed to load build instance errors and the Socket errors. I haven't had a chance to upgrade the JDK but will do that next.
Upgraded to latest jdk and same problems. still can't get the profiler to work. It still says it can't find it. I have it in the TeamCity/bin directory.
Ok, one more attempt, please add the following to the TEAMCITY_SERVER_OPTS
-Djava.library.path=<path to TeamCity/bin>
<path to TeamCity/bin> - replace this with path to a directory where you put .so file
no dice. my environment variable is as follows:
TEAMCITY_SERVER_OPTS=-Djava.library.path=/home/teamcity/bin -agentlib:yjpagent
Finally, I've found how java loads native libraries under Linux OS. You should modify LD_LIBRARY_PATH environment variable. Try to add path to /home/teamcity/bin to this variable, this should help (http://www.yourkit.com/docs/75/help/getting_started/running_with_profiler/agent.jsp).
OK. That page was very helpful. Thanks. I'm just going to use the full path java option. I'm starting things up now. I'll send results when I have them.
OK. Finally got something uploaded. I uploaded 2 files. I tried uploading the 2 fiiles initially and I don't think it worked so I renamed them and uploaded them again. They are called: openBuildResultsPage2.snapshot and openProjectOverviewPage3.snapshot. I'm not having the same consistency today as I was yesterday with the problems but I'm seeing all of the same errors in the logs. It's still timing out intermittently but it doesn't want to do it when I'm watching.
From snapshot I see that TeamCity is indexing tests for search. If you do not plan to search in which build a test ran, you can disable tests indexing by adding the following parameter to TEAMCITY_SERVER_OPTS:
-Dtc.search.index.tests=false
Or you can wait for some time till all history builds will be indexed. You can check how much builds need to be reindexed on the server configuration page (see builds queued at the bottom of the page). Of course indexing should not take so much time. Will see what we can do.
Pavel - thank you so much for your help this week. I will turn off the indexing for sure and see how things go. Does this have anything to do with the erorr messages (Failed to load finished build instance: Cannot find build promotion with ID 1990) that are spamming the logs? Do you kow what is causing those errors?
These warnings are harmless, they indicate that some builds may not be loaded because there is no record in the database containing build settings. Did you use EAP builds? There were some bugs which could cause such errors but they should be fixed now. Since search is traversing builds in history it can try to load such builds and produce such logs.
I'm using the latest production build. I've never used an EAP build.
In this case please attach your teamcity-server.log to this issue: http://jetbrains.net/tracker/issue/TW-5684
I just uploaded another snapshot since I added that option to my env and it is still crazy slow. Now reloading the projects page is taking a very long time. I'm out of here for the weekend so it's OK to check it out Monday. Thanks again for your support.
Oh yeah, the file is named: Bootstrap-2009-01-09(2).snapshot
done. just attached the log file.