Processes started by tests not terminated after move to new server and upgrade from 6.5 to 7.

Hi,

Summary: External processes started by test projects executed by the console runner of XUnit.NET and MSpec that should be terminated by code in handlers for the AppDomain.CurrentDomain.DomainUnload and AppDomain.CurrentDomain.ProcessExit events are no longer terminated after upgrade from TC 6.5 to 7 and moving to faster server.


More details:
We're using TC to build a .NET 3.5/4 solution. The artifacts are a number of class libraries. We're using Phantom to execute build scripts that run tests written using XUnit.NET and MSpec.

We recently made some modifications to our build server which is running on Windows Server 2008:

  • Upgrade from 6.5 to 7.
  • Moved it to a new, much faster server.
  • Both the build server and the agent runs under a different user. It used to run as the local administrator. Now it runs as a AD user which has local admin rights on the machine.
  • Using HTTPS instead of SSH for Git.


From TeamCity's perspective our build process is very simple, it simply runs a console command. This console command compiles a solution using msbuild. It then proceeds to run a total of eight test projects. It does this by invoking the console runnner of either XUnit.NET or MSpec.

Some of these test projects run quite heavy integration tests that starts various external services such as, for instance, node.js. These are all started using the System.Diagnostics.Process.Start() method. After starting such a process we store the process in a variable and assign handlers to both AppDomain.CurrentDomain.DomainUnload and AppDomain.CurrentDomain.ProcessExit. In the handlers for these events we invoke two methods on the process, process.CloseMainWindow() and process.WaitForExit(10000).

The reason for doing this, instead of terminating processes for dependencies in some sort of assembly cleanup method is that it allows us to use the same code both for our XUnit.NET and MSpec tests and also, if I remember correctly, that I couldn't find such a method in XUnit.NET. It should also ensure that the process(es) are terminated even if some exception is thrown in other assembly cleanup code.

Now, this has always worked well when running test locally both using console runners, TD.NET and ReSharper. It has also always worked well on our old build server, before we made the above changes. Now though, when running the build on the build server the processes aren't terminated. The build works fine in all other regards and no issues (that I can find) are reported. However, since the processes are still running the next build will fail due to not being able to clean the work directory as file are locked, or because multiple instances of external dependencies are running.

I'm in no way blaming TeamCity for this, but considering that this problem started after upgrading TC and changing the context in which it runs (faster server etc) I'm hoping that maybe, maybe, someone here might have an idea? Maybe something changed in TC 7 related to how command line build steps are run?

Any advice would be greatly appreciated!

3 comments
Comment actions Permalink

Every test runner tries to unload app-domain.
It looks the things are slighthly different when you start it from commandline.

Have you tried to check if you tear-down code works if you start the script from console? It looks like MSpec/XUnit.NET test runners do
not do the same cleanup as it implemented in TD.NET or ReSharper.

You may add an extra build step in TeamCity that provide cleanup. You may simply create a file with all information about started processes
so the cleanup step could simple parse it an perform necessary cleanup.

0
Comment actions Permalink

Sorry for the late reply.

I believe I've found the cause of our issues. In the new configuration the build agent runs as an AD user. This means that the "Allow service to interact with desktop" checkbox cannot be checked in the Log On tab in the properties window for the windows service. This means that the CloseMainWindow method fails to shut down the external services. Using Process.Kill works BUT that will only kill that specific process which means that any process that such as process has spawned (such as through running a bat file in our case) won't be killed.

It appears there are two possible solutions to this besides running the build agent as a local user:
1. Forcing Windows to allow the service/user to interact with the desktop, possibly as described here.
2. Figure out the process tree for a started process and kill the entire tree, for instance as described here.

I have yet to try #1 but #2 seems to work for us as long as all processes, including test runners, are either all 32 bit or all 64 bit processes.

0
Comment actions Permalink

Hello,

Thank you for sharing this. We found a bug in Process termination that build agent used to implement. That tries to build processes tree to find all child processes running under build agent. This looks close to #2 approach. Still, there could be a gaps in the process tree leading to processes miss.

0

Please sign in to leave a comment.