Preventing duplicate builds in a large & complex project

Hi all,

We have a reasonably large and complex project that I am looking into moving to using TeamCity. Ideally, I'd like to take full advantage of build chains, but I'm running into problems with preventing "pointless" builds.

Our project is stored in a single large Perforce depot (think bigger than 100 GB). It's a rather complex interconnected web, which means that you really need to have the whole thing sync'd to be able to build anything reliably. We have a "main" project that builds our final exe and several hundred supporting projects that build things like tools & data.

I currently have an evaluation TeamCity server where I'm experimenting with the best way to handle this. At the moment, I have the following build configurations all setup to use a single VCS root (because I don't want to have four duplicate copies of 100 GBs of data):

    Code Build - Compiles the main project to produce an exe
    Data Build 1 - Builds a zip containing the data of type '1'
    Data Build 2 - Builds a zip containing the data of type '2'
    Test Build - Uses exe from "Code Build" and data from "Data Build 1" and "Data Build 2"

For the sake of argument, assume that each of the above steps take 20 minutes to complete and that there is only 1 build agent (If we go ahead with adopting TeamCity, we'll have more agents but also more data build projects!). Data Build 1 and Data Build 2 only change once or twice a day, where as the Code Build changes several times every hour.

I've configured "Test Build" to have snapshot dependencies upon "Data Build 1", "Data Build 2" and "Code Build". The problem is that this means every change causes a full rebuild of all the configurations - which takes a long time!

My ideal solution would be someway to tell the "Data Build" configurations that they only need to build when they see changes in specific folders of Perforce. If someone checks in a change that only impacts the code build, I want TeamCity to just re-use the last successful data build. Does anyone know of a way to achieve this?

3 comments
Comment actions Permalink

Hi Stephen,

if I understand your situation correctly, you actually need an artifact dependency between the Test Build and the other Builds, not snapshot. If you really need a snapshot dependency, for instance because the Test Build needs also data from the repository, and you want full consistency, I don't see any solution.

If not, maybe you could use the "Finish Build Trigger" to trigger the Test Build, instead of snapshot dependencies. The only problem is that a single change that impacts serveral builds out of Code, Data 1 and Data 2 will trigger several Test builds.

Olivier

0
Comment actions Permalink

Thanks for the suggestion Olivier! Unforunately, our code and data is very closely tied together - people generally check in a code and data change as the same changelist and we need everything to be built on the same version to avoid broken builds. It's just a shame that our data takes so long to build, as otherwise it wouldn't be such an issue for us.

I'm currently investigating if I can do something clever using a PowerShell script to decide if it can re-use the previously built artifact or not (So in effect, if there are no important changes it will just re-publish the artifact from the previous build - not ideal, but better than a full rebuild)

0
Comment actions Permalink

You're welcome.

Maybe you should also reconsider solving that by brute force: if you have 3 fast agents available, then your the build time will be constant whatever the changes are, and equal to the best case if you have only one agent (40 minutes with your figures).

Unfortunately, TeamCity licensing model can make this option more expensive than what it should be. A related issue: https://youtrack.jetbrains.com/issue/TW-34200

0

Please sign in to leave a comment.