Use file deduplication when publising artefacts

Answered

We currently build our application for Any CPU, x86 and x64, as well as Net472 and NetCore.
We publish the build output, which can be rather large. The size is largely due to the fact that many files are duplicated across different builds. If TeamCity could remove duplicates when publishing artefacts, this would be really useful.

I tried an experiment with ZPAQ, and found that the build folder reduced from 7GB to around 500MB. Using zip, by comparison, reduces it only to 3.5GB.

Is it possible TeamCity could provide some sort of support for file deduplication?

0
2 comments
Avatar
Fedor Rumyantsev

Hello Rob,

Thank you for the suggestion! It appears as the feature request is already registered as https://youtrack.jetbrains.com/issue/TW-70254 (deduplication on file-level rather than block-level approach ZPAQ uses, though). I have added an internal reference to your request there; please feel free to vote or comment on it as you see fit. 

0
Avatar
Permanently deleted user

I think that request (which would be fanstastic) deduplicates files across builds, whereas my suggestion here is to deduplicate artefacts within a single build output. File level or block level makes no difference to me - ZPAQ happens to use block-level and was easy to find. I've heard (though haven't tried it) that using tar.gz may also help, as the tar puts all the files into a single stream, which gz can then compress across files.

My primary motivation is build speed. Our builds take around 20 mins, with a further 10 mins zipping and uploading artefacts.
We alread use Windows Server block-level deduplication on our TeamCity Server artefacts store.

0

Please sign in to leave a comment.