Hidden artifacts not being cleaned up

Our TeamCity server is full; it seems the main cause is that artifacts are not being cleaned up correctly.

Each build produces logs and .NET coverage files, which total around 20 MB. With hundreds of builds, these files are occupying the majority of our available disk space.

Since these are hidden artifacts, they reside in the .teamCity folder. Our base clean-up rule is:

In other words, hidden artifacts more than 14 days older than the last build should be cleaned up (except on the default branch, which has special rules).

Our build config keep rules are:

The first keep rule applies only to the default branch. The second applies only to active branches. Therefore, builds in inactive branches should not be affected by these keep rules. (The inherited rule is the base clean-up rule, which is unchanged.)

So I would expect the hidden artifacts specified in the base clean-up rule (all files in the .NETCoverage and logs folders) to be cleaned up for all inactive branches.

But if I look at an inactive branch, and check an old build (several weeks, and also three weeks older than the most build in this branch):

then I see that the two artifacts subfolders are not being cleaned up.

On checking the teamcity-cleanup.log, the build ID of the above build seems not to be listed anywhere. It looks like this build (along with many like it) is being skipped, but it is not clear why.

We have no pinned builds and the build config is not used as an artifact dependency in any other build. It is used a snapshot dependency, but we have selected ‘Do not prevent dependency artifacts cleanup’ in the base rule (and throughout all build configs), which according to the TeamCity documentation should be enough to ensure that the builds are cleaned up:

(I think “artifact of snapshot dependency” here is a typo for “artifact or snapshot dependency”.)

I have checked every setting and read every online article that I can find, but I can't find anything to help. It just looks like this isn't working as advertised.

Can anyone advise what I'm doing wrong, or suggest an alternative (automated) way to clean up these hidden artifacts?

(We are using TeamCity 2020.1.2 (build 78726). We'd love to upgrade, but we can't.)

1
8 comments
Hi Neil,

It seems that the behavior you described may be related to https://youtrack.jetbrains.com/issue/TW-59344/Cleanup-does-not-clean-an-artifact-dependency-build-despite-do-not-prevent-cleanup-option-in-the-using-build-configuration-if and/or https://youtrack.jetbrains.com/issue/TW-71735/Cleanup-if-some-keep-rule-has-the-Keep-artifact-dependencies-option-then-the-whole-build-configuration-is-considered-as-having.
Both issues were fixed as of 2021.2.

The recommended way to address this issue would be to update TeamCity. Along with the mentioned fixes, the 2023.11.1 version introduced the option to generate a clean-up debug report for a certain build. If the issue persists after the update, please generate a report for one of the affected builds as explained here (https://youtrack.jetbrains.com/issue/TW-2068/Provide-information-on-reason-builds-are-preserved-during-cleanup#focus=Comments-27-8734009.0-0). Then share the report with us by uploading it to https://uploads.jetbrains.com/ and sharing the upload ID.

It seems that the information from the documentation you are referring to is for the latest TeamCity version as well. The documentation for the 2020.1 version is available here: https://www.jetbrains.com/help/teamcity/2020.1/teamcity-documentation.html. 

"The artifact of snapshot dependency" means that the option applies to the artifacts of the snapshot dependency builds, not to artifacts of the build configuration where the option is set. Example build chain: A
Please let me know if you have any additional questions.

Best regards,
Anton
1

Hi Anton,

Many thanks for your response. It does seem that we have no choice but to upgrade if we want to fix this, which isn't possible for us right now. When we're able to do this, I'll check the behaviour as you suggest.

Regarding the documentation: the screenshot in my post is definitely from the 2020.1 documentation (to which you have linked). It would be really helpful if that page could be updated to explain that the clean-up rules for snapshot dependencies don't work properly and provide links to the bugs you have cited - this would have saved me at least 3 hours trying to work out what I'd done wrong!

Thanks again and best regards,

Neil

0
Dear Neil,

Could you please send me the link to this page in the 2020.1 documentation? I was looking through it and wasn't able to find this exact mention, which is why I assumed that it is from the current version's documentation.

Best regards,
Anton
1

Hi Anton,

Sure - the exact page/anchor is as follows (under Concepts > Clean-up):

https://www.jetbrains.com/help/teamcity/2020.1/clean-up.html#Deleted+Build+Configurations+Clean-up

Here's a larger screenshot showing the section I originally posted, including the TeamCity version and URL:

 

Best wishes,

Neil

0
Dear Neil,

Thank you for the link.
The issues that I provided as possible causes for the behavior you encountered were found and fixed later than 2020.1 was released. And are listed in the Release Notes for the related versions:
TW-71735: https://www.jetbrains.com/help/teamcity/2021.1/teamcity-2021-1-2-release-notes.html#Usability+Problem
TW-59344: https://www.jetbrains.com/help/teamcity/2021.2/teamcity-2021-2-release-notes.html#Usability+Problem

The documentation for the 2021.2 version of TeamCity also includes a more detailed description of the fix in the Upgrade Notes: https://www.jetbrains.com/help/teamcity/2021.2/upgrade-notes.html#Fixed+inconsistency+in+build+chains+clean-up.

It is indeed an inconsistency and non-intended behavior; however, it is already fixed and is covered in the related documentation.

I appreciate your understanding.

Since the TeanCity update may be difficult in your environment, you could try setting up the scheduled task in your OS to clean up the files as a workaround. Alternatively, if you have a build agent with access to the TeamCity server, you can run a scheduled build on this agent with the script that clears the files when and if needed.

Best regards,
Anton Vakhtel
1

Hi Anton,

Thanks for your response.

My point really was that it would be very helpful for the older Clean-Up documentation (e.g. 2020.1.2) to be updated with a warning that bugs exist in the relevant functionality. If I were struggling with something but trying to follow official documentation, I wouldn't routinely check later versions of the docs to see whether any bugs had been identified (and I don't think many other users would either). This lack of a warning is what cost me so much time unnecessarily.

The workaround we have employed is to write a custom Python script (for Python 2.7) which we run on our TeamCity server's host machine. This script applies very similar clean-up rules to those used by TeamCity. I could share the script if it would help.

Thanks again for your help with this.

Best wishes,

Neil

0
Dear Neil,

While I agree that it would be useful, it's unrealistic to go through every old version of the documentation every time an issue that impacts that version is fixed or found. We try to document the most relevant issues in the known issues section of the documentation and the upgrade notes, while the release notes contain all the included bug fixes.

On a side note, after two major releases, previous releases are considered out of support, so unless something critical is found, no further changes to that release will be performed, including documentation. While it's rare, if the version is still under support, and the issue is prevalent, we will consider changing the documentation accordingly. Still, beyond that, it is not realistic. Thank you for your understanding.

The script would be useful and may help other users if they use an old version of TeamCity and encounter a similar issue with the cleanup. Please kindly share it in the comment if you wish to.

Best regards,
Anton
1

Hi Anton,

Thanks for your reply and I'm sorry for my slow response.

I have posted below the Python script we used on our TeamCity server to facilitate disk clean-up, working around the issue described in this thread. I hope this might be of help to another user in a similar position.

Best wishes,

Neil

Python script for TeamCity clean-up (provided without guarantee - use at own risk!):

import os
import gzip
import shutil
import time
from datetime import datetime, timedelta


# Note that if Python 2.7 is installed, Python 3 features like 'timestamp()' can't be used.
#   See Daniel Böckenhoff's answer here for the workaround used below: https://stackoverflow.com/questions/50650704/attributeerror-datetime-datetime-object-has-no-attribute-timestamp
# ToDo: Consider adding e.g. "elapsed time" / "space cleared" outputs to the end of the script.

def timestamp_workaround(date):
    return time.mktime(date.timetuple())

def extract_metadata(file_path):
    with gzip.open(file_path, 'rt') as f:
        for line in f:
            if line.startswith('teamcity.build.branch='):
                branch_name = line.split('=')[1].strip()
                break
    timestamp = os.path.getmtime(file_path)
    return branch_name, timestamp

def delete_log_and_coverage_folders(build_folder_root, delete=False):
    folders_to_delete = [os.path.join(build_folder_root, '.teamcity/.NETCoverage'),
                        os.path.join(build_folder_root, '.teamcity/logs')]
    for folder in folders_to_delete:
        if os.path.exists(folder):
            if delete:
                print("Deleting: ", folder)
                shutil.rmtree(folder)
            else:
                print("Would delete: ", folder)
        else:
            print("Would delete (but doesn't exist): ", folder)

def process_builds(main_folder_path, delete=False):
    epoch_time = datetime(1970, 1, 1)
    branch_folders = {}   # Dictionary to store most recent build for each branch
    for root, dirs, files in os.walk(main_folder_path):
        for build_folder in dirs:
            build_folder_path = os.path.join(root, build_folder)
            properties_file_path = os.path.join(build_folder_path, '.teamcity/properties/build.start.properties.gz')
            if os.path.exists(properties_file_path):
                branch_name, timestamp = extract_metadata(properties_file_path)
                if branch_name not in ['default', 'prerelease']:
                    if branch_name not in branch_folders or timestamp > branch_folders[branch_name][1]:
                        branch_folders[branch_name] = (build_folder_path, timestamp)

    for branch_name, (most_recent_branch_build_folder, most_recent_timestamp) in branch_folders.items():
        last_14_days = datetime.fromtimestamp(most_recent_timestamp) - timedelta(days=14)
        for root, dirs, files in os.walk(os.path.dirname(most_recent_branch_build_folder)):
            for dir_name in dirs:
                build_folder_path = os.path.join(root, dir_name)
                properties_file_path = os.path.join(build_folder_path, '.teamcity/properties/build.start.properties.gz')
                if os.path.exists(properties_file_path):
                    branch, timestamp = extract_metadata(properties_file_path)
                    if branch == branch_name and timestamp < timestamp_workaround(last_14_days):
                        delete_log_and_coverage_folders(build_folder_path, delete)
                        
# Insert appropriate project name and build config name into the placeholders below.
MAIN_FOLDER_PATH = "/mnt/efs/system/artifacts/<TEAMCITY PROJECT NAME>/<BUILD CONFIG NAME>"

process_builds(MAIN_FOLDER_PATH, delete=True)
0

Please sign in to leave a comment.