MysqlData keeps growing(160GB for now), is there a way to clean up some out-of-date data through TeamCity?

MysqlData keeps growing(160GB for now), is there a way to clean up some out-of-date data through TeamCity?

Mysql Table size:

vcs_change: 43GB

0
31 comments
Hi,

TeamCity has cleanup functionality. Please see the documentation on how it can be set up and used: https://www.jetbrains.com/help/teamcity/2024.07/teamcity-data-clean-up.html.

Best regards,
Anton
0

Hi, Thx

I've set up the Base rule for <Root Project> and the clean-up runs every day. But it does not cut down the size of MySQL database size.

I cannot upload the screen shot of the rule, so I will show it with text:

Clean artifacts

keep history and statistical data

Default policy

Custom policy 

Older than 1 days since last build

10th successful build

Clean history

keep statistical data

Default policy

Custom policy

Older than 30 days since last build

100th successful build

Clean everything

Clean everything including artifacts, history, and statistical data

Default policy

Custom policy

Older than 30 days since last build

100th successful build

Dependencies

This setting affects clean-up of artifacts in builds of the dependency build configurations

Prevent clean-up

Do not prevent clean-up

0
Hi,

Is the base rule the same for all projects? You can check it by opening the Administration > Projects > Root project > Clean-up Rules and enabling the "Show clean-up rules from subprojects" option. Do you have any keep rules configured?

Best regards,
Anton
0

Hi,

Yes, I've check all the subprojects and all the Rules show as “1 inherited rule”.

And also I've trided to run the Clean-Up manually in Server Administration → Clean-Up Setting.

But the MySQL Data stays large as same, and the vcs_change table too.

In the directory of server's MySQL Data, there are over 100 files named “[computer-name].bin.000xxx” with size 1GB.

 

And I run following sql query in mysql server:

SELECT 
   table_schema, 
   table_name, 
   table_rows,
   round(((data_length + index_length) / 1024 / 1024), 2) as total_size_in_mb
FROM 
   information_schema.tables
WHERE 
   table_schema = "teamcity";

 

the result shows: "teamcity"    "vcs_change"    "173510406"    "43490.00"
 

Seems the cleanup process does not clean mysql data!

0
Thank you for the clarification!

“[computer-name].bin.000xxx” files look like the binary log, which is a sequential log of changes to your database.
It is used primarily for replication. A replica MySQL Server instance downloads these files and applies the same changes in the same order to the replica instance. The replica instance may be restarted or go offline temporarily, and when it reconnects to the source, it will resume downloading where it left off. So it can be handy to keep the binary log files around for some time. By default, they are set to expire automatically after 30 days (this is configurable). These files are not related to TeamCity, but please check if you replicate your MySQL Server and configure the expiration accordingly: https://dev.mysql.com/doc/refman/8.0/en/replication-options-binary-log.html#sysvar_binlog_expire_logs_seconds.

Could you share teamcity-cleanup.log? You can upload it using https://uploads.jetbrains.com/ and share the upload ID.

Best regards,
Anton
0

Thx, I've clean the bin log files.

I upload two files with ID 2024_09_12_N5pSYeDrvvK8oX1iGhkDry 

One is teamcity-cleanup.log and the other is the screenshot of teamcity database files in mysql.

0
Thank you! The cleanup looks functional.
Regarding the vcs_change table. This table stores the data about commits, such as changed files. TeamCity unloads all commits older than 3 years from in memory cache and performs cleanup of the changed files (cleaning up the vcs_change table) for the commits older than 3.5 years, so it is not something controlled by the cleanup functionality.

Best regards,
Anton
0

So can I delete records in vcs_change table?

0
No, please avoid doing any manual modifications to the TeamCity database unless instructed otherwise. This data is necessary for TeamCity to work properly. As mentioned in the previous message, the records in this table are being cleaned up automatically when they become older than 3.5 years old.
The cleanup log you've sent me shows that the cleanup is working as expected and cleans up everything related to the builds that are affected be your cleanup settings.
Please let me know if you have additional questions.

Best regards,
Anton
0

Why 3.5 years? Is there a way to change it to like 0.5 year?

0
Hi,

There are the following internal properties (https://www.jetbrains.com/help/teamcity/2024.07/server-startup-properties.html#TeamCity+Internal+Properties) available.
`teamcity.vcsModification.maxAgeDays` - this one controls the timestamp of the older commit that should be loaded from DB on start, it is set to 3 years by default.
`teamcity.vcsModification.changedFilesCleaner.maxAgeDays` - this one specifies the date of the oldest commit from where TeamCity can remove the files from `vcs_change` table. By default, it is set to the value of the first property + 180 days. So you should be able to reduce the number of commits that TeamCity loads from DB by reducing `teamcity.vcsModification.maxAgeDays` property, then the second one will be adjusted to the same value + 180 days. 
If you don't want to see commits in UI older than, for example, 2 years old, then you should set:
`teamcity.vcsModification.maxAgeDays=730`.
And then cleanup will start removing data for commits older than 730+180 days.

Best regards,
Anton
0

thx, I will have a try

0

I've set `teamcity.vcsModification.maxAgeDays=7` in Internal Properties of Diagnostics in Administration.

And it still does not clear the changes older than 7+180 days in the next clean-up.

Do I have to restart TeamCity server?

0
Hi,

I strongly recommend against setting this parameter to such a low value. The default of 3 years is considered a reasonable value, and the example I provided is for 2 years. I wouldn't recommend going much lower than that.
And yes, sorry for not mentioning it before, the restart is required.

Best regards,
Anton
0

Hi,

I've tried to restart and it did not work. The vcs_change table stays large!

0
Hi,

What is the current value of teamcity.vcsModification.maxAgeDays?

Best regards,
Anton
0

teamcity.vcsModification.maxAgeDays=7


 

0
Thanks for the information. Could you execute the cleanup, wait for it to finish, and share the teamcity-cleanup.log file? 
You can upload it using https://uploads.jetbrains.com/ and share the upload ID.

Best regards,
Anton
0

Upload id: 2024_10_17_ff6RjnYZymMtfNf34YELFc (file: teamcity-cleanup.log)

0
Hi,

I can see in the provided logs that the cleaner has been invoked:

[2024-10-17 03:03:29,838] INFO - jetbrains.buildServer.CLEANUP - Executing afterCleanup extension (10 of 38): jetbrains.buildServer.vcs.impl.VcsChangesStorage$VcsChangeTableCleaner

And it did not produce any output which can mean two things: either it did not find anything to clean, or it removed less than 100k rows (because it produces output for every 100k removed rows).
Could you run the following SQL queries?

1. select max(register_date) from vcs_history where register_date < (1729674875000 - 16156800000)
2. select max(modification_id) from vcs_history where register_date = <result of query 1>
3. select count(modification_id) from vcs_change where modification_id <= <result of query 2>

These queries should compute the number of rows in vcs_change table which should be removed according to your settings if you start cleanup today.

And, as mentioned in one of the previous messages, I strongly recommend against setting this parameter to such a low value. The default of 3 years is considered a reasonable default. It should be at least hald a year, otherwise it could produce some nasty side effects, and we can't guarantee that TeamCity will work as expected with such a low value.

Best regards,
Anton
0

1. select max(register_date) from vcs_history where register_date < (1729674875000 - 16156800000) 

Result: 1713518065039

2. select max(modification_id) from vcs_history where register_date = 1713518065039

Result: 112094

3. select count(modification_id) from vcs_change where modification_id <= 112094

Result: 110015

0
Thank you! Could you run a new cleanup and then run the queries again?
0

select max(register_date) from vcs_history where register_date < (1729674875000 - 16156800000); 

1713518065039

select max(modification_id) from vcs_history where register_date = 1713518065039; 

112094

select count(modification_id) from vcs_change where modification_id <= 112094;

0

0
Thanks! The last result means that the rows were cleaned up. The previous result showed 110,015 rows that should be cleaned, and now it's 0. So the cleanup seems to be working correctly, and the table is getting cleaned up according to your settings. 
Could you share the new teamcity-cleanup.log just to make sure?

Best regards,
Anton
0

Upload id: 2024_10_28_XKoUdmEdxiPtQF6RYW7t4t (file: teamcity-cleanup.log)

But the vcs_change remains large(57GB)!

0
Could you check the actual number of rows in this table?
`show table status like 'vcs_change'`

Best regards,
Anton
0

        {
            "Name": "vcs_change",
            "Engine": "InnoDB",
            "Version": 10,
            "Row_format": "Dynamic",
            "Rows": 220234503,
            "Avg_row_length": 267,
            "Data_length": 58820935680,
            "Max_data_length": 0,
            "Index_length": 0,
            "Data_free": 5242880,
            "Auto_increment": null,
            "Create_time": "2024-01-09 20:50:28",
            "Update_time": "2024-10-30 21:34:47",
            "Check_time": null,
            "Collation": "utf8mb4_0900_ai_ci",
            "Checksum": null,
            "Create_options": "row_format=DYNAMIC",
            "Comment": ""
        }

0
Hi,

The cleanup works, we can see it from the changed results of the query `select count(modification_id) from vcs_change where modification_id <= 112094` and the logs. But it seems it's working slower than expected. One possible cause would be an unusual distribution of data. Maybe there are many commits with lots of changed files.
Could you send the results of the following?

```
show table status like 'vcs_history'
select min(modification_id) from vcs_change
select min(modification_id) from vcs_history where changes_count > 0
```

Best regards,
Anton
0

show table status like ‘vcs_history’
        {
            "Name": "vcs_history",
            "Engine": "InnoDB",
            "Version": 10,
            "Row_format": "Dynamic",
            "Rows": 1274728,
            "Avg_row_length": 443,
            "Data_length": 565182464,
            "Max_data_length": 0,
            "Index_length": 64077824,
            "Data_free": 7340032,
            "Auto_increment": null,
            "Create_time": "2024-01-09 20:50:28",
            "Update_time": "2024-11-06 08:26:49",
            "Check_time": null,
            "Collation": "utf8mb4_0900_ai_ci",
            "Checksum": null,
            "Create_options": "row_format=DYNAMIC",
            "Comment": ""
        }

 

select min(modification_id) from vcs_change

137715


select min(modification_id) from vcs_history where changes_count > 0

1
 

 

0
Hi,

From what we can see, there are 1.2M VCS commits and 220M changed files, which means that on average, each commit changes almost 200 files. This is a lot and we would say unusual. It seems most likely that you have many huge commits.
In this case the size of the `vcs_change` table seems normal; TeamCity is already removing what is possible.

Best regards,
Anton
0

Please sign in to leave a comment.