MysqlData keeps growing(160GB for now), is there a way to clean up some out-of-date data through TeamCity?
MysqlData keeps growing(160GB for now), is there a way to clean up some out-of-date data through TeamCity?
Mysql Table size:
vcs_change: 43GB
Please sign in to leave a comment.
TeamCity has cleanup functionality. Please see the documentation on how it can be set up and used: https://www.jetbrains.com/help/teamcity/2024.07/teamcity-data-clean-up.html.
Best regards,
Anton
Hi, Thx
I've set up the Base rule for <Root Project> and the clean-up runs every day. But it does not cut down the size of MySQL database size.
I cannot upload the screen shot of the rule, so I will show it with text:
Clean artifacts
keep history and statistical data
Default policy
Custom policy
Older than 1 days since last build
10th successful build
Clean history
keep statistical data
Default policy
Custom policy
Older than 30 days since last build
100th successful build
Clean everything
Clean everything including artifacts, history, and statistical data
Default policy
Custom policy
Older than 30 days since last build
100th successful build
Dependencies
This setting affects clean-up of artifacts in builds of the dependency build configurations
Prevent clean-up
Do not prevent clean-up
Is the base rule the same for all projects? You can check it by opening the Administration > Projects > Root project > Clean-up Rules and enabling the "Show clean-up rules from subprojects" option. Do you have any keep rules configured?
Best regards,
Anton
Hi,
Yes, I've check all the subprojects and all the Rules show as “1 inherited rule”.
And also I've trided to run the Clean-Up manually in Server Administration → Clean-Up Setting.
But the MySQL Data stays large as same, and the vcs_change table too.
In the directory of server's MySQL Data, there are over 100 files named “[computer-name].bin.000xxx” with size 1GB.
And I run following sql query in mysql server:
SELECT
table_schema,
table_name,
table_rows,
round(((data_length + index_length) / 1024 / 1024), 2) as total_size_in_mb
FROM
information_schema.tables
WHERE
table_schema = "teamcity";
the result shows: "teamcity" "vcs_change" "173510406" "43490.00"
Seems the cleanup process does not clean mysql data!
“[computer-name].bin.000xxx” files look like the binary log, which is a sequential log of changes to your database.
It is used primarily for replication. A replica MySQL Server instance downloads these files and applies the same changes in the same order to the replica instance. The replica instance may be restarted or go offline temporarily, and when it reconnects to the source, it will resume downloading where it left off. So it can be handy to keep the binary log files around for some time. By default, they are set to expire automatically after 30 days (this is configurable). These files are not related to TeamCity, but please check if you replicate your MySQL Server and configure the expiration accordingly: https://dev.mysql.com/doc/refman/8.0/en/replication-options-binary-log.html#sysvar_binlog_expire_logs_seconds.
Could you share teamcity-cleanup.log? You can upload it using https://uploads.jetbrains.com/ and share the upload ID.
Best regards,
Anton
Thx, I've clean the bin log files.
I upload two files with ID 2024_09_12_N5pSYeDrvvK8oX1iGhkDry
One is teamcity-cleanup.log and the other is the screenshot of teamcity database files in mysql.
Regarding the vcs_change table. This table stores the data about commits, such as changed files. TeamCity unloads all commits older than 3 years from in memory cache and performs cleanup of the changed files (cleaning up the vcs_change table) for the commits older than 3.5 years, so it is not something controlled by the cleanup functionality.
Best regards,
Anton
So can I delete records in vcs_change table?
The cleanup log you've sent me shows that the cleanup is working as expected and cleans up everything related to the builds that are affected be your cleanup settings.
Please let me know if you have additional questions.
Best regards,
Anton
Why 3.5 years? Is there a way to change it to like 0.5 year?
There are the following internal properties (https://www.jetbrains.com/help/teamcity/2024.07/server-startup-properties.html#TeamCity+Internal+Properties) available.
`teamcity.vcsModification.maxAgeDays` - this one controls the timestamp of the older commit that should be loaded from DB on start, it is set to 3 years by default.
`teamcity.vcsModification.changedFilesCleaner.maxAgeDays` - this one specifies the date of the oldest commit from where TeamCity can remove the files from `vcs_change` table. By default, it is set to the value of the first property + 180 days. So you should be able to reduce the number of commits that TeamCity loads from DB by reducing `teamcity.vcsModification.maxAgeDays` property, then the second one will be adjusted to the same value + 180 days.
If you don't want to see commits in UI older than, for example, 2 years old, then you should set:
`teamcity.vcsModification.maxAgeDays=730`.
And then cleanup will start removing data for commits older than 730+180 days.
Best regards,
Anton
thx, I will have a try
I've set `teamcity.vcsModification.maxAgeDays=7` in Internal Properties of Diagnostics in Administration.
And it still does not clear the changes older than 7+180 days in the next clean-up.
Do I have to restart TeamCity server?
I strongly recommend against setting this parameter to such a low value. The default of 3 years is considered a reasonable value, and the example I provided is for 2 years. I wouldn't recommend going much lower than that.
And yes, sorry for not mentioning it before, the restart is required.
Best regards,
Anton
Hi,
I've tried to restart and it did not work. The vcs_change table stays large!
What is the current value of teamcity.vcsModification.maxAgeDays?
Best regards,
Anton
teamcity.vcsModification.maxAgeDays=7
You can upload it using https://uploads.jetbrains.com/ and share the upload ID.
Best regards,
Anton
Upload id: 2024_10_17_ff6RjnYZymMtfNf34YELFc (file: teamcity-cleanup.log)
I can see in the provided logs that the cleaner has been invoked:
[2024-10-17 03:03:29,838] INFO - jetbrains.buildServer.CLEANUP - Executing afterCleanup extension (10 of 38): jetbrains.buildServer.vcs.impl.VcsChangesStorage$VcsChangeTableCleaner
And it did not produce any output which can mean two things: either it did not find anything to clean, or it removed less than 100k rows (because it produces output for every 100k removed rows).
Could you run the following SQL queries?
1. select max(register_date) from vcs_history where register_date < (1729674875000 - 16156800000)
2. select max(modification_id) from vcs_history where register_date = <result of query 1>
3. select count(modification_id) from vcs_change where modification_id <= <result of query 2>
These queries should compute the number of rows in vcs_change table which should be removed according to your settings if you start cleanup today.
And, as mentioned in one of the previous messages, I strongly recommend against setting this parameter to such a low value. The default of 3 years is considered a reasonable default. It should be at least hald a year, otherwise it could produce some nasty side effects, and we can't guarantee that TeamCity will work as expected with such a low value.
Best regards,
Anton
1. select max(register_date) from vcs_history where register_date < (1729674875000 - 16156800000)
Result: 1713518065039
2. select max(modification_id) from vcs_history where register_date = 1713518065039
Result: 112094
3. select count(modification_id) from vcs_change where modification_id <= 112094
Result: 110015
select max(register_date) from vcs_history where register_date < (1729674875000 - 16156800000);
1713518065039
select max(modification_id) from vcs_history where register_date = 1713518065039;
112094
select count(modification_id) from vcs_change where modification_id <= 112094;
0
Could you share the new teamcity-cleanup.log just to make sure?
Best regards,
Anton
Upload id: 2024_10_28_XKoUdmEdxiPtQF6RYW7t4t (file: teamcity-cleanup.log)
But the vcs_change remains large(57GB)!
`show table status like 'vcs_change'`
Best regards,
Anton
{
"Name": "vcs_change",
"Engine": "InnoDB",
"Version": 10,
"Row_format": "Dynamic",
"Rows": 220234503,
"Avg_row_length": 267,
"Data_length": 58820935680,
"Max_data_length": 0,
"Index_length": 0,
"Data_free": 5242880,
"Auto_increment": null,
"Create_time": "2024-01-09 20:50:28",
"Update_time": "2024-10-30 21:34:47",
"Check_time": null,
"Collation": "utf8mb4_0900_ai_ci",
"Checksum": null,
"Create_options": "row_format=DYNAMIC",
"Comment": ""
}
The cleanup works, we can see it from the changed results of the query `select count(modification_id) from vcs_change where modification_id <= 112094` and the logs. But it seems it's working slower than expected. One possible cause would be an unusual distribution of data. Maybe there are many commits with lots of changed files.
Could you send the results of the following?
```
show table status like 'vcs_history'
select min(modification_id) from vcs_change
select min(modification_id) from vcs_history where changes_count > 0
```
Best regards,
Anton
show table status like ‘vcs_history’
{
"Name": "vcs_history",
"Engine": "InnoDB",
"Version": 10,
"Row_format": "Dynamic",
"Rows": 1274728,
"Avg_row_length": 443,
"Data_length": 565182464,
"Max_data_length": 0,
"Index_length": 64077824,
"Data_free": 7340032,
"Auto_increment": null,
"Create_time": "2024-01-09 20:50:28",
"Update_time": "2024-11-06 08:26:49",
"Check_time": null,
"Collation": "utf8mb4_0900_ai_ci",
"Checksum": null,
"Create_options": "row_format=DYNAMIC",
"Comment": ""
}
select min(modification_id) from vcs_change
137715
select min(modification_id) from vcs_history where changes_count > 0
1
From what we can see, there are 1.2M VCS commits and 220M changed files, which means that on average, each commit changes almost 200 files. This is a lot and we would say unusual. It seems most likely that you have many huge commits.
In this case the size of the `vcs_change` table seems normal; TeamCity is already removing what is possible.
Best regards,
Anton