maintainDB fails with UTFDataFormatException in hsqldb code on upgrade
Some time ago our backups using `maintainDB` started failing due to missing heap space:
Failed: Unexpected exception: SQL error when doing 'Querying for a single value ' while performing SQL query: SQL SELECT: select max(mproc_id) from backup_info: java.sql.SQLExce
ption: java.lang.OutOfMemoryError: Java heap space. Caused by: org.hsqldb.HsqlException: java.lang.OutOfMemoryError: Java heap space
Today I had to try and fix this error, as I ran into it when upgrading TC from version 2019.1.x to 2020.1.3. Upgrading involves taking a backup, which invokes the maintainDB script...
As we have plenty of RAM (or so I thought), I increased it to 1024MB. No luck. Then 2048M. Still no luck. Then 4G (still failing). Then lastly 8GB. This time I no longer crashed due to OOM errors, but it still failed ... this time with a UTFDataFormatException.
Failed: Unexpected exception: SQL error when doing 'Querying for a single value ' while performing SQL query: SQL SELECT: select max(mproc_id) from backup_info: java.sql.SQLException: org.hsqldb.HsqlException: java.io.UTFDataFormatException. Caused by: org.hsqldb.HsqlException: org.hsqldb.HsqlException: java.io.UTFDataFormatException
While this is not the same exception, the root cause still seems like the same, doing a SELECT: select max(mproc_id) from backup_info makes the H2 database crash. Not sure how, but it does. How do I go from here? Currently, the system seems bricked, as it is stuck in upgrade mode.
Background details:
I try and invoke the `maintainDB` script like this when testing out the effects of different heap sizes:
time TEAMCITY_MAINTAINDB_MEM_OPTS="-Xmx8G" /opt/TeamCity/bin/maintainDB.sh backup
Please sign in to leave a comment.
Hello!
HSQLDB is ill-suited for the databases larger than 200 MB; in general, we do suggest to migrate to an external database the moment you start to rely on TeamCity (https://www.jetbrains.com/help/teamcity/setting-up-an-external-database.html#Default+Internal+Database). Could you please confirm:
1) what is the actual size of the HSQLDB on your setup? (You can see the HSQLDB files at <Data Directory>/system folder, DB is named as buildserver by default).
2) if you run SELECT COUNT(*) FROM BACKUP_INFO, what does it yield? Does the SELECT MAX(PROC_ID) FROM BACKUP_INFO run manually for you?
I am not sure how I can log into the HSQLDB database to run the query, but I would rather just try and migrate to Postgres or something. The DB is not more than 80MB:
OK, got a bit further, and what I got is kind of depressing. I cannot migrate to Postgres, because the `maintainDB migrate` command fails with the above exception(s), and I get the same exceptions when bypassing TC and using HSQLDB directly.
Not sure how to proceed from here, seeing the hsqldb fails at handling the current data.
By the way, if anyone is wondering how I managed to connect to the HSQLDB:
I downloaded the HSQLDB distribution, with the sqltool.jar, from their website:
Hello!
Thank you for the details and apologies for the delay with answer. UTFDataFormatException in this context, I believe, hints at the data corruption so I would advise to start with the migration directly (as per this article: https://www.jetbrains.com/help/teamcity/migrating-to-an-external-database.html#Switch+with+No+Data+Migration); you would lose the data like users and builds, but build configurations, projects and VCS roots are stored outside of the DB and do persist.
Alternatively, HSQLDB documentation lists some of the details on how the DB could be recovered: http://www.hsqldb.org/doc/1.8/guide/apc.html, however, the instruction implies an existence of recent backup file. (If you have one taken via maintainDB, you could try to restore it on a new external DB as well).
Thanks, I managed to find a working backup from July to get most of the content of the database and used that together with the latest TeamcityData directory. That got me 95% of the way. the hsqldb guys also mentioned that recent versions of the command line tools had some support for trying to fix corrupted data, but never went into that abyss :)