How do I know the progress of maintainDB migrate

I am currently migrating our database from HSQLDB to PostreSQL, using the maintainDB script. We should have done this a long time ago, before the database grew so large, so we know this is going to take a while. Now the script has been running for 14 hours, but there is no progress indication. (I know this is a problem you are aware of, this is not what I'm asking here.) The whole time it has been saying "Starting backup"

My question is if there is any other way to get a hint of whether the script is working or not. I tried attaching strace to it in Cygwin, but that crashed maintainDB when I exited strace, so I don't want to do that again.

In pgAdmin, I see there are still no tables in the teamcity database.

I then scrolled up a bit in the command line window, and saw several messages of the type "java.io.FileNotFoundException: C:\TeamCity\bin\..\logs\teamcity-maintenance.log (Access is denied)" I had not noticed these before, as the command line window had filled up with other messages about connecting to the database successfully, and that it was starting the backup.

Has the script failed? Should I quit, fix the access problems, and start it again? Or is it actually working now and I should let it sit there in order to not lose 14 hours of work?

Edit: The java process is using 32M of RAM, and 4% CPU, and has been doing so since last night. This is another reason why I suspect the script is actually not doing anything.

Edit2: Another clue: The command line window says "Intermediate backup file: F:\TeamCityData\backup\TeamCity_Backup_20130504_094550.zip", but that file does still not exist.

7 comments
Comment actions Permalink

Anders,

Migration may take multiple hours if data set is really huge, but in that case the tool does cause memory, CPU and network workload.
Access is denied error is a problem that must be fixed first. Please check all the required permissions and used account.

If the issue is still reproduced, then please send us full teamcity-maintenance.log file.

0
Comment actions Permalink

Hi Michael,

Thanks for looking into this! I did fix the permission issues, restarted the job, and left it running from Saturday morning until Sunday evening. Still no tables in the database, same low CPU and memory usage, still no temporary file. Our dataset is 1,4 GB, so we should definitely have migrated a long time ago.

I will send you the log by email, but it didn't say anthing after "Starting backup."

0
Comment actions Permalink

Hi Anders,

> Our dataset is 1,4 GB

That's a lot!

Seems you will need to dedicate more memory to the tool. For this I would recommend to do it on machine with more then 4 Gb of memory, make sure to use x64 JDK (by setting path to JRE installatino home to TEAMCITY_JRE environment variable) and change "-Xmx512m" in maintainDB.cmd script to something like -Xmx3Gb

I would also try to increase the memory a bit more.

If that does not help, please take several thread dump from the running/hanging maintainDB command and attach those together with logs\teamcity-maintenance.log

0
Comment actions Permalink

Thanks, I'll try that as soon as possible (it is not easy to find a good time to take down the CI server for all our teams).

Is there any way I can get any indication whether the tool is working or not? Should the java process increase its memory/CPU usage right after the script is started? When can I expect to see tables being created in the PostgreSQL database? Is there anything else I could look for?

0
Comment actions Permalink

Anders,

Actually, there should be logging into the console. Not sure I have anything to say until we can get hold of the thread dumps...

0
Comment actions Permalink

I am now running on 64 jvm, with -Xmx3G.

It still said nothing after "Starting backup", there are no tables in the database, and java.exe is running at ~4% CPU and ~100kb RAM according to Windows Task Manager.

I have sent Michael a thread dump, but here is the top of it. I took two more after a while, and they are all in this state. Could it be a deadlock?

Full thread dump Java HotSpot(TM) 64-Bit Server VM (23.21-b01 mixed mode):

"HSQLDB Timer @742f6e67" daemon prio=6 tid=0x0000000014da5800 nid=0x1888 in Obje
ct.wait() [0x00000000143ce000]
   java.lang.Thread.State: TIMED_WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        - waiting on <0x000000074001c608> (a org.hsqldb.lib.HsqlTimer$TaskQueue)

        at org.hsqldb.lib.HsqlTimer$TaskQueue.park(HsqlTimer.java:882)
        - locked <0x000000074001c608> (a org.hsqldb.lib.HsqlTimer$TaskQueue)
        at org.hsqldb.lib.HsqlTimer.nextTask(HsqlTimer.java:530)
        - locked <0x000000074001c608> (a org.hsqldb.lib.HsqlTimer$TaskQueue)
        at org.hsqldb.lib.HsqlTimer$TaskRunner.run(HsqlTimer.java:610)
        at java.lang.Thread.run(Unknown Source)

"Service Thread" daemon prio=6 tid=0x0000000011c91000 nid=0x32c0 runnable [0x000
0000000000000]
   java.lang.Thread.State: RUNNABLE

0
Comment actions Permalink

Discussion was moved into feedaback email where full thread dumps were attached.
Likely a case of HSQLDB corruption.

0

Please sign in to leave a comment.