How works the scheduling for parallel execution of build configs in build chains and can one influence it?

Created April 25, 2022 17:16

Hello,

let's say I have simple build chain consisting of these steps:

compile -> run "unit tests", "functional tests" & "integration tests" in parallel -> reporting build config waiting for all tests being finished

How does the scheduling for the parallel execution work aka in which order do the agents choose these steps? If I have 3 idling agents all is fine and I don't bother, but If I have only one or two idling agents it would be the best if these would take the longest running steps first.

At the moment the order by duration is as following:

functional tests
unit tests
integration tests

This is the order an agent would choose if only one is available through the whole build chain:

unit tests
integration tests
functional tests

I think the best scenario would be to start with the longest running step not run yet, for the case if another agent can join later. So these later joining agents can execute the shorter tasks while the longest is being executed or all agents working on the chain can work parallel on all shorter tasks parallel if the other agents join after the longest task is already done. In most (if not all) cases this would save time. Otherwise (like it is at the moment on my build chain) everyone needs to wait for the last joined agent because he took the longest running step. If possible I would like to influence the scheduling being used for the decision which step to take first.

Let's have a look on an example:

In my case it would be best if first agent starts with functional test (running 9mins). A second agent could take the unit tests (running 5mins) and afterwards integration tests (running 4 mins). If the second agent joins 2mins later overall execution time of the tests is 11mins, with the current behaviour it is more likely to result in 14min overall execution time (first runs unit tests 5min, second joins 2 mins later running integration tests 4min (ends after 6min), first runs functional tests 9min).

I know this a lightweight example with short times, but this effect will pile up in bigger situations with more agents, far more build chains and build steps and longer execution times. 11 vs 14 minutes is still already a plus of about 27% in time consumption. I'm sure it's not the real world situation to have plenty agents to grab immediately any ready build step and execute it one in a time, while a big part of the agents idles most of the time.

Best Regards

Daniel

Please sign in to leave a comment.