Drone-runner-kube slow to process spike in pipeline requests

colinhoglund · September 22, 2021, 7:32pm

Hey Brad , thanks for the feedback!

I pre-warmed a sandbox k8s cluster with enough capacity to support my test case, which is a .drone.yml with 50 parallel pipelines that just run a basic image build pipeline.

When running with a single kube-runner replica, start times balloon pretty quickly to 10x+ their normal start times and build containers are scheduled in what looks to be something of a sequential process.

Great, I did not know this was supported! I just tested this with a few iterations using the test method described above. When running with 3 kube-runner replicas, starts times seemed to decrease by ~1/3 and running with 20 kube-runner replicas start times seem to return to their normal expected values.

If scaling the kube-runner requires no special consideration, then we will probably just look at throwing more replicas at it for now. I suppose we could also look into setting up an HPA with custom metrics to scale using the drone_running_jobs metric. Do you think that would be the appropriate metric to determine if the kube-runner needs more capacity?

Topic		Replies	Views
Kubernetes runner: operation cannot be fulfilled; the object has been modified Drone Support	3	1865	April 21, 2020
Contributing to Drone for Kubernetes Drone Support	17	6569	December 31, 2019
Pipeline parallelism limited to 2? Drone Support	4	473	May 15, 2019
How to limit build concurrency per project Drone FAQ	13	7078	June 2, 2021
Drone runner - polling Drone Support backend	0	391	March 4, 2022

Drone-runner-kube slow to process spike in pipeline requests

Related topics