General question about the suitability of using drone

larroy · October 5, 2019, 12:24am

Hi

I was considering switching from Jenkins to drone to build Apache MXNet. It’s a complex CI setup which involves builds in different instance families. There’s separate pools of CPU instances, GPU instances and Windows instances in AWS, they are also autoscaled dynamically depending on the number of queued jobs.

Would it be possible to cover this setup with drone? Is it possible to have different pools of workers, for example CPU and GPU instances? Windows docker support in GPU might be a problem, so at least I would like to understand if for Linux workloads it would help this use case.

This is our curent system: http://jenkins.mxnet-ci.amazon-ml.com/

Thank you !

bradrydzewski · October 5, 2019, 12:33am

Would it be possible to cover this setup with drone? Is it possible to have different pools of workers, for example CPU and GPU instances?

yes, you can use labels to route pipelines to specific machines
https://docs.drone.io/pipeline/docker/syntax/routing/
https://docs.drone.io/runner/docker/configuration/reference/drone-runner-labels/

Windows docker support in GPU might be a problem, so at least I would like to understand if for Linux workloads it would help this use case.

Drone supports multiple runtime engines. We have the Docker runtime engine, which executes builds inside Docker containers [1] which is the most popular option. But we also have a host machine runtime engine which runs builds directly on the host [2] and an ssh runtime engine that runs builds on remote machines using the ssh protocol [3]

[1] Overview | Drone
[2] Overview | Drone
[3] Overview | Drone

There’s separate pools of CPU instances, GPU instances and Windows instances in AWS, they are also autoscaled dynamically depending on the number of queued jobs.

Drone also has an autoscaler that supports AWS. Note the autoscaler currently supports the Docker runtime engine only. See GitHub - drone/autoscaler: Automatically adds or removes instances based on build volume

larroy · October 5, 2019, 1:05am

Wow, thanks for your reply. We do autoscaling using a lambda function that examines the job queue. Does drone have a way to do something similar using an http API? in this case we could attach slaves directly as we do with Jenkins.

bradrydzewski · October 5, 2019, 1:10am

Yes, there is an /api/queue endpoint on the Drone server to get a list of pending and running jobs. The payload includes the os, arch, kernel, labels, etc which can all be used for autoscaling. This is the same endpoint our autoscaling project uses.

bradrydzewski · October 5, 2019, 1:13am

I should also point out that we have a digital ocean runner that spawns a new digital ocean VM for every pipeline execution. https://blog.drone.io/drone-digitoalocean-runner/

We are planning something similar for AWS.

larroy · October 5, 2019, 1:37am

If you look at our pipelines are quite big. We run a pipeline execution across different hosts. I don’t think the asumption that one pipeline can’t span multiple hosts is good.

Thanks a lot for your answers. I would be excited if we could use drone.io.

bradrydzewski · October 5, 2019, 1:54am

the yaml configuration file can define multiple pipelines as an execution graph [1], where each pipeline is executed independently of the others, and can be scheduled on any machine in the cluster that matches the pipeline requirements (os, arch, labels, etc) [2]

given the complexity of your pipelines, you might also want to consider defining your pipelines using Starlark scripting [3] as opposed to yaml.

[1] https://docs.drone.io/pipeline/configuration/
[2] https://docs.drone.io/pipeline/configuration/
[3] https://docs.drone.io/pipeline/scripting/starlark/

larroy · October 8, 2019, 11:13pm

What version do you suggest installing? The one in the docker container seems very light / demo purpose, or is the recommended way to run the master? I didn’t see any other binary packages provided. Do you recommend to build myself or use the docker image?

bradrydzewski · October 8, 2019, 11:23pm

The one in the docker container seems very light / demo purpose

sorry, not sure I understand.

the recommended installation is the drone/drone:1 server image which currently points to the 1.6.0 release. This is the full version of Drone.

larroy · November 26, 2019, 2:13am

Thanks for your answer, catching up with this again.

Does the autoscaler support autoscaling per label? or treats the pool uniformly?

Also, is there a possibility to clone recursively? I find that the submodules are not present on the repo.

bradrydzewski · November 26, 2019, 5:34am

For recursive cloning, you can do something like this:
https://docs.drone.io/pipeline/docker/syntax/cloning/#the---recursive-flag

Yes, the autoscaler can scale by label. More specifically, the autoscaler will only autoscaler workloads that match its profile (os, arch, labels). So you would setup a separate autoscaler for each unique profile.

Topic		Replies	Views
Autoscaling with Labels? Drone Support	4	523	May 6, 2019
[enhancement]: Allow autoscaler to spin up different machine configurations Drone Support	2	625	June 20, 2019
After some help with drone and the amazon autoscaler Drone Support	3	901	December 12, 2018
Introducing Drone Autoscale Drone Support	6	1893	November 17, 2020
Is there a way to make Drone Autoscaler ignore certain runners? Drone Support	1	275	April 29, 2022

General question about the suitability of using drone

Related topics