I was considering switching from Jenkins to drone to build Apache MXNet. It’s a complex CI setup which involves builds in different instance families. There’s separate pools of CPU instances, GPU instances and Windows instances in AWS, they are also autoscaled dynamically depending on the number of queued jobs.
Would it be possible to cover this setup with drone? Is it possible to have different pools of workers, for example CPU and GPU instances? Windows docker support in GPU might be a problem, so at least I would like to understand if for Linux workloads it would help this use case.
Windows docker support in GPU might be a problem, so at least I would like to understand if for Linux workloads it would help this use case.
Drone supports multiple runtime engines. We have the Docker runtime engine, which executes builds inside Docker containers [1] which is the most popular option. But we also have a host machine runtime engine which runs builds directly on the host [2] and an ssh runtime engine that runs builds on remote machines using the ssh protocol [3]
There’s separate pools of CPU instances, GPU instances and Windows instances in AWS, they are also autoscaled dynamically depending on the number of queued jobs.
Wow, thanks for your reply. We do autoscaling using a lambda function that examines the job queue. Does drone have a way to do something similar using an http API? in this case we could attach slaves directly as we do with Jenkins.
Yes, there is an /api/queue endpoint on the Drone server to get a list of pending and running jobs. The payload includes the os, arch, kernel, labels, etc which can all be used for autoscaling. This is the same endpoint our autoscaling project uses.
If you look at our pipelines are quite big. We run a pipeline execution across different hosts. I don’t think the asumption that one pipeline can’t span multiple hosts is good.
Thanks a lot for your answers. I would be excited if we could use drone.io.
the yaml configuration file can define multiple pipelines as an execution graph [1], where each pipeline is executed independently of the others, and can be scheduled on any machine in the cluster that matches the pipeline requirements (os, arch, labels, etc) [2]
given the complexity of your pipelines, you might also want to consider defining your pipelines using Starlark scripting [3] as opposed to yaml.
What version do you suggest installing? The one in the docker container seems very light / demo purpose, or is the recommended way to run the master? I didn’t see any other binary packages provided. Do you recommend to build myself or use the docker image?
Yes, the autoscaler can scale by label. More specifically, the autoscaler will only autoscaler workloads that match its profile (os, arch, labels). So you would setup a separate autoscaler for each unique profile.