Kubernetes runner image building and caching best practises?

We are evaluating the move from an older Drone1 dind implementation to Drone2 with kubernetes runner. We are in GCP, have ~60 pipelines and obviously we want to introduce as few changes as possible. We also want to be able to scale our GKE node workers based on CPU memory. Currently, we have 2 blockers that we are unsure how to solve:

  • First is docker image caching. If a pipeline runs on the same worker, everything is working fine as the image exists locally. When new workers are introduced though, everything needs to be rebuild again. I wonder what is the suggested way to move forward with this, I understand that image caching is a difficult problem to solve in the Kubernetes ecosystem, however the runner becomes less exciting if there is no way around image caching.

  • Then is the image building. Our drone.yaml file includes many different pipelines. One of them builds locally the Docker image, then the rest use it for all kind of things (testing, building etc). With multiple worker nodes, a pipeline can be picked up by a different worker, which doesn’t have the image locally (because another worker built it) and the pipeline fails. I guess we can push the image to GCR when we build it so each pipeline get it from there, but this sounds like an “expensive” way around it, especially without proper caching in place.

Is anyone using the kubernetes runner in a big production scale to share some ideas?

For the first point, we are proceeding with --cache-from --cache-to options in a shared filestore instance which is mounted to every new worker node when GKE autoscales the instances.

There is no easy way to solve the second point without playing around with mount volumes (which we want to avoid, we have ~60 configurations and some of them have big amount of steps).

However, we are now wondering, is there a way to force all pipelines in a single file, to be spanwed in the same kubernetes worker? That would solve our issue, since the image would always exist in the worker where the docker build command runs.