Steps repeatedly failing with clone:skipped

carboneater · November 30, 2021, 3:59pm

We deployed a second runner in our CI cluster recently.

And things haven’t been going great ever since…
Lots of jobs would randomly fail with the clone:skipped error message
But restarting them would, sometimes, fix them.

However, now, some projects keep failing to build with no useful logs…

Even when using the simplest pipeline configuration possible, the error message doesn’t change; pipeline gets destroyed as soon as it’s started…

.drone.jsonnet

[
  {
    kind: 'pipeline',
    type: 'docker',
    name: 'Base',
    steps: [
      {
        commands: ['echo EHLO'],
        image: 'node:14-alpine',
        name: 'echo',

      },
    ],
    trigger: { event: ['push'] },
  },

converted to .drone.yml

---
{
   "kind": "pipeline",
   "name": "Base",
   "steps": [
      {
         "commands": [
            "echo EHLO"
         ],
         "image": "node:14-alpine",
         "name": "echo"
      }
   ],
   "trigger": {
      "event": [
         "push"
      ]
   },
   "type": "docker"
}
---
kind: signature
hmac: 88611f11ae869c614ab4a045a94f0292d11a1ff99a5b74dbd997896657db855d

...

Runner Logs

time="2021-11-30T15:31:06Z" level=debug msg="stage received" stage.id=25718 stage.name=Base stage.number=1 thread=2
time="2021-11-30T15:31:06Z" level=debug msg="stage accepted" stage.id=25718 stage.name=Base stage.number=1 thread=2
time="2021-11-30T15:31:06Z" level=debug msg="stage details fetched" build.id=18272 build.number=41 repo.id=196 repo.name=modbus-manager repo.namespace=TS stage.id=25718 stage.name=Base stage.number=1 thread=2
time="2021-11-30T15:31:06Z" level=debug msg="updated stage to running" build.id=18272 build.number=41 repo.id=196 repo.name=modbus-manager repo.namespace=TS stage.id=25718 stage.name=Base stage.number=1 thread=2
time="2021-11-30T15:31:09Z" level=debug msg="destroying the pipeline environment" build.id=18272 build.number=41 repo.id=196 repo.name=modbus-manager repo.namespace=TS stage.id=25718 stage.name=Base stage.number=1 thread=2
time="2021-11-30T15:31:10Z" level=debug msg="successfully destroyed the pipeline environment" build.id=18272 build.number=41 repo.id=196 repo.name=modbus-manager repo.namespace=TS stage.id=25718 stage.name=Base stage.number=1 thread=2
time="2021-11-30T15:31:10Z" level=debug msg="updated stage to complete" build.id=18272 build.number=41 duration=2 repo.id=196 repo.name=modbus-manager repo.namespace=TS stage.id=25718 stage.name=Base stage.number=1 thread=2
time="2021-11-30T15:31:10Z" level=debug msg="poller: request stage from remote server" thread=2
time="2021-11-30T15:31:10Z" level=trace msg="http: context canceled"
time="2021-11-30T15:31:10Z" level=debug msg="done listening for cancellations" build.id=18272 build.number=41 repo.id=196 repo.name=modbus-manager repo.namespace=TS stage.id=25718 stage.name=Base stage.number=1 thread=2

Some other topics hinted at missing secrets failing the pipeline.
This pipeline fails even without any secrets anywhere

Other threads talk of restarting/updating Drone to fix the issue.
Admittedly, it worked some times
This morning, I’ve rebuilt the whole build cluster on drone/drone:2.6 and the same skipping issues still occur.

I’m at a loss now as to what I may try differently to fix this issue.
Any pointers?

Thanks

carboneater · November 30, 2021, 6:21pm

Support was quick to point something new to me.

In the drone UI, if you inspect the Response JSON, it may expose a new error there.

Mine was

{"id":18272,"repo_id":196,"trigger":"@hook","number":41,"status":"error","event":"push","action":"","link":"","timestamp":0,"message":"ci: can it echo?!?","before":"0f3ae15e3ae2a7630177e4f4747044cf80dd45be","after":"6abae17094472737810e30050991d86bf79fda18","ref":"refs/heads/core","source_repo":"","source":"core","target":"core","started":1638286266,"finished":1638286269,"created":1638286266,"updated":1638286266,"version":3,"stages":[{"id":25718,"repo_id":196,"build_id":18272,"number":1,"name":"Base","kind":"pipeline","type":"docker","status":"error","error":"Error response from daemon: could not find an available, non-overlapping IPv4 address pool among the defaults to assign to the network","errignore":false,"exit_code":255,"machine":"ace-bld-1-light","os":"linux","arch":"amd64","started":1638286266,"stopped":1638286268,"created":1638286266,"updated":1638286269,"version":4,"on_success":true,"on_failure":false,"steps":[{"id":232969,"step_id":25718,"number":1,"name":"clone","status":"skipped","exit_code":0,"started":1638286268,"stopped":1638286268,"version":2,"image":"drone/git:latest"},{"id":232970,"step_id":25718,"number":2,"name":"echo","status":"skipped","exit_code":0,"started":1638286268,"stopped":1638286268,"version":2,"depends_on":["clone"],"image":"docker.io/library/node:14-alpine"}]}]}

So it was a matter of allocating more vnets to the Docker Daemon.

Hope this helps someone later on.

Topic		Replies	Views
When daemon returned that he cannot assigned IPv4 each new build is treated skipped Drone Bugs	4	382	July 15, 2021
Pipeline fails with “clone: skipped” (Docker stack) Drone Support	21	1634	July 1, 2021
[SOLVED] Drone CI build keeps failing Drone Support	2	293	July 18, 2021
Build step failing with: "step.status":"failure" on log Drone Support	7	1227	April 29, 2019
SSH clone: skipped Drone Support	2	419	November 24, 2021

Steps repeatedly failing with clone:skipped

Related topics