Debugging with tmate and autoscaler (ec2)

Whenever I try to debug a failed pipeline on a drone system with aws autoscaler it fails with

	/bin/sh: /usr/drone/bin/tmate: not found

I can’t set DRONE_TMATE_ENABLED=true as I don’t know how/where to set this as i don’t have the drone-docker-runner setup manually, because these are instantiated through the autoscaler but the autoscaler doesn’t seam to have a setting like DRONE_TMATE_ENABLED.

Any suggestions what am I doing/thinking wrong? Thx.

I’m having the same problem using debug in Drone builds with the only step exit 78.

Here is my .drone.yml:

---
kind: pipeline
type: docker
name: default

platform:
  os: linux
  arch: amd64

clone:
  disable: true

steps:
- name: exit
  image: registry.cartesius.local/git:latest
  commands:
  - exit 78
...

DRONE_TMATE_ENABLED=true is not set in my configuration, drone and drone-runner-docker are latest versions.

As an addition to this, when the step is running I observe this log

and then, after build finishes, this log

@bradrydzewski, any thoughts?

I changed line

image: drone/drone-runner-docker:1

to this

image: drone/drone-runner-docker:latest

and the problem was gone.

Thank you for the tip but I am running drone/autoscaler with aws so there is no drone-runner-docker involved in my docker-compose file.

When I log in to a spawned ec2 instance from the autoscaler I see there is a drone/drone-runner-docker:1 active but I don’t know how I can tell the autoscaler to use the latest version of drone-runner-docker.

I thing I missing a small piece the setup and then it’ll work. That would be so convenient.

@ceelian you should be able to override the default runner docker image that the autoscaler uses with DRONE_AGENT_IMAGE=drone/drone-runner-docker:latest, see DRONE_AGENT_IMAGE | Drone

It would be great to get to the bottom of the issue, there doesn’t seem to be any significant changes between version 1 (which should pull 1.8.0) and latest. I’ll see if I can find any clues.

@ceelian tmate needs to be enabled by passing DRONE_TMATE_ENABLED=true to the runner. Since you are launching runners using the autoscaler, you would need to configure the autoscaler to set the variable. I believe you can use the following autoscaler environment variable to set runner environment variables:

DRONE_AGENT_EVIRON=DRONE_TMATE_ENABLED=true

Thank you for the tip. I tried adding the following environment to the docker-compose service which starts the drone/autoscaler:1.8.0

- DRONE_AGENT_ENVIRON=DRONE_TMATE_ENABLED:true

But again no luck. Output from drone is (I produce the error with calling the not_existing_command):

latest: Pulling from library/alpine
Digest: sha256:4edbd2beb5f78b1014028f4fbb99f3237d9561100b6881aabbf5acce2c4f9454
Status: Image is up to date for alpine:latest
+ not_exsisting_command
/bin/sh: not_exsisting_command: not found
/bin/sh: /usr/drone/bin/tmate: not found

In this thread Tmate not found error - #2 by csgit - Drone @bradrydzewski said, that tmate should be automatically installed in each step. But it looks like this is not done with autoscaler on AWS.

Right now, I’m really running out of ideas on how to make this work.

@drone I tried it once more with your suggested syntax, even though it is a different syntax than in the docs DRONE_AGENT_ENVIRON | Drone like

DRONE_AGENT_ENVIRON=foo:bar,baz:qux

with adding the following syntax to the docker-compose.yaml it finally worked after 8 month of trial and error. :partying_face:

- DRONE_AGENT_ENVIRON=DRONE_TMATE_ENABLED=true

Thank you very much!