I’m not sure what’s going on exactly, or why pushing to gcr
specifically completely fails.
Docker info:
Client:
Version: 17.12.1-ce
API version: 1.35
Go version: go1.9.4
Git commit: 7390fc6
Built: Tue Feb 27 22:17:53 2018
OS/Arch: linux/amd64
Server:
Engine:
Version: 18.03.0-ce-rc4
API version: 1.37 (minimum version 1.12)
Go version: go1.9.4
Git commit: fbedb97
Built: Thu Mar 15 07:42:54 2018
OS/Arch: linux/amd64
Experimental: false
Here are the error messages I’m getting:
From the dind container:
time="2018-03-18T23:38:34.929137948Z" level=error msg="Upload failed, retrying: net/http: HTTP/1.x transport connection broken: write tcp 10.28.4.2:56880->74.125.141.82:443: write: broken pipe"
time="2018-03-18T23:38:45.697296074Z" level=info msg="Layer sha256:7bd9c31f8d447bdf187e1b4481dd47dead674782425e7f8a0f5c7b38291f36ea cleaned up"
time="2018-03-18T23:38:46.968848532Z" level=error msg="Upload failed, retrying: net/http: HTTP/1.x transport connection broken: write tcp 10.28.4.2:56936->74.125.141.82:443: write: broken pipe"
time="2018-03-18T23:38:52.106884826Z" level=error msg="Upload failed, retrying: net/http: HTTP/1.x transport connection broken: write tcp 10.28.4.2:56940->74.125.141.82:443: write: broken pipe"
time="2018-03-18T23:38:55.058103684Z" level=error msg="Upload failed: net/http: HTTP/1.x transport connection broken: write tcp 10.28.4.2:56944->74.125.141.82:443: write: broken pipe"
time="2018-03-18T23:38:55.058470308Z" level=info msg="Attempting next endpoint for push after error: net/http: HTTP/1.x transport connection broken: write tcp 10.28.4.2:56944->74.125.141.82:443: write: broken pipe"
time="2018-03-18T23:39:02.251787859Z" level=error msg="Upload failed, retrying: net/http: HTTP/1.x transport connection broken: write tcp 10.28.4.2:56952->74.125.141.82:443: write: broken pipe"
time="2018-03-18T23:39:17.388838238Z" level=error msg="Upload failed, retrying: net/http: HTTP/1.x transport connection broken: write tcp 10.28.4.2:56960->74.125.141.82:443: write: broken pipe"
time="2018-03-18T23:39:37.548286736Z" level=error msg="Upload failed: net/http: HTTP/1.x transport connection broken: write tcp 10.28.4.2:56972->74.125.141.82:443: write: broken pipe"
time="2018-03-18T23:39:37.548575722Z" level=info msg="Attempting next endpoint for push after error: net/http: HTTP/1.x transport connection broken: write tcp 10.28.4.2:56972->74.125.141.82:443: write: broken pipe"
And from my build runner:
Basically a whole lot of these:
3dd242197f82: Retrying in 3 seconds
f5d9842d46d0: Retrying in 2 seconds
75645eedf26d: Retrying in 2 seconds
86b86df1dd4d: Retrying in 2 seconds
3dd242197f82: Retrying in 2 seconds
f5d9842d46d0: Retrying in 1 second
75645eedf26d: Retrying in 1 second
86b86df1dd4d: Retrying in 1 second
3dd242197f82: Retrying in 1 second
net/http: HTTP/1.x transport connection broken: write tcp 10.28.4.2:56474->74.125.141.82:443: write: broken pipe
Makefile:77: recipe for target 'deploy_docker' failed
and a whole lot of these:
16756a95e889: Retrying in 3 seconds
16756a95e889: Retrying in 2 seconds
16756a95e889: Retrying in 1 second
unexpected EOF
Here’s my set up:
I’m using a single GKE cluster with the drone helm chart, with dind
enabled.
I’m using a custom build runner; it basically loops through listed sub-directories and runs an os.Exec
.
In this case, it’s gcloud
docker build` (successful),
The relevant parts of my .drone.yml
:
pipeline:
deploy_docker:
image: my-build-runner-image-with-docker-ce
volumes:
- /var/run/docker.sock:/var/run/docker.sock
# network_mode: host
environment:
# I've tried using the provided docker-in-docker with network_mode: host
# - DOCKER_HOST=tcp://localhost:2375
- DOCKER_REGISTRY_DOMAIN=registry.example.io
# Enabling this flag tells the makefile to use "gcloud docker --" instead of "docker --"
# - DOCKER_USE_GCP=true
commands:
- docker version
- gcloud docker --authorize-only
- jules -stage deploy_docker
So I can do this, and it will work in this environment. This is as close as I can get to reproducing the actual build process.
This uses the ci-drone-agent
's dind
container that’s deployed on my cluster, with the image that I provided above, an the same options that I provided above.
I then clone the repository in the same location, and run the commands provided above.
kubectl exec -it ci-drone-agent-75f65b6f99-zbq6p -c ci-drone-dind docker run --rm -it -e GKE_CLUSTER_NAME=my-cluster-1 -e GKE_CLUSTER_ZONE=us-east1-b -e GCP_PROJECT=my-gcp-project -e DOCKER_USE_GCP=true --network host my-build-runner-image-with-docker-ce /bin/bash
And it works perfectly when I do it manually.
I’ve tried:
- Enabling host networking and using the
dind
docker daemon via tcp - Disabling dind and setting the pipeline container to privileged mode and mounting
/var/run/docker.sock
from the node. - Using various methods to authenticate with
gcr.io
, includinggcloud docker -- push
,gcloud docker --authorize-only
, using the_json_token
user with the key provided todocker login
. - Switching the base image of my build runner
jules
form Debian to Ubuntu - Upgrading the
dind
image to:latest
.
The weird thing about this issue?
It’s exclusive to gcr, but only when it’s ran by Drone.
FYI, these are really small FROM scratch
images with just a binary on them.
Any ideas or anything else I could try?