[solved] Issue - Kubernetes Runner - Cert validation issues with Plugin/Docker

Experiencing an issue with the Drone kubernetes runner when trying to use the plugin/docker plugin from plugins.drone.io.

When building a docker image based off of centos/python-36-centos7, Drone times out trying to curl or grab yum packages from https://packages.microsoft.com, timeouts occur with the “message:timeout initializing nss with certpath”

This seems to be a Docker in Docker issue as this container is able to build locally, and in Jenkins.

Inside the Drone docker container:

/drone/src # curl https://packages.microsoft.com -v
*   Trying 13.82.67.141:443...
* TCP_NODELAY set
* Connected to packages.microsoft.com (13.82.67.141) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
*   CAfile: /etc/ssl/certs/ca-certificates.crt
  CApath: none
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* TLSv1.3 (IN), TLS handshake, Server hello (2):
* TLSv1.2 (IN), TLS handshake, Certificate (11):
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):
* TLSv1.2 (IN), TLS handshake, Server finished (14):
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
* TLSv1.2 (OUT), TLS change cipher, Change cipher spec (1):
* TLSv1.2 (OUT), TLS handshake, Finished (20):
* TLSv1.2 (IN), TLS handshake, Finished (20):
* SSL connection using TLSv1.2 / ECDHE-RSA-AES256-GCM-SHA384
* ALPN, server accepted to use http/1.1
* Server certificate:
*  subject: CN=packages.microsoft.com
*  start date: Apr  5 23:12:30 2020 GMT
*  expire date: Apr  5 23:12:30 2022 GMT
*  subjectAltName: host "packages.microsoft.com" matched cert's "packages.microsoft.com"
*  issuer: C=US; ST=Washington; L=Redmond; O=Microsoft Corporation; OU=Microsoft IT; CN=Microsoft IT TLS CA 2
*  SSL certificate verify ok.
> GET / HTTP/1.1
> Host: packages.microsoft.com
> User-Agent: curl/7.67.0
> Accept: */*
> 
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
< Server: nginx/1.16.1
< Date: Tue, 14 Jul 2020 18:04:50 GMT
< Content-Type: text/html
< Transfer-Encoding: chunked
< Connection: keep-alive
< Strict-Transport-Security: max-age=31536000; includeSubDomains
< X-Content-Type-Options: nosniff
< 
<html>
<head><title>Index of /</title></head>
<body>
<h1>Index of /</h1><hr><pre><a href="../">../</a>
<a href="centos/">centos/</a>                                            10-Mar-2020 15:06                   -
<a href="clamav/">clamav/</a>                                            10-Mar-2020 20:07                   -
<a href="config/">config/</a>                                            26-Feb-2020 01:35                   -
<a href="debian/">debian/</a>                                            25-Feb-2020 21:20                   -
<a href="fedora/">fedora/</a>                                            02-Apr-2020 01:13                   -
<a href="keys/">keys/</a>                                              26-Feb-2020 03:07                   -
<a href="opensuse/">opensuse/</a>                                          25-Feb-2020 19:22                   -
<a href="repos/">repos/</a>                                             09-Apr-2020 19:10                   -
<a href="rhel/">rhel/</a>                                              25-Feb-2020 19:12                   -
<a href="sles/">sles/</a>                                              25-Feb-2020 19:06                   -
<a href="ubuntu/">ubuntu/</a>                                            02-Apr-2020 02:12                   -
<a href="yumrepos/">yumrepos/</a>                                          09-Apr-2020 18:39                   -
<a href="index.nginx-debian.html">index.nginx-debian.html</a>                            16-Aug-2019 06:51                 612
</pre><hr></body>
</html>
* Connection #0 to host packages.microsoft.com left intact

Inside the CentOS container:

(app-root) sh-4.2# curl https://packages.microsoft.com -v
* About to connect() to packages.microsoft.com port 443 (#0)
*   Trying 40.117.131.251...
* Connected to packages.microsoft.com (40.117.131.251) port 443 (#0)
* Initializing NSS with certpath: sql:/etc/pki/nssdb
*   CAfile: /etc/pki/tls/certs/ca-bundle.crt
  CApath: none
1 Like

The same appears to happen when attempting a similar build from Ubuntu:18.04

The command '/bin/sh -c curl https://packages.microsoft.com/keys/microsoft.asc | apt-key add - && curl https://packages.microsoft.com/config/ubuntu/18.04/prod.list' returned a non-zero code: 2
time="2020-07-14T19:47:04Z" level=fatal msg="exit status 2"

Note: I am running drone-kube-runner 1.0.0-beta.4 and drone-1.8.1

Issue is figured out, based on this article: https://medium.com/@liejuntao001/fix-docker-in-docker-network-issue-in-kubernetes-cc18c229d9e5. Hopefully this helps anyone who has the issue. it would be great to know if there is a way to not explicitly set MTU in the drone file. The drone build container has an MTU set at 1500 and the overlay network and dockerhost have 1450. setting mtu to 1450 in the file resolved it.

when you configure your kubernetes runner, you can set global environment variables (see runner configuration reference). You can globally set PLUGIN_MTU which will automatically set this plugin value without having to configure in the yaml.

Thank you @ashwilliams1

@ashwilliams1, can you please provide a link in the docs? I dont see this under the kube runner or docker runner, or in the helm chart.

here is a link to all runner configuration options
https://docs.drone.io/runner/kubernetes/configuration/reference/

there are two options in the list for globally setting environment variables.