We deploy drone on kubernetes, although not using the kubernetes driver, only the native docker driver.
If for some reason the drone-server pod is deleted and recreated, I’m assuming the lets encrypt cert is also lost that is stored /data/golang-autocert
? We are logging errors and unable to access:
http: TLS handshake error from 10.20.0.1:43912: acme/autocert: missing certificate
I would have assumed that with a missing cached cert drone-server would re-request a new cert? If not what workflow should we employ as having a mounted volume in kubernetes is really not very fun when we start talking StatefulSets etc…
In my experience, the Go autocert package we use [1] automatically requests a new certificate if the cache is deleted. We have not observed any issues. If you are having problems with lets encrypt certs you may want to check with the package authors for tips to triage further.
[1] https://godoc.org/golang.org/x/crypto/acme/autocert
1 Like
Will do and report back. Thanks.
You may also want to check their mailing list. Redirecting to Google Groups
Okay to clear this up @bradrydzewski I looked into the logs further and go autocert. The issue was on our end, we were being rate limited by ACME on our .drone sub domain. The issue for us was our drone server and agents were running on our K8s cluster that has preemptive nodes - these nodes are randomly killed at least once a day by Google. We use them as they are about 25% the cost of normal nodes.
Each time the node dies it takes the certs along with it. ACME request another one, there is a rate limit of 5 renewals per week. Of course there are 7 days… We overcame this issue by using statefulsets. These use NAS of your cloud provider, we then mounted the cert path that drone uses… Voilà.
We have a pretty simple template now that we can boot up into GKE cluster using very cheap preemptive nodes with a fair amount of CPU. Note: we use docker native not the drone kubernetes driver