Migrating between drone instances

Hey folks,

We’re about to start a process of migrating our business from GitLab to GitHub Enterprise. This becomes a little tricky with Drone due to its tight coupling with the source control system for its authorization model.

An open question - how would you approach this? We’ve got some ideas but it would be great to hear suggestions (however crazy) from the community.

Thanks!

Some thoughts on how I would approach this below. Note that this is based on the following assumptions:

  • usernames are retained (e.g. gitlab.com/octocat => github.com/octocat)
  • repository names are retained (e.g. gitlab.com/octocat/hello-world => github.com/octocat/hello-world)
  • pull requests are migrated over to github and retain same pull request numbers

Step 1: Database Updates

I would go through the database and update the repository urls.

UPDATE repos SET
 repo_link = REPLACE(repo_link, 'https://gitlab.skyscanner.com/', 'https://github.skyscanner.com/')
,repo_clone = REPLACE(repo_clone, 'https://gitlab.skyscanner.com/', 'https://github.skyscanner.com/')

Step 1.a: Diff Link Updates (optional)

The user interface has a deep link to commits in version control, stored in the build_link column in the builds table. You could also try to remap this field to the new location. I doubt a simple replace would work in this case, but you could probably still do a bulk update using a regex replace.

Step 1.b: Ref Updates (optional)

The pull request refs will need to be changed in the event you want to re-run old pull request builds.

UPDATE builds SET
build_ref = REPLACE(build_ref, 'refs/merge-requests', 'refs/pulls')
WHERE build_event = 'pull_request'

Step 2: Token updates

In order to run builds you need an active token for each repository owner. This would require every user re-login to the new system to refresh their token. This is probably not feasible in a large migration. So …

I would create a GitHub machine account that has access to all github repositories, and generate a personal token for this machine account. I would then override all user tokens in the drone database with this machine token, like this:

UPDATE users SET
user_token = ?

This will allow the system to continue running builds without interruption. The next time a user authenticates, the machine account token will be replaced with an oauth token.

Step 3: Repair Webhooks

Finally you will need to add webhooks to Github. We have a rest endpoint that you can use use to re-create webhooks (below). I would execute this endpoint for every active repository in the database.

POST /api/repos/{owner}/{name}/repair

I would probably write a quick script to accomplish this, using the machine account token to authenticate and fetch all repositories.

import (
	"github.com/drone/drone-go/drone"
	"golang.org/x/oauth2"
)

const (
	token = "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9"
	host  = "http://drone.company.com"
)

func main() {
	config := new(oauth2.Config)
	auther := config.Client(
		oauth2.NoContext,
		&oauth2.Token{
			AccessToken: token,
		},
	)

	client := drone.NewClient(host, auther)

	repos, err := client.RepoList()
	if err != nil {
		log.Println(err)
	}

	for _, repo := range repos {
		err := client.RepoRepair(repo.Owner, repo.Name)
		if err != nil {
			log.Println(err)
		}
	}
}

Thanks for the reply Brad - gives us a great place to start from.

How epic would it be, do you think, for us to change the codebase to having multiple remotes from within Drone (if we follow the same assumptions as you’ve posted there) so we can migrate specific repos within the same instance? Do you see any gotchas or would it just be about making sure we use the right remote for the right repo?

Since the remote is an interface, you could try to wrap both the GitHub and GitLab remotes into a single interface. Something like this:

type CombineRemote struct {
  gitlab remote.Remote
  github remote.Remote
}

func (r *CombineRemote) Repo(u *model.User, owner, name string) (*model.Repo, error) {
  // first find the repo in github
  repo, err := r.github.Repo(u, owner, name)
  if err != nil {
    // else fallback to github
    repo, err = r.gitlab.Repo(u, owner, name)
  }
  return repo, err
}

and this:

func (r *CombineRemote) Hook(r *http.Request) (*model.Repo, *model.Build, error) {
  if r.Header.Get("X-Gitlab-Event") != "" {
    return r.gitlab.Hook(r)
  }
  return r.github.Hook(r)
}

You would want some functions to be GitHub only such as the Login. You could probably make Activation and Deactivation Github only, since you want to encourage people to move to GitHub …

You would need to make sure the RepoList function returns a combined list and de-dupes the list, in case the repository exists in both systems, in favor of the GitHub entry.

Off the top of my head I can’t really think of any gotchas, but there have to be some edge cases that I’m not considering. But overall, I think having a custom remote wrapper should work.

Hi Brad

Do you have any solutions for dealing with auth tokens with this method? At the moment they are stored in the user model and passed to the methods that need them.

Liam