Post

How We Migrated Our Internal Python Packages From GitLab to Gcloud Artifacts

This blog shares the journey of our migration from GitLab registry to Google Cloud Artifacts

At Docsumo, we recently transitioned our version control system from GitLab to GitHub. One challenge that emerged during this migration was the lack of native support for Python packages in GitHub. To address this, we opted to move our internal packages to Google Cloud Artifacts.

Here are the detailed steps I followed to successfully migrate around 250 versions of internal packages from GitLab python registry to Google Cloud Artifacts.

Create a python registry

First, create a Python registry on Google Cloud Artifacts. You can do this using the Google Cloud Console interface by selecting python as the format and adjusting other options as required.

Alternatively, you can create the registry using the following command:

1
2
3
4
gcloud artifacts repositories create python-packages \
    --repository-format=python \
    --location=us \
    --description="Python package repository"

You can change the repository name from python-packages to anything you prefer and adjust other settings as needed.

After creating the repository, your repository URL will look like this:

1
https://<PROJECT_LOCATION>-python.pkg.dev/<PROJECT_ID>/<REGISTRY_NAME>/

You can also copy the repository URL from the Google Cloud Console interface.

IMAGE

Next, set the repository URL in an environment variable, as it will be required later to upload the packages.

1
export REPO_URL=<REPOSITORY_URL>

Download the Packages

The next step is to download the packages from GitLab. Below is the python script that I used.

Set the GITLAB_USERNAME, GITLAB_TOKEN and GITLAB_ORG_NAME environment variables.

After executing the script, all of your packages will be downloaded to the specified directory as set in the script.

You can generate a GitLab token here; make sure to give the token read_registry permission.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
import requests
import os

GITLAB_USERNAME = os.environ.get("GITLAB_USER_NAME")
GITLAB_TOKEN = os.environ.get("GITLAB_TOKEN")
GITLAB_ORG_NAME = os.environ.get("GITLAB_ORG_NAME")
DOWNLOAD_DIR = "packages"

def get_all_group_packages(group: str, token: str):
    """Use Gitlab api to get all pakages and their metadata"""
    url =f"https://gitlab.com/api/v4/groups/{group}/packages"
    params = {
        "per_page": 400,
        "page": 0,
        "exclude_subgroups": False,
        "order_by": "project_path"
    }

    headers = {
        "PRIVATE-TOKEN": token
    }

    while True:
        response = requests.get(url, headers=headers, params=params).json()

        if not response:
            break

        for package in response:
            yield package

        params["page"] += 1

os.makedirs(DOWNLOAD_DIR, exist_ok=True)

for package in get_all_group_packages(GITLAB_ORG_NAME, GITLAB_TOKEN):
    package_name = package.get("name")+'=='+package.get("version")
    print(f"Downloading {package_name}")
    package_path = package.get("_links",{}).get("delete_api_path")
    package_path = package_path.replace('https://','').rsplit('/', 1)[0]

    index_url = f"https://{GITLAB_USERNAME}:{GITLAB_TOKEN}@{package_path}/pypi/simple"

    command = f"pip download {package_name} --no-deps --no-build-isolation --dest {DOWNLOAD_DIR} --index-url {index_url}"
    os.system(command)

Upload the packages

With all package wheels downloaded, the next step is to upload them to the newly created Google Cloud Artifact Registry.

Start by installing the required tools:

1
pip install twine keyrings.google-artifactregistry-auth

Then, navigate to the directory where the packages are downloaded and use twine to upload them to Google Cloud.

1
twine upload --repository-url $REPO_URL *.whl --verbose

After running the above command, the packages should start uploading to Google Cloud Artifact Registry.

I hope this guide helps you in making a similar transition smoothly. Feel free to reach out if you have any questions or need any assistance!

This post is licensed under CC BY 4.0 by the author.