Using GitHub or GitLab in Research

By C.Du @snail123815

GitHub and GitLab are powerful platforms for version control and collaboration on code. In our context, GitHub is often the practical choice for public repositories and collaboration with people outside Leiden University, while the university-hosted GitLab service is the preferred option for projects related to personal data or other work that should stay within the university environment. The university GitLab is listed in the VRE service overview.

However, Git itself and these hosting platforms are designed to track code and text files — not all types of research output. There are restrictions on file sizes (for example, GitHub blocks files over 100 MB), and they are not suited for storing (temporary) results, large datasets, or binary files.

The university GitLab also does not allow sharing repositories externally. That makes GitHub the better solution when you need external collaborators, public visibility, or GitHub-specific services such as the GitHub to Zenodo release workflow.

This means that even for bioinformatics students, an Electronic Lab Notebook (ELN) is still essential. The ELN captures the full research context that Git cannot: the reasoning behind decisions, experimental results, links to data files, and the relationship between code and outcomes.

What Git tracks and what it does not

Tracked well by Git

Not suited for Git

Source code and scripts

Large data files and databases

Plain-text configuration files

Binary result files (images, PDFs, compressed archives)

Documentation in Markdown/text

Temporary or intermediate results

Commit history and diffs

Repository-level operations (renames, transfers)

Because of these limitations, you should use Git and ELN together as complementary tools.

Summary

When

What to record in ELN

Repository created

Purpose, link to repo, chosen platform and reasoning, access info

Major repo operations

Name changes, transfers, permission changes, with reasoning

Each test run

Description, code version (commit hash), link to results on Research Drive; if using sync-with-rclone, keep the generated git repository summary file

Project completed

Release version, platform, Zenodo DOI

Release archived

Link to release zip on Research Drive