Skip to content

Git and GitLab introduction

TL;DR Git(Lab) is cool, use Git(Lab)!

Version control

Versions

flowchart LR A(idea) --> B(code) --> C(program)

Typically a developer (a.k.a. programmer) writes code to implement an idea. Subsequently, this code is interpreted or it is compiled into a program (or package, script, library etc. etc.) which can be run on a computer.

Usually, the code is written in a programming language and will go through many iterations before resulting in a working program. These iterations are called versions and can be visualized on a time line as follows:

flowchart LR E(idea) --> A subgraph code A(verion 0) --> B(version 1) --> C(version 2) --> D(version 3) end D --> F(program)

Support for maintaining these versions and having the possibility to review or even fall back to previous versions can be a very powerful tool for developers!

Important

Be aware that a new software version can concern changes in just one file but will most often reflect changes in multiple files.

Branching and merging

Very often many new ideas or additions need to be explored and tested during the development of code before they are rejected or included in a final version. Such sidetracks can also be visualized on the time line:

flowchart LR A(version 0) --> B(version 1) --> C(version 2) --> D(version 3) B --> E(version A.1) --> F(version A.2) --> C B --> G(version B.1) --> H(version B.2)

In the above example after version 1 two new ideas are introduced which are developed further in two sidetracks (branches) that split up the main timeline (trunk). The idea in the A-branch turns out to be a good addition and is finally included in the code in version 2 (merged). The idea in the B-branch is not working properly and is not added to the final code.

This concept of branching and merging allows the developer to keep focus on the main development in the trunk while in parallel trying out new ideas or different implementations in branches.

Continuing the above example, the merging algorithm will aim to silently merge version 1 and version A.2 in a new version 2 when there is no ambiguity. If however the merging of the two intermediate versions is hindered by overlapping code sections, or other conflicts, Git will indicate this in the resulting merged file with special indicators as shown in the next example:

hello.c in version 1

#include <stdio.h>
int main() {
    printf("Hello, World!");
}

hello.c in version A.2

#include <stdio.h>
int main() {
    printf("Hello, Universe!");
}

hello.c in version 2; merge from version 1 and version A.2

#include <stdio.h>
int main() {
<<<<<<< version 2
    printf("Hello, World!");
=======
    printf("Hello, Universe!");
>>>>>>> version A.2
}

Note the <<<<<<< version 2 indicating that Git found a merging ambiguity which the developer needs to fix.

Thus, Git enables version control:

flowchart TD A(Git) --> B(version control)

Collaboration

Git allows multiple developers to collaborate on code development. Each version is annotated with the id of the developer and a small description of the changes in the timeline. Therefore, all developers can keep track of the contributions by others.

This also makes it possible that each developer can either work on a separate part of the code or simultaneously on the same code as other developers, in separate branches or even via the same branch. To prevent merging conflicts, it is crucial that developers have frequent meetings in order to agree how to collaborate!

As such, Git also facilitates collaboration:

flowchart TD A(Git) --> B(version control) A --> C(collaboration)

Centralized storage

In Git all versions of the code together with annotations are stored in a repository. Although Git can very well be used with only one, local copy of a repository on the developer’s computer, it also allows for centralized storage of repositories. While doing so, a developer(s) can retrieve (pull) the latest version of the repository from the Git server and after, for instance, a day of programming store (push) the repository back to the server.

Important

When pulling a repository from a server, by default a complete copy of all versions with all annotations will be locally stored on the computer of a developer. In this way developers have access to the entire history of the code’s development without the need for a connection to a server (thus, allowing to work offline).

This implicitly also eases the sharing of code with multiple developers. At the same time it requires frequent (preferably once per day) pulling and pushing of the repository to ensure that discrepancies between the versions are not too large.

The server we are using in this course is GitLab. You can access this server via a webinterface. In the Git and Gitlab Usage-section it will be explained how to connect to this server. There are other, similar servers, however, including GitHub and BitBucket.

In effect, our overview scheme now becomes:

flowchart TD E(GitLab) --> A A(Git) --> B(version control) A --> C(collaboration) A --> D(centralized storage)

Issue tracking

Having a central server also opens up the way for other useful development tools. One such service allows fellow developers or users of a program to report issues. These can be bugs, suggestions or questions regarding the usage of your program. The GitLab server has a very nice interface in which developers can assign issues to particular programmers, answer them or fix bugs in the code.

flowchart TD E(GitLab) --> A A(Git) --> B(version control) A --> C(collaboration) A --> D(centralized storage) E --> F(issue tracking)

Authentication and authorization

Having code available on a publicly accessible server requires that there is some sort of control regarding who can access this code in the first place and who is also allowed to modify it. GitLab enables maintainers to restrict the access to the server with authentication and provides tools to hand out particular access-rights to repositories.

flowchart TD E(GitLab) --> A A(Git) --> B(version control) A --> C(collaboration) A --> D(centralized storage) E --> F(issue tracking) E --> G(authentication &
authorization)

Other tools

GitLab provides many other useful tools like wiki-documentation or online editing which fall out of scope of this course.

It is nice, however, to point out one particular service named CI/CD (Continues Integration & Continues Development/Deployment). This allows for the automatic execution of test suites or deployment of the program after a certain action, for instance, pushing a new version of the repository to the server. As a matter of fact this website you are using now is build with CI/CD. The text is written in Markdown in a repository stored in GitLab. After each push of a repository to the server, CI/CD will automatically render the Markdown files to HTML and send the resulting website to a webserver.

More information about GitLab can be found here:


Last update: 2021-02-09