Developing SWIM with git/GitLab and recent code changes

Michel Wortmann & Stefan Liersch

June 19, 2018

Version control with git

Why bother with version control systems?

  • collaborate and share
  • storing and managing multiple and parallel versions
    • not just name a directory swim_final_plus_extra_2010-11-11
  • helping with changes and restoring previous versions of files by giving them a history
  • understanding what was changed when by who
  • merging two versions

What is git?

  • git is a fast, distributed version control software
  • Almost all operations are local (i.e. fast)
  • Very well documented online and on the commandline
  • Distributed: every user has a copy of the entire project repository
  • Not just for large software or project groups
    • personal configuration files
    • papers, documents (works best if plain text)
  • Allows and encourages easy branching
    • e.g. experiment with new features, bug fixes in your code
    • merging is easy
  • Most open-source software projects are now developed using git

(inspired by C. Linstead)

Commits and branches

GitLab at PIK

What is GitLab?

  • Open-source platform to host git repositories
  • Like GitHub but non-commercial and can be self-hosted
  • Gives advice/documentation for most functionalities
  • Personal and group (e.g. SWIM group) repositories

Features

  • Repository management
  • Merge requests and code reviews
  • Issue tracking
  • Activity feeds and wikis

PIK’s GitLab

The SWIM group

Recent SWIM code changes

Fully functional minimal test project

  • The Blankenstein (Saale) catchment, ~1000km²
  • 2 subcatchments, 10 subbasins, 185 hydrotopes
  • syncronised with the test project of m.swim
code/
docs/
project/
    input/
      blank.bsn
      blank.str
      ...
    output/

Flexible landuse cover classes

  • rather than 15 static classes, flexible number
  • replacing cntab.dat for *.lut file

A more convenient Makefile

  • shorter and more flexible
  • some errors/warnings removed, more cleaning needed

Single .sub, .rte, .gw files to tables

  • combined single files into simple tables
  • b3SubFiles switch in .bsn file for transition
input/
  Sub/
    groundwater.tab
    subbasin.tab
    routing.tab

Work-in-progress: Full documentation in Latex

  • docs/ contains a latex project of new developments
  • The SWIM Manual (Krysanova et al., 2000) needs to be translated from Word to latex (started)
code/
docs/
  figures/
  CHANGES_SWIM_2015-11.txt
  SWIM_new_developments_documentation.pdf
  SWIM_new_developments_documentation.tex

Contributing to the SWIM repository

Methods

GitLab merge requests

  • for improvements, new features
  • should be required if input files change
  • code review
  • documentation & publication of change on GitLab

Direct push

  • for obvious bugs and minimal changes
  • only available to masters/owners
  • unreviewed changes lead to messy code (like in the past with SVN) 👎

Record issues

  • filing an issue to describe a problem or a feature on GitLab
  • fast and uncomplicated
  • ensures the problem is not forgotten
  • merge requests can later be added to the issue

Hands on git

Prerequisites

  • git installed on the commandline (e.g. cluster console)
  • a local folder to keep source code repositories
    • e.g. source/
  • access to GitLab
  • your SSH key uploaded to GitLab
    • top right icon > settings > SSH keys
    • copy/past output of cat ~/.ssh/*.pub
  • Fork the swim/swim repository

Getting and creating a repository

Copy/download an existing one

Creating a new (local) repository

Inspection commands and help

  • help is available for every git command via -h (short) and --help (full help)

Recording changes

  • only recorded in own clone of repository
  • Undoing commits

- file status - .gitignore file

Branches

Hands on GitLab

Creating a new repository in GitLab

  • Create New Project from user home
    • call it <your name>s_test
  • Then link the local repository and push your project

Linking with online repositories

  • A ‘remote’ is a reference to a repository URL/path
  • Every branch may be connected to any remote branch

Merge requests

  1. Pull latest trunk version
  2. Create new branch
  3. Make and commit changes
  4. Push new branch to fork
  5. Submit merge request

1. Pull latest trunk version

-> for testing, fork your neighbours test project, clone it and add the ‘trunk’ remote

2. Create new branch

3. Make and commit changes

e.g.

  • don’t forget to implement and test changes in Blankenstein project
  • as many commits possible as needed

4. Push new branch to fork

5. Submit merge request

Testing someone else’s merge request

Checkout merge request branch

Compare differences to other branch or commit

(GitLab will remind you what to do)

Conclusions and best practices

Conclusions

  • It will take some time and practice to be productive and efficient in git
  • We need to adopt some best practices for our group, e.g.:
    • always implement and test changes with Blankenstein test project
    • document changes in doc/
    • write meaningful commit and merge request messages
  • We should think of/copy some basic coding conventions

Coding conventions

An example

  1. Variable names should use lower case letters and (if necessary) digits separated by underscores, for example: ice_thickness. (from PISM)
  2. One statement per line. (universal)
  3. Functions should not have side effects (if at all possible). In particular, do not use and especially do not modify “global” objects… (from PISM)
  4. All ascii input and output tables should be CSV and have a .csv extension. (personal recommendation)