Software and Model Guidelines

Preamble

Models and Software are the basis and the heart of PIK’s scientific work and reputation. PIK’s scientific mandate is to understand the physical limits of the Earth system and to explore management options for a sustainable and equitable development. This broad portfolio, encompassing the earth system, climate impacts, socio-economic concepts, energy systems as well as general complex systems research, necessitates a broad basis of high-quality software and model development.

PIK is committed to upholding high standards of scientific practice. The Software, Data and Modelling Council (SDMC) coordinates the development and implementation of the Software and Modelling Guidelines that codify this commitment for its remit. The central aims of these guidelines are to help scientists achieve transparency, reproducibility and correctness of their scientific results.

To ensure transparency, and foster collaboration, PIK is committed to Open Science and Open Source. Software developed at the institute is recommended to be open sourced in accordance with the PIK Open Source and Open Data Policy.

PIK's commitment to the highest standards of scientific practice, demands reproducibility of PIK's research results. In a Software context this means that code that contributed to scientific insights needs to be not just available, but also understandable and can be run in a defined setting, possibly long after publication. Finally, best practices on code testing and review help ensuring the correctness of scientific results.

Software Guidelines

GL 1 Documentation

Any software and model has to be documented in an adequate way and level. Documentation needs to ensure that the code can be understood, run and modified by other researchers (including the author at some future point). It is an integral part of transparency and reproducibility. Developers should ensure synchronization between developing code and corresponding documentation. PIK recommends considering software documentation as a part of the software design process. Different levels of documentation are appropriate for different software and models, the living document contains a further discussion of what may be appropriate in what context.

GL 2 Version Control

PIK requires using version control systems for software and model development. Version control systems allow other scientists to follow, maintain and contribute to the software development, and ensure that prior versions, as well as the reasoning underlying changes is not lost. PIK recommends using git, as it has become the de facto standard for open source collaboration, and provides an in-house Gitlab instance.

GL 3 Software citation

For permanently citable versions (usually also along with a Journal publication) archiving the software with a digital object identifier (<http://www.doi.org/>) is recommended. PIK also strongly encourages citing the software underlying scientific research, including foundational packages.

GL 4 Software Testing

Testing ensures the high quality of software via writing code that checks whether the scientific code performs as expected. A wide variety of testing practices and methods exists in professional software development. PIK requires scientific code contributors to use appropriate tests where possible.

GL 5 Code review

Code reviews, e.g. carefully demonstrating your code to a colleague, is an important part of software quality assurance. New code developments must be reviewed before contributing to a publication. PIK encourages software developers to adopt workflows incorporating reviews, for example based on Gitlab projects and merge requests, that require and document review before code is incorporated into the repository.

GL 6 Long Term Storage

Corresponding to the rules for “Safeguarding Good Scientific Practice” of the German Research Foundation DFG and the “Rules for Ensuring Good Scientific Practice at PIK and Procedures for Dealing with Scientific Misconduct”, all relevant software, models and data must be securely stored for ten years to enable reproducibility of important model output. Code and data published under a DOI fulfils those requirements. For all other cases, the PIK archiving service should be used.

GL 7 Crediting developers and maintainers

Research department heads and working group leaders have to make sure that code development and maintenance will be honoured by proper credits, e.g. via (co)authorship or opportunities for publications in technical journals.

GL 8 Checklist

The guidelines given here are designed to help with maintaining high scientific standards. To assist scientists in ensuring that the guidelines are implemented, the SDMC will develop checklists.

Model Guidelines

Beyond these general guidelines, the specific challenges and requirements in the design and development of major complex models are subject to the following guidelines.

GL M1 Selection of models to be developed / used at PIK and maintenance requirements

Although the selection of models is primarily driven by scientific questions, the limitation of resources for model development and maintenance at PIK make this selection process an important part of the model quality control process. Decisions on modelling strategy are made at the level of the respective Research Departments. The Research Department Heads ensure that development and/or usage of a model will contribute to the research strategy of the institute. They also ensure long-term model maintenance so that other research groups from PIK or outside the institute can build on the results of a modelling project for their own research.

GL M2 Clearly defined responsibilities for each model

For each model under development / in use at PIK a responsible scientist is nominated who ensures that good modelling practice and guidelines are applied for the model implementation. This responsible person has to be a senior scientist, ideally with a permanent position.

GL M3 Application of validation and uncertainty & sensitivity analyses for model implementation

A new or a modified implementation of a model or its components has to be verified and validated. This goes beyond the need to test implementations covered in the guideline on testing. Validation of the model implementation should take into account all relevant sources, ranging from the abstract model, to empirical data, and to alternative model implementations and their output. Model inter-comparison studies/projects are encouraged, as well as uncertainty and sensitivity analyses of the model implementation.