The most likely task you’ll want to do is provide your input to annotate the existing research software database at github.com/rseng/software. This tutorial will walk you through how to do that, both using the command line and a web interface (under development).

Clone the repository

Whether or not you use the interface or command line, you need to start with a set of software to annotate! You might start your own software repository with rse init, but likey you want to annotate one that already exists. Let’s clone it now.

$ git clone https://github.com/rseng/software
cd software

The default database in the rse.ini is the filesystem, which means the research software encyclopedia knows how to annotate it. If you intend to contribute your annotations back to respository (we hope that you do!) then it might be a good idea to checkout a new branch:

git checkout -b annotation/user-vsoch

Environment

To annotate criteria, if you aren’t sitting in the root of the repository, you might want to export your RSE_CONFIG_FILE to be where the repository is. For example:

$ export RSE_CONFIG_FILE=/path/to/rseng/software/rse.ini

Command Line Annotation

To annotate from the command line, you can choose rse annotate and target either criteria or the taxonomy.

$ rse annotate taxonomy
$ rse annotate criteria

Since we use GitHub usernames to determine who has annotated what, if your GitHub username is not available via:

git config user.name

then you’ll need to provide that as an argument:

$ rse annotate taxonomy -u vsoch
$ rse annotate criteria -u vsoch

Annotate All Unseen Software Repositories

By default, running an annotation session will annotate all software that your GitHub username has not seen.

$ rse annotate taxonomy
$ rse annotate criteria

If you want to re-annotate repositories that you’ve seen, then specify that:

$ rse annotate taxonomy --all
$ rse annotate criteria --all

For each annotation, the repository is saved after you answer all questions for it. This means if you press Control+C during any time, the repositories you’ve finished annotation for will be saved.

Annotate Single Repository

It might be preferable for you to annotate a specific repository. For each of the below, you are provided a url to explore further along with a description, if needed.

criteria

$ rse annotate criteria -r github/singularityhub/sregistry
INFO:rse.main:Database: filesystem

https://github.com/singularityhub/sregistry [server for storage and management of singularity images]:
Would taking away the software be a detriment to research? [n]|y: y
Has the software been cited? [n]|y: y
Is the software intended for a particular domain? [n]|y: n
Was the software created with intention to solve a research question? [n]|y: n
Is the software intended for research? [n]|y: n
Has the software been used by researchers? [n]|y: y

After the annotation session, you can use git status to see that the criteria you answered for, and the repositories that you answered questions for, will have changed files:

$ git status
On branch master
Untracked files:
  (use "git add <file>..." to include in what will be committed)
	database/github/singularityhub/sregistry/criteria-RSE-absence.tsv
	database/github/singularityhub/sregistry/criteria-RSE-citation.tsv
	database/github/singularityhub/sregistry/criteria-RSE-domain-intention.tsv
	database/github/singularityhub/sregistry/criteria-RSE-question-intention.tsv
	database/github/singularityhub/sregistry/criteria-RSE-research-intention.tsv
	database/github/singularityhub/sregistry/criteria-RSE-usage.tsv

In the case of this example, we had not yet seen the repositories, so they are new files. We would then add the files, and push to our branch, and open a pull request to the repository. Here is an asciinema that shows how annotation looks:

taxonomy

You might also want to annotate repositories for the taxonomy. This means that you will be shown a repository, a list of categories, and asked to place the software in one or more categories (up to you!)

$ rse annotate taxonomy -r github/singularityhub/sregistry

https://github.com/singularityhub/sregistry [server for storage and management of singularity images]:
How would you categorize this software? [enter one or more numbers]
[0] Domain-specific analysis software (SPM, fsl, afni for neuroscience)
[1] Application Programming Interfaces
[2] Communication tools or platforms (email, slack, etc.)
[3] Data collection (web-based experiments or portals)
[4] Databases
[5] Domain-specific hardware (software for physics to control lab equipment)
[6] Frameworks (to generate documentation, content management systems)
[7] Interactive development environments for research (Matlab, Jupyter)
[8] Numerical libraries (includes optimization, statistics, simulation, e.g., numpy)
[9] Operating systems
[10] Domain-specific optimized software (neuroscience software optimized for GPU)
[11] Personal scheduling and task management
[12] Provenance and metadata collection tools
[13] Text editors and integrated development environments
[14] Version control
[15] Visualization (interfaces to interact with, understand, and see data, plotting tools)
[16] Workflow managers
Please enter one or more numbers, separated by spaces
Please enter your choice [0:16] : 1 4

And after your session, you can see that the taxonomy file for the repository has been added or updated.

$ git status
On branch master
Untracked files:
  (use "git add <file>..." to include in what will be committed)
	database/github/singularityhub/sregistry/taxonomy.tsv

Here is a quick view of what this looks like interactively.