Getting Started

Requirements

To develop and / or run an analysis you need some things:

  1. Data in form of a .csv file

  2. Meta data in form of a .yml file

  3. A directory to store your scripts (can be generated)

  4. A directory to store the outputs of the scripts (will be generated, if it does not exists)

Setting up the Environment

Because the scripts use hifis-surveyval as dependency, it is wise to set up a dedicated environment for your analysis. We show an example workflow using Poetry to achieve that, but other solutions like Pipenv also work.

First, we need to install poetry on our system.

pip install poetry

Afterwards, we need to create a project folder or clone a git repository, where everything is stored. In the terminal, go to this directory.

Create a Project for the Analysis

If in an existing git repository already a pyproject.toml exists, you can skip this step.

Follow the instructions of poetry and skip choosing the dependencies.

poetry init

Install Dependencies

The most important dependency is obviously hifis-surveyval.

poetry add hifis-surveyval

You can later add other dependencies of the analysis scripts with the same command.

Initializing hifis-surveyval

If in an existing git repository already a hifis-surveyval.yml and a directory for your scripts exist, you can skip this step.

We can create a config and an example script with the following command.

poetry run hifis-surveyval init

If you only need the config or only an example script, see:

poetry run hifis-surveyval init --help

Adding or Changing Analysis Scripts

Now you can freely add or edit .py files in the scripts folder determined in the config file. Please do not use subpackages there.

You get access to the data and the tools via the arguments in the run function of the script. An example script is shown below.

from hifis_surveyval.data_container import DataContainer
from hifis_surveyval.hifis_surveyval import HIFISSurveyval

def run(hifis_surveyval: HIFISSurveyval, data: DataContainer):
    """Execute example script."""
    for question in data.question_collection_ids:
        print(question)