2. Feed Qumin¶

Qumin is a tool for studying inflection systems. Qumin works from full paradigm data in phonemic transcription.

Download some data¶

Qumin supports only the Paralex standard. Many datasets are available on the dedicated Zenodo community. This tutorial series will use the ParaKar dataset for Livvi Karelian noun inflection:

Download parakar in your local folder:

pip install paralex
paralex get 13736170 -o parakar

For Qumin to work, the paralex dataset should contain at least a forms and a sounds table. Fortunately, parakar has such tables.

Note

The sounds file of a Paralex lexicon usually requires edition, as Qumin imposes more constraints on sound definitions than paralex does. See Write the sounds file. And for the forms table: Write the paradigms file.

Run Qumin¶

All Qumin scripts require a dataset to be passed. Datasets are referenced by their metadata descriptor, which contains all information Qumin needs. A standard Qumin command starts like this:

qumin action=<some_action> data=<dataset.package.json>

The available actions are listed in qumin.config.config.Actions, and serve to compute alternation patterns, calculate predictability measures, and describe inflection classes. More options can be added to control Qumin’s behaviour, and are described in the CLI reference reference.hydra.

For instance, the following will run the predictability measures on parakar:

qumin action=pred data=parakar/parakar.package.json

Outputs¶

Computation results are provided as a Frictionless DataPackage. In addition to the output files, Qumin writes a metadata.json file in the output directory, containing:

A description of each file in the output directory.
Timestamps for the beginning and the end of the run.
The command-line arguments passed to the script.
The description of the Paralex package used for the computations.

When passing computation results to downstream scripts, you should pass the path to the metadata descriptor.