Command reference
The command can be used launching
freeports
from the command line. To get a contextual help use the option --help shortened to -h
freeports -h
the options can be specified in different ways that overwrite each other. The same option can be specified in 3 different ways:
configuration file
environment variables
command line options
option not specified default to values specified in the conf_parse submodule. Those are overwritten by the options specified in the configuration file, then the result is overwritten by the environment variables, command line options and finally if when in BATCH MODE by the job contextual options. The option available to be overwritten and how are documented in the respective reference page:
Reference config pages:
After specified overwritten the options are overwritten as described in the section about validation.
Each method of overwriting also has a specific validation mechanism documented in the respective page and applied before
the validation of resulting configuration.
Each way of specify option set one of the program option described in this page.
The options here documented have an effecton the behaviour of the freeports call.
The options
VERBOSITY
This values goes from 0 to 5, 0 indicate min verbosity called CRITICAL VERBOSITY and 4 indicate the max verbosity also called DEBUG VERBOSITY, 5 is the NOSET VERBOSITY.
The meaning of the others levels are the ones used by the python logging package:
freeports |
|
|---|---|
0 |
|
1 |
|
2 |
|
3 |
|
4 |
|
5 |
|
URL, PDF and SAVE_PDF
When not in BATCH MODE one between URL or PDF has to be specified.
If URL is specified the program use the pdf file corresponding to the url,
if PDF is specified it load a pdf file from local filesystem and if both are specified
it tries to load from local storage, then fallback to the url.
If both are specified and SAVE_PDF is True, if the file is not present locally, it download it
and save on disk with name indicate by PDF option.
OUT_CSV
When not in BATCH MODE it indicate the file where to output the resulting csv parsed from the pdf document.
Note
The OUT_CSV default on Windows systems is CON
FORMAT
It indicate which algorithm to use to parse the pdf, these algorithms are called the ‘formats’ of the pdf reports.
It is mandatory to specify this variable if no URL is provided, if it is provided the format try to be inferred using
a mapping file that map different url regular expressions to a format. The file is called format_url_mapping.yaml in the source code.
CONFIG_FILE
This option indicate the config file loaded to overwrite the default options, this option can only be specified using an environment variable or using a command line argument, and it is evaluated before any other option.
Validation of resulting configuration
Each way of specify options have his algorithm to validate the user choice, but after those checks it is performed a consistency check on the resulting configuration. Noticebly the most important performed chekcs are:
In
BATCH MODEOUT_CSVis the name of an archive or of a directoryAfter job contextual options overwriting at least one between
PDForURLis defined
BATCH MODE
This mode permit to process different files all at one in parallel. This mode is caratterized by the BATCH
variable set to a batch csv file, OUT_CSV to a directory name or .tar.gz archive and
optionally BATCH_WORKERS to a number (if not set default to number of available CPUs).
The batch csv file is a csv file with some header that indicate the option to overwrite to the
resulting configuration. These option are called job contextual options and each row of the csv file is called a job.
The available overwrittables options are:
Header |
Overwritten option |
|---|---|
|
|
|
|
|
|
|
|
|
See below |
the header is case insensitive, so for example url, URL and Url are considered the same header.
the bool matching is done so that cast to True if csv value is one between (case insensitive)
true, on, yes, y, t, 1 to False if between false, off, no, n, f, 0.
OUT_CSV and prefix out
When in BATCH MODE, OUT_CSV has to be a directory or a .tar.gz archive.
The prefix out cell add to the resulting configuration a PREFIX_OUT_CSV option.
The program create if it doesn’t exists a directory named as OUT_CSV if is not an archive
or the name of the archive without .tar.gz exstension and for each job, save an output file
named {PREFIX_OUT_CSV}-{FORMAT}.csv. Than if was specified OUT_CSV as an archive, the directory
is compressed into .tar.gz. If the directory didn’t existed and an archive is created, after creation
the directory is deleted from the filesystem.