Command line interface¶
The alexandria3k command can be invoked from the shell as follows.
alexandria3k: Relational interface to publication metadata
usage: alexandria3k [-h] [-a ATTACH_DATABASES [ATTACH_DATABASES ...]]
[-c COLUMNS [COLUMNS ...]] [-D DEBUG [DEBUG ...]]
[-d DATA_SOURCE [DATA_SOURCE ...]] [-E OUTPUT_ENCODING]
[-F FIELD_SEPARATOR] [-H] [-i [INDEX ...]]
[-L LIST_SCHEMA] [-l LINKED_RECORDS] [-o OUTPUT] [-P]
[-p POPULATE_DB_PATH] [-Q QUERY_FILE] [-q QUERY]
[-R ROW_SELECTION_FILE] [-r ROW_SELECTION] [-s SAMPLE]
[-x EXECUTE]
Named Arguments¶
- -a, --attach-databases
Databases to attach for the row selection query
- -c, --columns
Columns to populate using table.column or table.*
- -D, --debug
- Output debuggging information as specfied by the arguments.
files-read: Output counts of data files read; link: Record linking operations; log-sql: Output executed SQL statements; perf: Output performance timings; populated-counts: Dump counts of the populated database; populated-data: Dump the data of the populated database; populated-reports: Output query results from the populated database; progress: Report progress; stderr: Log to standard error; virtual-counts: Dump counts of the virtual database; virtual-data: Dump the data of the virtual database.
Default: []
- -d, --data-source
- Specify data set to be processed and its source.
The following data sets are supported: ASJC [<CSV-file> | <URL>] (defaults to internal table); Crossref <container-directory>; DOAJ [<CSV-file> | <URL>] (defaults to https://doaj.org/csv); funder-names [<CSV-file> | <URL>] (defaults to https://doi.crossref.org/funderNames?mode=list); journal-names [<CSV-file> | <URL>] (defaults to http://ftp.crossref.org/titlelist/titleFile.csv); ORCID <summaries.tar.gz-file> ROR <zip-file>;
- -E, --output-encoding
Query output character encoding (use utf-8-sig for Excel)
Default: “utf-8”
- -F, --field-separator
Character to use for separating query output fields
Default: “,”
- -H, --header
Include a header in the query output
Default: False
- -i, --index
SQL expressions that select the populated rows
- -L, --list-schema
- List the schema of the specified database. The following
names are supported: Crossref, ORCID, ROR, other, all
- -l, --linked-records
Only add ORCID records that link to existing <persons> or <works>
- -o, --output
Output file for query results
- -P, --partition
Run the query over partitioned data slices. (Warning: arguments are run per partition.)
Default: False
- -p, --populate-db-path
Populate the SQLite database in the specified path
- -Q, --query-file
File containing query to run on the virtual tables
- -q, --query
Query to run on the virtual tables
- -R, --row-selection-file
File containing SQL expression that selects the populated rows
- -r, --row-selection
SQL expression that selects the populated rows
- -s, --sample
Python expression to sample the Crossref tables (e.g. random.random() < 0.0002)
Default: “True”
- -x, --execute
Operation to execute on the data. This can be one of: link-aa-base-ror (link author affiliations to base-level research organizations); link-aa-top-ror (link author affiliations to top-level research organizations); link-works-asjcs (link works with Scopus All Science Journal Classification Codes — ASJCs).