iLund4u
Home page and documentation: https://art-egorov.github.io/ilund4u/
The Atkinson Lab 4U | AE
-------- HOTSPOTS MODE --------
-------------------------------
    COMMAND-LINE PARAMETERS
-------------------------------
[MANDATORY ARGUMENTS]
-gff <folder>
    Path to a folder containing extended gff files.
    Each gff file should contain a corresponding nucleotide sequence.
    (designed to handle pharokka/prokka produced annotation files).
-------------------------------
[OPTIONAL ARGUMENTS | DATA PROCESSING]
-ufid, --use-filename-as-id
    Use filename (wo extension) as contig id instead
    of the contig id written in a gff file.
-mps, --min-proteome-size <int>
    Minimal number of proteins in a genome to be taken in analysis.
    [default: 15]
-gct, --genome-circularity-table <path>
    Path to a table containing information for each genome about whether
    it is circular or not.
    Format: tab-separated; column names: id, circular. Values 1 or 0.
    For genomes which id is not listed, the default value will be used.
    [default: non circular; use -ncg/--non-circular-genomes to change]
--pclt, --protein-clusters-table <path>
    Path to a table with predefined protein clusters (to replace
    mmseqs clustering). Format: mmseqs-like cluster table
    (two columns, first - cluster_id; second - protein_id; without header)
-psc, --proteome-sim-cutoff <float>
    Minimal fraction of shared homologous proteins between two
    proteomes to be connected in proteome network by edge.
    [default: 0.65]
--pcot, --proteome-communities-table <path>
    Path to a table with predefined proteome communities (to replace network-based
    Leiden community detection algorithm). Format: mmseqs-like cluster table
    (two columns, first - community_id; second - proteome_id; without header)
-spc, --single-proteome-community
    Set the same single community for all proteomes. Could be considered
    as pangenome mode when input consists of the same or highly similar species.
-mpcs, --min-proteome-community-size <int>
   Minimal size of proteome community which then will be taken in
   hotspot search analysis [default: 10].
-vpc, --variable-protein-cutoff <float>
    Upper limit of fraction of proteomes in which a protein homology
    group should be found within a proteome community to be called
    "variable". [default: 0.25]
-cpc, --conserved-protein-cutoff <float>
    Lower limit of fraction of proteomes in which a protein homology
    group should be found within a proteome community to be called
    "conserved". [default: 0.75]
-cg, --circular-genomes
    Consider each input genome as circular. That means that first
    and last proteins will be considered as neighbours.
    [default: non circular]
-ncg, --non-circular-genomes
    Consider each input genome as non-circular. That means that first
    and last proteins won't be considered as neighbours.
    [default: non circular]
-hpc, --hotspot-presence-cutoff
    Fraction of proteomes within a proteome community in which clustered
    variable islands should be found to be considered as a hotspot.
    [default: 0.3]
-rnf, --report-not-flanked
    Report in results hotspots that have flanked conserved genes only on one
    side (located on the end of non-circular sequences) [default: False]
-------------------------------
[OPTIONAL ARGUMENTS | OTHERS]
-o <dir name>
    Output dir name. This will be created if it does not exist.
	[default: ilund4u_{current_date}; e.g. ilund4u_2022_07_25-20_41]
-o-db, --output-database <dirname>
    Output dir name for the database.
    If not specified then no database will be saved.
-c <standard|<file.cfg>
    Path to a configuration file or name of a premade config file
    [default: standard]
-------------------------------
[MISCELLANEOUS ARGUMENTS]
--debug
    Provide a detailed stack trace for debugging purposes.
--parsing-debug
    Provide detailed stack trace for debugging purposes
    for failed reading of gff files.
-q, --quiet
    Don't show progress messages.
-------------------------------