lanctools
Tools for working with phased local ancestry data stored in the .lanc file format, as defined by Admix-kit [Hou et al., 2024].
lanctools is designed to provide fast local ancestry queries and convenient conversion from external formats (e.g., FLARE [Browning et al., 2023] and RFMix [Maples et al., 2013]). It focuses on efficient access to .lanc data and is not intended to replace the full functionality of Admix-kit.
Features
- Efficient random access to phased local ancestry data
- Local ancestry-masked genotype queries
- Conversion from FLARE and RFMix output to
.lancformat - Python API and command-line interface (CLI)
Installation
pip install lanctools
Quickstart
Querying Local Ancestry Data
lanctools is primarily intending for performing fast queries of local ancestry and genotype data. Examples are provided below.
import numpy as np
from lanctools import LancData
ld = LancData(
plink_prefix="chr1",
lanc_file="chr1.lanc",
ancestries=["YRI", "CEU"]
)
idx_var = np.arange(100, dtype=np.uint32)
# Get phased local ancestry: shape (N, 100, 2)
lanc = ld.get_lanc(idx_var)
# Get phased genotypes: shape (N, 100, 2)
geno = ld.get_geno(idx_var)
# Get ancestry-masked genotypes: shape (N, 100, len(ancestries))
lanc_geno = ld.get_lanc_geno(idx_var)
Converting FLARE or RFMix Files to .lanc
lanctools also provides c++ code for converting RFMix2 .msp.tsv or FLARE
.vcf.gz files into the .lanc file format. This can be called with the
python function convert_to_lanc.
from lanctools import convert_to_lanc
convert_to_lanc(
file="chr1.anc.vcf.gz",
file_fmt="FLARE",
plink_prefix="chr1",
output="chr1.lanc"
)
Command-Line Interface
For the file format conversion example above, a command-line utility is provided which accomplishes the same task.
lanctools convert-flare --file chr1.anc.vcf.gz --plink_prefix chr1 --output chr1.lanc
lanctools also has a helpful CLI command for combining multiple .lanc files
(e.g. across chromosomes) into a single .lanc file.
lanctools merge --file chr1.lanc,chr2.lanc,chr3.lanc --outfile chr1_3.lanc
References
- Hou, K. et al. Admix-kit: an integrated toolkit and pipeline for genetic analyses of admixed populations. Bioinformatics 40, btae148 (2024). paper software
- Browning, S. R., Waples, R. K. & Browning, B. L. Fast, accurate local ancestry inference with FLARE. Am J Hum Genet 110, 326–335 (2023). paper software
- Maples, B. K., Gravel, S., Kenny, E. E. & Bustamante, C. D. RFMix: A Discriminative Modeling Approach for Rapid and Robust Local-Ancestry Inference. Am J Hum Genet 93, 278–288 (2013). paper software