Metadata-Version: 2.4
Name: cfin-umls-tools
Version: 0.3.3b1
Summary: Python tools for interacting with the UMLS
Author-email: Chris Finan <c.finan@ucl.ac.uk>
License-Expression: GPL-3.0-or-later
Project-URL: Homepage, https://cfinan.gitlab.io/umls-tools
Project-URL: Repository, https://gitlab.com/cfinan/umls-tools
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Python :: Implementation :: CPython
Classifier: Programming Language :: Python :: Implementation :: PyPy
Requires-Python: <3.14,>=3.11
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: beautifulsoup4
Requires-Dist: biopython
Requires-Dist: cfin-merge-sort
Requires-Dist: lxml
Requires-Dist: nltk
Requires-Dist: numpy>=1.24
Requires-Dist: pandas
Requires-Dist: cfin-pyaddons
Requires-Dist: py2neo
Requires-Dist: pymysql
Requires-Dist: requests
Requires-Dist: sqlalchemy>=2
Requires-Dist: sqlalchemy-config
Requires-Dist: sqlalchemy-utils
Requires-Dist: stdopen
Requires-Dist: texttable
Requires-Dist: tqdm
Provides-Extra: dev
Requires-Dist: pytest; extra == "dev"
Requires-Dist: pytest-dependency; extra == "dev"
Requires-Dist: pytest-cov; extra == "dev"
Requires-Dist: ruff; extra == "dev"
Requires-Dist: bump2version; extra == "dev"
Requires-Dist: build; extra == "dev"
Requires-Dist: twine; extra == "dev"
Dynamic: license-file

# UMLS tools

__version__: `0.3.3b1`

The umls-tools package is a toolkit to build the UMLS data into a relational database. It also provides an SQLAlchemy object relational mapper and an API for using Metamap. In addition, there are scripts to extract the relationships and build them into a Neo4j graph database.

There is [online](https://cfinan.gitlab.io/umls-tools/index.html) documentation for umls-tools.

Please note that the code in this package is intended for research use only and not meant for any clinical use.

## Installation instructions
At present, umls-tools is undergoing development and no packages exist yet on PyPi. Therefore it is recommended that you install in either of the two ways listed below.

### Installation using conda
I maintain a conda package in my personal conda channel. To install from this please run:

```
conda install -c cfin -c bioconda -c conda-forge umls-tools
```

There are currently builds for Python v3.8, v3.9 and v3.10 for Linux-64 and Mac-osx. Please keep in mind that all development is carried out on Linux-64 and Python v3.8/v3.9. I do not own a Mac so can't test on one, the conda build does run some import tests but that is it.

### Installation using pip
You can install using pip from the root of the cloned repository, first clone and cd into the repository root:

```
git clone git@gitlab.com:cfinan/umls-tools.git
cd umls-tools
```

Install the dependencies:
```
python -m pip install --upgrade -r requirements.txt
```

Then install using pip
```
python -m pip install .
```

Or for an editable (developer) install run the command below from the root of the repository. The difference with this is that you can just to a `git pull` to update, or switch branches without re-installing:
```
python -m pip install -e .
```

### Conda dependencies
There are also conda yaml environment files in `./resources/conda/envs` that have the same contents as `requirements.txt` but for conda packages, so all the pre-requisites. I use this to install all the requirements via conda and then install the package as an editable pip install.

However, if you find these useful then please use them. There are Conda environments for Python v3.8, v3.9 and v3.10.

## Next steps...
You might want to [setup](https://cfinan.gitlab.io/umls-tools/setup_config.html) a database connection config file if you are using any RDMS other than SQLite.

You will also want to [install](https://cfinan.gitlab.io/umls-tools/installing_umls_db.html) a copy of the UMLS database.

Although the `umls_tools.parse` module is deprecated, it does require the [GeniaTagger](https://www.nactem.ac.uk/GENIA/tagger/) to be installed. The path to the binary should be set in an environment variable called `GENIATAGGER` in your `~/.bashrc`. If you do not plan to use the `umls_tools.parse` module then this is optional.

If you plan to use [Metamap](https://www.nlm.nih.gov/research/umls/implementation_resources/metamap.html), you will also need to [install](https://lhncbc.nlm.nih.gov/ii/tools/MetaMap/documentation/Installation.html) it locally, you will need to login to the NLM for that.

There is also an experimental [Neo4j build](https://cfinan.gitlab.io/umls-tools/installing_umls_db.html) script you can try but read below first.

In addition to the Python command-line scripts that are available when the package is installed. There are also some bash administrative scripts located in ``./resources/bin``. Please note these will not be installed when you install via clone & pip or a conda install. If using conda you will have to clone the repo. With either install method you will need to add the ``./resources/bin`` directory to your PATH.

These scripts will require two bash libraries to be in your PATH.

1. ``shflags`` - `This <https://github.com/kward/shflags>`_ is to manage bash command line arguments.
2. ``bash-helpers`` - `This <https://gitlab.com/cfinan/bash-helpers>`_ wraps some handle bash functions.

For more information on what is available see the [bash script](https://cfinan.gitlab.io/umls-tools/scripts.html#bash-scripts) documentation.


## Change log

### version `0.3.0a0`
* API - Add a generalisable index module (`umls_tools.admin.index`) for creating index tables in the UMLS and other databases. This also offers some basic index search options.
* API - Added an `umls_tools.orm_mixin` module, to generalise index table creation.
* API - Updated the ORM to add index tables to the UMLS schema.
* API - Deprecated the `umls_tools.parsers` module.
* SCRIPTS - Added a UMLS database index script to create index tables from the `MRCONSO.STR` fields.

### version `0.3.1a0`
* API - Updated to use SQLALchemy 2 - This will cause some warnings when the ORM module is loaded. I am currently investigating this.
