***************************************************************************
*
* Title:        ChEMBL FTP Directory
*
* Release:      chembl_33
*
* Date:         31/05/2023
*
***************************************************************************

The data content in ChEMBL is licensed under a highly permissive Creative
Commons license - specifically the "CC Attribution-ShareAlike 3.0 Unported
license", see LICENSE file. The required attribution should contain the url
of the ChEMBL resource, and also the release version, e.g.:

ChEMBL data is from http://www.ebi.ac.uk/chembl - the version of ChEMBL is
chembl_33.

This should be visible on the entry portal for a web resource in which
ChEMBL is integrated, or contained with the documentation for any further
distribution.


Contents of this directory:
---------------------------

README                               This file

LICENSE                              License data is released under: Creative
                                     Commons Attribution-Share Alike 3.0
                                     Unported License

chembl_33_release_notes.txt          chembl_33 release details, which includes
                                     counts and schema change information

chembl_33_schema_documentation.html  Listing of all chembl_33 tables and
                                     columns in html format

chembl_33_schema_documentation.txt   Listing of all chembl_33 tables and
                                     columns in plain text format

chembl_33.fa.gz                      Fasta file of proteins in chembl_33
                                     target_dictionary table

chembl_33_bio.fa.gz                  Fasta file of the sequences of proteins
                                     in the chembl_33 biotherapeutics table

chembl_33.sdf.gz                     SDF of chembl_33 compounds, includes
                                     chembl_id

chembl_33_chemreps.txt.gz            Tab-separated file containing different
                                     chemical representations (SMILES, InChI
                                     and InChI Key) of chembl_33 compounds,
                                     includes chembl_id

chembl_33_monomer_library.xml        chembl_33 HELM monomer library

chembl_33_schema.png                 Schema diagram of chembl_33 database

chembl_33_mysql.tar.gz               Files required to load chembl_33 data
                                     into a MySQL database

chembl_33_postgresql.tar.gz          Files required to load chembl_33 data
                                     into a PostgreSQL database

chembl_33_sqlite.tar.gz              Files required to load chembl_33 data
                                     into a SQLite database

chembl_uniprot_mapping.txt           Mapping between chembl_33 target
                                     chembl_ids and UniProt accessions

chembl_33.fps.gz                     2048 bit radius 2 morgan FPS format
                                     fingerprints of chembl_33 compounds,
                                     includes chembl_id

chembl_33.h5                         2048 bit radius 2 morgan fingerprints
                                     file for FPSim2, includes molregno

checksums.txt                        SHA-256 checksum hashes for ftp files
