
.. DO NOT EDIT.
.. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY.
.. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE:
.. "auto_examples/advanced/plot_sca_with_own_metric.py"
.. LINE NUMBERS ARE GIVEN BELOW.

.. only:: html

    .. note::
        :class: sphx-glr-download-link-note

        :ref:`Go to the end <sphx_glr_download_auto_examples_advanced_plot_sca_with_own_metric.py>`
        to download the full example code. or to run this example in your browser via Binder

.. rst-class:: sphx-glr-example-title

.. _sphx_glr_auto_examples_advanced_plot_sca_with_own_metric.py:


========================================
Running Cocoatree with your own metric
========================================

It is possible to perform Cocoatree's SCA pipeline by defining it's own
metric. In this example, we show how to use a callable to compute the
coevolution metric.

.. GENERATED FROM PYTHON SOURCE LINES 10-17

.. code-block:: Python

    import numpy as np

    import cocoatree.datasets as c_data
    import cocoatree
    from sklearn.metrics import pairwise
    from sklearn.preprocessing import OneHotEncoder








.. GENERATED FROM PYTHON SOURCE LINES 18-24

Load the dataset
----------------

We start by importing the dataset. In this case, we can directly load the S1
serine protease dataset provided in :mod:`cocoatree`. To work on your on
dataset, you can use the `cocoatree.io.load_msa` function.

.. GENERATED FROM PYTHON SOURCE LINES 24-40

.. code-block:: Python


    serine_dataset = c_data.load_S1A_serine_proteases()
    loaded_seqs = serine_dataset["alignment"]
    loaded_seqs_id = serine_dataset["sequence_ids"]
    n_loaded_pos, n_loaded_seqs = len(loaded_seqs[0]), len(loaded_seqs)


    def compute_correlation(sequences, seq_weights=None, freq_regul=0.08):
        seqs = np.array([[i for i in s] for s in sequences])
        X = OneHotEncoder().fit_transform(seqs.T)
        return pairwise.pairwise_kernels(X)


    coevol_matrix, results = cocoatree.perform_sca(
        loaded_seqs_id, loaded_seqs, n_components=3,
        coevolution_metric=compute_correlation)




.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    computing weight of seq 1/1376  
    computing weight of seq 101/1376        
    computing weight of seq 201/1376        
    computing weight of seq 301/1376        
    computing weight of seq 401/1376        
    computing weight of seq 501/1376        
    computing weight of seq 601/1376        
    computing weight of seq 701/1376        
    computing weight of seq 801/1376        
    computing weight of seq 901/1376        
    computing weight of seq 1001/1376       
    computing weight of seq 1101/1376       
    computing weight of seq 1201/1376       
    computing weight of seq 1301/1376       




.. rst-class:: sphx-glr-timing

   **Total running time of the script:** (0 minutes 1.247 seconds)


.. _sphx_glr_download_auto_examples_advanced_plot_sca_with_own_metric.py:

.. only:: html

  .. container:: sphx-glr-footer sphx-glr-footer-example

    .. container:: binder-badge

      .. image:: images/binder_badge_logo.svg
        :target: https://mybinder.org/v2/gh/tree-timc/cocoatree/gh-pages?urlpath=lab/tree/notebooks/auto_examples/advanced/plot_sca_with_own_metric.ipynb
        :alt: Launch binder
        :width: 150 px

    .. container:: sphx-glr-download sphx-glr-download-jupyter

      :download:`Download Jupyter notebook: plot_sca_with_own_metric.ipynb <plot_sca_with_own_metric.ipynb>`

    .. container:: sphx-glr-download sphx-glr-download-python

      :download:`Download Python source code: plot_sca_with_own_metric.py <plot_sca_with_own_metric.py>`

    .. container:: sphx-glr-download sphx-glr-download-zip

      :download:`Download zipped: plot_sca_with_own_metric.zip <plot_sca_with_own_metric.zip>`


.. only:: html

 .. rst-class:: sphx-glr-signature

    `Gallery generated by Sphinx-Gallery <https://sphinx-gallery.github.io>`_
