
.. DO NOT EDIT.
.. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY.
.. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE:
.. "auto_examples/datasets/plot_rhomboid_proteases.py"
.. LINE NUMBERS ARE GIVEN BELOW.

.. only:: html

    .. note::
        :class: sphx-glr-download-link-note

        :ref:`Go to the end <sphx_glr_download_auto_examples_datasets_plot_rhomboid_proteases.py>`
        to download the full example code. or to run this example in your browser via Binder

.. rst-class:: sphx-glr-example-title

.. _sphx_glr_auto_examples_datasets_plot_rhomboid_proteases.py:


Rhomboid proteases
====================

Load the dataset

.. GENERATED FROM PYTHON SOURCE LINES 8-32




.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    The loaded MSA has 2767 sequences and 135       positions.
    After filtering, we have 135 remaining positions.
    After filtering, we have 2767 remaining sequences.
    computing weight of seq 1/2767  
    computing weight of seq 101/2767        
    computing weight of seq 201/2767        
    computing weight of seq 301/2767        
    computing weight of seq 401/2767        
    computing weight of seq 501/2767        
    computing weight of seq 601/2767        
    computing weight of seq 701/2767        
    computing weight of seq 801/2767        
    computing weight of seq 901/2767        
    computing weight of seq 1001/2767       
    computing weight of seq 1101/2767       
    computing weight of seq 1201/2767       
    computing weight of seq 1301/2767       
    computing weight of seq 1401/2767       
    computing weight of seq 1501/2767       
    computing weight of seq 1601/2767       
    computing weight of seq 1701/2767       
    computing weight of seq 1801/2767       
    computing weight of seq 1901/2767       
    computing weight of seq 2001/2767       
    computing weight of seq 2101/2767       
    computing weight of seq 2201/2767       
    computing weight of seq 2301/2767       
    computing weight of seq 2401/2767       
    computing weight of seq 2501/2767       
    computing weight of seq 2601/2767       
    computing weight of seq 2701/2767       
    Number of effective sequences 1792






|

.. code-block:: Python

    import numpy as np
    from cocoatree.datasets import load_rhomboid_proteases
    import cocoatree.msa as c_msa
    import cocoatree.statistics.position as c_pos


    dataset = load_rhomboid_proteases()

    loaded_seqs = dataset["alignment"]
    loaded_seqs_id = dataset["sequence_ids"]
    n_loaded_pos, n_loaded_seqs = len(loaded_seqs[0]), len(loaded_seqs)

    print(f"The loaded MSA has {n_loaded_seqs} sequences and {n_loaded_pos} \
          positions.")

    sequences, sequences_id, positions = c_msa.filter_sequences(
        loaded_seqs, loaded_seqs_id, gap_threshold=0.4, seq_threshold=0.2)
    n_pos = len(positions)
    print(f"After filtering, we have {n_pos} remaining positions.")
    print(f"After filtering, we have {len(sequences)} remaining sequences.")

    seq_weights, m_eff = c_pos.compute_seq_weights(sequences)
    print('Number of effective sequences %d' %
          np.round(m_eff))


.. rst-class:: sphx-glr-timing

   **Total running time of the script:** (0 minutes 1.245 seconds)


.. _sphx_glr_download_auto_examples_datasets_plot_rhomboid_proteases.py:

.. only:: html

  .. container:: sphx-glr-footer sphx-glr-footer-example

    .. container:: binder-badge

      .. image:: images/binder_badge_logo.svg
        :target: https://mybinder.org/v2/gh/tree-timc/cocoatree/gh-pages?urlpath=lab/tree/notebooks/auto_examples/datasets/plot_rhomboid_proteases.ipynb
        :alt: Launch binder
        :width: 150 px

    .. container:: sphx-glr-download sphx-glr-download-jupyter

      :download:`Download Jupyter notebook: plot_rhomboid_proteases.ipynb <plot_rhomboid_proteases.ipynb>`

    .. container:: sphx-glr-download sphx-glr-download-python

      :download:`Download Python source code: plot_rhomboid_proteases.py <plot_rhomboid_proteases.py>`

    .. container:: sphx-glr-download sphx-glr-download-zip

      :download:`Download zipped: plot_rhomboid_proteases.zip <plot_rhomboid_proteases.zip>`


.. only:: html

 .. rst-class:: sphx-glr-signature

    `Gallery generated by Sphinx-Gallery <https://sphinx-gallery.github.io>`_
