Metadata-Version: 2.4
Name: proleTRact
Version: 0.1.1
Summary: A user-friendly platform for interactive exploration, visualization, and analysis of tandem repeat findings from TandemTwister outputs
Author-email: Lion Ward Al Raei <lionward.alraei@gmail.com>
License: BSD 3-Clause Non-Commercial License
        
        Copyright (c) 2025, Lion Ward Al Raei; Max Plamck institute for molecular genetics
        
        Redistribution and use in source and binary forms, with or without
        modification, are permitted for NON-COMMERCIAL PURPOSES ONLY, provided that
        the following conditions are met:
        
        1. Redistributions of source code must retain the above copyright notice, this
           list of conditions and the following disclaimer.
        
        2. Redistributions in binary form must reproduce the above copyright notice,
           this list of conditions and the following disclaimer in the documentation
           and/or other materials provided with the distribution.
        
        3. Neither the name of the copyright holder nor the names of its
           contributors may be used to endorse or promote products derived from
           this software without specific prior written permission.
        
        4. COMMERCIAL USE IS PROHIBITED. This software may not be used for commercial
           purposes, including but not limited to:
           - Use in commercial products or services
           - Use by for-profit organizations for business operations
           - Distribution as part of commercial software
           - Any use that generates revenue or commercial benefit
           
           This software may be used for:
           - Academic research
           - Educational purposes
           - Personal/private use
           - Non-profit research and development
        
        THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
        AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
        IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
        DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
        FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
        DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
        SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
        CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
        OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
        OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
        
        For commercial licensing inquiries, please contact: lionward.alraei@gmail.com
        
Project-URL: Homepage, https://github.com/Lionward/ProleTRact
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: streamlit>=1.30
Requires-Dist: pandas>=2.0
Requires-Dist: numpy>=1.24
Requires-Dist: altair>=5.0
Requires-Dist: plotly>=5.0
Requires-Dist: pysam>=0.22
Requires-Dist: scikit-learn>=1.2
Dynamic: license-file

<br />
<div align="center">
 <img width="300" alt="grafik" src="src/proletract/ProleTRact_logo.svg">
  </p>
</div>
<br />
<p>This repository contains a <strong>Tandem Repeat Visualization Tool</strong> that serves as the companion tool to <a href="https://github.com/Lionward/TandemTwister"><strong>TandemTwister</strong></a>. The tool processes Variant Call Format (VCF) files generated by TandemTwister and visualize tandem repeats in an intuitive, interactive format. Users can explore motifs, compare alleles to the reference sequence, and gain insights into the structure of tandem repeats, enhancing their ability to interpret genomic variation.</p>

<h2>Why ProleTRact?</h2>
<p>TRs are complex: alleles can differ by motif composition, length, and interrupted blocks. ProleTRact visulize TR regions with color-coded motifs, highlights interruptions, and provides intuitive navigation across regions and samples, enabling quick insight into potentially pathogenic expansions or atypical structures.</p>

<h2>Key Features</h2>
<ul>
  <li><strong>Individual and Cohort modes:</strong> Analyze a single VCF or an entire directory of VCFs.</li>
  <li><strong>Dynamic sequence visualization:</strong> Color-coded motifs, clear interruption highlighting, and side-by-side allele comparison.</li>
  <li><strong>Pathogenic TR reference overlay:</strong> Built-in <code>pathogenic_TRs.bed</code> provides context for known loci (disease, gene, thresholds).</li>
  <li><strong>Fast navigation:</strong> Move across TR records with Previous/Next controls or jump to a specific region.</li>
</ul>

<h2>Installation Options</h2>
<p>Pick the workflow that fits your environment:</p>

<h3>Option A &mdash; Install from PyPI (recommended)</h3>
<pre><code>pip install proleTRact
proleTRact  # launches the Streamlit app</code></pre>
<p>The launcher opens a browser locally. On headless machines set <code>STREAMLIT_SERVER_HEADLESS=true</code> before invoking <code>proleTRact</code>.</p>

<h3>Option B &mdash; Clone and run locally (with conda)</h3>
<pre><code>git clone git@github.com:Lionward/ProleTRact.git
cd ProleTRact
conda create -n proletract python=3.9
conda activate proletract
pip install -r requirements.txt
pip install -e .
streamlit run src/proletract/app.py
</code></pre>

<h2>Quickstart</h2>
<ol>
  <li>Launch the app with one of the commands above.</li>
  <li>Open the browser tab (Streamlit prints the URL if you are headless).</li>
  <li>Load an individual VCF or cohort folder from the sidebar and start exploring tandem repeats.</li>
</ol>

<h2>Usage</h2>
<h3>Individual mode 👤</h3>
<ol>
  <li>Select <strong>individual sample</strong> in the sidebar.</li>
  <li>Provide the absolute path to a bgzipped and tabix-indexed VCF (<code>.vcf.gz</code> with <code>.tbi</code>):
    <ul>
      <li>Enter the path in the sidebar input, then click <strong>Upload VCF File</strong>.</li>
      <li>The app will parse records and enable navigation across TR variants.</li>
    </ul>
  </li>
  <li>Use <strong>Previous</strong>/<strong>Next</strong> to step through records or jump to a region like <code>chr1:1000-2000</code>.</li>
  <li>Inspect motif blocks, interruptions, and per-allele differences.</li>
</ol>

<h3>Cohort mode 👥👥</h3>
<ol>
  <li>Select <strong>Cohort</strong> in the sidebar and choose <em>Reads-based VCF</em> or <em>Assembly VCF</em> view.</li>
  <li>Provide the absolute path to a directory containing TandemTwister VCF files:</li>
  <li>Click <strong>Load Cohort</strong> to scan the directory and enable cohort navigation.</li>
  <li>Browse records and compare across samples.</li>
  <li>Use <strong>Previous</strong>/<strong>Next</strong> to step through records or jump to a region like <code>chr1:1000-2000</code>.</li>
  <li>Inspect motif blocks, interruptions, and per-allele differences.</li>
</ol>

<h2>Input Requirements</h2>
<ul>
  <li><strong>VCF format:</strong> Standard VCF generated by TandemTwister.</li>
  <li><strong>Cohort directory:</strong> A folder with multiple <code>.vcf.gz</code> files generated by TandemTwister is required for cohort mode.</li>
</ul>


<h2>Demo / Examples</h2>
<p>Example screenshots and short walkthrough GIFs will be added here. For now, you can open <code>example.svg</code> for a preview:</p>
<img src="src/proletract/assets/example.svg" alt="Tandem Repeat Visualization Example" style="max-width: 100%; height: auto; border: 1px solid #ccc; padding: 10px;">
<ul>
  <li><em>Planned:</em> Individual-mode walkthrough </li>
  <li><em>Planned:</em> Cohort-mode walkthrough</li>
</ul>


<h2>Contributing</h2>
<p>Contributions are welcome! Please <a href="https://github.com/Lionward/ProleTRact/issues">open an issue</a> to discuss changes.</p>

<h2>License</h2>
<p>This project is licensed under the BSD 3-Clause Non-Commercial License — see <code>LICENSE</code> for details. Commercial use is prohibited. This software is intended for academic research, educational purposes, and personal/private use only. For commercial licensing inquiries, please contact the author.</p>

<h2>Citation</h2>
<p>If you use ProleTRact in your work, please cite this repository. A formal citation entry will be added once available.</p>
