=========================================================
Pikobs: CMC Observational Data Processing Toolkit
=========================================================

.. image:: https://img.shields.io/badge/python-3.8+-blue.svg
   :target: https://www.python.org/downloads/
   :alt: Python Version
.. image:: https://img.shields.io/badge/docs-Sphinx-green.svg
   :target: https://www.sphinx-doc.org/
   :alt: Built with Sphinx

Welcome to the **Pikobs** documentation. Pikobs is a comprehensive Python utility designed for the extraction, processing, and visualization of **CMC (Canadian Meteorological Centre)** observational data. 

This toolkit provides a robust suite of modules to handle various aspects of data analysis, statistical monitoring, and quality control for both conventional and satellite meteorological observations.

---

Overview
--------

Pikobs simplifies observational data management by acting as a bridge between raw assimilation cycle files and data-driven decision-making. The core workflow allows researchers and meteorologists to:

* **Extract:** Pull data directly from the CMC assimilation cycle outputs (monitoring/banco) used in the ``psmon`` system. This includes standard `RDB files <https://wiki.cmc.ec.gc.ca/wiki/Tables_Relationnelles_pour_les_observations>`_ from directories such as ``evalalt``, ``bgckalt``, and ``postalt``.
* **Filter:** Isolate specific observations using precise bitmask criteria defined by :doc:`assimilation flags <flags_criteria>` and specific :doc:`geographical regions <region>`.
* **Compute:** Generate advanced statistical metrics including O-B (omp), O-A (oma), observation density, and bias corrections based on the extracted data.
* **Visualize:** Render high-quality results tailored to each metric. You can explore the specific visualizations generated by each module on their dedicated pages, such as :doc:`Scatter Plots <scatter>`, :doc:`Geographical Maps <mapobs>`, and :doc:`Histograms <histogram>`.

---

Getting Started
---------------

Installation
~~~~~~~~~~~

It is highly recommended to install Pikobs within a virtual Python environment to maintain dependency integrity:

.. code-block:: bash

   pip install pikobs

.. note::
   Ensure all underlying system dependencies (such as specific libraries required by Cartopy or Matplotlib) are installed as outlined in your internal setup documentation.

Quick Start & HPC Usage
~~~~~~~~~~~~~~~~~~~~~~

.. important::
   **Interactive Session Setup**
   For resource-intensive computations (especially global datasets), you must start an interactive session on the compute nodes before running Pikobs to avoid overloading login nodes:

.. code-block:: bash

   qsub -I -X -l select=4:ncpus=80:mpiprocs=80:ompthreads=1:mem=185gb -l place=scatter -l walltime=6:0:0

Once inside the session, modules can be executed directly from the command line. Below is an example of generating a geographical scatter plot:

.. code-block:: bash

   python -c 'import pikobs; pikobs.scatter.arg_call()' \
       --path_experience_files /home/dlo001/sites8/Data_pikobs/monitoring0/ \
       --experience_name GIC5DA1E22_DDT2 \
       --pathwork yours_path \
       --datestart 2026020100 \
       --dateend 2026020200 \
       --region Monde \
       --family to_amsua_allsky \
       --flags_criteria assimilee \
       --fonction omp oma nobs obs dens bcorr \
       --boxsizex 2 \
       --boxsizey 2 \
       --projection cyl \
       --id_stn all \
       --channel all \
       --n_cpu 80

**Command Breakdown:**
  * **Filters** data to include only :doc:`assimilated observations <flags_criteria>` for the :doc:`to_amsua_allsky <families>` instrument.
  * **Computes** multiple metrics (omp, oma, nobs, etc.) using **80 CPU cores** for parallel processing.

---

Core Modules Overview
---------------------

Each module in Pikobs is engineered with distinct characteristics and specialized parameters to handle specific facets of data analysis. Explore their individual pages for detailed usage and examples:

* :doc:`scatter`: Generate 2D geographical scatter plots to visualize the spatial distribution of metrics (e.g., O-B, O-A).
* :doc:`timeserie`: Analyze the temporal evolution of meteorological statistics over continuous periods.
* :doc:`cardio`: Monitor system health and data assimilation flow at a high level.
* :doc:`vdedr`: Process and visualize vertical profiles and specific observational metrics.
* :doc:`zone`: Perform statistical aggregations and evaluations across specific regional zones.
* :doc:`flag`: Deep dive into Quality Control (QC) by analyzing rejection and assimilation flags.
* :doc:`mapobs`: Create detailed geographical maps displaying precise observation locations and densities.
* :doc:`histogram`: Calculate and plot statistical distributions and data frequencies.
* :doc:`profile`: is a diagnostic tool designed to evaluate the vertical distribution of meteorological metrics (such as Bias and Standard Deviation) across atmospheric pressure levels or satellite channels.
* :doc:`obscountdb`: A diagnostic tool designed to extract, compare, and visualize the volume of assimilated observations between two different assimilation cycle runs.

Documentation Index
-------------------

.. toctree::
   :maxdepth: 2
   :caption: 🛠️ Core Modules

   scatter
   timeserie
   cardio
   vdedr
   zone
   flag
   mapobs
   histogram
   profile
   obscountdb

.. toctree::
   :maxdepth: 2
   :caption: ⚙️ Configuration

   region
   projection
   flags_criteria
   families
   varno

.. toctree::
   :maxdepth: 2
   :caption: 🧪 Development

   unittest 

---

Resources & Contact
-------------------

For more information, feature requests, or bug reports:

* **Repository:** `Pikobs on GitLab <https://gitlab.science.gc.ca/dlo001/Pikobs>`_
* **Data Reference:** `RDB Files Documentation (Wiki) <https://wiki.cmc.ec.gc.ca/wiki/Tables_Relationnelles_pour_les_observations>`_
