Metadata-Version: 2.4
Name: datasentinel
Version: 0.1.2
Summary: Data Sentinel is a powerful tool to monitor data pipelines and ensure data quality.
Author: Sumz SAS
License: Apache Software License (Apache 2.0)
Project-URL: Homepage, https://github.com/SumzCol/datasentinel
Project-URL: Bug Tracker, https://github.com/SumzCol/datasentinel
Keywords: data quality,data engineering,monitoring,data validation,data pipelines,pipelines,audit logging
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: python-ulid~=2.7.0
Requires-Dist: pydantic~=2.10
Requires-Dist: toolz
Requires-Dist: lazy_loader
Provides-Extra: pyspark-base
Requires-Dist: pyspark<4.0,>=3.4.0; extra == "pyspark-base"
Provides-Extra: deltatable-base
Requires-Dist: delta-spark<4.0,>=3.1.0; extra == "deltatable-base"
Provides-Extra: pandas-base
Requires-Dist: pandas<3.0,>=1.3; extra == "pandas-base"
Provides-Extra: pandas-cualleecheck
Requires-Dist: cuallee[pandas]~=0.15.2; extra == "pandas-cualleecheck"
Provides-Extra: pyspark-cualleecheck
Requires-Dist: cuallee[pyspark]~=0.15.2; extra == "pyspark-cualleecheck"
Provides-Extra: cualleecheck
Requires-Dist: datasentinel[pandas-cualleecheck,pyspark-cualleecheck]; extra == "cualleecheck"
Provides-Extra: pandas-rowlevelresultcheck
Requires-Dist: datasentinel[pandas-base]; extra == "pandas-rowlevelresultcheck"
Provides-Extra: pyspark-rowlevelresultcheck
Requires-Dist: datasentinel[pyspark-base]; extra == "pyspark-rowlevelresultcheck"
Provides-Extra: rowlevelresultcheck
Requires-Dist: datasentinel[pandas-rowlevelresultcheck,pyspark-rowlevelresultcheck]; extra == "rowlevelresultcheck"
Provides-Extra: pandas-checks
Requires-Dist: datasentinel[pandas-cualleecheck,pandas-rowlevelresultcheck]; extra == "pandas-checks"
Provides-Extra: pyspark-checks
Requires-Dist: datasentinel[pyspark-cualleecheck,pyspark-rowlevelresultcheck]; extra == "pyspark-checks"
Provides-Extra: all-checks
Requires-Dist: datasentinel[pandas-checks,pyspark-checks]; extra == "all-checks"
Provides-Extra: spark-deltatableresultstore
Requires-Dist: datasentinel[deltatable-base,pyspark-base]; extra == "spark-deltatableresultstore"
Provides-Extra: all-resultstores
Requires-Dist: datasentinel[spark-deltatableresultstore]; extra == "all-resultstores"
Provides-Extra: database-databaseauditstore
Requires-Dist: SQLAlchemy<3.0,>=1.4; extra == "database-databaseauditstore"
Provides-Extra: spark-deltatableauditstore
Requires-Dist: datasentinel[deltatable-base,pyspark-base]; extra == "spark-deltatableauditstore"
Provides-Extra: all-auditstores
Requires-Dist: datasentinel[database-databaseauditstore,spark-deltatableauditstore]; extra == "all-auditstores"
Provides-Extra: slack-slacknotifier
Requires-Dist: slack-sdk~=3.34.0; extra == "slack-slacknotifier"
Provides-Extra: all-notifiers
Requires-Dist: datasentinel[slack-slacknotifier]; extra == "all-notifiers"
Provides-Extra: email-templateemailmessagerenderer
Requires-Dist: openpyxl==3.1.5; extra == "email-templateemailmessagerenderer"
Requires-Dist: Jinja2==3.1.5; extra == "email-templateemailmessagerenderer"
Provides-Extra: all-renderers
Requires-Dist: datasentinel[email-templateemailmessagerenderer]; extra == "all-renderers"
Provides-Extra: test
Requires-Dist: pytest<9.0,>=7.2; extra == "test"
Requires-Dist: pytest-cov<7,>=3; extra == "test"
Requires-Dist: pendulum>=2.1.2; extra == "test"
Requires-Dist: coverage[toml]; extra == "test"
Requires-Dist: datasentinel[all-auditstores,all-checks,all-notifiers,all-renderers,all-resultstores]; extra == "test"
Provides-Extra: scripts
Requires-Dist: click==8.1.0; extra == "scripts"
Provides-Extra: lint
Requires-Dist: ruff==0.11.12; extra == "lint"
Requires-Dist: pre-commit<5.0,>=2.9.2; extra == "lint"
Requires-Dist: pyright==1.1.403; extra == "lint"
Provides-Extra: all
Requires-Dist: datasentinel[lint,scripts,test]; extra == "all"
Dynamic: license-file

# Data Sentinel

[![Python version](https://img.shields.io/badge/python-3.10%20%7C%203.11%20%7C%203.12%20%7C%203.13-blue.svg)](https://pypi.org/project/datasentinel/)
[![PyPI version](https://badge.fury.io/py/datasentinel.svg)](https://pypi.org/project/datasentinel/)
[![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](https://github.com/SumzCol/datasentinel/blob/main/LICENSE)

## What is Data Sentinel?

Data Sentinel is a powerful framework for data quality validation and monitoring in production data pipelines. It provides a comprehensive suite of tools to ensure data accuracy, completeness, consistency, and integrity with native support for PySpark and Pandas dataframes.

Data Sentinel is designed with software engineering best practices to help you create robust, maintainable, and scalable data quality monitoring solutions.

## How do I install Data Sentinel?

To install Data Sentinel from the Python Package Index (PyPI) run:

```bash
pip install datasentinel
```

For specific use cases, you can install with optional dependencies:

```bash
# For PySpark focused data validation
pip install datasentinel[pyspark-checks]

# For Pandas focused data validation
pip install datasentinel[pandas-checks]

# For complete installation with all features
pip install datasentinel[all]
```

## What are the main features of Data Sentinel?

| Feature                       | What is this?                                                                                                                            |
|-------------------------------|------------------------------------------------------------------------------------------------------------------------------------------|
| **Data Quality Validations**  | Execute comprehensive checks to ensure data accuracy, completeness, consistency, and integrity using industry-standard validation rules. |
| **Multi-DataFrame Support**   | Native support for PySpark and Pandas dataframes with consistent APIs.                                                                   |
| **Audit Stores**              | Comprehensive audit trail logging to multiple destinations including databases and Delta tables.                                         |
| **Notifications**             | Configurable notification system that alerts stakeholders when data quality issues are detected.                                         |
| **Validation Results Stores** | Store data quality validation results in various formats and destinations for reporting, analysis, and historical tracking.              |


## Why does Data Sentinel exist?

Data quality is critical for successful data-driven organizations, but implementing comprehensive data quality monitoring can be complex and time-consuming. Data Sentinel addresses this by providing:

- **Standardized approach** to data quality validation across different technologies.
- **Extensible architecture** that adapts to your specific requirements.
- **Best practices** built-in for audit logging, notifications, and result management.

## Can I contribute?

We welcome contributions to Data Sentinel! Whether you're fixing bugs, adding features, improving documentation, or sharing feedback, your contributions help make Data Sentinel better for everyone.

Check out our [contribution guidelines](CONTRIBUTING.md) to get started.

## Where can I learn more?

- **Documentation**: [Coming Soon] - Comprehensive guides and API reference
- **GitHub Repository**: [https://github.com/SumzCol/datasentinel](https://github.com/SumzCol/datasentinel)
- **Issue Tracker**: [https://github.com/SumzCol/datasentinel/issues](https://github.com/SumzCol/datasentinel/issues)

## License

Data Sentinel is licensed under the [Apache Software License (Apache 2.0)](LICENSE).
