Metadata-Version: 2.4
Name: antigravity-lite
Version: 0.1.6
Summary: Community Version of the B2B Antigravity PySpark Framework. Essential utilities for AWS FinOps and Cloud cost optimization.
Author-email: Arquitecto B2B <arquitecto@antigravity.dev>
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Requires-Python: >=3.8
Description-Content-Type: text/markdown
Requires-Dist: boto3>=1.26.0

<div align="center">
  <h1>🚀 Antigravity Lite (FinOps & AWS Glue Tools)</h1>
  <p><b>AWS Financial Auditor and Smart S3 Manager for PySpark Ecosystems</b></p>
</div>

---

## 🛑 The Silent AWS Glue Killer: Spark's Catalyst Optimizer
Have you ever wondered why your massive PySpark cluster just hangs for hours, consuming 100% CPU without writing a single byte of data when processing a <b>Wide Dataframe</b>?

Many Data Engineers blame data skew or bad partitioning, panicking and upscaling AWS Glue Worker instances to expensive `G.4X` or `G.8X` tiers. But throwing money at RAM is not the solution. **The architectural solution is not buying more RAM; it's isolating the math.**

## 📦 Installation

```bash
pip install antigravity-lite
```

## 🛠 Included Open-Source Tools

### 1. Smart S3 Renamer (`S3Finalizer` - Universal API)
Tired of PySpark polluting your Datalake with `part-00000...` strings and empty `_SUCCESS` files?
`S3Finalizer` is a native Boto3 utility that scans raw outputs and renames them sequentially and cleanly without breaking cluster concurrency. It works with Apache Spark, AWS Glue DynamicFrames, and standard S3 files seamlessly.

```python
from antigravity_lite.io.s3_finalizer import S3Finalizer

finalizer = S3Finalizer(bucket_name="my-corporate-datalake")

# Automatically re-sequence and format any outputs natively
finalizer.sequence_files(
    s3_prefix="raw_zone/sales/",
    pattern="ENTERPRISE_REPORT_{seq:04d}.parquet",
    starts_with="",      # Optional: Target specific outputs (e.g. "0000_part")
    ends_with=".parquet",# Optional: Ignore non-parquet files
    contains="part"      # Optional: Filter
)
# Magic Output: ENTERPRISE_REPORT_0001.parquet
```

### 2. AWS Glue FinOps Auditor (`AgAuditor`)
Inject this standalone tool to scan your AWS CloudWatch telemetry and compute exactly how many thousands of dollars you are wasting each month on inflated AWS Worker instances just to keep Spark's Catalyst Optimizer from crashing.

```python
from antigravity_lite.auditor.finops import AgAuditor

# Scan the cluster using CloudWatch and AWS APIs
AgAuditor.run_aws_audit(region="us-east-1", dias_analisis=7)
```
*Console Output:* It will accurately map your allocated *Workers* (G.1X, G.4X) against real compressed JVM Heap usage to reveal your exact financial capital leak.

---

## 💎 Commercial Licensing (Antigravity PRO)

The *Lite* version can tell you you're burning thousands of dollars... **Purchasing the Antigravity PRO Enterprise License actually fixes it.**

If your `AgAuditor` report flags an **"⚠️ AST/OOM RISK"** or your Heap spikes past 85%, you need the **DataFrameChunker** mathematical engine (Exclusive to the Pro B2B Edition).
The enterprise version intercepts Spark's low-level planner and vertically slices the execution plan using **Logarithmic Binary Trees (Tree Reduce)** to forcibly truncate the AST *Lineage*. This drops your memory footprint so drastically that you can process half-a-billion operations on tiny `G.1X` clusters at zero `OutOfMemory` risk.

💻 **Request a Proof-of-Concept or Live Architecture Demo for B2B deployment by connecting via LinkedIn.**
