Metadata-Version: 2.4
Name: hca-smart-sync
Version: 0.2.1
Summary: Intelligent S3 synchronization for HCA Atlas data
License: Apache-2.0
License-File: LICENSE
Keywords: HCA,S3,sync,bioinformatics,datasets,checksum,manifest
Author: HCA Team
Author-email: dave@clevercanary.com
Requires-Python: >=3.10,<4.0
Classifier: Environment :: Console
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Python :: 3.14
Classifier: Topic :: Scientific/Engineering
Requires-Dist: boto3 (>=1.34.0,<2.0.0)
Requires-Dist: botocore (>=1.34.0,<2.0.0)
Requires-Dist: natsort (>=8.4.0,<9.0.0)
Requires-Dist: pydantic (>=2.0.0,<3.0.0)
Requires-Dist: pydantic-settings (>=2.0.0,<3.0.0)
Requires-Dist: pyyaml (>=6.0.0,<7.0.0)
Requires-Dist: rich (>=13.0.0,<14.0.0)
Requires-Dist: typer (>=0.16.0,<0.17.0)
Project-URL: Homepage, https://github.com/clevercanary/hca-ingest-tools/tree/main/smart-sync
Project-URL: Issues, https://github.com/clevercanary/hca-ingest-tools/issues
Project-URL: Repository, https://github.com/clevercanary/hca-ingest-tools
Description-Content-Type: text/markdown

# HCA Smart-Sync

Intelligent S3 data synchronization for HCA Atlas source datasets and integrated objects.

## Installation

```bash
cd smart-sync
poetry install
```

## Usage

```bash
# Basic sync
poetry run hca-smart-sync sync gut-v1 --profile my-profile

# Dry run
poetry run hca-smart-sync sync gut-v1 --profile my-profile --dry-run

# Development environment
poetry run hca-smart-sync sync gut-v1 --profile my-profile --environment dev
```

## Development

```bash
# Install development dependencies
make dev

# Run tests
make test-all

# Run with coverage
make test-cov

# Run linting
make lint

# Format code
make format
```

## Features

- SHA256 checksum-based synchronization
- Manifest-driven uploads
- AWS CLI integration with progress display
- Environment-based bucket selection
- Interactive upload confirmation
- Research-grade data integrity verification

## Configuration

The tool supports environment-based bucket selection:

- `prod` (default): `hca-atlas-tracker-data`
- `dev`: `hca-atlas-tracker-data-dev`

## Requirements

- Python 3.10+
- AWS CLI configured with appropriate profiles
- Poetry for dependency management

