Metadata-Version: 2.4
Name: linearscript
Version: 3.1.2
Summary: SCRIPT : A deterministic, RDKit-independent molecular notation with 100% round-trip stereo parity, materials science extensions, biopolymer support, and formal LALR grammar.
Author-email: SCRIPT Development Team <script@example.com>
Maintainer-email: SCRIPT Development Team <script@example.com>
License: MIT
Project-URL: Homepage, https://github.com/sangeet01/script
Project-URL: Documentation, https://sangeet01.readthedocs.io
Project-URL: Repository, https://github.com/sangeet01/script.git
Project-URL: Bug Tracker, https://github.com/sangeet01/script/issues
Keywords: chemistry,cheminformatics,materials-science,smiles,molecular-notation,biopolymer,stereochemistry,alloys,crystallography
Classifier: Development Status :: 5 - Production/Stable
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Chemistry
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Requires-Python: >=3.8
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: lark>=1.1.0
Provides-Extra: rdkit
Requires-Dist: rdkit>=2023.3.1; extra == "rdkit"
Provides-Extra: dev
Requires-Dist: pytest>=7.0; extra == "dev"
Requires-Dist: pytest-cov>=4.0; extra == "dev"
Requires-Dist: black>=22.0; extra == "dev"
Requires-Dist: flake8>=5.0; extra == "dev"
Requires-Dist: mypy>=1.0; extra == "dev"
Provides-Extra: all
Requires-Dist: rdkit>=2023.3.1; extra == "all"
Dynamic: license-file

# script 

**Structural Chemical Representation In Plain Text (SCRIPT)** is a deterministic, 1-to-1 canonical molecular notation system designed to solve the ambiguity and complexity of SMILES. With an RDKit-independent core engine, SCRIPT guarantees that every molecule has exactly one valid string representation—no post-hoc sanitization required.

## Installation

```bash
# Core engine (no RDKit required)
pip install linearscript

# Full suite with RDKit bridge for SMILES interoperability
pip install linearscript[rdkit]
```

## Why SCRIPT?

Unlike SMILES, which can represent a single molecule in dozens of valid ways (e.g., Aspirin can be written in over 50 valid SMILES strings), SCRIPT uses a deterministic Sandhi state machine to ensure a 1-to-1 mapping between a molecular graph and its text representation.

```text
Aspirin in SMILES:  CC(=O)Oc1ccccc1C(=O)O  (one of many valid forms)
Aspirin in SCRIPT:  CC(=O)OC:C:C:C:C:C&6:C(=O)O  (always and only this)
```

### Key Capabilities
- **Canonical by Design**: DFS + Morgan ranking guarantees 100% canonical strings.
- **RDKit-Free Core**: Parse, validate, and manipulate molecular graphs without heavy cheminformatics dependencies.
- **Advanced Chemistry**: Full support for organometallics (dative/haptic/coordinate bonds), alloys with fractional occupancies (`<~0.9>`), and electronic/excited states (`<s:3>`, `<*>`).
- **Materials & Surfaces**: Natively encode crystallographic contexts (`[[Rutile]]`) and surface chemistry interfaces (`|`).
- **Biopolymers**: Built-in grammar for peptide and nucleic acid chains (`{A.G.S}`).
- **Chemical Reactions**: Robust multi-component reaction representation (`R>>P`).

## Quick Start

### Parse and Canonicalize

```python
from script.parser import SCRIPTParser
from script.canonical import SCRIPTCanonicalizer

parser = SCRIPTParser()
# Parse from a SMILES-style input (SCRIPT is backwards compatible with basic SMILES)
result = parser.parse("CC(=O)Oc1ccccc1C(=O)O")

mol = result["molecule"]
print(f"Atoms: {len(mol.atoms)}, Bonds: {len(mol.bonds)}")

# Generate the 1-to-1 unique canonical SCRIPT string
canon = SCRIPTCanonicalizer().canonicalize_core(mol)
print(canon)  # Output: CC(=O)OC:C:C:C:C:C&6:C(=O)O
```

### RDKit Interoperability

```python
from rdkit import Chem
from script.rdkit_bridge import SCRIPTFromMol, MolFromSCRIPT

# Convert RDKit Mol to SCRIPT
mol = Chem.MolFromSmiles("CN1CCC[C@H]1c2cccnc2")
script_str = SCRIPTFromMol(mol)

# Convert SCRIPT back to RDKit Mol
mol_back = MolFromSCRIPT(script_str)
```

## Documentation & Source

For full documentation, advanced usage tutorials, materials science examples, and the complete grammar specification, please visit the [GitHub Repository](https://github.com/sangeet01/script).

## License

MIT with Commons Clause. Free for academic and non-commercial use.
Commercial licensing available separately.

---

Developed by **SCRIPT Development Team**.

#

GitHub: [sangeet01/script](https://github.com/sangeet01/script.git)
