Metadata-Version: 2.4
Name: timenorm-py
Version: 0.1.0
Summary: Python-native temporal expression parser and normalizer
Author: Timenorm Python Contributors
License: Apache-2.0
Project-URL: Homepage, https://github.com/clulab/timenorm
Project-URL: Documentation, https://github.com/clulab/timenorm
Project-URL: Repository, https://github.com/clulab/timenorm
Project-URL: Issues, https://github.com/clulab/timenorm/issues
Keywords: nlp,temporal,time,normalization,parsing
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Text Processing :: Linguistic
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
License-File: NOTICE
Requires-Dist: python-dateutil>=2.8.0
Requires-Dist: tensorflow<3.0.0,>=2.12.0
Requires-Dist: numpy>=1.24.0
Provides-Extra: dev
Requires-Dist: pytest>=7.0.0; extra == "dev"
Requires-Dist: pytest-cov>=4.0.0; extra == "dev"
Requires-Dist: black>=23.0.0; extra == "dev"
Requires-Dist: ruff>=0.1.0; extra == "dev"
Requires-Dist: mypy>=1.0.0; extra == "dev"
Dynamic: license-file

# Timenorm-Py

A Python-native temporal expression parser and normalizer based on the [timenorm](https://github.com/clulab/timenorm) library.

## Overview

Timenorm-Py finds and normalizes temporal expressions in natural language text using a neural network-based approach (SCATE - Semantically Compositional Annotation for TEmporal expressions).

**Example:**
```python
from timenorm import TemporalParser
import datetime

parser = TemporalParser()
text = "I saw her last week and will meet her next Tuesday."
anchor = datetime.datetime(2024, 11, 15)

results = parser.parse(text, anchor)
# Returns temporal expressions with normalized intervals:
# - "last week" → Interval(2024-11-08, 2024-11-15)
# - "next Tuesday" → Interval(2024-11-19, 2024-11-20)
```

## Features

- 🧠 **Neural Parser**: Character-level RNN for accurate temporal expression identification
- 🔧 **Compositional Operators**: Build complex temporal expressions from simple operators (Last, Next, This, Before, After, etc.)
- 📅 **Python-Native**: Built with Python's `datetime` and `dateutil` for seamless integration
- ✅ **Well-Tested**: Comprehensive test suite matching the original Scala implementation

## Installation

### From GitHub

```bash
# Clone the repository
git clone https://github.com/dadhichgaurav1/temporalextractor-timenorm-py.git
cd temporalextractor-timenorm-py

# Install in development mode
pip install -e .
```

### From PyPI (Coming Soon)

```bash
pip install timenorm-py
```

### Requirements

- Python 3.10+
- `python-dateutil`
- `tensorflow` (optional, for neural network inference)

## Quick Start

### Simple Parsing

```python
from timenorm import TemporalParser, Interval

# Create parser
parser = TemporalParser()

# Parse with anchor time (document creation time)
anchor = Interval.of(2024, 11, 19)
text = "I saw her last week and will meet her next Tuesday."

# Note: Requires TensorFlow model for detection
# Currently returns empty without model, but infrastructure is ready
results = parser.parse(text, anchor=anchor)
```

###  Using Direct Temporal Algebra

```python
from timenorm import Interval, Period, Last, Next, DAY, WEEK, MONTH
import datetime

# Create intervals
anchor = Interval.of(2024, 11, 19)
year_2024 = Interval.of(2024)
march_15 = Interval.of(2024, 3, 15)

# Period arithmetic
three_months = Period(MONTH, 3)
start = datetime.datetime(2024, 1, 1)
interval = start + three_months  # January 1 + 3 months = April 1

# Temporal operators
last_week = Last(anchor, Period(DAY, 7))
print(f"Last week: {last_week.start} to {last_week.end}")

next_weeks = Next(anchor, Period(WEEK, 3))
print(f"Next 3 weeks: {next_weeks.start} to {next_weeks.end}")
```

### Parsing from XML (Anafora Format)

```python
from timenorm import TemporalParser, Interval

parser = TemporalParser()
anchor = Interval.of(2024, 11, 19)

# Parse from Anafora XML file
results = parser.parse_xml("annotations.xml", anchor=anchor)

for expr in results:
    print(f"{expr}: {expr.start} to {expr.end}")
```

### Batch Processing

```python
from timenorm import TemporalParser, Interval

parser = TemporalParser()
text = "Monday meeting. Tuesday lunch. Wednesday presentation."
spans = [(0, 15), (16, 29), (30, 53)]
anchor = Interval.of(2024, 11, 19)

results = parser.parse_batch(text, spans, anchor=anchor)
for i, batch_result in enumerate(results):
    print(f"Batch {i+1}: {batch_result}")
```

## Core Concepts

### Intervals
Temporal intervals on the timeline with start (inclusive) and end (exclusive) points:
```python
from timenorm import Interval

# Year 2024
year = Interval.of(2024)  # 2024-01-01 to 2025-01-01

# Specific day
day = Interval.of(2024, 11, 15)  # 2024-11-15 to 2024-11-16
```

### Periods
Amounts of time independent of the timeline:
```python
from timenorm import Period, MONTH, WEEK

three_months = Period(MONTH, 3)
two_weeks = Period(WEEK, 2)
```

### Operators
Compositional operators for building complex temporal expressions:
```python
from timenorm import Last, Next, Period, DAY
from datetime import datetime

anchor = Interval.of(2024, 11, 15)

# "last 7 days"
last_week = Last(anchor, Period(DAY, 7))

# "next Tuesday"
next_tuesday = Next(anchor, Repeating(DAY, WEEK, value=1))  # Tuesday = 1
```

## Requirements

- Python >= 3.10
- TensorFlow >= 2.12
- python-dateutil >= 2.8

## Credits

This is a Python-native reimplementation of the original [timenorm](https://github.com/clulab/timenorm) library developed by:
- Steven Bethard
- Egoitz Laparra  
This is a Python-native reimplementation of [timenorm](https://github.com/clulab/timenorm), originally developed by Steven Bethard, Egoitz Laparra, and Dongfang Xu at the University of Arizona's Computational Language Understanding Lab (CLU Lab).

### Original Authors
- **Steven Bethard** - University of Arizona
- **Egoitz Laparra** - University of Arizona  
- **Dongfang Xu** - University of Arizona

### Research Papers

The temporal expression normalization approach implemented in this library is based on:

1. Laparra, E., Xu, D., & Bethard, S. (2018). **From Characters to Time Intervals: New Paradigms for Evaluation and Neural Parsing of Time Normalizations**. *Transactions of the Association for Computational Linguistics*, 6, 343-356.

2. Xu, D., Laparra, E., & Bethard, S. (2019). **Pre-trained Contextualized Character Embeddings Lead to Major Improvements in Time Normalization: A Detailed Analysis**. *Proceedings of the Eighth Joint Conference on Lexical and Computational Semantics*.

### Acknowledgments

This Python implementation:
- Maintains API compatibility with the original SCATE component
- Uses the same compositional semantics for temporal expressions
- Follows the architectural patterns from the original Scala implementation
- Includes resources (vocabularies, labels, schemas) from the original project

### License

Both the original timenorm and this Python implementation are licensed under the **Apache License 2.0**. See the [LICENSE](LICENSE) file for details.

## Contributing

Contributions are welcome! This project aims to maintain compatibility with the original timenorm while providing a pure-Python implementation.

Areas for contribution:
- Neural network model integration
- Additional language support
- Performance optimizations  
- Documentation improvements

## Citation

If you use this library in research, please cite the original papers:

```bibtex
@article{laparra2018characters,
  title={From Characters to Time Intervals: New Paradigms for Evaluation and Neural Parsing of Time Normalizations},
  author={Laparra, Egoitz and Xu, Dongfang and Bethard, Steven},
  journal={Transactions of the Association for Computational Linguistics},
  volume={6},
  pages={343--356},
  year={2018}
}

@inproceedings{xu2019pre,
  title={Pre-trained Contextualized Character Embeddings Lead to Major Improvements in Time Normalization: A Detailed Analysis},
  author={Xu, Dongfang and Laparra, Egoitz and Bethard, Steven},
  booktitle={Proceedings of the Eighth Joint Conference on Lexical and Computational Semantics},
  year={2019}
}
```

## Contact

For questions about this Python implementation, please open an issue on GitHub.

For questions about the original timenorm project, see https://github.com/clulab/timenorm
```
