Metadata-Version: 2.4
Name: langchain-community-inspire-hep
Version: 0.1.0
Summary: INSPIRE HEP tools and API wrapper for LangChain.
Keywords: langchain,physics,research,inspirehep,hep
Requires-Python: >=3.9
Description-Content-Type: text/markdown
Requires-Dist: langchain-core>=0.2.0
Requires-Dist: pydantic<3,>=1.10
Requires-Dist: requests>=2.31.0
Provides-Extra: test
Requires-Dist: pytest>=7.0; extra == "test"

# INSPIRE HEP Tools for LangChain

Integration with [INSPIRE HEP](https://inspirehep.net), the trusted community hub for high energy physics research literature.

## Overview

This contribution adds three tools for searching and retrieving physics papers from the INSPIRE HEP database:

- **Search Literature**: Find physics papers by topic with flexible sorting options
- **Get Author Papers**: Retrieve an author's publications (requires INSPIRE identifiers)
- **Get Paper Details**: Fetch complete information about a specific paper by record ID

## Features

✅ Search by topic, author, or citation count  
✅ Sort results by most recent or most cited  
✅ Configurable result limits  
✅ Comprehensive error handling  
✅ Full test coverage (15 unit tests + 5 integration tests)

## Installation
```bash
pip install langchain-community
```

## Quick Start

### Basic Usage
```python
from langchain_community.tools.inspire_hep import INSPIRESearchLiteratureTool

tool = INSPIRESearchLiteratureTool()

# Search for papers
result = tool.invoke({"query": "quantum field theory"})
print(result)

# Search with sorting
result = tool.invoke({
    "query": "string theory",
    "sort": "mostcited"  # or "mostrecent"
})
```

### All Three Tools
```python
from langchain_community.tools.inspire_hep import (
    INSPIRESearchLiteratureTool,
    INSPIREGetAuthorPapersTool,
    INSPIREGetPaperDetailsTool,
)

# Search for papers on a topic
search_tool = INSPIRESearchLiteratureTool()
result = search_tool.invoke({
    "query": "quantum gravity",
    "sort": "mostrecent"
})

# Get an author's papers (requires INSPIRE identifier)
author_tool = INSPIREGetAuthorPapersTool()
result = author_tool.invoke({
    "author_name": "Witten.Edward.1",
    "sort": "mostcited"
})

# Get details of a specific paper
details_tool = INSPIREGetPaperDetailsTool()
result = details_tool.invoke({
    "record_id": "451647"  # Maldacena's AdS/CFT paper
})
```

## Using with AI Agents
```python
from langchain_openai import ChatOpenAI
from langchain.agents import AgentExecutor, create_tool_calling_agent
from langchain_core.prompts import ChatPromptTemplate
from langchain_community.tools.inspire_hep import (
    INSPIRESearchLiteratureTool,
    INSPIREGetAuthorPapersTool,
)

# Create tools
tools = [
    INSPIRESearchLiteratureTool(),
    INSPIREGetAuthorPapersTool(),
]

# Create LLM (use models with good tool calling support)
llm = ChatOpenAI(model="gpt-4", temperature=0)

# Create prompt
prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a physics research assistant with access to INSPIRE HEP."),
    ("human", "{input}"),
    ("placeholder", "{agent_scratchpad}"),
])

# Create agent
agent = create_tool_calling_agent(llm, tools, prompt)
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)

# Use the agent
result = agent_executor.invoke({
    "input": "What are the most cited papers on string theory?"
})
print(result['output'])
```

## Direct API Access (Without Agents)

For direct API access without LangChain agents:
```python
from langchain_community.utilities.inspire_hep import INSPIREHEPAPIWrapper

# Create wrapper with custom settings
wrapper = INSPIREHEPAPIWrapper(top_k_results=5)

# Search papers
papers = wrapper.search_literature("quantum gravity", sort="mostcited")
print(papers)

# Get author papers
author_papers = wrapper.get_author_papers("Witten.Edward.1", sort="mostrecent")
print(author_papers)

# Get paper details
details = wrapper.get_paper_details("451647")
print(details)
```

## Sorting Options

Both search and author tools support flexible sorting:

- **`mostrecent`** (default for search): Newest papers first
- **`mostcited`** (default for author): Most cited papers first
```python
# Find recent breakthroughs
tool.invoke({"query": "quantum computing", "sort": "mostrecent"})

# Find influential papers
tool.invoke({"query": "supersymmetry", "sort": "mostcited"})
```

## Finding Author Identifiers

The author papers tool requires INSPIRE identifiers (format: `Lastname.Firstname.N`), not plain names:

1. Go to https://inspirehep.net/authors
2. Search for the author by name
3. Click on their profile
4. Use the identifier shown (e.g., `Witten.Edward.1`)

**Why?** Plain names are ambiguous (many physicists share the same name), while INSPIRE identifiers are unique.

## Advanced Search Syntax

INSPIRE HEP supports advanced search queries:
```python
# Highly cited papers (1000+ citations)
tool.invoke({"query": "topcite 1000+"})

# Papers by specific author
tool.invoke({"query": "author:Witten"})

# Papers in date range
tool.invoke({"query": "date 2020->2024"})

# Combine criteria
tool.invoke({"query": "quantum gravity topcite 500+"})
```

See the [INSPIRE HEP search guide](https://help.inspirehep.net/knowledge-base/inspire-paper-search/) for more syntax.

## API Rate Limiting

INSPIRE HEP enforces rate limits of **15 requests per 5 seconds per IP address**. The wrapper handles basic rate limiting, but avoid making rapid successive requests.

## Testing

This contribution includes comprehensive test coverage:

- **15 unit tests**: Test wrapper and tools with mocked API responses
- **5 integration tests**: Test with real API calls

Run tests:
```bash
# Unit tests (fast, no internet required)
pytest tests/unit_tests/test_inspire_hep.py -v

# Integration tests (requires internet)
pytest tests/integration_tests/test_inspire_hep_integrations.py -v

# All tests
pytest tests/ -v
```

## Known Limitations

1. **Author identifiers required**: The author papers tool works reliably only with INSPIRE identifiers, not plain names. Users must look up identifiers at https://inspirehep.net/authors.

2. **LLM compatibility**: Agent performance depends on the LLM's tool-calling capabilities. Works best with OpenAI GPT-4, Anthropic Claude, and other models with strong structured output support.

## Example Use Cases

### Research Assistant
```python
"What are the most influential papers on the AdS/CFT correspondence?"
→ Uses search_literature with sort="mostcited"
```

### Literature Review
```python
"Find recent papers on quantum entanglement from the last year"
→ Uses search_literature with sort="mostrecent"
```

### Author Research
```python
"What are Edward Witten's most cited contributions?"
→ Uses get_author_papers with author identifier
```

### Paper Deep Dive
```python
"Tell me about INSPIRE record 451647"
→ Uses get_paper_details for full information
```

## Citation

If you use INSPIRE HEP in your research, please cite:
```bibtex
@article{Moskovic:2021zjs,
    author = "Moskovic, Micha",
    title = "{The INSPIRE REST API}",
    url = "https://github.com/inspirehep/rest-api-doc",
    doi = "10.5281/zenodo.5788550",
    month = "12",
    year = "2021"
}
```

## Contributing

This is an open-source contribution to LangChain. Future enhancements could include:

- Job search functionality
- Conference search
- Advanced filtering options
- Citation graph traversal
- Batch operations

## Resources

- [INSPIRE HEP Website](https://inspirehep.net)
- [INSPIRE API Documentation](https://github.com/inspirehep/rest-api-doc)
- [LangChain Documentation](https://python.langchain.com/)
- [LangChain Contributing Guide](https://github.com/langchain-ai/langchain/blob/master/CONTRIBUTING.md)

## License

This contribution follows LangChain's MIT License.
