Metadata-Version: 2.4
Name: ai-for-research
Version: 0.5.0
Summary: Install the ai-for-research Agent Skills into a coding agent's skills directory.
Author: UnaryLab
License: MIT
Project-URL: Homepage, https://github.com/UnaryLab/ai-for-research
Project-URL: Repository, https://github.com/UnaryLab/ai-for-research
Project-URL: Issues, https://github.com/UnaryLab/ai-for-research/issues
Keywords: agent-skills,research,llm,claude,codex,gemini
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Environment :: Console
Classifier: Operating System :: POSIX
Classifier: Intended Audience :: Science/Research
Requires-Python: >=3.8
Description-Content-Type: text/markdown
License-File: LICENSE
Dynamic: license-file

# AI-for-Research Skills

A collection of **Agent Skills** that cover the research workflow end to end — from
first idea to polished paper. Each skill is a self-contained `SKILL.md` (YAML
frontmatter with `name` + `description`, then Markdown instructions) that teaches an
agent how to do one research task well.

The skills use the open **Agent Skills** format, so they work with any agent that
supports it — [Claude Code](https://claude.com/claude-code), OpenAI Codex,
Gemini CLI, and others — and the instructions are plain Markdown, so you can also drop a
`SKILL.md` straight into any LLM as context. **Nothing in them is Claude-specific.**

Everything is **tuned for three venue families**:

- **Top AI/ML conferences** — NeurIPS, ICML, ICLR, CVPR, ACL, …
- **Computer architecture** — ISCA, MICRO, ASPLOS, HPCA — plus the related systems,
  security, and HPC venues architecture work appears in (MLSys, SC, PACT, ISPASS,
  IISWC, SOSP, OSDI, USENIX ATC, NSDI, IEEE S&P, NDSS).
- **Nature-family science** — Nature, Science, Nature [subject], …

## The research workflow

The skills group into six stages that flow into one another. The pipeline is
**iterative, not one-shot** — expect to loop back to read more, re-evaluate an idea,
or re-run experiments before you write anything up.

```
   1 · GROUND        literature-survey  ·  paper-read  ·  chalk-talk
        │
        ▼
   2 · IDEATE        idea-brainstorm  ·  idea-evaluate
        │
        ▼
   3 · EXPERIMENT    code-implement  ·  experiment-scaffold  ·  code-clean
        │
        ▼
   4 · ANALYZE       result-generate            (raw results → figures & tables)
        │
        ▼
   5 · WRITE         paper-draft  ·  paper-polish
        │
        ▼
   6 · RELEASE       artifact-create            (cleaned code → reproducible Docker artifact)
```

Within a stage the two skills pair up: **survey the field** then **read a paper**;
**generate ideas** then **judge them**; build **your own harness** while **running
others' code** as baselines; **draft** then **polish**. Between experiments and
writing, **result-generate** turns raw outputs into the paper's figures and tables,
and **code-clean** tidies the experiment code (yours or others') without changing what
it does. Finally, **artifact-create** packages the cleaned code into a reproducible
Docker artifact for release and artifact evaluation. Reading (stage 1) also feeds every
later stage — revisit it whenever a gap shows up. **chalk-talk** sits in stage 1 next to
reading but is cross-cutting: turn any concept — one you just read or one you're proposing —
into a teaching diagram at any stage.

**Entry points** — start wherever you actually are:

- New to a topic / need background → **literature-survey**, **paper-read**
- Need to teach or visually explain a single concept → **chalk-talk** (cross-cutting)
- Have a topic but no idea yet → **idea-brainstorm**
- Have an idea, unsure if it's worth it → **idea-evaluate**
- Ready to test an idea → **experiment-scaffold** (with **code-implement** for baselines)
- Code got messy / cleaning up before release → **code-clean**
- Have raw results to turn into figures/tables → **result-generate**
- Have exhibits and need to write them up → **paper-draft**, then **paper-polish**
- Packaging code for artifact evaluation / a reproducible release → **artifact-create**

## The skills

| Skill | Goal | Reach for it when… |
|-------|------|--------------------|
| **[literature-survey](skills/literature-survey/SKILL.md)** | Run a multi-source, multi-phase survey (search → read → synthesize) and produce a cited Markdown report with **inline hyperlink citations**. | You need the state of the art on a topic, or a related-work section / bibliography. |
| **[paper-read](skills/paper-read/SKILL.md)** | Walk through a paper hierarchically with **Keshav's three-pass method** (bird's-eye → grasp → deep), stopping when it's not worth going deeper. | You need to read, understand, triage, or review a paper (for a junior researcher). |
| **[chalk-talk](skills/chalk-talk/SKILL.md)** | Generate a chalkboard-style **teaching diagram** for one concept — bubbles + arrows, handwritten chalk text — as a layout plan, a paste-ready image-generation prompt, narration, and alt text. *(Cross-cutting — usable at any stage, not strictly sequential.)* | You want to teach or visually explain a single concept to a class, labmate, or talk audience. |
| **[idea-brainstorm](skills/idea-brainstorm/SKILL.md)** | Generate a wide set of research ideas for a topic, then **tier them by novelty** (T1 paradigm-shift → T5 incremental), verifying novelty against prior work. | You want new directions, or to rank ideas by how genuinely novel they are. |
| **[idea-evaluate](skills/idea-evaluate/SKILL.md)** | Judge an idea through the **IEEE Micro Top Picks** lens — long-term impact above all — and return a narrative verdict (Top Pick / Honorable Mention / Below the bar). | You need to decide whether an idea is worth pursuing, submitting, or dropping. |
| **[code-implement](skills/code-implement/SKILL.md)** | Reproduce and run **someone else's** code (GitHub repo or paper artifact) from install to the exact command that generates the result of interest. | You need to get external code working or reproduce a paper's results. |
| **[experiment-scaffold](skills/experiment-scaffold/SKILL.md)** | **Plan first**, get sign-off, then scaffold a runnable experiment framework (a validated "walking skeleton") to test **your own** idea. | You're standing up experiments from scratch to validate a hypothesis. |
| **[code-clean](skills/code-clean/SKILL.md)** | Clean up existing code — dead code, debug cruft, style, hardcoded values — **without changing behavior**, behind a safety net (tests or a golden snapshot) with a tool-evidenced deletion log. | You need to tidy/refactor code or strip cruft without breaking it. |
| **[result-generate](skills/result-generate/SKILL.md)** | Turn raw experiment outputs into publication-quality **figures and tables** — with the analysis (aggregation, error bars, significance, normalization) and honesty (show variance, no cherry-picking) behind them, plus a script that regenerates each exhibit. | You have results and need the paper's figures/tables. |
| **[paper-draft](skills/paper-draft/SKILL.md)** | Draft **new** paper prose for any section, venue-aware — brainstorm the framing first, lead with the contribution, avoid AI-tells. | You have results and need to write them up. |
| **[paper-polish](skills/paper-polish/SKILL.md)** | Critique and polish **existing** paper prose like an area chair — diagnose first, then optional fixes — venue-aware. | You have a draft and want it reviewed, tightened, or de-AI-ed. |
| **[artifact-create](skills/artifact-create/SKILL.md)** | Package your code into a reproducible **Docker artifact** for artifact evaluation — runs `code-clean` first, bakes in a pinned environment, and verifies it reproduces the results in a clean room, with an Artifact Appendix. | You're preparing an AE submission or a reproducible release. |

> Pairs that are easy to confuse:
> **experiment-scaffold** builds *your own* harness; **code-implement** runs *others'* code.
> **paper-draft** generates *new* prose; **paper-polish** improves *existing* prose.
> **artifact-create** packages *your* code into a runnable artifact; **code-implement** runs *someone else's*.

**Looking for paper review?** This repo stops at `paper-polish` (prose-level, section-by-section
critique). For full **reviewer-style review of a paper before submission** — surfacing what
reviewers will flag so you can fix it first to improve quality — see the companion repo
**[UnaryLab/ai-paper-review](https://github.com/UnaryLab/ai-paper-review)**.

## How to use these skills

### 1. Make them available to your agent

Install the `ai-for-research` CLI — from PyPI, or editable from a clone — then run it for your
agent. It symlinks every skill into that agent's skills directory and is safe to re-run:

```bash
pip install ai-for-research                       # from PyPI

# — or editable from a clone (tracks updates via `git pull`) —
git clone https://github.com/UnaryLab/ai-for-research.git
cd ai-for-research && pip install -e .
```

```bash
ai-for-research claude                 # → ~/.claude/skills
ai-for-research codex                  # → ~/.agents/skills
ai-for-research gemini                 # → ~/.gemini/skills
```

`ai-for-research <agent> [project-dir]` — pass a **project directory** to install the skills
into that project's agent subdir (e.g. `ai-for-research claude ~/my-project` →
`~/my-project/.claude/skills`); with no argument the skills go to the per-agent global default
shown above. (`$SKILLS_DIR` sets an exact directory; `CLONE_DIR` and `REPO_URL` are honored
too.) Installed editable (`-e`) from the clone, the command links that checkout so a `git pull`
updates the symlinks; installed from elsewhere it clones the repo on first run.

> **Codex** and **Gemini** also recognize the shared `~/.agents/skills` alias.

**Prefer not to `pip install`? Copy the skills in manually.** After cloning, each skill is just
a folder under `skills/` — copy (or symlink) them into your agent's skills directory:

```bash
git clone https://github.com/UnaryLab/ai-for-research.git
cd ai-for-research

mkdir -p ~/.claude/skills            && cp -r skills/* ~/.claude/skills/             # Claude Code
mkdir -p ~/.agents/skills            && cp -r skills/* ~/.agents/skills/             # OpenAI Codex
mkdir -p ~/.gemini/skills            && cp -r skills/* ~/.gemini/skills/             # Gemini CLI
```

Use `cp -r` to copy, or symlink (`ln -s "$PWD/skills/<name>" <dir>/<name>`) so a `git pull`
keeps them updated. Installation path per agent:

| Agent | Skills directory |
|-------|------------------|
| **Claude Code** | `~/.claude/skills/` (global) — or `<project>/.claude/skills/` |
| **OpenAI Codex** | `~/.agents/skills/` (global) — or `<project>/.agents/skills/` |
| **Gemini CLI** | `~/.gemini/skills/` (global) — or `<project>/.gemini/skills/` |

**No runtime?** Each skill is just a folder with a `SKILL.md` — paste the relevant one into
any LLM as context (or prepend it to your prompt). It's plain Markdown instructions.

### 2. Invoke a skill

- **Automatically** — just ask in plain language. The agent matches your request against
  each skill's `description` and `When to use` section and runs the best fit
  (e.g. *"help me read this paper"* → `paper-read`).
- **Explicitly** — name the skill: in Claude Code as a slash command (`/literature-survey`,
  `/paper-polish`, …); in other tools however that runtime invokes skills — or simply say
  *"use the literature-survey skill."*

### 3. What to expect

- Some skills are **gated on your input**: `experiment-scaffold` won't write code until
  you approve its plan; `paper-read` goes one pass at a time and asks before going
  deeper; `literature-survey` writes each phase to disk before the next.
- Skills **hand off** to each other and will suggest the next one where it makes sense.
- A few have helper scripts (e.g. `literature-survey/scripts/` for paper search and the
  citation database); the rest are pure instructions.

## Conventions shared across all skills

- **Persona & ground rules.** Every skill opens with a tailored expert persona plus the
  same three rules: *don't be stupid, don't mansplain, don't ask questions that can be
  googled* (look it up; only ask for what's in the user's head).
- **Venue-aware.** Guidance, checklists, and examples adapt to the three venue families
  above.
- **No fabrication.** Citations link to real papers; reproduced numbers are reported
  as-is; results trace to the command that produced them.

## Repository layout

```
ai-for-research/
├── README.md                # this file
├── LICENSE                  # MIT
├── pyproject.toml           # Python packaging — `pip install -e .` adds the ai-for-research CLI
├── ai_for_research/         # the ai-for-research CLI package (cli.py)
├── .github/workflows/       # CI — publish.yml (build + publish to PyPI on release)
└── skills/                  # the Agent Skills — one folder per skill, each with a SKILL.md
    ├── literature-survey/   # also has references/ and scripts/
    ├── paper-read/
    ├── chalk-talk/          # cross-cutting: concept teaching diagrams
    ├── idea-brainstorm/
    ├── idea-evaluate/
    ├── code-implement/
    ├── experiment-scaffold/
    ├── code-clean/
    ├── result-generate/
    ├── paper-draft/         # also has references/
    ├── paper-polish/        # also has references/
    └── artifact-create/
```

Each skill folder contains a `SKILL.md`; some also carry `references/` (deeper guidance
read on demand) and/or `scripts/`.

## Acknowledgments

Several skills adapt **procedure-as-prompt** templates from Todd Austin's
[`toddmaustin/research-prompts`](https://github.com/toddmaustin/research-prompts) (Apache-2.0):

- **chalk-talk** — adapted from his `chalk-talk` prompt.
- **paper-read** — the optional **Heilmeier Catechism** lens (from `heilmeier-extractor`).
- **idea-brainstorm** — the **fail-fast** "cheapest decisive test" layer (from `fail-fast`).
- **paper-polish** — the `deadline-triage`, `fact-faithfulness`, `page-trim`, and `presentation`
  modes (from `brutal-review`, `hallucination-detector`, `orphan-finder`, and `tough-crowd`).
- **paper-draft** — the **VOICE CARD** voice-capture/match stage (from `writing-voice`).

## License

[MIT](LICENSE) © 2026 UnaryLab.
