Metadata-Version: 2.4
Name: my-dev-team
Version: 0.12.4
Summary: An autonomous, LangGraph-powered AI development agency.
Author-email: Alexander Bobrovsky <bobrovsky@seznam.cz>
License: Apache-2.0
Project-URL: Homepage, https://github.com/mydevteam-ai/my-dev-team
Project-URL: Bug Tracker, https://github.com/mydevteam-ai/my-dev-team/issues
Keywords: ai,agents,langgraph,llm,cli,tdd,code-generation,development,flask
Classifier: Development Status :: 5 - Production/Stable
Classifier: Intended Audience :: Developers
Classifier: Topic :: Software Development :: Code Generators
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Programming Language :: Python :: 3.14
Requires-Python: >=3.14
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: langgraph>=1.0.9
Requires-Dist: langgraph-checkpoint-sqlite>=3.0.3
Requires-Dist: langchain>=1.2.10
Requires-Dist: litellm>=1.82.1
Requires-Dist: aiosqlite>=0.22.1
Requires-Dist: pydantic>=2.12.5
Requires-Dist: python-dotenv>=1.2.1
Requires-Dist: pyyaml>=6.0.3
Requires-Dist: docker>=7.1.0
Requires-Dist: rich>=14.3.3
Requires-Dist: rank-bm25>=0.2.2
Provides-Extra: ui
Requires-Dist: flask>=3.0.0; extra == "ui"
Provides-Extra: rag
Requires-Dist: mcp>=1.0.0; extra == "rag"
Requires-Dist: vectorize-me>=0.1.0; extra == "rag"
Provides-Extra: anthropic
Requires-Dist: langchain-anthropic>=0.3.0; extra == "anthropic"
Provides-Extra: google
Requires-Dist: langchain-google-genai>=2.0.0; extra == "google"
Provides-Extra: groq
Requires-Dist: langchain-groq>=0.3.0; extra == "groq"
Provides-Extra: mistral
Requires-Dist: langchain-mistralai>=0.2.0; extra == "mistral"
Provides-Extra: ollama
Requires-Dist: langchain-ollama>=0.3.0; extra == "ollama"
Provides-Extra: openai
Requires-Dist: langchain-openai>=0.3.0; extra == "openai"
Dynamic: license-file

# My Dev Team 🚀

[![PyPI version](https://badge.fury.io/py/my-dev-team.svg)](https://badge.fury.io/py/my-dev-team)
[![Python 3.10+](https://img.shields.io/badge/python-3.10+-blue.svg)](https://www.python.org/downloads/)
[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)

An autonomous, LangGraph-powered AI development agency. **My Dev Team** takes raw project requirements and processes them through a multi-agent workflow (Product Manager, System Architect, Developers and QA) to incrementally build, test and deliver production-ready code.

**Unlike third-party SaaS platforms, My Dev Team is a local-first orchestrator.** Your workspace, SQLite state database and review trails live 100% on your machine. You can run the entire crew locally for free using Ollama for zero data egress, or connect to cloud APIs (OpenAI, Groq) knowing your proprietary codebase is never stored on an external platform's servers.

## Core Features

* **Multi-Agent Architecture:** Specialized AI agents handle distinct phases of the software development lifecycle.
* **Local-First & Privacy-Focused:** You own your data. The orchestrator, memory checkpointer, and file system execute strictly on your local hardware. Your code and requirements never sit on a third-party dashboard.
* **Semantic Model Routing:** Automatically routes tasks to the most cost-effective or capable LLMs based on the agent's requested capabilities.
* **Strict Test-Driven Development (TDD):** Testing is never an afterthought. Tasks are generated with embedded testing criteria, and the Developer writes unit tests alongside implementation code for immediate QA validation.
* **State Recovery & Resiliency:** Powered by asynchronous SQLite checkpointing. If an API rate limit is hit or a workflow is interrupted, you can resume the exact thread without losing a single token of progress.
* **Telemetry & Cost Tracking:** Automatically tallies prompt and completion tokens across the entire workflow. Calculates exact USD costs dynamically using LiteLLM's live pricing registry, printing a detailed receipt at the end of every run.
* **Incremental Development:** The System Architect breaks down requirements into a manageable backlog of tasks with explicit dependency edges.
* **Self-Healing Code:** The Developer, Reviewer and QA Engineer agents continuously loop until unit tests pass and code meets specifications.
* **Developer Fan-out:** Optionally run two developers with different LLMs on the same task simultaneously. A Code Judge evaluates both implementations and selects the best one before code review. All subsequent revisions are handled by the winning developer.
* **Structured Outputs:** Powered by Pydantic and LangChain, ensuring zero "Markdown spillage" and robust state management.
* **Tool-Calling Agents:** All agents use LLM-native tool calling to submit their work, enabling free-form reasoning and thinking before structured output.
* **Extensible:** Easily add custom tools like `HumanInTheLoop` or `ConsoleLogger`.
* **Local Git Versioning:** Every line of AI-generated code is automatically version-controlled.
* **Cost & Token Optimization Analyzer:** Built-in telemetry tracks API costs down to the fraction of a cent and generates a diagnostic report at the end of every run, actively warning you if agents are stuck in loops or suffering from context bloat.
* **SKILLs System:** Uses SKILLs - modular, reusable agent instructions and domain knowledge files. SKILLs can be attached to agents or workflows to extend capabilities, enforce coding standards, or inject project-specific expertise.
* **RAG Knowledge Base:** Agents can retrieve context from an external knowledge base (documents, Jira tickets, Confluence pages, etc.) via any MCP-compatible vector store. A bundled Docker image starts the entire Qdrant + MCP stack in one command. Multi-source routing lets you combine Qdrant, Jira, Confluence and custom sources simultaneously.
* **Workspace Hydrating:** Seed the workspace from an existing local directory or ZIP archive before agents run. Agents can extend, refactor or build on top of real code instead of starting from scratch.
* **Lazy Workspace Loading:** The Developer agent receives only a file listing instead of the full workspace dump. It reads files on demand via `ReadFile`, `GlobFiles` and `GrepFiles` tools, keeping prompts small and avoiding context window overflow on large projects.
* **BM25 Context Retrieval:** Task-scoped agents (QA Engineer, Equivalence Checker, Final QA, Migrator) receive only the most relevant workspace files in full via BM25 keyword retrieval, with the rest listed as paths for on-demand reading.

### AI Agents

1) **Product Manager:** Analyzes requirements, asks clarifying questions and writes detailed Technical Specifications.
2) **System Architect:** Breaks specifications down into a cohesive backlog of developer tasks.
3) **Senior Developer:** Incrementally writes code and unit tests for the current task.
4) **Code Reviewer:** Analyzes the generated code for security, style and logic issues.
5) **QA Engineer:** Evaluates code against task requirements using either LLM-based mental simulation or execution via a secure Docker sandbox.
6) **Final QA Engineer:** Performs a full-repository integration test once all tasks are complete.
7) **Reporter:** Generates a comprehensive final Markdown report for stakeholders.

## Getting Started

### Prerequisites

* **Python 3.10+**
* **API Keys** set in your environment (e.g., `OPENAI_API_KEY`, `GROQ_API_KEY`), OR a local instance of **Ollama** running for free local models.
* **LLM provider package** for your chosen backend - install only what you need:
  ```sh
  pip install "my-dev-team[ollama]"      # Ollama
  pip install "my-dev-team[groq]"        # Groq
  pip install "my-dev-team[anthropic]"   # Anthropic Claude
  pip install "my-dev-team[openai]"      # OpenAI / DeepSeek / xAI Grok
  pip install "my-dev-team[google]"      # Google Gemini
  pip install "my-dev-team[mistral]"     # Mistral
  ```

  Combine multiple extras in one command:
  ```sh
  pip install "my-dev-team[groq,ollama,openai]"
  ```

**Optional Dependencies:**

* **Docker Engine** required only if you intend to use the Sandboxed QA code execution features.
* **Flask** required only to launch the web dashboard (`pip install my-dev-team[ui]`).
* **Node.js 18+** required only if you installed from source (`pip install -e .`) and need to build the web dashboard frontend.
* **Git** required only if you want to use the `GitCommitter` extension for automatic local version control of the generated workspace.
* **RAG** required only if agents are configured with `rag: true`: `pip install "my-dev-team[rag]"`. This installs `mcp` (for agent context retrieval) and `vectorize-me` (the `mcp-ingest` CLI for document ingestion). See the [RAG setup guide](docs/rag.md) for full instructions.

### Installation

**Installing into a virtual environment is highly recommended.**

You can install the package directly via pip:

```sh
pip install my-dev-team
```

For local development, clone the repository and run `pip install -e .`

## Usage

### CLI

```sh
# Start a new project (Ollama by default)
devteam project.txt

# Switch provider and cap request rate
devteam project.txt --provider groq --rpm 30

# Pause for human approval of spec and task plan before development
devteam project.txt --ask-approval

# Run without Docker (LLM-based QA only)
devteam project.txt --no-docker

# Run two developers on each task and let a judge pick the best
devteam project.txt --fanout
devteam project.txt --workflow development-fanout

# Fan-out also works with other workflows
devteam project.txt --workflow migration --fanout
devteam project.txt --workflow migration-fanout

# Resume an interrupted run
devteam --resume web_scraper_cli_20260312_083500

# Inject reviewer feedback into a paused run
devteam --resume web_scraper_cli_20260312_083500 --feedback "Add input validation"

# Print checkpoint timeline for a thread
devteam --history web_scraper_cli_20260312_083500

# Seed the workspace from an existing project before agents run
devteam project.txt --seed /path/to/existing/code
devteam project.txt --seed export.zip
```

See [docs/cli.md](docs/cli.md) for all arguments and options.

### Web Dashboard

```sh
pip install "my-dev-team[ui]"
devteam-ui
```

The dashboard supports live execution feed, workspace browser, human-in-the-loop approval and resume with feedback.

## Architecture

### Multi-Agent Workflow

**My Dev Team** operates as a cyclic, self-healing state machine. Instead of a simple linear pipeline, agents pass context back and forth, iterating on code until it meets strict quality standards.

```mermaid
stateDiagram-v2
    pm : Product Manager
    human : Human in the Loop
    architect : System Architect
    officer : Project Officer
    dev : Senior Developer
    reviewer : Code Reviewer
    qa : QA Engineer
    final_qa : Final QA Engineer
    reporter : Reporter
    [*] --> pm
    pm --> human
    human --> pm
    pm --> architect
    architect --> officer
    officer --> dev
    dev --> reviewer
    reviewer --> dev
    reviewer --> qa
    qa --> dev
    qa --> officer
    officer --> final_qa
    final_qa --> reporter
    reporter --> [*]
```

**How the routing works:**

* **Requirements Gathering:** The **Product Manager** loops with a **Human** to refine requirements before development begins.

* **Task Orchestration:** The **System Architect** designs the system, and the **Project Officer** orchestrates the task backlog, routing individual tickets to the **Senior Developer**.

* **The Refinement Loop:** The **Senior Developer**, **Code Reviewer** and **QA Engineer** agents operate in a strict self-healing loop. Code is repeatedly analyzed and tested; if bugs or style issues are found, the state routes directly back to the **Senior Developer** for revisions.

* **Final Delivery:** Once the **Project Officer** confirms all tasks are complete, the **Final QA Engineer** runs full-repository integration tests before the **Reporter** generates the final documentation.

### Semantic Model Routing (LLM Factory)

**My Dev Team** doesn't just use one model for everything. It uses a **capability scoring** architecture via `LLMFactory`.

Instead of hardcoding a specific model (like `gpt-5.3-codex`), each agent declares the capabilities it needs and the factory scores every model configured for the active provider, selecting the best match.

**Built-in capabilities:**

* `reasoning` - deep thinking, complex analysis (Product Manager, Code Judge)
* `planning` - task decomposition, architecture (System Architect)
* `code-generation` - writing implementation code (Senior Developer)
* `code-analysis` - reading, reviewing and testing code (Reviewer, QA Engineers)
* `fast-utility` - lightweight summarization tasks (Reporter)

### Centralized Configuration

Code and configuration are strictly separated to make the framework maintainable and extensible.

* **Model Routing (`config/llms.yaml`):** All provider definitions (Groq, OpenAI, Ollama) and model capability scores are centralized in a single YAML file. To add a new model, declare its capabilities - no agent code needs to change.
* **Agent Prompts (`config/agents/**`):** Every agent's persona, system instructions and constraints are stored as clean Markdown files with YAML frontmatter. No massive, hardcoded prompt strings cluttering the Python logic!
* **Sandbox Environments (`config/sandbox.yaml`):** Docker base images and test execution commands for various runtimes (Python, Node.js) are completely decoupled. You can easily add support for entirely new programming languages by simply defining the image and test command in YAML, without touching the core Python engine.

### Sandboxed QA Execution

The QA Engineer agent does not rely on LLM "guesswork" or mental simulation to test code. It executes the generated code in reality.

* **Zero Hallucinations:** The QA node mounts the active workspace into a temporary directory and runs the actual test suite (e.g., `pytest`, `npm test`). It reads the exact `stdout`/`stderr` tracebacks to accurately report bugs back to the Developer.
* **Ephemeral Isolation:** Code is executed securely using the Docker SDK. Containers are strictly isolated, resource-limited (CPU/RAM), and immediately destroyed after the test run, ensuring your host machine is never at risk.
* **Universal Runtime Auto-Detection:** The sandbox dynamically inspects the workspace or takes explicit direction from the System Architect to pull the correct Docker image (Python, Node.js, etc.) on the fly.

### Telemetry & Optimization

Running multi-agent systems can get expensive quickly if models get stuck in loops or context windows grow out of control. **My Dev Team** includes a built-in `TelemetryTracker` that monitors every single LLM call.

At the end of every workflow, the framework prints a granular receipt and an optimization diagnostic report:

```text
========================================
📊 TELEMETRY & COST REPORT
========================================
Total API Requests:  12
Prompt Tokens:       45,200
Completion Tokens:   3,100
Total Tokens:        48,300
----------------------------------------
Estimated Cost:      $0.0145
========================================

========================================
🔍 TOKEN OPTIMIZATION DIAGNOSTICS
========================================
⚠️ Thrashing Detected: `qa` was called 8 times. The agent might be stuck in a failure loop.
📈 Context Bloat: `reviewer` input grew by 3.2x (Started: 1200, Ended: 3840).
========================================
```

This allows you to easily identify architectural token leaks, pinpoint which specific agent is struggling, and adjust your `llms.yaml` or prompt templates accordingly!

## Usage (Python API)

If you want to integrate the crew into your own application, customize the LLM Factory's routing table, or override specific agent behaviors, use the clean Python API:

```python
import asyncio
import aiosqlite
from pathlib import Path
from dotenv import load_dotenv

from langgraph.checkpoint.sqlite.aio import AsyncSqliteSaver

from devteam.crew import CrewFactory
from devteam.extensions import ConsoleLogger, HumanInTheLoop
from devteam.utils import LLMFactory

load_dotenv()

def my_extensions() -> list:
    return [
        ConsoleLogger(),
        HumanInTheLoop()
    ]

async def main():
    requirements = "Build a simple Python calculator CLI with basic arithmetic."
    thread_id = 'calc_run_01'
    project_folder = Path('./workspaces/calculator_app')
    project_folder.mkdir(parents=True, exist_ok=True)
    db_path = project_folder / 'state.db'
    llm_factory = LLMFactory(provider='groq')
    crew_factory = CrewFactory(llm_factory=llm_factory)
    try:
        async with aiosqlite.connect(db_path) as conn:
            checkpointer = AsyncSqliteSaver(conn)
            crew = crew_factory.create(project_folder, checkpointer=checkpointer, extensions=my_extensions())
            print("🚀 Starting the AI Dev Team...")
            final_state = await crew.execute(thread_id=thread_id, requirements=requirements)
        if final_state.abort_requested:
            print("❌ Workflow aborted by user or validation failure.")
        elif final_state.success:
            print("🎉 Project completed successfully!")
            if final_state.final_report:
                print(final_state.final_report)
        else:
            print("🚨 Release failed: Integration bugs found!")
            for bug in final_state.integration_bugs:
                print(f" - {bug}")
    except KeyboardInterrupt:
        print("\n\n🛑 Workflow interrupted by user (Ctrl+C).")
        print(f"💡 You can resume this exact state later by running:")
        print(f"   devteam --resume {thread_id}")
    finally:
        telemetry.print_receipt()

if __name__ == "__main__":
    asyncio.run(main())
```

## Contributing

Pull requests are welcome. For major changes, please open an issue first...

## License

Distributed under the Apache 2.0 license. See `LICENSE` for more information.
