Product Requirements Document — Meta Agent

⸻

1 · Overview

Meta Agent is a “developer of developers”: a Python application that, given a natural‑language specification, automatically produces a fully‑functional OpenAI Agents SDK agent (code + tests + guardrails). It eliminates hand‑coding boilerplate, accelerates prototyping, and enforces best practices from day one.
	•	Problem solved Developers spend significant time translating English requirements into agent scaffolds, wiring tools, and writing safety checks.
	•	Target users AI engineers, solutions architects, and power users who need bespoke agents quickly but can’t invest days in setup.
	•	Value proposition Generate production‑ready agents in minutes, with built‑in validation, sandboxing, and documentation.

⸻

2 · Core Features

Feature	What it does	Why it matters	How it works (high‑level)
Natural‑language spec ingestion	Accepts structured or free‑form specs describing goals, I/O contracts, tools, and constraints.	Lowers the barrier to entry; no DSL to learn.	Uses o3 to parse spec into an internal schema and identify work chunks.
Agent planner (Meta Agent)	Decomposes the spec, orchestrates sub‑agents, assembles final artifact.	Central logic hub; enforces consistency & versioning.	Runs on o3 with minimal tools (web search, template library, python_interpreter for compilation).
Tool Designer sub‑agent	Generates runnable Python code for each required tool and its unit tests.	Eliminates stubs; ensures tools are functional day‑one.	o4‑mini‑high + sandboxed interpreter; may call web search for API discovery.
Guardrail Designer sub‑agent	Creates validation logic (Pydantic, regex, policy checks) + guardrail tests.	Embeds safety & compliance early; prevents bad outputs.	Uses gpt‑4o; attaches to Agents‑SDK guardrail hooks.
Automated evaluation harness	Compiles generated code, executes unit tests, and surfaces results.	Guarantees “it actually runs.”	Pytest inside sandbox container; results returned in summary.
Artifact bundle & dependency lock	Outputs agent.py, tests/ , requirements.txt, and optional diagram.	One‑command install & run; reproducible builds.	Meta Agent writes files to target directory or pushes to Git repo.
Cost & trace telemetry	Logs token usage, latency, and spend per generation.	Keeps surprises off the cloud bill; aids optimization.	Leverages Agents‑SDK tracing and OpenAI usage APIs.



⸻

3 · User Experience

Personas
	1.	Rapid Prototyper (“Alice”) — AI startup engineer needing a demo‑ready agent by tomorrow.
	2.	Enterprise Solutions Architect (“Bob”) — integrates bespoke agents into client workflows; cares about compliance and audit trails.
	3.	Curious Hobbyist (“Charlie”) — explores AI agents but lacks deep coding expertise.

Key User Flows

Flow	Steps	Success signal
Create agent from scratch	(1) Run CLI → paste spec or select template. (2) Meta Agent generates files. (3) User python my_agent.py --demo.	Agent runs without error & passes autogenerated tests.
Iterate on existing agent	(1) Provide updated spec. (2) Meta diff‑parses changes, regenerates only affected parts.	Modified agent works; unchanged tests still pass.
Inspect guardrails	(1) Run --audit. (2) Tool lists guardrail coverage and edge cases.	Developer signs off or tweaks validators.

UI/UX Considerations
	•	CLI first, VS Code extension later (syntax‑highlighted diff view).
	•	Colored status banner (✔ build passed, ✖ tests failed, 🛡️ guardrail triggered).
	•	Optional Markdown/mermaid diagram auto‑rendered for README.

⸻

4 · Technical Architecture

Component	Responsibility	Tech / Integration
Meta Agent Orchestrator	Planning, sub‑agent delegation, artifact assembly.	openai-agents-python, model =o3
Tool Designer Agent	Code + unit tests for tools.	model =o4‑mini‑high, sandboxed python_interpreter
Guardrail Designer Agent	Validation logic & tests.	model =gpt‑4o, Agents‑SDK guardrails
Evaluation Harness	Compile, execute, report.	Pytest in Docker (no outbound net)
Template/Pattern Library	Store proven agent blueprints.	Local JSON/YAML files; search indexed by embeddings
Artifact Repository	Save generated bundles, metadata.	Git backend or S3 bucket
Telemetry & Tracing	Cost, latency, usage stats.	Agents‑SDK tracing hooks, OpenAI usage API

Data Models
	•	SpecSchema — parsed user requirements (goal, I/O, tools[], guardrails[], constraints).
	•	ToolArtifact — {code:str, tests:str, metadata}.
	•	AgentBundle — {agent_code, tests, requirements, diagram, trace}.

Infrastructure
	•	Dockerized sandbox with resource limits.
	•	Optional Kubernetes job runner for concurrent generations.
	•	Vector store (Chroma) for template retrieval & caching web search hits.

⸻

5 · Development Roadmap (scope‑only)

Phase	Deliverables (no timelines)
0 · Foundations	CLI skeleton, SpecSchema, Meta Agent orchestrator stub, Docker sandbox.
1 · MVP	• Natural‑language ingestion → working agent.py • Single Tool Designer/Guardrail flow • Basic pytest harness • Requirements lockfile generation.
2 · Validation & Observability	• Guardrail test coverage report • Cost/latency telemetry dashboard • Compile‑time linting (ruff/pyright).
3 · Templates & UX polish	• Library of common agent archetypes • Mermaid diagram auto‑generation • Colored CLI feedback.
4 · Extensibility	• VS Code extension • Plugin system for custom sub‑agents • API endpoints for SaaS version.
5 · Enterprise Hardening	• RBAC, audit logs, SSO • Policy engine integration • Multi‑tenant artifact store.



⸻

6 · Logical Dependency Chain
	1.	SpecSchema & CLI skeleton (foundation).
	2.	Sandbox container — needed before executing any generated code.
	3.	Meta Agent orchestration — operates spec → tasks.
	4.	Tool Designer + Guardrail Designer — unlock functional code generation.
	5.	Evaluation harness — proves MVP works end‑to‑end.
	6.	Artifact bundling & dependency lock — delivers usable output.
	7.	Telemetry & diagrams — polish after core loop is stable.
	8.	Extended UX layers (VS Code, SaaS API) — once stable backend exists.

Goal is earliest possible “hello‑world agent” that passes tests, then iterate outward.

⸻

7 ·Risks & Mitigations

Risk	Impact	Mitigation
LLM hallucination → non‑runnable code	Build fails	Automated compilation + retry loops; constrain generation via templates.
Sandbox escape / malicious code	Security breach	Docker seccomp/apparmor, read‑only FS, no outbound net unless allow‑listed.
OpenAI API cost spikes	Budget blow‑out	Telemetry guard; cost caps per generation; use cheaper models when possible.
Spec ambiguity	Rework, user frustration	Meta Agent auto‑asks clarifying questions; provide spec template wizard.
External API drift	Generated tools break over time	Health‑check tests; version pin APIs; regenerate tools on failure.
Scope creep delaying MVP	Missed delivery	Strict phase gating; prioritize “agent builds, runs, tested” above all.



⸻

8 · Appendix
	•	Research links
	•	OpenAI Agents SDK docs (v0.0.7)
	•	Best‑practice guardrail frameworks (Guardian, Rebuff)
	•	Pytest sandboxing strategies
	•	Example Spec Template (YAML)

goal: "Summarize Slack threads into daily digest"
io_contract:
  input: "Channel history (JSON)"
  output: "Markdown summary"
tools:
  - name: slack_api
    requirement: "Fetch channel messages"
guardrails:
  - "No PII leakage"
constraints:
  max_latency: 30s
  max_cost: $0.50


	•	Mermaid Diagram Sample

graph TD
  A[User Spec] --> B[Meta Agent]
  B --> C[Tool Designer]
  B --> D[Guardrail Designer]
  C --> E[Code + Tests]
  D --> E
  E --> F[Evaluation Harness]
  F --> G[Agent Bundle]