# Codex Tasks — Structural Analysis & Tooling **Created:** 2026-04-03 **Purpose:** Three Codex agents build structural tools and analysis from the VIBECODE-THEORY paper series (papers 001-008 + allegorical directory). These complement the Gemini research swarm by providing machine-readable structure, cross-reference maps, and integration tooling. **Protocol:** Each agent claims ONE task by writing their identifier into the `Claimed by` field, then works autonomously. When done, write output to the specified location and mark status as `DONE`. --- ## Task C1: Cross-Reference Graph **Status:** DONE **Claimed by:** Codex-GPT5 **Output:** `tools/cross-references/` Parse all 8 papers and 8 allegory files. Extract every cross-reference between documents — explicit ("Paper 006's theological thread," "as established in Paper 007") and implicit (shared concepts, terms introduced in one paper and used in another). **Deliverables:** 1. **`graph.json`** — Structured JSON graph: ```json { "nodes": [ {"id": "007", "title": "The Ratchet", "concepts_introduced": ["biological ratchet", "infrastructure threshold", ...]} ], "edges": [ {"source": "008", "target": "007", "type": "extends", "context": "extends ratchet mechanism with a direction — toward unification"}, {"source": "008", "target": "003", "type": "addresses", "context": "responds to falsifiability concern"} ] } ``` 2. **`graph.mermaid`** — Mermaid diagram showing paper relationships. Use directional edges labeled with relationship type (extends, refutes, addresses, introduces concept used by). 3. **`dangling_threads.md`** — List of concepts, questions, or claims that are raised in one paper but never resolved or revisited. These are candidates for Paper 009+. For each: which paper raised it, what the open question is, and which (if any) later papers partially address it. 4. **`concept_flow.md`** — For each major concept (dependency chain, ratchet, infrastructure threshold, cognitive preference shift, automation spiral, knowledge unification, etc.), trace its lifecycle: where introduced, where challenged, where revised, where it currently stands. **How to extract:** Read each paper. Look for: - Explicit references: "Paper N," "as established in," "the series has," "prior papers" - Section headers like "Relationship to Prior Papers" (most papers have one) - Shared terminology across papers - Open questions sections (most papers end with these) - The HANDOFF.md file has a summary of key ideas by session --- ## Task C2: Concept Index & Glossary Generator **Status:** DONE **Claimed by:** Codex-GPT5 **Output:** `tools/concept-index/` Build an automated glossary of every named concept, framework, and thesis in the series. **Deliverables:** 1. **`index.json`** — Structured concept index: ```json { "concepts": [ { "name": "The Biological Ratchet", "aliases": ["neural pruning argument", "dependency ratchet", "physiological argument"], "introduced_in": "007", "definition": "Dependencies don't reverse because the organism physically adapts...", "revised_in": [], "challenged_in": ["003"], "referenced_in": ["008"], "status": "active", "related_concepts": ["cognitive preference shift", "infrastructure threshold"] } ] } ``` 2. **`glossary.md`** — Human-readable glossary sorted alphabetically. For each concept: one-paragraph definition drawn from the papers, paper of origin, current status (active/superseded/open question). 3. **`concept_map.mermaid`** — Mermaid diagram showing concept relationships (which concepts depend on, extend, or contradict which other concepts). Separate from the paper-level graph in C1 — this is concept-to-concept, not paper-to-paper. 4. **`build_index.py`** — The Python script that generates all of the above from the paper files. Should be re-runnable as new papers are added. Read the markdown files, extract concepts by pattern matching (bold terms, section headers, named frameworks), cross-reference, and output structured data. **Extraction heuristics:** - Bold terms on first use often indicate named concepts - Section headers are often concept names - Table rows in papers 007 and 008 define mappings - "Relationship to Prior Papers" sections link concepts across papers - The HANDOFF.md "Key Ideas" sections are a good seed list --- ## Task C3: Research Integrator **Status:** DONE **Claimed by:** Codex-GPT5 **Output:** `tools/integrator/` Build a tool that processes the Gemini research output files (from `research/`) and produces a unified research digest. **Note:** The research files may not exist yet (Gemini agents are still running). Build the tool so it works on whatever files exist at runtime, and can be re-run later when all 6 are complete. **Deliverables:** 1. **`integrate.py`** — Python script that: - Reads all `research/*.md` files - Extracts all named scholars/authors mentioned across files - Deduplicates scholars appearing in multiple research files and consolidates what each research file says about them - Extracts all book/paper titles and builds a unified bibliography - Identifies contradictions (where one research file's evidence conflicts with another's) - Maps research findings to the open questions from Paper 008's "Open Questions for Paper 009" section - Outputs structured results 2. **`digest.md`** — Generated output (from running integrate.py on whatever research files exist): - **Scholars by frequency** — who appears most across the research, suggesting central importance - **Unified bibliography** — every source mentioned, deduplicated, sorted by relevance - **Contradiction report** — where research files disagree or present conflicting evidence - **Paper 009 coverage map** — which open questions from 008 got the most supporting material, which got the least (research gaps) - **Strongest challenges** — the most threatening counterarguments found across all research files 3. **`009_outline_suggestion.md`** — Auto-generated suggested outline for Paper 009 based on: - Which open questions have the most research material - Which new themes emerged from the research that weren't in the original open questions - Which counterarguments are strong enough to require direct engagement **Design notes:** - Parse markdown with regex or a lightweight parser — don't require a markdown AST library - Be generous with extraction — false positives are better than missed findings - The script should work with 1 research file or all 6 - Print progress to stdout so the user can see what it found