Morph: AST-Level Refactoring Where the LLM Describes Intent, Not Code

When an LLM generates source code for a refactor, the output is a diff a reviewer must read line by line and trust blindly. There is no way to know if the model missed a reference, broke an import, or introduced a subtle logic change without reading every line.
Morph takes a different approach. Instead of asking the LLM to generate code, it asks the LLM to describe what to change as a structured plan of typed operations - RenameSymbol, MoveFunction, ExtractModule, and more. A reviewer reads ten structured operations in seconds and knows exactly what will change, why, and in what order. The transformation engine then validates the plan against the real codebase dependency graph, applies each operation atomically using tree-sitter AST manipulation, runs the test suite to confirm correctness, and stages clean changes for review. Failed transformations roll back automatically.
The LLM's job is intent declaration, not code writing. Morph's engine handles everything else.
Why Typed Plans Beat Source Code Generation
When a refactoring is expressed as a typed plan, every operation is verifiable before it runs. The plan validator checks file existence, symbol existence, dependency conflicts, and operation conflicts against a real dependency graph. The transformer applies operations in dependency order. The verifier runs pytest after every apply - any failure triggers automatic rollback.
Source code generation has none of these guarantees. A typed plan does.
The Pipeline
A natural language goal enters the LLM Planner, which outputs a validated TransformationPlan. The Plan Validator checks file existence, symbol existence, dependency conflicts, and operation conflicts against a NetworkX dependency graph. The Transformer applies operations in dependency order using tree-sitter AST manipulation, creating a file backup first. The Verifier runs pytest - any failure triggers automatic rollback. Clean changes are handed off to the Staging Manager via GitPython and summarised in a Report.
Supported Operations
Each operation is a typed Pydantic model. The LLM populates the fields — Morph validates and executes.
How the Dependency Graph Works
Before validating any plan, Morph parses the entire codebase with tree-sitter and builds a NetworkX dependency graph. This graph is used to:
Detect files that import the symbol being moved or renamed
Sort operations so dependencies are updated before dependents
Warn when a move will cascade across downstream files
Prevent circular dependency introduction from module extraction
This is what makes Morph safe to run on real codebases - the plan is validated against the actual dependency structure before a single file is touched.
Rollback Guarantee
Every non-dry-run apply call snapshots all affected files before touching them. If pytest reports failures after transformation, Morph restores from the snapshot automatically. The workspace is always left in a clean, known-good state.
Live Results
A real dry-run against anthropic/claude-haiku-4-5 via OpenRouter - the LLM parsed a natural language rename goal and produced a validated RenameSymbol plan in under 5 seconds. Full output and reproduction steps are in RESULTS.md.
Installation
pip install -e .
For local inference, install Ollama and pull a model:
ollama pull gemma4:e4b
For cloud backends, set the relevant environment variable:OPENROUTER_API_KEY - OpenRouter (recommended)OPENAI_API_KEY - OpenAIANTHROPIC_API_KEY - Anthropic
Usage
Describe what you want in plain English. Morph figures out the operations:
morph refactor --goal "rename calculate_total to compute_total" ./src
Preview the plan without touching any files:
morph refactor --goal "extract validation logic into validate_input()" ./src --dry-run
Generate and save the plan for inspection before applying:
morph plan --goal "add type annotations to all functions in utils.py" ./src --output plan.json
Apply a saved plan:
morph refactor --plan plan.json ./src
Verify the codebase passes its own test suite:
morph verify ./src
Generate a Markdown report of the last run:
morph report ./src --format markdown --output REFACTOR_REPORT.md
Supported Models
Morph works with any provider. OpenRouter is the recommended starting point - one API key routes to every model below without separate accounts.
The planner uses temperature=0.1 - low randomness produces more consistent structured output. Unknown model strings are automatically routed through OpenRouter with no --backend flag required.
CLI Reference
morph refactor --goal "..." PATH - Generate plan from goal and apply itmorph refactor --plan FILE PATH - Apply a previously saved planmorph refactor ... --dry-run - Show plan without modifying filesmorph plan --goal "..." PATH - Generate and display plan onlymorph verify PATH - Run the test suite and report pass/failmorph report PATH - Generate Markdown/JSON report of last run
Key flags: --model, --backend, --dry-run, --no-rollback, --output
Development
Clone and install in editable mode with dev dependencies:
git clone https://github.com/dakshjain-1616/morph
cd morph
pip install -e ".[dev]"
Run the full test suite:
pytest tests/ -v
Lint and type-check:
ruff check morph/ && mypy morph/
Final Notes
Morph shifts refactoring from code generation to intent declaration. The LLM describes what to change in a structured, validated plan. The engine does the mechanical work. Tests confirm correctness. The result is refactoring that is auditable before it runs, verifiable after it runs, and automatically reversible if it breaks anything.
This project was built using NEO. NEO is a fully autonomous AI engineering agent that can write code and build solutions for AI/ML tasks including AI model evals, prompt optimization and end to end AI pipeline development.
The code is at https://github.com/dakshjain-1616/Morph
You can also build with NEO in your IDE using the VS Code extension or Cursor.
You can use NEO MCP with Claude Code: https://heyneo.com/claude-code





