MDX render failed
# Mini-Coding-Agent This folder contains a small standalone coding agent: - code: `mini_coding_agent.py` - CLI: `mini-coding-agent` It is a minimal local agent loop with: - workspace snapshot collection - stable prompt plus turn state - structured tools - approval handling for risky tools - transcript and memory persistence - bounded delegation The model backend is currently based on Ollama. <img src="https://sebastianraschka.com/images/github/mini-coding-agent/1.webp" width="500px"> <br> **Stay tuned for a more detailed tutorial to be linked here** ## Requirements You need: - Python 3.10+ - Ollama installed - an Ollama model pulled locally Optional: - `uv` for environment management and the `mini-coding-agent` CLI entry point This project has no Python runtime dependency beyond the standard library, so you can run it directly with `python mini_coding_agent.py` if you do not want to use `uv`. ## Install Ollama Install Ollama on your machine so the `ollama` command is available in your shell. Official installation link: [ollama.com/download](https://ollama.com/download) Then verify: ```bash ollama --help ``` Start the server: ```bash ollama serve ``` In another terminal, pull a model. Example: ```bash ollama pull qwen3.5:4b ``` Qwen 3.5 model library: - [ollama.com/library/qwen3.5](https://ollama.com/library/qwen3.5) The default in this project is `qwen3.5:4b`. If you have sufficient memory, it is worth trying a larger model such as `qwen3.5:9b` or another larger Qwen 3.5 variant. The agent just sends prompts to Ollama's `/api/generate` endpoint. ## Project Setup Clone the repo or your fork and change into it: ```bash git clone https://github.com/rasbt/mini-coding-agent.git cd mini-coding-agent ``` If you forked it first, use your fork URL instead: ```bash git clone https://github.com/<your-github-user>/mini-coding-agent.git cd mini-coding-agent ``` ## Basic Usage Start the agent: ```bash cd mini-coding-agent uv run mini-coding-agent ``` Without `uv`, run the script directly: ```bash cd mini-coding-agent python mini_coding_agent.py ``` By default it uses: - model: `qwen3.5:4b` - approval: `ask` For a concrete usage example, see [EXAMPLE.md](EXAMPLE.md). ## Approval Modes Risky tools such as shell commands and file writes are gated by approval. - `--approval ask` prompts before risky actions (default and recommended) - `--approval auto` allows risky actions automatically (convenient but riskier) - `--approval never` denies risky actions Example: ```bash uv run mini-coding-agent --approval auto ``` ## Resume Sessions The agent saves sessions under the target workspace root in: ```text .mini-coding-agent/sessions/ ``` Resume the latest session: ```bash uv run mini-coding-agent --resume latest ``` Resume a specific session: ```bash uv run mini-coding-agent --resume 20260401-144025-2dd0aa ``` ## Interactive Commands Inside the REPL, slash commands are handled directly by the agent instead of being sent to the model as a normal task. - `/help` shows the list of available interactive commands - `/memory` prints the distilled session memory, including the current task, tracked files, and notes - `/session` prints the path to the current saved session JSON file - `/reset` clears the current session history and distilled memory but keeps you in the REPL - `/exit` exits the interactive session - `/quit` exits the interactive session; alias for `/exit` ## Main CLI Flags ```bash uv run mini-coding-agent --help ``` Without `uv`: ```bash python mini_coding_agent.py --help ``` CLI flags are passed before the agent starts. Use them to choose the workspace, model connection, resume behavior, approval mode, and generation limits. Important flags: - `--cwd` sets the workspace directory the agent should inspect and modify; default: `.` - `--model` selects the Ollama model name, such as `qwen3.5:4b`; default: `qwen3.5:4b` - `--host` points the agent at the Ollama server URL (usually not needed); default: `http://127.0.0.1:11434` - `--ollama-timeout` controls how long the client waits for an Ollama response (usually not needed); default: `300` seconds - `--resume` resumes a saved session by id or uses `latest`; default: start a new session - `--approval` controls how risky tools are handled: `ask`, `auto`, or `never`; default: `ask` - `--max-steps` limits how many model and tool turns are allowed for one user request; default: `6` - `--max-new-tokens` caps the model output length for each step; default: `512` - `--temperature` controls sampling randomness; default: `0.2` - `--top-p` controls nucleus sampling for generation; default: `0.9` ## Example See [EXAMPLE.md](EXAMPLE.md) ## Notes & Tips - The agent expects the model to emit either `<tool>...</tool>` or `<final>...</final>`. - Different Ollama models will follow those instructions with different reliability. - If the model does not follow the format well, use a stronger instruction-following model. - The agent is intentionally small and optimized for readability, not robustness.