mirror of
http://10.0.2.1:3031/sauer/claude-code.git
synced 2026-06-30 11:36:57 +10:00
update README: working local model setup with LOCAL_MODEL_BASE_URL
This commit is contained in:
parent
8605b3e54a
commit
38089ceaf0
44
README.md
44
README.md
@ -49,32 +49,36 @@ export ANTHROPIC_API_KEY=your-key
|
|||||||
node cli.js
|
node cli.js
|
||||||
```
|
```
|
||||||
|
|
||||||
### With Local Models (Ollama, LM Studio)
|
### With Local Models (Ollama + Qwen3-Coder)
|
||||||
|
|
||||||
Claude Code uses the Anthropic Messages API format. To use local models, run [litellm](https://github.com/BerriAI/litellm) as a translation proxy:
|
We patched the source to add `LOCAL_MODEL_BASE_URL` — routes only model API calls to your local proxy while letting auth/startup use Anthropic's servers normally.
|
||||||
|
|
||||||
|
**Requirements:** [Ollama](https://ollama.com) + [litellm](https://github.com/BerriAI/litellm) + a Claude subscription (for auth)
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
# Terminal 1: Start litellm proxy
|
# Step 1: Pull a model with 128K+ context (required for Claude Code's system prompt)
|
||||||
pip install litellm
|
ollama pull qwen3-coder:30b
|
||||||
litellm --model ollama/llama3.1:8b --port 8080
|
|
||||||
|
|
||||||
# Terminal 2: Point Claude Code at the proxy
|
# Step 2: Create litellm config that maps Claude's model name to your local model
|
||||||
export ANTHROPIC_BASE_URL=http://localhost:8080
|
cat > litellm-config.yaml << 'CONF'
|
||||||
export ANTHROPIC_API_KEY=not-needed
|
model_list:
|
||||||
node cli.js
|
- model_name: "claude-sonnet-4-20250514"
|
||||||
|
litellm_params:
|
||||||
|
model: "ollama/qwen3-coder:30b"
|
||||||
|
num_ctx: 65536
|
||||||
|
litellm_settings:
|
||||||
|
drop_params: true
|
||||||
|
CONF
|
||||||
|
|
||||||
|
# Step 3: Start litellm proxy (needs Python 3.10+)
|
||||||
|
pip install 'litellm[proxy]'
|
||||||
|
litellm --config litellm-config.yaml --port 8080
|
||||||
|
|
||||||
|
# Step 4: Run Claude Code (in another terminal)
|
||||||
|
LOCAL_MODEL_BASE_URL=http://localhost:8080 node cli.js
|
||||||
```
|
```
|
||||||
|
|
||||||
Works with any model Ollama supports — llama3.1, codellama, deepseek-coder, mistral, etc.
|
Claude Code authenticates with Anthropic normally (you need a subscription), but all model inference runs locally on Qwen3-Coder via Ollama. Works with any model that has 128K+ context — qwen3-coder, deepseek-r1, llama4, etc.
|
||||||
|
|
||||||
### With OpenAI / GPT Models
|
|
||||||
|
|
||||||
```bash
|
|
||||||
# Via litellm proxy
|
|
||||||
litellm --model openai/gpt-4o --port 8080
|
|
||||||
|
|
||||||
# Or any OpenAI-compatible endpoint (Codex, GPT-5.4, etc.)
|
|
||||||
litellm --model openai/o3 --port 8080
|
|
||||||
```
|
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
|
|||||||
Loading…
Reference in New Issue
Block a user