Using Claude Code with Ollama

Claude Code with Ollama

Running Large Language Models (LLMs) locally has become increasingly accessible, and combining this with Claude Code opens up powerful possibilities for AI-assisted development without relying on cloud services. In this post, I’ll walk through setting up Ollama to run LLMs locally and how to integrate it with Claude Code.

Installing Ollama

Ollama is a lightweight tool for running LLMs locally. Installation is straightforward on macOS:

brew install ollama

For other operating systems, refer to the official Ollama installation guide.

Installing the Claude CLI

The Anthropic Claude CLI can be installed via npm:

npm install -g @anthropic-ai/claude-code

Running the Ollama Server

Once Ollama is installed, you can start the server. The script below handles this automatically, but for manual control:

ollama serve

The server runs on port 11434 by default.

Pulling a Model

Before using a model, you need to pull it. For a local model like Qwen3.5:

ollama pull qwen3.5:9b

Or if you have an Ollama account and want to use cloud-hosted models:

ollama pull minimax-m2.7:cloud

The Wrapper Script

I created a wrapper script that ties everything together. It starts the Ollama server if not already running, pulls the specified model, configures the required environment variables, and launches Claude Code with the selected model:

#!/bin/bash
set -e

MODEL=""
HELP=""

while getopts ":m:h" opt; do
  case $opt in
    m) MODEL="$OPTARG" ;;
    h) HELP="yes" ;;
    :) echo "Option -$OPTARG requires a model argument" >&2; exit 1 ;;
    *) echo "Unknown option -$OPTARG" >&2; exit 1 ;;
  esac
done
shift $((OPTIND - 1))

if [ -n "$HELP" ] || [ -z "$MODEL" ]; then
    echo "Usage: $0 -m <model> [--dangerously-skip-permissions] [-p] <prompt>"
    echo "  -m <model>   Model to use (required)"
    echo "  -h           Show this help"
    exit 0
fi

if ! lsof -i :11434 2>/dev/null | grep -q LISTEN; then
    ollama serve &
    OLLAMA_PID=$!
else
    OLLAMA_PID=""
fi

ollama pull "$MODEL" 2>/dev/null || true

sleep 2

# Verify model exists before starting Claude
if ! ollama list | tail -n +2 | awk '{print $1}' | grep -q "^${MODEL}$"; then
    echo "ERROR: Model $MODEL not found"
    [ -n "$OLLAMA_PID" ] && kill $OLLAMA_PID 2>/dev/null || true
    exit 1
fi

export ANTHROPIC_BASE_URL="http://127.0.0.1:11434"
export ANTHROPIC_AUTH_TOKEN="ollama"
export ANTHROPIC_API_KEY=""

claude --model "$MODEL" "${@}"

if [ -n "$OLLAMA_PID" ]; then
    kill $OLLAMA_PID 2>/dev/null || true
fi

Usage

Make the script executable and run it with your chosen model:

chmod +x ollama-claude.sh

# For a local model
./ollama-claude.sh -m qwen3.5:9b "Explain this code for me"

# For a cloud model via Ollama
./ollama-claude.sh -m minimax-m2.7:cloud "Help me refactor this function"

Selecting the Right Model

  • Local models (e.g., qwen3.5:9b): Run entirely on your machine, no internet required, and free to use. Performance depends on your hardware.
  • Cloud models (e.g., minimax-m2.7:cloud): Require an Ollama account and internet connection, but typically offer better performance and larger context windows.

Summary

Combining Claude Code with Ollama gives you the best of both worlds: the powerful AI-assisted development experience of Claude Code running on top of locally-hosted or cloud-hosted LLMs through Ollama. Whether you prefer the privacy and cost benefits of local models or the raw power of cloud-hosted ones, this setup provides flexibility for different use cases.

The wrapper script automates the boilerplate, letting you focus on actually using the AI rather than managing infrastructure.

The full script is available here.