Ollama cheat sheet

Quickly get started with Ollama, a tool for running large language models locally, with this cheat sheet. Install Ollama on your preferred platform (even on a Raspberry Pi 5 with just 8 GB of RAM), download models, and customize them to your needs. Ollama is a tool that allows you to run open-source large language models (LLMs) locally on your machine.

Quick start Guide




Run and chat with Llama 2:

ollama run llama2

Model Library

Access a variety of models from Example commands to download and run specific models:

  • ollama run llama2
  • ollama run mistral
  • ollama run dolphin-phi

Customize a Model

Import Models

  • GGUF: Use a Modelfile with the FROM instruction pointing to the GGUF file.
    FROM ./model.gguf
    ollama create mymodel -f Modelfile
    ollama run mymodel
  • PyTorch/Safetensors: Refer to the import guide.

Customize Prompt

  1. Pull the model:
    ollama pull llama2
  2. Create a Modelfile with custom parameters and system messages.
  3. Create and run the model:
    ollama create custommodel -f ./Modelfile
    ollama run custommodel

Advanced usage

CLI Reference

  • Create a model: ollama create mymodel -f ./Modelfile
  • Pull a model: ollama pull modelname
  • Remove a model: ollama rm modelname
  • Copy a model: ollama cp source_model new_model
  • List models: ollama list
  • Start Ollama (without GUI): ollama serve

Multimodal Input

  • Text: Wrap multiline input with """.
  • Images: Specify the image path directly in the prompt.

REST API Examples

  • Generate a response:
    curl http://localhost:11434/api/generate -d '{"model": "llama2", "prompt":"Why is the sky blue?"}'
  • Chat with a model:
    curl http://localhost:11434/api/chat -d '{"model": "mistral", "messages": [{"role": "user", "content": "why is the sky blue?"}]}'

Refer to the API documentation for more details.