Quickstart: Chat

Start chatting with a local AI model in under a minute. No accounts, no API keys, no cloud.

Using the desktop app

Install Teale. Download from teale.com/download or run brew install --cask teale. See Install on Mac for details.
Click the Teale icon in your menu bar. It appears near the clock after launching the app.
Wait for the model to download. Teale automatically selects and downloads a model based on your available RAM. For a Mac with 16 GB, this is typically Llama 3.1 8B (4-bit quantized, about 4.5 GB). You only download once.
Type a message and start chatting. Inference runs entirely on your Mac. Nothing is sent to the cloud.

Start the inference server and send a message:

teale up
teale chat "What is the meaning of life?"

For an interactive conversation:

teale chat

This opens a REPL where you can type messages back and forth. Press Ctrl+C to exit.

List available models and pull a different one:

teale models list
teale models pull qwen-2.5-7b-instruct-4bit
teale chat --model qwen-2.5-7b-instruct-4bit "Explain quantum computing"

When you send a message, Teale:

Loads the model into your Mac's unified memory (first message may take a few seconds).
Runs inference on Apple Silicon using Metal acceleration.
Streams tokens back as they are generated.

All processing stays on your machine. No data leaves your device unless you explicitly connect to the Teale network.