Local AI for private work.

Run supported GGUF models with your files in a native workspace. Chats, attachments, downloaded local models, and settings stay on your device.

Sensitive work
No cloud billing
Work offline

Download Aitopus Explore local models

DOCX

XLS

PDF

Local

Notes

Brief

Run the local model families your work calls for.

Aitopus supports compatible GGUF models from these families, so private requests can stay on-device when a model is installed.

Qwen
LFM
GPT-OSS
Gemma
Granite
Phi
GLM

Why local models

Private by default

Keep your work on your machine. Local models let prompts, responses, and files be processed on-device instead of being sent to a remote provider.

Works offline

Once downloaded, local models are available without an internet connection. Use AI while traveling or working from restricted networks.

No per-token cost

Run local models on your own hardware without per-token charges, API keys, or cloud billing for local chats.

Device-level control

Choose which models run on your device and keep your conversations on your machine.

Manage models locally

Install, use, and delete local models from your device. Stay in control of storage, model choice, and your local setup.

How it works

Pick a local model

Browse compatible local models in Aitopus and compare each model's strengths, size, and fit for your device.

Choose the right download for your device

Many local models are offered in multiple quantizations. Smaller quantizations use less memory and storage, and usually run faster. Larger ones may preserve more quality, but need more capable hardware. Aitopus helps you pick the version that fits your device.

Select it from the model picker

Once the download finishes, the model appears in Aitopus alongside your other available models. Select it from the model picker to start chatting.

Start chatting locally

Use the Aitopus chat interface with a model running on your own machine.

Manage storage anytime

See which local models are installed, how much space they use, and remove them whenever you need the space back.

Qwen 3.6 27B

RecommendedPowerful

Flagship reasoning model for coding and multilingual chat.

256K context2 quantizations1 downloadingFits

Gemma 4 31B IT

RecommendedPowerful

Dense Google model with vision and broad language support.

256K context2 quantizationsFits

Gemma 4 E2B IT

Fast

Lightweight model for quick local chat and reasoning.

128K context4 quantizations1 installedFits

GLM 4.7 Flash

PowerfulFast

MoE model from Z.ai for reasoning, coding, and bilingual chat.

203K context2 quantizationsFits

Pick a local model

Browse compatible local models in Aitopus and compare each model's strengths, size, and fit for your device.

Qwen 3.6 27B

RecommendedPowerful

Flagship reasoning model for coding and multilingual chat.

256K context2 quantizations1 downloadingFits

Gemma 4 31B IT

RecommendedPowerful

Dense Google model with vision and broad language support.

256K context2 quantizationsFits

Gemma 4 E2B IT

Fast

Lightweight model for quick local chat and reasoning.

128K context4 quantizations1 installedFits

GLM 4.7 Flash

PowerfulFast

MoE model from Z.ai for reasoning, coding, and bilingual chat.

203K context2 quantizationsFits

Choose the right download for your device

OpenAI GPT-OSS 20B (F16)

13.8 GBFitsF16Max 128K context

Download

OpenAI GPT-OSS 20B (Q4_K_M)

Fast

11.6 GBFitsQ4_K_MMax 128K context

Download

OpenAI GPT-OSS 20B (Q8_0)

Fast

12.1 GBFitsQ8_0Max 128K context

Download

Select it from the model picker

Once the download finishes, the model appears in Aitopus alongside your other available models. Select it from the model picker to start chatting.

Qwen 3.6 35B A3B (UD-Q4_K_M)

Apple Intelligence

Apple

Granite 4.0 H 1B

Q4_K_M

Qwen 3.6 27B

Q8_0

Gemma 4 E4B IT

BF16

Qwen 3.6 35B A3B

UD-Q4_K_M

Start chatting locally

Use the Aitopus chat interface with a model running on your own machine.

On device / llama.cpp

What can I help with?

This chat is routed to the local model selected above.

Message Aitopus

Manage storage anytime

See which local models are installed, how much space they use, and remove them whenever you need the space back.

Qwen 3.6 27B (Q8_0)

28.6 GBTight fitQ8_0Max 256K context

Context length

128K

Choose how much conversation history the model can use.

4K256K

Qwen 3.6 27B (UD-IQ2_M)

10.8 GBFitsUD-IQ2_MMax 256K context

Context length

32K

Choose how much conversation history the model can use.

4K256K