Open WebUI is the self-hosted ChatGPT — a web interface you point at any OpenAI-compatible endpoint (most often a local Ollama instance) that brings RAG, document upload, role-based access control, scheduled automations, and a Progressive Web App for mobile. The repo crossed 124k stars and 282 million Docker pulls — the second-most-downloaded local-AI project after Ollama. The selling point is that it ships SSO, RBAC, and LDAP for free, in a category where ChatGPT Team charges $30/user/month and ChatGPT Enterprise costs more.
What is Open WebUI?
Open WebUI is a Python+SvelteKit web application that hosts a multi-user chat interface on top of an OpenAI-shaped backend. It speaks Ollama natively — auto-detecting installed models — and any other endpoint that exposes /v1/chat/completions, including the OpenAI API itself, Anthropic via a proxy, and local llama.cpp servers. The default deployment shape is a Docker container that mounts a single volume for chats and uploaded files; it scales horizontally to Kubernetes when one box stops being enough.
The 2026 release line added a native desktop app for macOS, Windows, and Linux (system-wide push-to-talk, floating chat bar at Shift+Cmd+I), scheduled automations that run prompts on a cron, and built-in OAuth session management for MCP connections. Pair it with Ollama on the same host and you have a self-contained ChatGPT replacement that never ships a token off the box.
Install Open WebUI
Docker is the canonical install path. The image with the :ollama tag bundles Ollama in the same container; if you already run Ollama on the host (the recommended setup) use the slim image instead and point Open WebUI at your existing daemon.
# Assumes Ollama already running on http://localhost:11434
docker run -d \
--name open-webui \
-p 3000:8080 \
--add-host=host.docker.internal:host-gateway \
-v open-webui:/app/backend/data \
--restart always \
-e OLLAMA_BASE_URL=http://host.docker.internal:11434 \
ghcr.io/open-webui/open-webui:main
# Open http://localhost:3000 and create the first admin account.
# That account becomes the workspace owner; everyone after signs up as a regular user.docker run -d \
--name open-webui \
--gpus all \
-p 3000:8080 \
-v ollama:/root/.ollama \
-v open-webui:/app/backend/data \
--restart always \
ghcr.io/open-webui/open-webui:ollamaThe Kubernetes manifests live under kubectl, kustomize, and Helm in the upstream repo. The desktop app — OpenWebUI.app on macOS, MSI on Windows, AppImage on Linux — runs the same backend locally without Docker; useful for laptop demos, less useful for a team server.
Connect to a non-Ollama backend
Open WebUI accepts any OpenAI-compatible endpoint. Add an OpenAI provider in Admin Panel → Settings → Connections, set the API base URL, and the model picker pulls the available models on the next request. Useful providers that work out of the box:
- OpenAI — official endpoint, paste an API key, all models appear in the picker.
- Anthropic via proxy — point at an OpenAI-compatible adapter such as
litellm; Open WebUI does not call Anthropic's native API directly. - vLLM / TGI — production inference servers expose OpenAI-shaped endpoints; useful when one Ollama daemon is no longer enough.
- Multiple Ollama hosts — point at several
OLLAMA_BASE_URLSseparated by semicolons; Open WebUI load-balances across them.
RAG, documents, and web search
The RAG pipeline accepts file uploads (PDF, DOCX, TXT, Markdown, images via OCR) and has built-in support for 15+ web search providers (Google, Bing, DuckDuckGo, Brave, SearXNG, Tavily, Firecrawl) and 9+ vector databases (Chroma, Qdrant, Weaviate, Pinecone, Milvus). The default is Chroma running in the Open WebUI container, which is fine for one user; for a team, mount a Qdrant or Weaviate sidecar and switch the vector backend in Admin Panel → Documents.
Web-search retrieval is enabled per-chat with the search-icon toggle; the model gets a tool-call surface that runs the configured search provider, fetches the page, and folds the result into context before the next token. The same mechanism powers custom tools — Python functions you upload that the model can invoke during a conversation.
Multi-user, RBAC, SSO
The first user to sign up becomes the workspace owner; subsequent signups land as regular users. Admins can assign permissions per group: which models are visible, which can be selected as the default, who can upload documents, who can configure tools. SSO is built in (OAuth, OIDC, SAML), and there is a self-hosted LDAP/AD path for shops that want to mirror their existing identity store. The combination of free SSO + RBAC + audit logs is the wedge against ChatGPT Team for cost-sensitive teams.
Two operational notes worth getting right. First, WEBUI_AUTH=False disables auth entirely — useful for an air-gapped lab, dangerous on a network. Second, the open-webui Docker volume holds every chat, every uploaded document, and every embedding; back it up the way you back up a database, not the way you back up a stateless web app.
Open WebUI vs ChatGPT Team — the cost story
ChatGPT Team is $30/user/month for OpenAI's frontier models behind a managed UI. Open WebUI charges nothing per seat and runs on infrastructure you own; the cost shifts to a small VPS or a workstation, plus whatever upstream tokens you spend if you use a hosted model under it. For a 10-developer team, the like-for-like comparison is roughly $300/month on ChatGPT Team versus $50–150/month on Open WebUI plus a pay-as-you-go OpenAI key (or zero recurring spend if you run local Ollama models for the bulk of usage). What you trade is operational responsibility — backups, upgrades, a sane reverse proxy with TLS, and the OAuth dance to plug it into your identity provider.
Open WebUI vs LibreChat
LibreChat is the closest direct alternative. The pragmatic split: Open WebUI wins for RAG-heavy use cases and small teams that want fewer moving parts; LibreChat wins for agent workflows and larger deployments where the Mongo-plus-Meilisearch architecture scales further than Open WebUI's default SQLite. If you are starting fresh and Ollama is already running, start with Open WebUI — the install story is one container.
Common pitfalls
- Connection refused to Ollama: the container cannot reach
localhost:11434on the host without--add-host=host.docker.internal:host-gatewayand the matchingOLLAMA_BASE_URL. On Linux, that flag is the easy fix. - First admin account leaks: the first signup becomes the owner. Disable signups (
ENABLE_SIGNUP=False) the moment your account exists, then invite users by email. - Slow first response after upgrade: Open WebUI re-indexes embeddings on schema changes. Expect a one-time stall on the first chat after a major version bump.
- Docker volume size grows: uploaded documents are stored in full alongside their embeddings. Audit the volume monthly if users upload large PDFs.
- Web-search rate limits: the default DuckDuckGo provider has informal limits that bite under team usage. Move to Brave Search or Tavily once more than two users start searching daily.
Related reading
- Local LLMs for coding — model and hardware choices for the Ollama tier underneath Open WebUI.
- Ollama — the runtime that almost every Open WebUI install talks to.
- AI & LLM coding model comparison — the hosted side of the picker.