🤖 Setup AnythingLLM — Private AI on Your Documents
Deploy AnythingLLM on Ubuntu with Docker — a fully private ChatGPT-like platform that chats with your documents. Connect local Ollama models or cloud APIs. Multi-user workspaces, RAG pipelines, and complete data privacy.
📦 Resources & Setup Scripts
Grab the automated bash script from GitHub to follow along with the video.
Quick Install:
wget https://raw.githubusercontent.com/mhmdali94/Docker/main/ai/anythingllm/anythingllm-ubuntu.sh
chmod +x anythingllm-ubuntu.sh
sudo bash anythingllm-ubuntu.sh
Tutorial Steps
1 Download & Run the Script
The script installs Docker and deploys AnythingLLM. Make sure Ollama is already running on port 11434 if you want local models.
wget https://raw.githubusercontent.com/mhmdali94/Docker/main/ai/anythingllm/anythingllm-ubuntu.sh
chmod +x anythingllm-ubuntu.sh
sudo bash anythingllm-ubuntu.sh
2 Complete the Setup Wizard
Open your browser and navigate to AnythingLLM. Follow the setup wizard to choose your LLM provider and create an admin account:
http://<your-server-ip>:3001
3 Connect Your LLM Provider
In Settings → LLM Preference, select Ollama and enter your server address. For cloud models, enter your OpenAI or Anthropic API key.
4 Create a Workspace & Upload Documents
Create a workspace, upload your PDFs, Word docs, or text files, then start chatting. AnythingLLM automatically indexes and embeds your documents for accurate retrieval.
Ports Used
| Port | Purpose |
|---|---|
| 3001 | AnythingLLM Web UI |
Overview
AnythingLLM is an open-source, all-in-one AI platform that lets you chat with your documents using any LLM — local via Ollama or cloud via OpenAI, Anthropic, or Gemini. It uses Retrieval-Augmented Generation (RAG) to index your uploaded files, find relevant passages, and inject them into the LLM prompt so answers are grounded in your actual content rather than training data. It supports multi-user workspaces, access controls, agent tools (web browsing, code execution), and an embeddable chat widget for your website.
Why Use It
AnythingLLM bridges the gap between raw LLM inference and practical document-aware AI. Unlike simply uploading files to ChatGPT, it stores all your documents locally, processes them into a vector database on your server, and never sends document content to external services when using Ollama. This makes it the right choice when privacy and data sovereignty are non-negotiable — a law firm chatting with client contracts, a hospital staff Q&A over internal clinical guidelines, or an IT team knowledge base over runbooks and documentation.
When You Need It
Who Should Use It
Real Use Cases
Main Features
How to Use After Installation
Security Best Practices
Ports and Firewall Notes
AnythingLLM runs on port 3001. Block this port at the firewall and expose it only through a reverse proxy with HTTPS. If AnythingLLM and Ollama run on the same server, set OLLAMA_BASE_URL to http://host.docker.internal:11434 inside the AnythingLLM container to avoid network exposure for the LLM traffic. Do not expose port 3001 directly to the internet — always use a reverse proxy with TLS.
Backup and Maintenance
Common Mistakes
Troubleshooting
Alternatives
Dify offers a more developer-focused approach with visual workflow building, agent orchestration, and API publishing — better if you want to build AI applications rather than just document Q&A. PrivateGPT is a simpler, code-only alternative for technical users who want minimal infrastructure. Flowise provides a no-code pipeline builder for assembling LangChain components visually. For enterprise scale with advanced access controls and SSO, Cognita or Langfuse add observability and governance layers. AnythingLLM's strength is its clean out-of-the-box experience for non-developer teams who need private document chat.
When Not to Use It
AnythingLLM is not ideal for high-volume production APIs that need to serve many concurrent requests — it is optimized for team-scale usage, not enterprise inference throughput. If you need to build complex AI workflows with conditional logic, external API calls, and multi-step agents, Dify or n8n with AI nodes are more appropriate. If your team is purely technical and prefers code over UI, LangChain or LlamaIndex give more control at the cost of more setup work.
Need Help Setting Up AnythingLLM?
PrismaTechWork provides end-to-end infrastructure services — from initial deployment and security hardening to ongoing monitoring, automated backups, and dedicated support. Whether you need a single-server setup or a multi-site network, our team ensures your infrastructure is built right, secured properly, and maintained reliably.
Frequently Asked Questions
Can AnythingLLM work completely offline?
Yes, if you use Ollama as the LLM provider. AnythingLLM itself runs fully on your server with no external network calls. Ollama pulls models from the internet once during setup, then operates entirely offline. The only external call in a fully local setup is the initial Docker image pull. After that, no data leaves your network — documents, embeddings, and conversations all stay local.
What file types can I upload to AnythingLLM?
AnythingLLM supports PDF, DOCX, TXT, MD, CSV, JSON, XLSX, and web URLs (it scrapes the page text). It does not support scanned image PDFs out of the box — those require OCR preprocessing with a tool like Tesseract or Adobe Acrobat to extract the text layer before uploading. Video and audio files are not supported without a transcription preprocessing step.
How many documents can a workspace hold?
There is no hard document limit. Performance depends on your server resources. Large workspaces (1,000+ pages) require more RAM for embedding and slower retrieval. In practice, keeping workspaces focused — one topic per workspace — gives better answer quality than one giant workspace with everything. Split large document collections across multiple workspaces and let users query the relevant one.
Can I use AnythingLLM with OpenAI while keeping documents private?
Partially. Document embeddings (the vector representations) are computed locally using a local embedding model — document content is not sent to OpenAI. However, the retrieved document chunks are included in the chat prompt that is sent to OpenAI's API for the final answer generation. If documents are sensitive, use Ollama for fully local inference. For less sensitive content, mixing local embeddings with OpenAI generation is a reasonable middle ground.
How do I add new documents to an existing workspace?
Click the paperclip or document icon in the workspace chat view to open the document manager. Drag and drop new files or paste a URL. AnythingLLM will embed the new documents and add them to the workspace vector database automatically. Existing conversations are not retroactively updated — new documents are available for all new conversations after embedding completes.
Can I embed AnythingLLM as a chat widget on my website?
Yes. In workspace settings, enable the embedded chat option. AnythingLLM generates a script tag you add to your website. The widget appears as a chat bubble and connects back to your AnythingLLM server. You can customize the widget's appearance, welcome message, and suggested questions. Only the specific workspace you embed is accessible — users cannot navigate to other workspaces through the widget.
What is the difference between workspaces and users?
Workspaces are document collections with their own vector store, system prompt, and settings. Users are accounts that can be assigned to one or more workspaces. An Admin can access all workspaces. A Manager can manage assigned workspaces. A User can only chat in workspaces they are assigned to and cannot see documents or settings. This structure lets you run multiple isolated AI assistants on a single AnythingLLM instance.
How do I update AnythingLLM without losing my documents?
Run `docker compose pull && docker compose up -d` in your AnythingLLM directory. Documents, vector embeddings, conversation history, and user accounts are stored in a named Docker volume and are not affected by image updates. After the update, log in and verify that workspaces still load and document search works. Check the AnythingLLM GitHub changelog for breaking changes before updating major versions.
