Self-Hosted LLM Server
Hosting a local LLM server with Ollama and OpenWebUI.
LLM server set up in URseismo lab. It is running on a 1x 3090Ti GPU with 24GB of VRAM, 256GB of RAM.
Prerequisites
Docker Compose Setup
# update containers
docker compose pull
# create a docker network, ollama and openwebui
docker compose up -d
# stop it
docker compose stop
# docker-compose.yml
services:
ollama:
image: ollama/ollama
container_name: ollama
ports:
- "11434:11434"
volumes:
- ollama:/root/.ollama
restart: unless-stopped
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: all
capabilities: [gpu]
networks:
- llm-subnet
openwebui:
image: ghcr.io/open-webui/open-webui:main
container_name: openwebui
ports:
- "3000:8080"
environment:
- ENABLE_WEBSOCKET_SUPPORT=false
- OLLAMA_API_BASE_URL=http://ollama:11434
# - AUTOMATIC1111_BASE_URL=http://host.docker.internal:7860
# - ENABLE_IMAGE_GENERATION=true
# - IMAGE_GENERATION_MODEL=v1-5-pruned-emaonly
# - IMAGE_SIZE=640x800
depends_on:
- ollama
restart: unless-stopped
volumes:
- openwebui:/app/backend/data
networks:
- llm-subnet
volumes:
ollama:
openwebui:
memory:
networks:
llm-subnet:
Service Exposure
To expose the services to the internet, we used a reverse proxy (a synology NAS), with ssh -N -R 3000:localhost:3000 <NAS>
to expose openwebui port to NAS, and set up synology reverse proxy to forward llm.<synology.address>
to localhost:3000
.
Service is available at https://llm.repovibranium.synology.me.