Self-Hosted LLM Server

LLM server set up in URseismo lab. It is running on a 1x 3090Ti GPU with 24GB of VRAM, 256GB of RAM.

Prerequisites

Docker Compose Setup

# update containers
docker compose pull

# create a docker network, ollama and openwebui
docker compose up -d

# stop it
docker compose stop

# docker-compose.yml
services:
  ollama:
    image: ollama/ollama
    container_name: ollama
    ports:
      - "11434:11434"
    volumes:
      - ollama:/root/.ollama
    restart: unless-stopped
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: all
              capabilities: [gpu]
    networks:
      - llm-subnet

  openwebui:
    image: ghcr.io/open-webui/open-webui:main
    container_name: openwebui
    ports:
      - "3000:8080"
    environment:
      - ENABLE_WEBSOCKET_SUPPORT=false
      - OLLAMA_API_BASE_URL=http://ollama:11434
      # - AUTOMATIC1111_BASE_URL=http://host.docker.internal:7860
      # - ENABLE_IMAGE_GENERATION=true
      # - IMAGE_GENERATION_MODEL=v1-5-pruned-emaonly
      # - IMAGE_SIZE=640x800
    depends_on:
      - ollama
    restart: unless-stopped
    volumes:
      - openwebui:/app/backend/data
    networks:
      - llm-subnet

volumes:
  ollama:
  openwebui:
  memory:

networks:
  llm-subnet:

Service Exposure

To expose the services to the internet, we used a reverse proxy (a synology NAS), with ssh -N -R 3000:localhost:3000 <NAS> to expose openwebui port to NAS, and set up synology reverse proxy to forward llm.<synology.address> to localhost:3000.

Service is available at https://llm.repovibranium.synology.me.