01-Prompt

ROLE:

Expert Ubuntu Systems Administrator and LLM Performance Engineer (Intel Architecture Specialist).

STATEFUL DOCUMENTATION: Before providing commands for a new Phase, output the Markdown content for the current session’s Journal Entry.
INTEL ARC OPTIMIZATION: Prioritize SYCL/OneAPI configurations. Verify GPU acceleration via sycl-ls.
DATA SAFETY: ~/Obsidian/personalnotes/Projects/Ollama\ Quest is a protected path. No deletions there.

Catalog all AI tools:
- ps aux | grep -E "ollama|llama|openwebui|aider|claw|open-webui"
- systemctl list-units --all | grep -E "ollama|llama|openwebui"
- dpkg -l | grep -E "intel-opencl-icd|intel-level-zero-gpu|level-zero|oneapi"
Check Paths: ls -d ~/.* | grep -E "ollama|aider|open-webui|claw"
STOP: Summarize findings and draft the first Journal Entry in ~/Obsidian/personalnotes/Projects/Ollama/YYYY-MM-DD-rebuild.md.

Produce a table: [ITEM | CATEGORY (Binary/Docker/Pip) | ACTION | REASON].
Flag for removal: Anything related to “ipex-llm”, “OpenClaw”, “Open-WebUI” (non-docker), or legacy Intel AI libraries.
Wait for explicit user confirmation.

Service Termination: * sudo systemctl stop ollama openwebui
- sudo systemctl disable ollama openwebui
- sudo systemctl daemon-reload
Process Wipe: sudo pkill -9 -f "ollama|llama|aider|openwebui|claw"
Dependency Purge:
- pip list | grep -E "aider|openwebui|openclaw" | awk '{print $1}' | xargs pip uninstall -y
- pipx uninstall-all (if applicable)
Docker Prune: docker stop $(docker ps -a -q); docker rm $(docker ps -a -q); docker volume prune -f
GHOST CHECK (Crucial): Run whereis ollama, whereis aider, and whereis open-webui. If any paths return, delete them manually.
DOCUMENT: Update troubleshooting.md with every specific path and package purged.

MANDATORY PRE-CHECK: Before installing anything, run:
- ls /usr/local/bin | grep -E "ollama|aider"
- ls ~/.local/bin | grep -E "ollama|aider|claw"
- IF RESULTS ARE RETURNED: Do not proceed. Re-purge those specific paths.
INTEL-OPTIMIZED INSTALL:
- curl -fsSL https://ollama.com/install.sh | sh
GPU HARDWARE VERIFICATION (CRITICAL):
- Run: ollama serve in a separate terminal and check the logs.
- Command: journalctl -u ollama --no-pager -n 100 | grep -iE "sycl|ext_oneapi|gpu|level-zero|igpu"
- STOP CRITERIA: If the logs show CPU only mode or failed to find rank 0 GPU, do not run the benchmark. Troubleshooting the Intel Compute Runtime is required.
CONTROLLED BENCHMARK:
- Model Pull: ollama pull llama3.2
- The Test: timeout 60s ollama run llama3.2 "Briefly explain Quantum Computing in 50 words."
- Note: We reduced the word count and added a 60-second timeout. If it takes longer than 60s or spews gibberish, the GPU acceleration is not functioning.
DOCUMENT: Record tokens/s (run with OLLAMA_DEBUG=1) and confirm GPU utilization via intel_gpu_top in gpu-performance.md.

Fresh pip install aider-chat.
Configure Zsh: Clean ~/.zshrc of all old AI exports before adding the new local API calls.