====== How to Use Ollama ====== Ollama is a lightweight tool for running large language models locally. It handles model downloading, quantization, GPU detection, and serving behind a simple CLI and REST API. This guide covers installation, model management, API usage, custom configurations, and integration with other tools. ===== Installation ===== === Linux === Run the official install script: curl -fsSL https://ollama.com/install.sh | sh For manual installation: curl -L https://ollama.com/download/ollama-linux-amd64 -o ollama chmod +x ollama sudo mv ollama /usr/local/bin/ Set up as a systemd service for automatic startup: sudo tee /etc/systemd/system/ollama.service > /dev/null <