Ollama - Local LLM Installation
Introduction
LLMs or Large language models are becoming more and more streamlined and offer the possibility at the tip of your fingers, to have your very own personal assistant. If you would like, also completely dissconnected from the outside world, providing assistance from simple searches to visual object understanding to helping you brainstorm ideas.. While many models are under a subscription fee, there are some that are free to use and customize for your liking
What is Ollama?
Presenting Ollama, and Open Source app that helps you organize , run and create LLMs locally on your computer, regardeless of the OS Platform presented(Linux, Windows, MacOS). The process is very simple and intuitive, and within a few minutes you are able to have your very own AI Model, installed on your machine. PC Requirements vary from a simple consumer PC with an medium Grafic Card to Models that do not run on 50000 dolars Workstations, due to the Size and Complexity. We will focus on the one(s) which can be installed on most PCs.
Installation
For Installation of Ollama you require the following
On Mac
Go to https://ollama.com/download/mac
download the client for Mac
install it on you MacOS
On Linux
On Linux https://ollama.com/download/linux
download the client for Linux using
curl -fsSL https://ollama.com/install.sh | sh
On Windows
Go to https://ollama.com/download/windows
download the client for Windows
install it on your Windows PC
After the client is installed on the respective OS check the disk space on your drive. Each Modell can take from a few GB to double digits, so make sure you have enough space to download it. After, you can open Terminal (ist called crossplatform the same) and run the following command:
ollama run "Ollamamodel"
Wait for the process to complete and you will automatically receive a prompt of the model you have downloaded. You can find the list of LLMs available on your machine by running:
ollama list
For downloading a different Model, you need to rerun “ollama run ollamamodel“ and a new download will start.
As an estimation , the models below 10B parameters are running smooth on the following PC configuration:
AMD 3900x
32 GB RAM
1 Tb NVME Storage
Nvidia 4070 Super
I have tried also Gemma 2 with 27B Parameters and it was simply too slow. Llama 3.3 failed with the error that there arent enough RAM available, so 32 GB is not sufficient running the respective Model. The following Models ran smoothly on my machine from the ones I tested:
- llama 3.2
- gemma 2
Configure WebGUI
For the configuration of an WebGUI, similar to what you get on chat GPT, you need to configure an additional tool called WebUI. You can find the detailed description of the tool on GitHub but for our purpose you can run the following command in order to get the docker container running:
docker run -d -p 3000:8080 --add-host=host.docker.internal:host-gateway -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:main
After running the command the container is setup and you can reach it using localhost:3000 (or IP Address of your machine on port 3000).
At first login, you must create an account which will be your administrator account.
Create Account
After providing the first username and password you login to the admin console and you can configure additional users if required.
And there you have it. Your very own Assistant ready for customization and further ways in which it can help you. I cannot wait to play with it and configure it as I wish and I hope so do you.
Have a great day, TFG