Save LLama.cpp files locally.
Open terminal in a folder where you want the app.
git clone https://github.com/ggerganov/llama.cpp
cd llama.cpp
make
Download model, and place it into the 'models' (sub)folder.
For example:
https://huggingface.co/Sosaka/Alpaca-native-4bit-ggml/resolve/main/ggml-alpaca-7b-q4.bin
Notes: Better model = better results.
Yet, there are also restrictions. For example, 65B model 'alpaca-lora-65B.ggml.q5_1.bin' (5bit) = 49GB space; 51GB RAM Required.
Hopefully in the future we'll find even better ones.
Also, there are different files (requirements) for models that will use only CPU or also GPU (and from which brand - AMD, NVIDIA).
To make the best use of your hardware - check available models.
'ggml-alpaca-7b-q4.bin' 7B model works without any need for the extra Graphic Card. It's light to start with.
- Update path + Run (question) prompts from terminal
/Documents/Llama/llama.cpp$
make -j && ./main -m ./models/ggml-alpaca-7b-q4.bin -p "What is the best gift for my wife?" -n 512
Result:

Source: https://github.com/ggerganov/llama.cpp
It would be great to:
1. check for a better (Web)GUI (on top of the terminal).
2. Add persona, like
https://www.youtube.com/watch?v=nVC9D9fRyNU
from
https://discord.com/channels/1018992679893340160/1094185166060138547/threads/1094187855854719007
P.S. The easiest AI local installation is to download 'one-click-installer' from
https://github.com/oobabooga/one-click-installers
(and follow prompt messages).
For Ubuntu \ Terminal:
$ chmod +x start_linux.sh
$ ./start_linux.sh
Yet, now it's not a perfect world. My failed attempts included:
- OobaBooga failed for my laptop hardware (no GPU found). Bug - reported. And it looks like the model I've selected could not work without NVIDIA graphic card
- Dalai lama failed due to folder rights restrictions, and a few items versions compatibility issues. So I skipped it, even if it looked promising. https://github.com/cocktailpeanut/dalai