Local-llm-setup for running LLMs in my (MacBook) laptop.
  • Shell 90.9%
  • Python 9.1%
Find a file
2026-06-13 15:44:19 +02:00
scripts bug fixes 2026-06-13 15:44:19 +02:00
.gitignore bug fixes in openwebui 2026-06-13 15:16:59 +02:00
get_model_size.py added usage messages 2026-06-12 15:09:46 +02:00
README.md changed monitoring tool 2026-06-13 14:31:21 +02:00

Local LLM expeiments

Setpup

python3.14 -m venv llm-env
source llm-env/bin/activate
pip install mlx mlx-lm

Then add to the $PATH the variable the path to the scripts folder of this repo.

Choose an appriopriate model:

Use the get_model_size script to have an idea about the total occupance in memory of the model you are going to use:

# usage
./get_model_size.py <hugging_face_model_name>

# example:
./get_model_size.py mistralai/Mistral-7B-v0.1

Getting a model

model="mlx-community/Qwen3-30B-A3B-4bit"
hf download ${model}

Once downloaded, add it into the list of available models in the models.json file, and set it as current model with modelctx script:

Serving a model

mode="mlx-community/Qwen3-8B-8bit"
mlx_lm.server --model ${model}

And check the resource utilizage with

sudo macmon

List of donwloaded models

to keep track and avoid to have many GB of wasted space with things I don't use:

In general check the space occupancy inside:

du -h -d 2 $HOME/.cache/huggingface/

In particular models should be store as folder in the `$HOME/.cache/huggingface/hub** directory

Note after some experiments I would sudgest to check also $HOME/.lmstudio/models