Local-llm-setup for running LLMs in my (MacBook) laptop.
- Shell 90.9%
- Python 9.1%
| scripts | ||
| .gitignore | ||
| get_model_size.py | ||
| README.md | ||
Local LLM expeiments
Setpup
python3.14 -m venv llm-env
source llm-env/bin/activate
pip install mlx mlx-lm
Then add to the $PATH the variable the path to the scripts folder of this repo.
Choose an appriopriate model:
Use the get_model_size script to have an idea about the total occupance in memory of the model you are going to use:
# usage
./get_model_size.py <hugging_face_model_name>
# example:
./get_model_size.py mistralai/Mistral-7B-v0.1
Getting a model
model="mlx-community/Qwen3-30B-A3B-4bit"
hf download ${model}
Once downloaded, add it into the list of available models in the models.json file, and set it as current model with modelctx script:
Serving a model
mode="mlx-community/Qwen3-8B-8bit"
mlx_lm.server --model ${model}
And check the resource utilizage with
sudo macmon
List of donwloaded models
to keep track and avoid to have many GB of wasted space with things I don't use:
In general check the space occupancy inside:
du -h -d 2 $HOME/.cache/huggingface/
In particular models should be store as folder in the `$HOME/.cache/huggingface/hub** directory
Note after some experiments I would sudgest to check also $HOME/.lmstudio/models