How to use LLM locally at home

NeurAI Project
5 min readFeb 9, 2024

--

Running LLM models is a complex task due to the need for programming knowledge and the use of command lines, such as using Ollama, llamafile, or localGPT.

Not everyone has these skills, and it seems that to check these models, we can only access them via commercial web services like ChatGPT, Copilot (Bing), and similar platforms.

Among all the options for running LLMs, LM Studio stands out above all.

LM Studio

If we want a graphical option installable on Windows, Linux, and MAC, with access to the largest LLM database such as HuggingFace and free of charge, this is currently the best option.

Features

  • Multiplatform (Windows, Linux, and MAC).
  • Simple and powerful graphical interface.
  • Run LLMs offline.
  • Possibility to use LLMs without censorship.
  • Use of GPU and CPU to accelerate the process.
  • Download compatible models from HuggingFace.
  • Access to LLM repositories, indicating the number of downloads.
  • Configurable to adjust to the models one needs.

Requirements for using LM Studio

To run LLM models, a computer with good specifications is required due to the need for power and RAM. This would be the minimum to use it:

  • Apple Silicon Mac (M1/M2/M3) with macOS 13.6 or newer.
  • Windows/Linux PC with a processor that supports AVX2 (typically newer PCs).
  • 16GB+ of RAM is recommended.
  • For PCs, 6GB+ of VRAM is recommended.
  • NVIDIA/AMD GPUs supported.
  • More than 50GB of space, NVM recommended.

Installation

For this part, we will make an example for Windows 11 with an Intel i9 with 64Gb of RAM and an Nvidia 2070.

  1. Go to the LM Studio website.
  2. Download the program for Windows.

3. Install the downloaded file.

4. Installation completed.

Using LM Studio

Once installed, it will be possible to start using LLM with very few steps.

Icons on the left side:

  • 🏠 Home: To return to the main screen.
  • 🔎 Search LLM: Area to search for LLM models in different repositories. It is possible to filter by downloads, likes, or recent.
  • 💬 Chat with LLM: To communicate with the selected LLM.
  • ↔️ Local Server: To set up your own LLM server with the downloaded models.
  • 📂LLM folder: All downloaded models.

News

On the main screen, there will be news or updates about the application. This allows you to stay informed about all the news related to the application.

Searching for an LLM model

Go to search and look for the Mixtral model and filter by the number of downloads:

Upon selecting it, on the right side, different models appear for download. The size is related to the depth of quantization, which makes it more precise but takes up more space.

We download the one with the highest quantization and which occupies 49.62 GB, but if you don’t have enough RAM, it is possible to use lower models.

With a 1Gb connection, it took me 20 minutes to complete the download process.

Selecting an LLM model to work with

On the 💬 CHAT screen, it is necessary to select the model you want to use.

  • Select a model to load: Select the model to use and wait for it to be loaded into memory, looking similar to the image below.

Using LLM models offline

And that would be all. On the chat screen, it will be possible to write to the model and through your own resources, get responses to what we have asked.

My advice is to try several models, as there are some specific ones for translating or programming. Normally the number of downloads is usually an indication that a model has a certain quality.

If we have a GPU with good power, it is possible to take advantage of it by activating its use and speeding up the response of the LLM to our questions.

Conclusion

This has been a brief guide to understand that to use LLM, it is not necessary to have extensive programming knowledge or command line usage, being able to use the latest models easily with our own resources.

It also shows us the potential that we have at Neurai for Open Source models and that we can train for our own needs and distribute the power among the network itself

Links

Referencias

  1. Brown, T. B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., … & Amodei, D. (2020). Language Models are Few-Shot Learners. arXiv preprint arXiv:2005.14165. https://arxiv.org/abs/2005.14165
  2. Techopedia. (2023). Large Language Model (LLM). https://www.techopedia.com/definition/34948/large-language-model-llm
  3. Papers Explained 64: Mistral 7B: https://medium.com/dair-ai/papers-explained-mistral-7b-b9632dedf580

--

--

NeurAI Project

Neurai is a Layer-1 blockchain with ASIC resistant and NFT/FT onchain focused on IoT and AI.