Introduction to Local LLMs – by Iulian

The topic of LLM is quite popular right now so Iulian Arcus covered how to regain sovereignty over the tools we may be using day to day very soon.

Regardless of the quality of the tools, the discourse seems to be heading towards LLMs being more integrated in our daily tasks. But with models changing every month, sometimes without recourse, and funding for the big providers uncertain, having control over your tools becomes more important.

We covered how to install LM Studio, a simple interface backed by the open-source library llama.cpp and discussed how to set it up using small models such as llama 3.1 instruct 8b. We covered what those parts of the name mean and what to look out for when downloading models to best fit your needs.
With a black box next to his laptop, Iulian explained that even on a laptop you can benefit from a 5x speedup by using an external GPU over Thunderbolt 3/USB 4.

Afterwards we looked at LM Studio’s ability to run as an API and connected it to VS Code via the Continue plugin.

The demos worked but the results given by the LLMs were not very functional. Iulian explained that models under 8b generally don’t perform well but can still offer good terms to search for when you’re unfamiliar with a topic. And remember to always double check commands you’re running on your system. The coding examples also threw some errors and trying to pass the errors back didn’t fix the issue immediately. (The next day after the mist of the live demo cleared, the second error was just the code asking for a file as an input, so the code actually works!)

The evening finished with a discussion about the participants’ experience with LLMs and the online models seem to provide better results. Web technologies and python seem to be more compatible but more difficult or niche languages suffer. Iulian made the analogy that you wouldn’t hire a python developer for a C++ job so why do you expect a model to be more capable?

The local LLM space is exciting but the expectations are higher than the results. The session ended on a good note: the techniques are getting better so newer models of the same size are more capable. But now we have the foundations to evaluate them for ourselves.

Note: We attempted to record this session but the setup failed to capture more than the introduction. We may still upload that.