Local models

Obsidian runs entirely locally, so why shouldn’t Enzyme work well with offline models, too? Let’s give it a try. This involves installing software that runs and exposes the model with an OpenAI compatible endpoint.

Currently, the best-supported configuration for locally-run Enzyme is as follows:

Model: Mistral-7B Instruct v0.2. Running Q8_0 quantization on my M1 Max 64GB machine (supports 32k context)
Host: LM Studio, featuring a custom preset for Mistral Instruct with a specific logit-biasing in order to coax reference generation

Instructions for setting up offline Enzyme with LM Studio

Install LM Studio here.

In LM Studio:

Search for Mistral-7B Instruct v0.2 (the right model should be this one) and download the desired model
In the “Local Inference Server” tab, download and import the the Mistral Instruct (Enzyme) preset.

Set the context window to a desired value, i.e. 32768.
Reload the model if required
Start the model server

In Obsidian:

In Obsidian’s settings for the Enzyme plugin, there should already be a model config for LM Studio (you’ll notice that the Base URL should line up with LM Studio). Click “Select” to activate that model config.

You should be all set!

Notes

Future work could streamline the setup process, namely as the tech improves:

The node-llama-cpp library avoids the need for a separate application to work as an LLM server. That said, I’m not sure if Obsidian plugins that require a binary package to be installed are well-supported). Integration of logit-biasing support within the library could help as well.
Will be watching WASM/WebGPU advancements and support within Electron
Currently, logit-biasing is pretty crucial to Enzyme working as intended. To my knowledge, Nitro and Ollama do not support