Get started prompt engineering with local LLMs
Ollama is an application for running large-language models locally on your computer. It gives you access to open-source LLMs that you can prompt directly with the command line or an endpoint.
To getting started with local LLMs:
- Download and install Ollama: https://ollama.com/download
- When prompted, install the
ollama
CLI - Download and run your first LLM:
ollama run llama2
- Send your first prompt: “What is the chief end of man?”
The response will be printed to the console.
Using the CLI is nice, but a better option is to create and send prompts with a scripting language. I’m going to use Python and OpenAI’s chat completion API, since that a popular combination. For an example with JavaScript, see this documentation.
-
Create a new python file:
touch completions.py
-
Install the
openai
package:pip3 install openai
-
Set up your OpenAI client:
# completions.py from openai import OpenAI client = OpenAI( base_url="http://localhost:11434", # ll43a looks like llama api_key="ollama" # Unused but required )
-
Create your first completion:
# This function is not required, but it's nice to have def get_completion(prompt, model="llama2", temperature=0.0): messages = [{"role": "user", "content": prompt}] response = client.chat.completions.create( model=model, messages=messages, temperature=temperature, ) return response.choices[0].message.content response = get_completion("What is the chief end of man?") print(response)
-
Run your script:
python3 completions.py
That’s all it takes! For a good introduction to prompt writing, I recommend DeepLearning.AI’s ChatGPT Prompt Engineering for Developers course.