Skip to content

Latest commit

 

History

History
142 lines (93 loc) · 2.7 KB

homework.md

File metadata and controls

142 lines (93 loc) · 2.7 KB

Homework: Open-Source LLMs

In this homework, we'll experiment more with Ollama

It's possible that your answers won't match exactly. If it's the case, select the closest one.

Solution: https://www.loom.com/share/f04a63aaf0db4bf58194ba425f1fcffa

Q1. Running Ollama with Docker

Let's run ollama with Docker. We will need to execute the same command as in the lectures:

docker run -it \
    --rm \
    -v ollama:/root/.ollama \
    -p 11434:11434 \
    --name ollama \
    ollama/ollama

What's the version of ollama client?

To find out, enter the container and execute ollama with the -v flag.

Q2. Downloading an LLM

We will donwload a smaller LLM - gemma:2b.

Again let's enter the container and pull the model:

ollama pull gemma:2b

In docker, it saved the results into /root/.ollama

We're interested in the metadata about this model. You can find it in models/manifests/registry.ollama.ai/library

What's the content of the file related to gemma?

Q3. Running the LLM

Test the following prompt: "10 * 10". What's the answer?

Q4. Donwloading the weights

We don't want to pull the weights every time we run a docker container. Let's do it once and have them available every time we start a container.

First, we will need to change how we run the container.

Instead of mapping the /root/.ollama folder to a named volume, let's map it to a local directory:

mkdir ollama_files

docker run -it \
    --rm \
    -v ./ollama_files:/root/.ollama \
    -p 11434:11434 \
    --name ollama \
    ollama/ollama

Now pull the model:

docker exec -it ollama ollama pull gemma:2b 

What's the size of the ollama_files/models folder?

  • 0.6G
  • 1.2G
  • 1.7G
  • 2.2G

Hint: on linux, you can use du -h for that.

Q5. Adding the weights

Let's now stop the container and add the weights to a new image

For that, let's create a Dockerfile:

FROM ollama/ollama

COPY ...

What do you put after COPY?

Q6. Serving it

Let's build it:

docker build -t ollama-gemma2b .

And run it:

docker run -it --rm -p 11434:11434 ollama-gemma2b

We can connect to it using the OpenAI client

Let's test it with the following prompt:

prompt = "What's the formula for energy?"

Also, to make results reproducible, set the temperature parameter to 0:

response = client.chat.completions.create(
    #...
    temperature=0.0
)

How many completion tokens did you get in response?

  • 304
  • 604
  • 904
  • 1204

Submit the results