Getting llama.cpp
We can get llama.cpp with a package manager, downloading a Docker image, or even downloading the right binary files, as explained in the documentation. But we are going with the more flexible, and slightly more complex, option: compiling from the source code.
We download the code:
1
2
git clone https://github.com/ggml-org/llama.cpp
cd llama.cpp
We also have to make sure that we have all the basic tools we need:
1
sudo apt install -y build-essential cmake
build-essential includes all the standard C and C++ building and compiling tools. We also need cmake because llama.cpp uses CMake instead of Makefiles.
Building llama.cpp
The instructions for building are here. Since I have an Nvidia card, I’m using the CUDA option.
My first attempt failed because I used the CUDA toolkit from the Ubuntu repo:
1
sudo apt install nvidia-cuda-toolkit
To compile, llama.cpp CUDA uses nvcc, a tool that orchestrates the compilation of CPU code and GPU code. Unfortunately, the version installed (when I tried it) was CUDA 12, which includes nvcc 12. nvcc 12 fails with my RTX 5060 Ti card, so I had to go to the NVIDIA site and install the latest version.
The next step is configuring the build (making sure we are in the llama.cpp folder we got from git):
1
cmake -B build -DGGML_CUDA=ON
This line instructs CMake to use the build folder to generate the build and to enable CUDA support.
I got an error at this step because CMake couldn’t find nvcc in the path. I had to set the nvcc path variable (CUDACXX) manually, in my case, in /usr/local/cuda-13.0/bin/nvcc.
We now just build llama.cpp with
1
cmake --build build --config Release
It took over 16 minutes on my computer. To speed up the process, we can tell CMake to use several cores in parallel with the -j parameter:
1
cmake --build build --config Release -j 8
Running a model
To test llama.cpp I’m going to use Qwen3 8B. This is an 8.19B params reasoning model, very popular for general uses. I’m going with the GGUF version from unsloth: unsloth/Qwen3-8B-GGUF
To download it, I’ll first install this HuggingFace tool:
1
pip install -U "huggingface_hub[cli]"
This tool manages authentication, caching, and versioning, which is useful if we start downloading lots of models.
Now just run
1
hf download unsloth/Qwen3-8B-GGUF --local-dir ./Qwen3-8B-GGUF
and we are good to go.
We’ll find several different .gguf files in the Qwen3-8B-GGUF folder, corresponding with the different quantized models, so we pick one and run it like this:
1
./llama.cpp/build/bin/llama-cli -m ./Qwen3-8B-GGUF/Qwen3-8B-Q8_0.gguf
The client will log a lot of initialization info, followed by something like this:
1
2
3
4
5
6
7
8
9
== Running in interactive mode. ==
- Press Ctrl+C to interject at any time.
- Press Return to return control to the AI.
- To return control without starting a new line, end your input with '/'.
- If you want to submit another line, end your input with '\'.
- Not using system message. To change it, set a different value via -sys PROMPT
>
Qwen3-8B has a built-in chat template, allowing it to start in conversation mode instead of just text completion mode. Let’s try it with a riddle:
A man is looking at a picture of someone. His friend asks who it is. The man replies, “Brothers and sisters, I have none. But that person’s father is my father’s son.” Who is in the picture?
The answer is quite verbose, but correct (can be a son or a daughter). I include it at the end of the post.
The performance seems reasonable. To get specific numbers, llama.cpp includes a tool to measure tokens per second:
1
2
3
4
5
6
7
8
9
10
11
./llama.cpp/build/bin/llama-bench -m ./Qwen3-8B-GGUF/Qwen3-8B-Q8_0.gguf
ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 1 CUDA devices:
Device 0: NVIDIA GeForce RTX 5060 Ti, compute capability 12.0, VMM: yes
| model | size | params | backend | ngl | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | --------------: | -------------------: |
| qwen3 8B Q8_0 | 8.11 GiB | 8.19 B | CUDA | 99 | pp512 | 3302.48 ± 7.60 |
| qwen3 8B Q8_0 | 8.11 GiB | 8.19 B | CUDA | 99 | tg128 | 48.26 ± 0.03 |
build: 4b2dae38 (6740)
The model processes more than three thousand tokens per second for the input and 48 tokens per second for the output. Good enough.
In the next post, we’ll try to replicate the process with a different inference library.
The answer to the riddle
Thinking
Okay, so there’s this riddle: A man is looking at a picture of someone. His friend asks who it is. The man replies, “Brothers and sisters, I have none. But that person’s father is my father’s son.” Who is in the picture?
Hmm, let’s try to break this down. First, the man says he has no brothers or sisters. So, he’s an only child. That means his parents don’t have any other children besides him. Now, the second part is tricky: “that person’s father is my father’s son.”
Let me parse that sentence. The person in the picture has a father. And that father is the son of the man’s father. Wait, so the person’s father is the son of the man’s father. Let me think about the relationships here.
So, if the man’s father has a son, that would be the man himself, right? Because he has no brothers or sisters. So, the man’s father’s son is the man. Therefore, the person’s father is the man. Wait, so that means the person in the picture is the man’s son? Because the person’s father is the man.
But hold on, the man says he has no brothers or sisters, but he could have a son. So if the person in the picture is his son, then the person’s father is the man, which would be the man’s father’s son. Because the man is the son of his father. So that fits.
But wait, could there be another possibility? Let me think again. If the person in the picture is the man’s son, then the person’s father is the man, and the man’s father’s son is the man himself. So that works.
Alternatively, could the person in the picture be the man’s father? Let me check. If the person’s father is the man’s father’s son. So the person’s father would be the man’s father’s son. Since the man has no brothers or sisters, the man’s father’s son is only the man himself. Therefore, the person’s father is the man. Therefore, the person in the picture is the man’s son.
Wait, so the person in the picture is the man’s son. But another thought: what if the person is the man himself? But the man is looking at a picture of someone else, right? Because he says “who is it,” and he’s looking at a picture. So the person in the picture can’t be himself.
Wait, but maybe the person in the picture is the man’s father? Let me test that scenario. If the person in the picture is the man’s father, then the person’s father would be the man’s father’s father, right? But the man’s father’s son is the man himself. So that would mean that the person’s father (the man’s father’s father) is the man’s father’s son (which is the man). That can’t be, because the man’s father’s son is the man, not the man’s father’s father. So that doesn’t work.
Alternatively, if the person is the man’s son, then the person’s father is the man, and the man’s father’s son is the man. So that fits. Therefore, the person in the picture is the man’s son.
Wait, but maybe the person is the man’s father’s son? Which would be the man. But he’s looking at a picture of someone else, so that’s not possible.
Alternatively, could there be a case where the person is the man’s son? Let me confirm again. Let’s draw a family tree.
The man has no brothers or sisters. So his father has only one son, which is the man. So the man’s father’s son is the man. Therefore, the person’s father is the man. Therefore, the person is the man’s son.
So the answer would be the man’s son. But wait, the riddle is a bit tricky. Sometimes these riddles have unexpected answers. Let me think again.
Another angle: the man says “that person’s father is my father’s son.” So, the person’s father is the son of the man’s father. So the person’s father is the man. Because the man is the son of his father. Therefore, the person is the man’s son.
Yes, that seems to fit. So the person in the picture is the man’s son.
But wait, could the person be the man’s father? Let me check. If the person is the man’s father, then the person’s father is the man’s father’s father. But the man’s father’s son is the man. So unless the man is the father of the person, which would make the person the man’s son.
So I think the answer is the man’s son.
But sometimes, these riddles have a twist. Like, the person in the picture is the man himself. But he’s looking at a picture, so it’s someone else. So the answer must be his son.
Alternatively, maybe the person is his nephew? Wait, but the man has no brothers or sisters, so he can’t have a nephew. Because a nephew would be the son of his brother or sister. Since he has none, no nephews.
Alternatively, could the person be his father? Let’s check again. If the person is his father, then that person’s father is the man’s father’s father. But the man’s father’s son is the man. Therefore, the person’s father (the man’s father’s father) is not the same as the man. So that doesn’t fit.
So the only possibility is that the person is the man’s son. Therefore, the answer is the man’s son.
Wait, but sometimes these riddles play with the fact that the person could be the man’s father’s son, which is the man himself. But since the man is looking at a picture, the person can’t be himself. So maybe the answer is the man’s son.
Yes, I think that’s it. Let me just make sure there’s no other possibility.
Another way to think: the man’s father’s son. Since the man has no brothers, that son is the man. So the person’s father is the man. Therefore, the person is the man’s son.
Yes, that’s correct. So the answer should be the man’s son.
The riddle presents a seemingly complex family relationship, but it can be unraveled by carefully analyzing the statements.
End thinking
Step-by-Step Breakdown:
-
The man has no brothers or sisters. This means that the man is an only child. Therefore, his father has only one son—the man himself.
- The key statement: “that person’s father is my father’s son.”
Let’s break this down:
- “My father’s son” refers to the man himself (since he’s the only son).
- So, the person in the picture has a father, and that father is the man.
- Therefore, the person in the picture is the man’s son. This is the only logical conclusion that fits all the given conditions.
Why Other Possibilities Don’t Work:
-
Could it be the man’s father? If the person were the man’s father, then the person’s father would be the man’s father’s father, which is not the same as the man. This contradicts the statement.
-
Could it be the man himself? The riddle says the man is looking at a picture of someone, so the person in the picture cannot be himself.
-
Could it be the man’s nephew or cousin? Since the man has no brothers or sisters, he has no nephews or cousins either. So this is not a valid possibility.
Final Conclusion:
The only person who fits the description and satisfies all the conditions in the riddle is the man’s son.
\[\boxed{\text{The person in the picture is the man's son.}}\]