Add pipeline guide and enhance CMake configuration for llama integration

2026-04-05 18:09:04 +00:00 · 2026-03-28 14:16:31 -04:00
parent ad1adfeb62
commit 7f1ca2050c
4 changed files with 651 additions and 76 deletions
--- a/pipeline/README.md
+++ b/pipeline/README.md
@@ -0,0 +1,128 @@
+# Pipeline Guide
+
+This guide documents the end-to-end pipeline workflow for:
+
+- Building the C++ pipeline executable
+- Installing a lightweight GGUF model for llama.cpp
+- Running the pipeline with either default or explicit model path
+- Re-running from a clean build directory
+
+## Prerequisites
+
+- CMake 3.20+
+- A C++ compiler (Apple Clang on macOS works)
+- Internet access to download model files
+- Hugging Face CLI (`hf`) from `huggingface_hub`
+
+## Build
+
+From repository root:
+
+```bash
+cmake -S pipeline -B pipeline/dist
+cmake --build pipeline/dist -j4
+```
+
+Expected executable:
+
+- `pipeline/dist/biergarten-pipeline`
+
+## Install Hugging Face CLI
+
+Recommended on macOS:
+
+```bash
+brew install pipx
+pipx ensurepath
+pipx install huggingface_hub
+```
+
+If your shell cannot find `hf`, use the full path:
+
+- `~/.local/bin/hf`
+
+## Install a Lightweight Model (POC)
+
+The recommended proof-of-concept model is:
+
+- `Qwen/Qwen2.5-0.5B-Instruct-GGUF`
+- File: `qwen2.5-0.5b-instruct-q4_k_m.gguf`
+
+From `pipeline/dist`:
+
+```bash
+cd pipeline/dist
+mkdir -p models
+~/.local/bin/hf download Qwen/Qwen2.5-0.5B-Instruct-GGUF qwen2.5-0.5b-instruct-q4_k_m.gguf --local-dir models
+```
+
+## Run
+
+### Option A: Explicit model path (recommended)
+
+```bash
+cd pipeline/dist
+./biergarten-pipeline --model models/qwen2.5-0.5b-instruct-q4_k_m.gguf
+```
+
+### Option B: Default model path
+
+If you want to use default startup behavior, place a model at:
+
+- `pipeline/dist/models/llama-2-7b-chat.gguf`
+
+Then run:
+
+```bash
+cd pipeline/dist
+./biergarten-pipeline
+```
+
+## Output Files
+
+The pipeline writes output to:
+
+- `pipeline/dist/output/breweries.json`
+- `pipeline/dist/output/beer-styles.json`
+- `pipeline/dist/output/beer-posts.json`
+
+## Clean Re-run Process
+
+If you want to redo from a clean dist state:
+
+```bash
+rm -rf pipeline/dist
+cmake -S pipeline -B pipeline/dist
+cmake --build pipeline/dist -j4
+cd pipeline/dist
+mkdir -p models
+~/.local/bin/hf download Qwen/Qwen2.5-0.5B-Instruct-GGUF qwen2.5-0.5b-instruct-q4_k_m.gguf --local-dir models
+./biergarten-pipeline --model models/qwen2.5-0.5b-instruct-q4_k_m.gguf
+```
+
+## Troubleshooting
+
+### `zsh: command not found: huggingface-cli`
+
+The app name from `huggingface_hub` is `hf`, not `huggingface-cli`.
+
+Use:
+
+```bash
+~/.local/bin/hf --help
+```
+
+### `Model file not found ...`
+
+- Confirm you are running from `pipeline/dist`.
+- Confirm the file path passed to `--model` exists.
+- If not using `--model`, ensure the default file exists at `models/llama-2-7b-chat.gguf` relative to current working directory.
+
+### CMake cache/path mismatch
+
+Use explicit source/build paths:
+
+```bash
+cmake -S /absolute/path/to/pipeline -B /absolute/path/to/pipeline/dist
+cmake --build /absolute/path/to/pipeline/dist -j4
+```