the-biergarten-app/pipeline/README.md

# Pipeline Guide

This guide documents the end-to-end pipeline workflow for:

- Building the C++ pipeline executable
- Installing a lightweight GGUF model for llama.cpp
- Running the pipeline with either default or explicit model path
- Re-running from a clean build directory

## Prerequisites

- CMake 3.20+
- A C++ compiler (Apple Clang on macOS works)
- Internet access to download model files
- Hugging Face CLI (`hf`) from `huggingface_hub`

## Build

From repository root:

```bash
cmake -S pipeline -B pipeline/dist
cmake --build pipeline/dist -j4
```

Expected executable:

- `pipeline/dist/biergarten-pipeline`

## Install Hugging Face CLI

Recommended on macOS:

```bash
brew install pipx
pipx ensurepath
pipx install huggingface_hub
```

If your shell cannot find `hf`, use the full path:

- `~/.local/bin/hf`

## Install a Lightweight Model (POC)

The recommended proof-of-concept model is:

- `Qwen/Qwen2.5-0.5B-Instruct-GGUF`
- File: `qwen2.5-0.5b-instruct-q4_k_m.gguf`

From `pipeline/dist`:

```bash
cd pipeline/dist
mkdir -p models
~/.local/bin/hf download Qwen/Qwen2.5-0.5B-Instruct-GGUF qwen2.5-0.5b-instruct-q4_k_m.gguf --local-dir models
```

## Run

### Option A: Explicit model path (recommended)

```bash
cd pipeline/dist
./biergarten-pipeline --model models/qwen2.5-0.5b-instruct-q4_k_m.gguf
```

### Option B: Default model path

If you want to use default startup behavior, place a model at:

- `pipeline/dist/models/llama-2-7b-chat.gguf`

Then run:

```bash
cd pipeline/dist
./biergarten-pipeline
```

## Output Files

The pipeline writes output to:

- `pipeline/dist/output/breweries.json`
- `pipeline/dist/output/beer-styles.json`
- `pipeline/dist/output/beer-posts.json`

## Clean Re-run Process

If you want to redo from a clean dist state:

```bash
rm -rf pipeline/dist
cmake -S pipeline -B pipeline/dist
cmake --build pipeline/dist -j4
cd pipeline/dist
mkdir -p models
~/.local/bin/hf download Qwen/Qwen2.5-0.5B-Instruct-GGUF qwen2.5-0.5b-instruct-q4_k_m.gguf --local-dir models
./biergarten-pipeline --model models/qwen2.5-0.5b-instruct-q4_k_m.gguf
```

## Troubleshooting

### `zsh: command not found: huggingface-cli`

The app name from `huggingface_hub` is `hf`, not `huggingface-cli`.

Use:

```bash
~/.local/bin/hf --help
```

### `Model file not found ...`

- Confirm you are running from `pipeline/dist`.
- Confirm the file path passed to `--model` exists.
- If not using `--model`, ensure the default file exists at `models/llama-2-7b-chat.gguf` relative to current working directory.

### CMake cache/path mismatch

Use explicit source/build paths:

```bash
cmake -S /absolute/path/to/pipeline -B /absolute/path/to/pipeline/dist
cmake --build /absolute/path/to/pipeline/dist -j4
```