the-biergarten-app/pipeline/README.md

# Biergarten Pipeline

Biergarten Pipeline is a C++20 command-line tool that reads a local city list, resolves contextual enrichment for each sampled city through an injected service, and generates brewery names and descriptions. The current code samples up to four locations per run, then uses either Gemma 4 or the mock generator to produce the output.

## Tested Hardware & OS

### x86/64 Linux, NVIDIA RTX 2000

- **Host**: ThinkPad P1 Gen 7 (Fedora 43)
- **CPU**: Intel Core Ultra 7 155H
- **GPU**: NVIDIA RTX 2000 Ada Generation
- **Memory**: 32GB
- **Model**: Gemma 4 E4B: efficient local reasoning; released Apr 2, 2026.
- **Inference**: llama.cpp with CUDA 12.x support

### ARM MacOS, M1 Pro

- **Host**: MacBook Pro 14" (2021)
- **CPU**: Apple M1 Pro (8-core)
- **GPU**: Apple M1 Pro (14-core) [Integrated]
- **Memory**: 16GB
- **Model**: Gemma 4 E4B: efficient local reasoning; released Apr 2, 2026.
- **Inference**: llama.cpp with Metal (MPS) support

## Pipeline

| Stage    | What happens                                                            |
| -------- | ----------------------------------------------------------------------- |
| Load     | Reads `locations.json` and picks up to four city/country pairs.         |
| Enrich   | Calls the injected enrichment service for each sampled city.            |
| Generate | Passes the city, country, and gathered context to the active generator. |
| Log      | Writes the generated breweries and any warnings through `spdlog`.       |

If an enrichment lookup throws, the pipeline skips that city and keeps going. If the lookup returns an empty string, the city stays in the pipeline and is still passed to the generator.

## Core Components

| Component               | Role                                                                   |
| ----------------------- | ---------------------------------------------------------------------- |
| BiergartenDataGenerator | Orchestrates loading, enrichment lookup, generation, and logging.      |
| IEnrichmentService      | Abstraction for location-context providers.                            |
| WikipediaService        | Default enrichment provider backed by Wikipedia and in-memory caching. |
| LlamaGenerator          | Runs local GGUF inference and validates output.                        |
| MockGenerator           | Produces deterministic fallback data without a model.                  |
| JsonLoader              | Parses the local `locations.json` file.                                |
| CURLWebClient           | Handles HTTP requests to Wikipedia.                                    |

## Build

| Requirement          | Notes                                                                      |
| -------------------- | -------------------------------------------------------------------------- |
| C++23 compiler       | GCC 13+ or Clang 16+ are good starting points.                             |
| CMake                | Version 3.24 or newer.                                                     |
| libcurl              | Required for Wikipedia requests.                                           |
| Optional GPU tooling | CUDA on NVIDIA, HIP/ROCm on supported AMD systems, Metal on Apple Silicon. |

Boost, Boost.DI, spdlog, and llama.cpp are fetched by CMake. On Apple Silicon, Metal is enabled automatically. On Linux, the build looks for CUDA or HIP/ROCm when the matching toolkit is present. There are no plans to support Windows.

```bash
cmake -S . -B build
cmake --build build
```

If the dependency build fails on macOS, check the repo build notes.

## Model

Create a `models/` directory and download the GGUF file there before running the app.

```bash
mkdir -p models
curl -L \
	-o models/google_gemma-4-E4B-it-Q6_K.gguf \
	https://huggingface.co/bartowski/google_gemma-4-E4B-it-GGUF/resolve/main/google_gemma-4-E4B-it-Q6_K.gguf?download=true
```

## Run

Run the executable from the build directory so the copied `locations.json` and `prompts/` directory are available.

```bash
./biergarten-pipeline --mocked
./biergarten-pipeline --model models/google_gemma-4-E4B-it-Q6_K.gguf --temperature 1.0 --top-p 0.95 --top-k 64 --n-ctx 8192 --seed -1
```

| Flag            | Purpose                                                                      |
| --------------- | ---------------------------------------------------------------------------- |
| `--mocked`      | Uses the mock generator instead of a model.                                  |
| `--model, -m`   | Path to a GGUF model file, such as `models/google_gemma-4-E4B-it-Q6_K.gguf`. |
| `--temperature` | Sampling temperature. Default: `1.0`.                                        |
| `--top-p`       | Nucleus sampling parameter. Default: `0.95`.                                 |
| `--top-k`       | Top-k sampling parameter. Default: `64`.                                     |
| `--n-ctx`       | Context window size. Default: `8192`.                                        |
| `--seed`        | Random seed. Default: `-1`.                                                  |
| `--help, -h`    | Prints usage.                                                                |

`--mocked` and `--model` are mutually exclusive. If neither is set, the program exits with an error. The sampling flags only matter when a model is loaded. The enrichment step is sequential now, and empty context is allowed.

## Layout

| Path             | Use                                         |
| ---------------- | ------------------------------------------- |
| `includes/`      | Public headers.                             |
| `src/`           | Implementation files.                       |
| `locations.json` | Input city list copied into the build tree. |
| `prompts/`       | Prompt text used by the model path.         |