Add ethics document, edit diagrams

2026-07-16 17:47:22 +00:00 · 2026-04-22 04:55:06 -04:00
parent a8e0ced8ba
commit d40ce34363
14 changed files with 471 additions and 161 deletions
--- a/pipeline/ETHICS-AND-KNOWN-ISSUES.md
+++ b/pipeline/ETHICS-AND-KNOWN-ISSUES.md
@@ -0,0 +1,330 @@
 # Ethics, Bias, and Known Issues
 This document covers the ethical context of the Biergarten Pipeline's output,
 the model's biases, and known issues including hallucinated brewing science and
 low-resource language failures.
 > Note that all testing was used using `google_gemma-4-E4B-it-Q6_K.gguf`.
 ## Table of Contents
 - [What This Dataset Is](#what-this-dataset-is)
 - [What This Dataset Is Not](#what-this-dataset-is-not)
 - [Model Bias and Language Quality](#model-bias-and-language-quality)
 - [Western and Eurocentric Lens](#western-and-eurocentric-lens)
 - [Wikipedia Enrichment](#wikipedia-enrichment)
 - [The "Avoid AI Phrases" Prompt Instruction](#the-avoid-ai-phrases-prompt-instruction)
 - [Known Issues](#known-issues)
  - [Hallucinated Brewing Techniques](#hallucinated-brewing-techniques)
  - [Low-Resource Language Hallucination](#low-resource-language-hallucination)
 ---
 ## What This Dataset Is
 This is AI-generated fixture data for a proof-of-concept version of The
 Biergarten App. Anyone who interacts with an application seeded from this
 pipeline must be told upfront that the content is AI-generated.
 ---
 ## What This Dataset Is Not
 The pipeline is not intended to produce accurate brewing science, faithful
 cultural representation, or reliable local-language text. Hallucinations such as
 invented fermentation techniques, or incoherent local-language prose, are
 expected, observed, and partially documented in [Known Issues](#known-issues)
 below.
 Human control sits at the context layer (i.e. prompt design, Wikipedia
 enrichment). Statistical output shapes in future pipeline stages (check-in
 distributions, rating skews, activity profiles) will be handled the same way.
 **Treat this data as an exercise in prompt engineering and model behaviour, not
 as a source of truth for brewing techniques or cultural representation.**
 **Natural language processing, although a powerful tool for data analysis and
 generation is to be taken with scrutiny. Human language is not simply data
 points to be analyzed, but carries deep cultural and human meaning that
 artificial intelligence is incapable of.**
 ---
 ## Model Bias and Language Quality
 The underlying model's training biases surface measurably in this pipeline.
 Output quality tracks with how well a language is represented in the training
 corpus: standard French (`fr-FR`) produces coherent text; regional variants like
 `fr-CD` and `fr-CI` are noticeably weaker; low-resource languages like Welsh,
 Māori, and Sicilian produce output that is syntactically plausible but often
 semantically broken.
 This is a property of the training distribution, not something that can be
 mitigated through prompt design. This is a well-documented characteristic of
 large language models trained predominantly on English-language
 material.[^llm-bias]
 Mitigations are documented in
 [Known Issues: Low-Resource Language Hallucination](#low-resource-language-hallucination).
 ### Western and Eurocentric Lens
 The model's training data skews heavily Western and North American. When
 generating brewery descriptions for Kinshasa, Abidjan, or Osaka, for example, it
 defaults to framing and cultural reference points drawn from that perspective
 rather than from the lived context of those cities. Wikipedia enrichment grounds
 some generation in city-specific material, but it does not eliminate the skew.
 **Output should be read with an understanding of this bias.**
 ---
 ## Wikipedia Enrichment
 City and beer context is fetched from the Wikipedia API. Wikipedia text is
 co-licensed under the **Creative Commons Attribution-ShareAlike 4.0
 International License (CC BY-SA 4.0)** and the **GNU Free Documentation License
 (GFDL)**.[^wp-license]
 Wikipedia's own accuracy limitations and editorial biases can propagate into
 generated descriptions.
 ---
 ## The "Avoid AI Phrases" Prompt Instruction
 The system prompt instructs the model to avoid common AI-generated phrasing
 patterns. This is a prompt engineering experiment:
 > How far can a model be pushed against its own stylistic defaults?
 This is not an attempt to disguise the content as human-written. All downstream
 consumers are informed of the AI-generated origin before engagement.
 ---
 ## Known Issues
 ### Hallucinated Brewing Techniques
 When forced by the system prompt to generate a "highly specialized technical
 brewing detail," the model frequently hallucinates fermentation science and
 brewing chemistry. While the resulting sounds confident, it will be nonsensical
 to reader with brewing and/or scientific expertise.
 Small-parameter models such as Gemma 4 E4B cannot reliably satisfy prompts
 calling for specialist brewing detail. This is consistent with the **CHOKE**
 failure mode (Certain Hallucinations Overriding Known Evidence) [^llm-choke]
 where a specific-sounding prompt causes a model to produce confident, coherent,
 and factually wrong output.
 #### Example — Osaka, produced using Gemma 4 E4B
 ```log
 [2026-04-21 15:04:40.258] [info] 11. city="Osaka" country="Japan" state="Osaka" iso3166_2=JP-27 lat=34.6937 lon=135.5023
 [2026-04-21 15:04:40.258] [info]    brewery_name_en="Kani-no-Kuni Brewing"
 [2026-04-21 15:04:40.258] [info]    brewery_description_en="The humid scent of takoyaki and savory grilled squid always settles over the neighborhood early in the morning, mingling strangely with the metallic tang of spent grain outside our drying shed. We decided to build this place here because Osaka is a city that never pauses, a place built around the constant exchange of goods and tastes, and that is what we want our beer to reflect. Many people here are accustomed to the dependable, clean taste of the major pale lagers, but we are interested in exploring the deeper, more textured expressions of fermentation. Our base malts are specifically crafted by incorporating strains of local rice, like Yamada Nishiki, which we malt and use not for sake, but for its complex, buttery flavor profile during the mash. This combination, followed by a slow, anaerobic aging process, gives our beers a profound, satisfying depth that stands apart from the hurried mass-market style. The heavy, deep red brick of the original warehouse wall has absorbed decades of Kansai humidity and seasonal rains, creating a patina that tells the exact story of this district's tireless movement. We chose this structure not for its charm, but for its resilience and the sheer density of the local history held within its mortar. Our goal is simply to serve a drink worthy of this powerful trading city. If you are looking for a quiet spot away from the main thoroughfare, look for us just off the side street near Shinsekai."
 [2026-04-21 15:04:40.258] [info]    brewery_name_local="カニの国ブルワリー"
 [2026-04-21 15:04:40.258] [info]    brewery_description_local="早朝の、たこ焼きや香ばしいイカ焼きの湿った匂いは、いつも乾燥小屋の外にある使用済み麦芽の金属的な匂いと奇妙に混ざり合って近隣に漂います。私たちはこの場所に店を構えることを決めたのです。なぜなら、大阪は決して止まることのない都市であり、商品と味が絶え間なく交換されることで築かれた場所だからです。地元の多くの方々は、信頼できる大規模な淡麗ラガーの味が習慣になっていますが、私たちは発酵の、より深く、より複雑な表現を探求することに関心があります。私たちのベースモルトは、山田錦のような地元の米の品種を意図的に組み込んで作られています。この米を酒ではなく、麦芽として、仕込みの最中にその複雑でバターのような風味を引き出すために使用しています。この組み合わせを、ゆっくりとした嫌気的な熟成プロセスに続けることで、私たちのビールは、慌ただしい市場のスタイルとは一線を画す、深みのある、満足感のある複雑さを持っています。オリジナルの倉庫の重く深紅のレンガ壁は、関西特有の湿気と季節の雨を何十年も吸収し、この地区の絶え間ない動きの正確な物語を語るような古色を帯びています。私たちはこの構造物を、その魅力のためではなく、その回復力とモルタルに込められた地域の歴史の密度ゆえに選びました。私たちの目標は、ただこの力強い交易都市に値する飲み物を提供することだけです。もしメインの通りから離れた静かな場所をお探しなら、新世界近くの脇道にある私たちを探してください。"
 ```
 A review of the following text for brewing techniques reveals several
 inaccuracies, and no comments could be made on the local-language version due to
 my own lack of proficiency in Japanese:
 #### 1. "Buttery flavours" framed as a desirable malt-derived flavour
 **Incorrect.**
 Diacetyl is a fermentation byproduct of yeast metabolism, not a malt-derived
 compound.[^diacetyl-source] Diacetyl produces a buttery or butterscotch
 off-flavour and is carefully managed in many beer styles, in particular lighter
 beers, through a process called a _diacetyl rest_. In this process, fermentation
 temperature is briefly raised to allow yeast to reabsorb the compound before
 packaging.[^diacetyl-rest]
 The Oxford Companion to Beer claims that, while low levels are tolerable in some
 ales and stouts, diacetyl is considered undesirable at any perceptible
 concentration when it results from bacterial contamination or stressed
 fermentation.[^oxford-beer]
 #### 2. Yamada Nishiki sake rice described as a self-saccharifying base malt
 **Incorrect.**
 Yamada Nishiki (_山田錦_) is a short-grain Japanese rice bred specifically for
 sake production.[^yn-wiki] Its value lies in its large starchy core
 (_shinpaku_), low protein content, and amenability to _koji_ mold penetration
 during saccharification.[^yn-sakestreet] Sake brewing does not use the grain's
 own enzymatic activity for saccharification — it relies on _Aspergillus oryzae_
 (koji mold) grown on a portion of the steamed rice to convert starches to
 fermentable sugars.[^yn-sakeonline]
 #### 3. "Anaerobic aging" presented as a differentiating technique
 **Misleading**
 Anaerobic conditions during packaging and aging are not differentiating
 technique. Anaerobic conditions are the standard baseline for all commercial
 beer production. Breweries exclude oxygen as a top priority for packaging and
 shelf stability; published research in _Microbiology Spectrum_ confirms that
 packaged beer constitutes an anaerobic environment by definition.[^anaerobic]
 Professional packaging lines use CO_2 purges and closed transfers specifically
 to maintain this state.[^packaging] Framing anaerobic aging as a distinctive
 practice is misleading and suggests hallucinated output.
 ### Low-Resource Language Hallucination
 The generation pipeline passes local language codes to the model to retrieve a
 translated `description_local`. Output quality is reliable for high-resource
 languages such as French, though it may struggle with regional variants and
 idiomatic phrasing.
 For languages such as Welsh (Wales), Māori (Aotearoa/New Zealand), or Sicilian
 (Sicily, Italy), the model can generate text that looks syntactically plausible
 but is semantically incoherent. This comes from limited training-data coverage
 rather than prompt engineering. The resource-tier gradient is illustrated by
 this set of French-speaking cities, where `fr-FR` (Paris) produces coherent
 output while `fr-CD` (Kinshasa) and `fr-CI` (Abidjan) do not:
 ```json
 [
  {
    "city": "Kinshasa",
    "state_province": "Kinshasa",
    "iso3166_2": "CD-KN",
    "country": "Democratic Republic of the Congo",
    "iso3166_1": "CD",
    "latitude": -4.4419,
    "longitude": 15.2663,
    "local_languages": ["fr-CD", "ln"]
  },
  {
    "city": "Paris",
    "state_province": "Île-de-France",
    "iso3166_2": "FR-IDF",
    "country": "France",
    "iso3166_1": "FR",
    "latitude": 48.8566,
    "longitude": 2.3522,
    "local_languages": ["fr-FR"]
  },
  {
    "city": "Abidjan",
    "state_province": "Abidjan",
    "iso3166_2": "CI-AB",
    "country": "Ivory Coast",
    "iso3166_1": "CI",
    "latitude": 5.36,
    "longitude": -4.0083,
    "local_languages": ["fr-CI"]
  },
  {
    "city": "Montreal",
    "state_province": "Quebec",
    "iso3166_2": "CA-QC",
    "country": "Canada",
    "iso3166_1": "CA",
    "latitude": 45.5017,
    "longitude": -73.5673,
    "local_languages": ["fr-CA"]
  },
  {
    "city": "Brussels",
    "state_province": "Brussels-Capital Region",
    "iso3166_2": "BE-BRU",
    "country": "Belgium",
    "iso3166_1": "BE",
    "latitude": 50.8503,
    "longitude": 4.3517,
    "local_languages": ["fr-BE", "nl-BE"]
  }
 ]
 ```
 Output sample:
 [./out-sample/french-cities.example](out-sample/french-cities.example)
 #### Proposed Mitigations
 - **Prevention via allowlist:** introduce a high-resource language allowlist. If
  a location's code is unlisted, skip `description_local` generation and fall
  back to English.
 - **Upstream sanitization:** strip known low-resource language codes from the
  `locations.json` payload before generation.
 - **Downstream flagging:** add a `description_local_confidence` column to the
  SQLite schema so downstream applications can filter or flag potentially
  hallucinated text by language tier.
 ---
 ## Footnotes
 [^llm-choke]:
    See: "CHOKE: Certain Hallucinations Overriding Known Evidence," a term
    coined by AI researcher Gwern Branwen to describe a failure mode where a
    model produces confident, coherent, and factually wrong output in response
    to a specific-sounding prompt. Source:
    [Trust Me, I'm Wrong: LLMs Hallucinate with Certainty Despite Knowing the Answer](https://arxiv.org/abs/2502.12964)
 [^llm-bias]:
    e.g., Blasi et al. (2022), "Systematic Inequalities in Language Technology
    Performance across the World's Languages," _ACL Anthology_. The pattern is
    consistent with models trained predominantly on English-language web
    corpora.
 [^wp-license]:
    Source:
    [Wikipedia:FAQ/Copyright](https://en.wikipedia.org/wiki/Wikipedia:FAQ/Copyright).
 [^cc-sa]:
    Creative Commons CC BY-SA 4.0 deed: "If you remix, transform, or build upon
    the material, you must distribute your contributions under the same license
    as the original." Source:
    [creativecommons.org/licenses/by-sa/4.0](https://creativecommons.org/licenses/by-sa/4.0/deed.en).
 [^diacetyl-source]:
    White Labs: "Diacetyl is a natural byproduct of fermentation… produced by
    brewer's yeast… generally considered undesirable at any perceived level."
    Source:
    [whitelabs.com — Compound Spotlight: Diacetyl](https://www.whitelabs.com/news-update-detail?id=54).
 [^diacetyl-rest]:
    Brewing Science Institute: diacetyl "is produced during the fermentation
    process, primarily as a byproduct of yeast metabolism… generally considered
    a flaw in most beer styles." Source:
    [brewingscience.com — Diacetyl: Understanding Its Role as an Off-Flavor in Beer](https://brewingscience.com/diacetyl-understanding-its-role-as-an-off-flavor-in-beer/).
 [^oxford-beer]:
    Oxford Companion to Beer via _Beer & Brewing_: "At low to moderate levels,
    diacetyl can be perceived as a positive flavor characteristic in some ales
    and stouts" but "particularly unwelcome in lager-style beers." Source:
    [beerandbrewing.com — diacetyl](https://www.beerandbrewing.com/dictionary/48TDqQibPi).
 [^yn-wiki]:
    Wikipedia: "Yamada Nishiki (山田錦) is a short-grain Japanese rice famous
    for its use in high-quality sake." Source:
    [en.wikipedia.org/wiki/Yamada_Nishiki](https://en.wikipedia.org/wiki/Yamada_Nishiki).
 [^yn-sakestreet]:
    Sake Street: Yamadanishiki's large _shinpaku_ allows koji mold to penetrate
    to the centre of the rice grain, making it "particularly suitable for
    producing good koji." Source:
    [sakestreet.com — What is Yamadanishiki?](https://sakestreet.com/en/media/what-is-yamadanishiki).
 [^yn-sakeonline]:
    Sake Online: "Steamed rice is added to make koji (rice malt) and yeast
    starter, which promotes alcohol fermentation." Source:
    [sakeonline.com.au — Types of Sake Rice: Yamada Nishiki](https://sakeonline.com.au/blogs/news/types-of-sake-rice-yamada-nishiki-and-its-characteristics).
 [^anaerobic]:
    Pai et al. (2022): "Breweries have recognized oxygen exclusion as a top
    priority for the proper packaging and aging of beer… packaged beer is an
    anaerobic environment." _Microbiology Spectrum._ Source:
    [journals.asm.org](https://journals.asm.org/doi/10.1128/spectrum.02656-22).
 [^packaging]:
    Beer Production Processes (oboe.com): Professional packaging lines use
    double CO_2 pre-evacuation cycles and closed transfers "so the beer moves in
    a completely anaerobic environment." Source:
    [oboe.com — Flavor Quality Control](https://oboe.com/learn/beer-production-processes-308lmf/flavor-quality-control-4).
--- a/pipeline/README.md
+++ b/pipeline/README.md
@@ -1,34 +1,42 @@
 # Biergarten Pipeline
-A C++20 command-line pipeline that samples city records from local JSON, enriches each with Wikipedia context, and generates bilingual brewery names and descriptions via a local GGUF model or a deterministic mock.
+A C++20 command-line pipeline that samples city records from local JSON,
 enriches each with Wikipedia context, and generates bilingual brewery names and
 descriptions via a local GGUF model or a deterministic mock.
 > **This pipeline produces AI-generated data.** It is not a source of truth for
 > brewing techniques, cultural representation, or local-language accuracy. See
 > [ETHICS-AND-KNOWN-ISSUES.md](ETHICS-AND-KNOWN-ISSUES.md) for full
 > documentation of limitations, hallucination patterns, and bias.
 ---
 ## Table of Contents
 - [How It Fits The Main App](#how-it-fits-the-main-app)
- [Tech Stack](#tech-stack)
+- [Quick Start](#quick-start)
- [Build](#build)
+  - [Build](#build)
- [Model](#model)
+  - [Model](#model)
- [Run](#run)
+  - [Run](#run)
 - [Architecture](#architecture)
  - [Pipeline Stages](#pipeline-stages)
  - [Key Components](#key-components)
  - [Runtime Behaviour](#runtime-behaviour)
 - [Generated Output](#generated-output)
- [Language Generation Quality](#language-generation-quality)
+- [Tech Stack](#tech-stack)
  - [Known Issues](#known-issues)
 - [Tested Hardware](#tested-hardware)
 - [Fixture Strategy](#fixture-strategy)
 - [Repo Layout](#repo-layout)
 - [Code Tour](#code-tour)
 - [Fixture Strategy](#fixture-strategy)
 - [Next Steps](#next-steps)
 ---
 ## How It Fits The Main App
-The pipeline is a data ingestion layer. It sits outside the web app runtime and produces seed records the app imports at startup or during a dedicated seed step.
+The pipeline is a data ingestion layer. It sits outside the web app runtime and
 produces seed records the app imports at startup or during a dedicated seed
 step.
 | Planned app area                 | Pipeline contribution                                              |
 | -------------------------------- | ------------------------------------------------------------------ |
@@ -39,35 +47,20 @@ The pipeline is a data ingestion layer. It sits outside the web app runtime and
 ---
-## Tech Stack
+## Quick Start
- C++20
+### Build
 - CMake 3.24+
 - Boost.JSON, Boost.ProgramOptions, Boost.DI
 - spdlog
 - libcurl
 - SQLite amalgamation fetched and compiled via CMake FetchContent
 - llama.cpp
-The build fetches Boost.DI, spdlog, llama.cpp, and SQLite via CMake. Metal is enabled on Apple Silicon; CUDA or HIP/ROCm is detected on Linux when the toolkit is present.
+Requirements: C++20 compiler, CMake 3.24+, libcurl, Boost (JSON and
-
+ProgramOptions). SQLite is fetched from the upstream amalgamation, so no system
-> **Code Style:** Modern C++20 throughout - RAII for ownership, `std::unique_ptr` for injected dependencies, `std::optional` for parse outcomes, `std::span` for read-only views over generated city data, structured bindings in pipeline loops. Formatting follows the Google C++ Style Guide via `.clang-format` with a narrow column limit and two-space indentation.
+SQLite package is required.
 ---
 ## Build
 Requirements: C++20 compiler, CMake 3.24+, libcurl, Boost (JSON and ProgramOptions).
 SQLite is fetched from the upstream amalgamation, so no system SQLite package is required.
 ```bash
 cmake -S . -B build
 cmake --build build
 ```
---
+### Model
 ## Model
 > Skip this step if you only need `--mocked`.
@@ -78,18 +71,18 @@ curl -L \
  https://huggingface.co/bartowski/google_gemma-4-E4B-it-GGUF/resolve/main/google_gemma-4-E4B-it-Q6_K.gguf?download=true
 ```
---
+### Run
-## Run
+Run from `build/` so the copied `locations.json` and `prompts/` are available.
-
+Each run also writes a fresh dated SQLite file such as
-Run from `build/` so the copied `locations.json` and `prompts/` are available. Each run also writes a fresh dated SQLite file such as `biergarten_seed_2026-04-19T15-30-45.123456Z.sqlite` into the working directory.
+`biergarten_seed_2026-04-19T15-30-45.123456Z.sqlite` into the working directory.
 ```bash
 ./biergarten-pipeline --mocked
 ./biergarten-pipeline --model models/google_gemma-4-E4B-it-Q6_K.gguf --temperature 1.0 --top-p 0.95 --top-k 64 --n-ctx 8192 --seed -1
 ```
-### CLI Flags
+#### CLI Flags
 | Flag            | Purpose                                                 |
 | --------------- | ------------------------------------------------------- |
@@ -102,9 +95,12 @@ Run from `build/` so the copied `locations.json` and `prompts/` are available. E
 | `--seed`        | Random seed. Default: `-1` (random at runtime).         |
 | `--help, -h`    | Print usage and exit.                                   |
-`--mocked` and `--model` are mutually exclusive. Omitting both exits with an error before the pipeline starts. Sampling flags are ignored when `--mocked` is set.
+`--mocked` and `--model` are mutually exclusive. Omitting both exits with an
 error before the pipeline starts. Sampling flags are ignored when `--mocked` is
 set.
-The post-build step copies `prompts/` into `build/prompts/`. Rebuild after editing `prompts/system.md`.
+The post-build step copies `prompts/` into `build/prompts/`. Rebuild after
 editing `prompts/system.md`.
 ---
@@ -121,41 +117,58 @@ The post-build step copies `prompts/` into `build/prompts/`. Rebuild after editi
 | Store    | `SqliteExportService` writes each successful brewery into a fresh dated `.sqlite` database with normalized location and brewery tables. |
 | Log      | `spdlog` writes results and warnings to the console.                                                                                    |
-If enrichment or generation fails for a city, that city is skipped and the pipeline continues.
+If enrichment or generation fails for a city, that city is skipped and the
 pipeline continues.
 ### Key Components
- `src/main.cc` - argument parsing and Boost.DI composition root.
+- `src/main.cc` — argument parsing and Boost.DI composition root.
- `JsonLoader` - validates curated location input.
+- `JsonLoader` — validates curated location input.
- `WikipediaService` - queries Wikipedia extracts, caches results, returns empty context on failure.
+- `WikipediaService` — queries Wikipedia extracts, caches results, returns empty
- `LlamaGenerator` - formats prompts for Gemma 4, validates JSON output, retries malformed responses up to three times. If output looks truncated, the retry raises the token budget before trying again.
+  context on failure.
- `MockGenerator` - stable hash-based output so the same city input always produces the same brewery.
+- `LlamaGenerator` — formats prompts for Gemma 4, validates JSON output, retries
- `SqliteExportService` - creates a dated SQLite file per run and persists each successful brewery into normalized tables.
+  malformed responses up to three times. If output looks truncated, the retry
- Brewery payloads include English and local-language name and description fields.
+  raises the token budget before trying again.
 - `MockGenerator` — stable hash-based output so the same city input always
  produces the same brewery.
 - `SqliteExportService` — creates a dated SQLite file per run and persists each
  successful brewery into normalized tables.
 - Brewery payloads include English and local-language name and description
  fields.
 ### Runtime Behaviour
-`WikipediaService` queries city, country, and beer-related Wikipedia extracts using its configured lookup, then caches the first successful response per query string. The fetched extract text is included in the prompt as context for generation.
+`WikipediaService` queries city, country, and beer-related Wikipedia extracts
 using its configured lookup, then caches the first successful response per query
 string. The fetched extract text is included in the prompt as context for
 generation.
-`GetLocationContext()` returns an empty string when the web client is unavailable or when lookup/parsing fails.
+`GetLocationContext()` returns an empty string when the web client is
 unavailable or when lookup/parsing fails.
-`LlamaGenerator` validates model output as structured JSON. The retry path exists as a safety hatch for cases where the reasoning block consumes available token budget and compresses the JSON output space. All runs to date have produced valid output on the first pass; the path is kept for resilience.
+`LlamaGenerator` validates model output as structured JSON. The retry path
 exists as a safety hatch for cases where the reasoning block consumes available
 token budget and compresses the JSON output space. All runs to date have
 produced valid output on the first pass; the path is kept for resilience.
-`MockGenerator` uses stable hashes for repeatable output in demos and Storybook runs.
+`MockGenerator` uses stable hashes for repeatable output in demos and Storybook
 runs.
 ### Process Flow - Activity Diagram
-![An activity diagram](./diagrams/activity-diagram.svg)
+![An activity diagram](./diagrams/current/output/activity.svg)
 ### Architectural Overview - Class Diagram
-![A class diagram](./diagrams/class-diagram.svg)
+![A class diagram](./diagrams/current/output/class.svg)
 ---
 ## Generated Output
-Each successful run stores a `GeneratedBrewery` pair with the source location and a `BreweryResult` payload. The same generated records are also written to a fresh SQLite export file named with the current UTC timestamp.
+Each successful run stores a `GeneratedBrewery` pair with the source location
 and a `BreweryResult` payload. The same generated records are also written to a
 fresh SQLite export file named with the current UTC timestamp.
 | Field               | Meaning                                    |
 | ------------------- | ------------------------------------------ |
@@ -164,7 +177,8 @@ Each successful run stores a `GeneratedBrewery` pair with the source location an
 | `name_local`        | Brewery name in the local language.        |
 | `description_local` | Brewery description in the local language. |
-The log dump also includes city, country, state or province, ISO subdivision code, latitude, and longitude for each entry.
+The log dump also includes city, country, state or province, ISO subdivision
 code, latitude, and longitude for each entry.
 ### Consumer Data Shape
@@ -180,80 +194,25 @@ The log dump also includes city, country, state or province, ISO subdivision cod
 ---
-## Language Generation Quality
+## Tech Stack
-The generation pipeline passes local language codes to the model to retrieve a translated `description_local`.
+- C++20
 - CMake 3.24+
 - Boost.JSON, Boost.ProgramOptions, Boost.DI
 - spdlog
 - libcurl
 - SQLite amalgamation fetched and compiled via CMake FetchContent
 - llama.cpp
-Output quality is reliable for high-resource languages such as French, though it may struggle with regional variants and idiomatic phrasing. This can be seen with these data points:
+The build fetches Boost.DI, spdlog, llama.cpp, and SQLite via CMake. Metal is
 enabled on Apple Silicon; CUDA or HIP/ROCm is detected on Linux when the toolkit
 is present.
-```json
+> **Code Style:** Modern C++20 throughout — RAII for ownership,
-[
+> `std::unique_ptr` for injected dependencies, `std::optional` for parse
-  {
+> outcomes, `std::span` for read-only views over generated city data, structured
-    "city": "Kinshasa",
+> bindings in pipeline loops. Formatting follows the Google C++ Style Guide via
-    "state_province": "Kinshasa",
+> `.clang-format` with a narrow column limit and two-space indentation.
    "iso3166_2": "CD-KN",
    "country": "Democratic Republic of the Congo",
    "iso3166_1": "CD",
    "latitude": -4.4419,
    "longitude": 15.2663,
    "local_languages": ["fr-CD", "ln"]
  },
  {
    "city": "Paris",
    "state_province": "Île-de-France",
    "iso3166_2": "FR-IDF",
    "country": "France",
    "iso3166_1": "FR",
    "latitude": 48.8566,
    "longitude": 2.3522,
    "local_languages": ["fr-FR"]
  },
  {
    "city": "Abidjan",
    "state_province": "Abidjan",
    "iso3166_2": "CI-AB",
    "country": "Ivory Coast",
    "iso3166_1": "CI",
    "latitude": 5.36,
    "longitude": -4.0083,
    "local_languages": ["fr-CI"]
  },
  {
    "city": "Montreal",
    "state_province": "Quebec",
    "iso3166_2": "CA-QC",
    "country": "Canada",
    "iso3166_1": "CA",
    "latitude": 45.5017,
    "longitude": -73.5673,
    "local_languages": ["fr-CA"]
  },
  {
    "city": "Brussels",
    "state_province": "Brussels-Capital Region",
    "iso3166_2": "BE-BRU",
    "country": "Belgium",
    "iso3166_1": "BE",
    "latitude": 50.8503,
    "longitude": 4.3517,
    "local_languages": ["fr-BE", "nl-BE"]
  }
 ]
 ```
 Output sample: [./out-sample/french-cities.example](out-sample/french-cities.example)
 ### Known Issues
 #### Low-Resource Language Hallucination
 For languages such as Welsh (Wales), Maori (Aotearoa/New Zealand), or Sicilian (Sicily, Italy), the model can generate text that looks syntactically plausible but is semantically incoherent. This comes from limited training-data coverage rather than prompt engineering.
 #### Proposed Mitigations
 - **Prevention via allowlist:** introduce a high-resource language allowlist. If a location's code is unlisted, skip `description_local` generation and fall back to English.
 - **Upstream sanitization:** strip known low-resource language codes from the `locations.json` payload before generation.
 - **Downstream flagging:** add a `description_local_confidence` column to the SQLite schema so downstream applications can filter or flag potentially hallucinated text by language tier.
 ---
@@ -283,62 +242,83 @@ For languages such as Welsh (Wales), Maori (Aotearoa/New Zealand), or Sicilian (
 ---
 ## Fixture Strategy
 - `--mocked` for stable fixtures, repeatable screenshots, and Storybook runs.
 - `--model` when geographically grounded content matters for demos.
 - Keep `locations.json` structured enough to support discovery and future
  filtering.
 - Treat SQLite output as seed material for the app's brewery domain, not
  production data.
 ---
 ## Repo Layout
 | Path                         | Purpose                                            |
-| ---------------- | ---------------------------------------------- |
+| ---------------------------- | -------------------------------------------------- |
 | `includes/`                  | Public headers and shared models.                  |
 | `src/`                       | Implementation files.                              |
 | `locations.json`             | Curated city input copied into the build tree.     |
 | `prompts/`                   | System prompt used by the model-backed path.       |
 | `diagrams/`                  | Architecture and pipeline diagrams.                |
 | `ETHICS-AND-KNOWN-ISSUES.md` | Ethics, bias, hallucination analysis, mitigations. |
 ---
 ## Code Tour
- `src/main.cc` - argument parsing and DI composition root.
+- `src/main.cc` — argument parsing and DI composition root.
- `src/biergarten_data_generator/` - orchestration, sampling, logging, and export.
+- `src/biergarten_data_generator/` — orchestration, sampling, logging, and
- `src/services/wikipedia/` - enrichment service and cache.
+  export.
- `src/services/sqlite/` - SQLite export implementation.
+- `src/services/wikipedia/` — enrichment service and cache.
- `src/data_generation/llama/` - local inference, prompt loading, output validation.
+- `src/services/sqlite/` — SQLite export implementation.
- `src/data_generation/mock/` - deterministic fallback.
+- `src/data_generation/llama/` — local inference, prompt loading, output
-
+  validation.
---
+- `src/data_generation/mock/` — deterministic fallback.
 ## Fixture Strategy
 - `--mocked` for stable fixtures, repeatable screenshots, and Storybook runs.
 - `--model` when geographically grounded content matters for demos.
 - Keep `locations.json` structured enough to support discovery and future filtering.
 - Treat SQLite output as seed material for the app's brewery domain, not production data.
 ---
 ## Next Steps
-The pipeline currently produces city-aware brewery records and dated SQLite exports. The next passes add additional fixture types so the app can exercise the full brewery domain without live data.
+The pipeline currently produces city-aware brewery records and dated SQLite
 exports. The next passes add additional fixture types so the app can exercise
 the full brewery domain without live data.
-### Testing _(Very High Importance)_
+### Testing — Very High Priority
- Unit test JSON validation and retry logic against malformed, truncated, and empty model outputs.
+- Unit test JSON validation and retry logic against malformed, truncated, and
- Integration test the enrichment pipeline with missing context, short context, and fake context inputs.
+  empty model outputs.
- Adversarial context tests: feed plausible but geographically incorrect Wikipedia extracts and verify the model does not silently blend them with training data.
+- Integration test the enrichment pipeline with missing context, short context,
- Verify bilingual enrichment behaviour when only an English extract is available versus when both extracts are present.
+  and fake context inputs.
- Confirm the retry path is reachable when the reasoning block consumes available token budget.
+- Adversarial context tests: feed plausible but geographically incorrect
  Wikipedia extracts and verify the model does not silently blend them with
  training data.
 - Verify bilingual enrichment behaviour when only an English extract is
  available versus when both extracts are present.
 - Confirm the retry path is reachable when the reasoning block consumes
  available token budget.
 ### Beer Generation
-Generate catalog entries with style, ABV, IBU, color, aroma notes, and food pairing hints. Link beers back to breweries and cities. Keep style coverage wide enough to exercise search, sort, and category filters.
+Generate catalog entries with style, ABV, IBU, color, aroma notes, and food
 pairing hints. Link beers back to breweries and cities. Keep style coverage wide
 enough to exercise search, sort, and category filters.
 ### User Generation
-Generate user profiles with stable names, bios, locale hints, and preference signals. Include stable IDs for downstream fixture joins. Keep output deterministic for screenshots while allowing larger randomized batches.
+Generate user profiles with stable names, bios, locale hints, and preference
 signals. Include stable IDs for downstream fixture joins. Keep output
 deterministic for screenshots while allowing larger randomized batches.
 ### Check-In System
-Produce timestamped check-in events between users and breweries. Use a J-curve activity profile - a small set of users accounts for most check-ins, the rest appear occasionally. Add bursty behaviour around weekends and travel periods.
+Produce timestamped check-in events between users and breweries. Use a J-curve
 activity profile — a small set of users accounts for most check-ins, the rest
 appear occasionally. Add bursty behaviour around weekends and travel periods.
 ### Beer Ratings
-Generate rating events with a strong positive skew and a long tail of lower scores. Avoid uniform distributions. Attach timestamps and user IDs so the app can compute averages, trends, and per-style comparisons.
+Generate rating events with a strong positive skew and a long tail of lower
 scores. Avoid uniform distributions. Attach timestamps and user IDs so the app
 can compute averages, trends, and per-style comparisons.
--- a/pipeline/diagrams/activity-diagram.svg
+++ b/pipeline/diagrams/activity-diagram.svg
--- a/pipeline/diagrams/class-diagram.svg
+++ b/pipeline/diagrams/class-diagram.svg
--- a/pipeline/diagrams/current/activity.puml
+++ b/pipeline/diagrams/current/activity.puml
--- a/pipeline/diagrams/current/class.puml
+++ b/pipeline/diagrams/current/class.puml
--- a/pipeline/diagrams/current/output/activity.svg
+++ b/pipeline/diagrams/current/output/activity.svg
--- a/pipeline/diagrams/current/output/class.svg
+++ b/pipeline/diagrams/current/output/class.svg
--- a/pipeline/diagrams/future_possible_activity.svg
+++ b/pipeline/diagrams/future_possible_activity.svg
--- a/pipeline/diagrams/future_possible_architecture.svg
+++ b/pipeline/diagrams/future_possible_architecture.svg
--- a/pipeline/diagrams/future-activity-diagram.puml
+++ b/pipeline/diagrams/future-activity-diagram.puml
--- a/pipeline/diagrams/future-class-diagram.puml
+++ b/pipeline/diagrams/future-class-diagram.puml
--- a/pipeline/diagrams/planned/output/biergarten_activity.svg
+++ b/pipeline/diagrams/planned/output/biergarten_activity.svg
--- a/pipeline/diagrams/planned/output/future_possible_architecture.svg
+++ b/pipeline/diagrams/planned/output/future_possible_architecture.svg