This commit is contained in:
Aaron Po
2026-04-20 23:56:27 -04:00
parent 6657015ee3
commit bbe8970bf6
2 changed files with 285 additions and 218 deletions

View File

@@ -1,18 +1,26 @@
@startuml future_possible_activity @startuml biergarten_activity
skinparam defaultFontName "DM Sans" skinparam defaultFontName "DM Sans"
skinparam defaultFontSize 13 skinparam defaultFontSize 13
skinparam titleFontName "Volkhov" skinparam titleFontName "Volkhov"
skinparam titleFontSize 20 skinparam titleFontSize 20
skinparam backgroundColor #FAFCF9 skinparam backgroundColor #FCFCF7
skinparam defaultFontColor #28342A skinparam defaultFontColor #14180C
skinparam titleFontColor #28342A skinparam titleFontColor #14180C
skinparam ArrowColor #628A5B skinparam ArrowColor #656F33
skinparam ActivityBackgroundColor #EAF0E8 skinparam activityStartColor #EBECE3
skinparam ActivityBorderColor #547461 skinparam activityEndColor #4A5837
skinparam ActivityDiamondBackgroundColor #DCE8D8 skinparam activityStopColor #4A5837
skinparam ActivityDiamondBorderColor #547461 skinparam ActivityBackgroundColor #EBECE3
skinparam NoteBackgroundColor #EAF0E8 skinparam ActivityBorderColor #4A5837
skinparam NoteBorderColor #547461 skinparam ActivityDiamondBackgroundColor #CBD2B5
skinparam ActivityDiamondBorderColor #4A5837
skinparam NoteBackgroundColor #DBEEDD
skinparam NoteFontColor #14180C
skinparam NoteBorderColor #4A5837
skinparam SwimlaneBorderColor #4A5837
skinparam SwimlaneBorderThickness 1
skinparam monochrome reverse
title The Biergarten Data Pipeline — Activity Diagram title The Biergarten Data Pipeline — Activity Diagram
@@ -27,6 +35,25 @@ endif
:Init CurlGlobalState & LlamaBackendState; :Init CurlGlobalState & LlamaBackendState;
:Build DI injector; :Build DI injector;
:Initialize SqliteExportService;
note right
Opens SQLite connection.
Begins a single transaction
covering all five fixture types.
end note
:Create BoundedChannel<LogEntry> log_ch;
:Spawn Log Worker thread;
note right
Log worker drains log_ch for the
entire pipeline lifetime.
All workers emit LogEntry structs
via PipelineLogger — never spdlog directly.
end note
:BiergartenPipelineOrchestrator::Run();
|BiergartenPipelineOrchestrator::Run()|
:JsonLoader::LoadLocations("locations.json"); :JsonLoader::LoadLocations("locations.json");
:JsonLoader::LoadBeerStyles("beer-styles.json"); :JsonLoader::LoadBeerStyles("beer-styles.json");
:JsonLoader::LoadPersonas("personas.json"); :JsonLoader::LoadPersonas("personas.json");
@@ -42,17 +69,10 @@ end note
:EnrichmentService::PreWarmPersonaCache(personas); :EnrichmentService::PreWarmPersonaCache(personas);
note right note right
Persona descriptions do not need location context. Persona descriptions do not need location context.
All persona Wikipedia/description lookups are All persona lookups are resolved and cached
resolved and cached globally at startup. globally at startup.
end note end note
:Initialize SqliteExportService;
note right
Opens SQLite connection.
Begins a single transaction
covering all five fixture types.
end note
:BiergartenPipelineOrchestrator::Run();
' ═══════════════════════════════════════════ ' ═══════════════════════════════════════════
' PHASE 0 — USER GENERATION ' PHASE 0 — USER GENERATION
@@ -73,24 +93,27 @@ fork again
:IPersonaSelectionStrategy::SelectPersona(\n personas_palette_); :IPersonaSelectionStrategy::SelectPersona(\n personas_palette_);
note right note right
Guaranteed cache hit from startup. Guaranteed cache hit from startup.
Returns a Persona struct with style_affinities, Returns a Persona struct carrying
abv_range, ibu_preference, checkin_weight. style_affinities, abv_range,
ibu_preference, checkin_weight.
end note end note
:NamesByCountry::SampleName(\n location.iso3166_1); :NamesByCountry::SampleName(\n location.iso3166_1);
note right note right
Deterministic lookup — no LLM involved. Deterministic lookup — no LLM involved.
Name is selected from a pre-keyed table Name selected from pre-keyed table
and passed into the generation prompt. and passed into the generation prompt.
end note end note
:GenerateUser(location, persona, sampled_name)\nvia DataGenerator; :GenerateUser(location, persona, sampled_name)\nvia DataGenerator;
note right note right
LLM receives: Location fields + persona description LLM receives: Location fields + persona
+ sampled name. Generates bio and preference description + sampled name. Generates
signals grounded in both. bio and preference signals grounded
in locale and persona.
end note end note
:PipelineLogger::Log(Info, UserGeneration,\n city, user_id, "llm");
:Send GeneratedUser → llm_ch; :Send GeneratedUser → llm_ch;
endwhile (no) endwhile (no)
:Close llm_ch; :Close llm_ch;
@@ -99,6 +122,7 @@ fork again
while (llm_ch has items?) is (yes) while (llm_ch has items?) is (yes)
:Receive GeneratedUser; :Receive GeneratedUser;
:ProcessUser(user) → sqlite3_int64; :ProcessUser(user) → sqlite3_int64;
:PipelineLogger::Log(Info, UserGeneration,\n city, user_id, "sqlite");
:Append → user_pool_; :Append → user_pool_;
endwhile (no) endwhile (no)
end fork end fork
@@ -108,7 +132,6 @@ end fork
' ═══════════════════════════════════════════ ' ═══════════════════════════════════════════
' PHASE 1 — BREWERY & BEER GENERATION ' PHASE 1 — BREWERY & BEER GENERATION
' Combined into a single dependent unit of work.
' ═══════════════════════════════════════════ ' ═══════════════════════════════════════════
:RunBreweryAndBeerPhase(sampled_locations); :RunBreweryAndBeerPhase(sampled_locations);
:Create BoundedChannels\n(loc_ch, llm_ch, exp_ch); :Create BoundedChannels\n(loc_ch, llm_ch, exp_ch);
@@ -122,6 +145,7 @@ fork again
while (loc_ch has items?) is (yes) while (loc_ch has items?) is (yes)
:Receive Location; :Receive Location;
:GetLocationContext(location,\nBreweryContextStrategy); :GetLocationContext(location,\nBreweryContextStrategy);
:PipelineLogger::Log(Info,\n BreweryAndBeerGeneration,\n city, nullopt, "enrichment");
:Send EnrichedCity → llm_ch; :Send EnrichedCity → llm_ch;
endwhile (no) endwhile (no)
|Orchestrator| |Orchestrator|
@@ -145,12 +169,8 @@ fork again
:Attach GeneratedBeer to Brewery bundle; :Attach GeneratedBeer to Brewery bundle;
endwhile (done) endwhile (done)
:PipelineLogger::Log(Info,\n BreweryAndBeerGeneration,\n city, brewery_id, "llm");
:Send BreweryWithBeers Bundle → exp_ch; :Send BreweryWithBeers Bundle → exp_ch;
note right
The next generation of a brewery is
entirely dependent on the current
brewery and its beers completing.
end note
endwhile (no) endwhile (no)
:Close exp_ch; :Close exp_ch;
fork again fork again
@@ -165,6 +185,8 @@ fork again
:ProcessBeer(beer) → sqlite3_int64; :ProcessBeer(beer) → sqlite3_int64;
:Append → beer_pool_; :Append → beer_pool_;
endwhile (done) endwhile (done)
:PipelineLogger::Log(Info,\n BreweryAndBeerGeneration,\n city, brewery_id, "sqlite");
endwhile (no) endwhile (no)
end fork end fork
@@ -177,16 +199,13 @@ end note
' ═══════════════════════════════════════════ ' ═══════════════════════════════════════════
' PHASE 2 — CHECKIN GENERATION ' PHASE 2 — CHECKIN GENERATION
' Sequential now that Breweries/Beers are done.
' ═══════════════════════════════════════════ ' ═══════════════════════════════════════════
:RunCheckinPhase(); :RunCheckinPhase();
:ICheckinDistributionStrategy::\nAssignActivityWeights(user_pool_); :ICheckinDistributionStrategy::\nAssignActivityWeights(user_pool_);
note right note right
Weights are seeded from each user's Weights seeded from each user's
persona.checkin_weight — high-activity persona.checkin_weight. J-curve profile
personas (craft enthusiasts) check in more, emerges from persona distribution.
casual personas less. J-curve profile
emerges from the persona distribution.
end note end note
while (For each GeneratedUser in user_pool_?) is (remaining) while (For each GeneratedUser in user_pool_?) is (remaining)
@@ -196,6 +215,7 @@ while (For each GeneratedUser in user_pool_?) is (remaining)
:Select brewery from brewery_pool_; :Select brewery from brewery_pool_;
:GenerateCheckin(user, brewery, timestamp)\nvia DataGenerator; :GenerateCheckin(user, brewery, timestamp)\nvia DataGenerator;
:ProcessCheckin(checkin) → sqlite3_int64; :ProcessCheckin(checkin) → sqlite3_int64;
:PipelineLogger::Log(Info, CheckinGeneration,\n nullopt, checkin_id, "sqlite");
:Append → checkin_pool_; :Append → checkin_pool_;
endwhile (done) endwhile (done)
endwhile (done) endwhile (done)
@@ -205,11 +225,9 @@ endwhile (done)
' ═══════════════════════════════════════════ ' ═══════════════════════════════════════════
:RunRatingPhase(); :RunRatingPhase();
note right note right
Beer selection during rating is biased by Beer selection biased by
user.persona.style_affinities and abv_range user.persona.style_affinities and abv_range.
users are more likely to rate beers matching Rating skew modulated per persona.
their persona profile. Rating skew (positive
with long tail) is also modulated per persona.
end note end note
while (For each GeneratedCheckin in checkin_pool_?) is (remaining) while (For each GeneratedCheckin in checkin_pool_?) is (remaining)
@@ -217,7 +235,9 @@ while (For each GeneratedCheckin in checkin_pool_?) is (remaining)
if (Beer exists for brewery?) then (yes) if (Beer exists for brewery?) then (yes)
:GenerateRating(user, beer, checkin_id)\nvia DataGenerator; :GenerateRating(user, beer, checkin_id)\nvia DataGenerator;
:ProcessRating(rating); :ProcessRating(rating);
:PipelineLogger::Log(Info, RatingGeneration,\n nullopt, rating_id, "sqlite");
else (no) else (no)
:PipelineLogger::Log(Warn, RatingGeneration,\n nullopt, brewery_id, "sqlite");
:Skip — brewery has no beers; :Skip — brewery has no beers;
endif endif
endwhile (done) endwhile (done)
@@ -230,6 +250,12 @@ endwhile (done)
note right note right
COMMIT covers all five fixture types. COMMIT covers all five fixture types.
end note end note
:Close log_ch;
:Join Log Worker;
note right
Drain guarantees no LogEntry is
dropped at shutdown.
end note
:spdlog::info "Pipeline complete in X ms"; :spdlog::info "Pipeline complete in X ms";
stop stop

View File

@@ -1,40 +1,52 @@
@startuml future_possible_architecture @startuml future_possible_architecture
skinparam style strictuml
' ==========================================
' CONFIGURATION & STYLING
' ==========================================
left to right direction
skinparam linetype ortho
' --- Typography ---
skinparam defaultFontName "DM Sans" skinparam defaultFontName "DM Sans"
skinparam defaultFontSize 14 skinparam defaultFontSize 14
skinparam titleFontName "Volkhov" skinparam titleFontName "Volkhov"
skinparam titleFontSize 20 skinparam titleFontSize 20
skinparam backgroundColor #FAFCF9
skinparam defaultFontColor #28342A
skinparam titleFontColor #28342A
skinparam ArrowColor #628A5B
skinparam linetype ortho
skinparam class {
BackgroundColor #FAFCF9
HeaderBackgroundColor #EAF0E8
BorderColor #547461
ArrowColor #628A5B
FontColor #28342A
}
skinparam note { ' --- Global Colors ---
BackgroundColor #EAF0E8 skinparam backgroundColor #FCFCF7
BorderColor #547461 skinparam defaultFontColor #14180C
FontColor #28342A skinparam titleFontColor #14180C
skinparam ArrowColor #656F33
skinparam class {
BackgroundColor #EBECE3
HeaderBackgroundColor #CBD2B5
BorderColor #4A5837
ArrowColor #656F33
FontColor #14180C
} }
skinparam package { skinparam package {
BackgroundColor #F2F6F0 BackgroundColor #DBEEDD
BorderColor #547461 BorderColor #4A5837
FontColor #28342A FontColor #14180C
} }
skinparam note {
BackgroundColor #DBEEDD
BorderColor #4A5837
FontColor #14180C
}
skinparam monochrome reverse
title The Biergarten Data Pipeline — Planned Architecture title The Biergarten Data Pipeline — Planned Architecture
left to right direction ' ==========================================
' DOMAIN MODELS
' ==========================================
package "Domain Models" { package "Domain Models" {
class Location { class Location {
+ city : std::string + city : std::string
+ state_province : std::string + state_province : std::string
@@ -70,15 +82,6 @@ package "Domain Models" {
+ min_ibu : int + min_ibu : int
+ max_ibu : int + max_ibu : int
} }
note right of BeerStyle
Loaded once at startup from
beer-styles.json via JsonLoader.
Passed as std::span<const BeerStyle>
to IBeerSelectionStrategy.
Generator receives the selected
style as a parameter — it never
reads the palette directly.
end note
class BreweryResult { class BreweryResult {
+ name_en : std::string + name_en : std::string
@@ -102,13 +105,6 @@ package "Domain Models" {
+ bio : std::string + bio : std::string
+ activity_weight : float + activity_weight : float
} }
note right of UserResult
activity_weight assigned by
ICheckinDistributionStrategy
after the full user pool is
committed. Drives J-curve
checkin volume per user.
end note
class CheckinResult { class CheckinResult {
+ checked_in_at : std::string + checked_in_at : std::string
@@ -143,11 +139,6 @@ package "Domain Models" {
+ user : UserResult + user : UserResult
+ generated_at : std::string + generated_at : std::string
} }
note right of GeneratedUser
user_id populated after SQLite
insert. Live FK carried in pool
for checkin and rating references.
end note
class GeneratedCheckin { class GeneratedCheckin {
+ checkin_id : sqlite3_int64 + checkin_id : sqlite3_int64
@@ -172,38 +163,93 @@ package "Domain Models" {
+ n_ctx : uint32_t = 8192 + n_ctx : uint32_t = 8192
+ seed : int = -1 + seed : int = -1
} }
note right of SamplingOptions
Ignored when GeneratorOptions::
use_mocked = true.
end note
class GeneratorOptions { class GeneratorOptions {
+ model_path : std::string + model_path : std::filesystem::path
+ use_mocked : bool = false + use_mocked : bool = false
+ sampling : SamplingOptions + sampling : SamplingOptions
} }
class PipelineOptions { class PipelineOptions {
+ output_path : std::filesystem::path
+ log_path : std::filesystem::path
} }
note right of PipelineOptions
Reserved for future config:
n_locations, concurrency,
output_path, etc.
end note
class ApplicationOptions { class ApplicationOptions {
+ generator : GeneratorOptions + generator : GeneratorOptions
+ pipeline : PipelineOptions + pipeline : PipelineOptions
} }
' --- Domain Model Relationships ---
ApplicationOptions *-- GeneratorOptions ApplicationOptions *-- GeneratorOptions
ApplicationOptions *-- PipelineOptions ApplicationOptions *-- PipelineOptions
GeneratorOptions *-- SamplingOptions GeneratorOptions *-- SamplingOptions
LocationContext *-- Completeness
} }
' ==========================================
' LOGGING
' ==========================================
package "Logging" {
enum LogLevel {
Debug
Info
Warn
Error
}
enum PipelinePhase {
Startup
UserGeneration
BreweryAndBeerGeneration
CheckinGeneration
RatingGeneration
Teardown
}
class LogEntry {
+ timestamp : std::chrono::system_clock::time_point
+ level : LogLevel
+ phase : PipelinePhase
+ message : std::string
+ city : std::optional<std::string>
+ entity_id : std::optional<std::string>
+ worker : std::optional<std::string>
}
interface Logger <<interface>> {
+ Log(level, phase, message,\n city, entity_id, worker) : void
}
class PipelineLogger {
- log_ch_ : BoundedChannel<LogEntry>&
+ Log(level, phase, message,\n city, entity_id, worker) : void
}
class LogWorker {
- log_ch_ : BoundedChannel<LogEntry>&
+ Run() : void
- FormatTimestamp(tp) : std::string
- ToSpdlogLevel(level) : spdlog::level::level_enum
- ToString(phase) : std::string
}
' --- Logging Relationships ---
LogEntry *-- LogLevel
LogEntry *-- PipelinePhase
PipelineLogger ..> LogEntry : emits
LogWorker ..> LogEntry : consumes
}
' ==========================================
' DOMAIN POLICY
' ==========================================
package "Domain Policy" { package "Domain Policy" {
interface IContextStrategy <<interface>> { interface ContextStrategy <<interface>> {
+ QueriesFor(loc : const Location&) : std::vector<std::string> + QueriesFor(loc : const Location&) : std::vector<std::string>
+ MaxContextChars() : size_t + MaxContextChars() : size_t
} }
@@ -218,7 +264,7 @@ package "Domain Policy" {
+ MaxContextChars() : size_t + MaxContextChars() : size_t
} }
interface ISamplingStrategy <<interface>> { interface SamplingStrategy <<interface>> {
+ Sample(locations : const std::vector<Location>&) : std::vector<Location> + Sample(locations : const std::vector<Location>&) : std::vector<Location>
} }
@@ -227,16 +273,9 @@ package "Domain Policy" {
+ Sample(locations : const std::vector<Location>&) : std::vector<Location> + Sample(locations : const std::vector<Location>&) : std::vector<Location>
} }
interface IBeerSelectionStrategy <<interface>> { interface BeerSelectionStrategy <<interface>> {
+ SelectStyles(brewery : const GeneratedBrewery&,\n palette : std::span<const BeerStyle>) : std::vector<BeerStyle> + SelectStyles(brewery : const GeneratedBrewery&,\n palette : std::span<const BeerStyle>) : std::vector<BeerStyle>
} }
note right of IBeerSelectionStrategy
Decides how many beers a brewery
gets and which styles are selected.
Count distribution and style
deduplication logic live here,
not in the orchestrator or generator.
end note
class RandomBeerSelectionStrategy { class RandomBeerSelectionStrategy {
- rng_ : std::mt19937 - rng_ : std::mt19937
@@ -244,24 +283,12 @@ package "Domain Policy" {
- max_beers_ : size_t - max_beers_ : size_t
+ SelectStyles(brewery : const GeneratedBrewery&,\n palette : std::span<const BeerStyle>) : std::vector<BeerStyle> + SelectStyles(brewery : const GeneratedBrewery&,\n palette : std::span<const BeerStyle>) : std::vector<BeerStyle>
} }
note right of RandomBeerSelectionStrategy
Draws a random count in [min, max].
Samples without replacement from
palette to avoid duplicate styles
per brewery.
end note
interface ICheckinDistributionStrategy <<interface>> { interface CheckinDistributionStrategy <<interface>> {
+ AssignActivityWeights(users : std::vector<GeneratedUser>&) : void + AssignActivityWeights(users : std::vector<GeneratedUser>&) : void
+ CheckinsForUser(user : const GeneratedUser&,\n brewery_count : size_t) : size_t + CheckinsForUser(user : const GeneratedUser&,\n brewery_count : size_t) : size_t
+ TimestampFor(user : const GeneratedUser&,\n index : size_t) : std::string + TimestampFor(user : const GeneratedUser&,\n index : size_t) : std::string
} }
note right of ICheckinDistributionStrategy
Owns all statistical policy:
J-curve weight assignment,
bursty weekend timestamps,
per-user checkin volume.
end note
class JCurveCheckinStrategy { class JCurveCheckinStrategy {
- rng_ : std::mt19937 - rng_ : std::mt19937
@@ -273,17 +300,28 @@ package "Domain Policy" {
} }
' ==========================================
' ORCHESTRATION
' ==========================================
package "Orchestration" { package "Orchestration" {
interface DataPreloader <<interface>> {
+ LoadLocations(filepath : const std::filesystem::path&) : std::vector<Location>
+ LoadBeerStyles(filepath : const std::filesystem::path&) : std::vector<BeerStyle>
+ LoadPersonas(filepath : const std::filesystem::path&) : std::vector<Persona>
+ LoadNamesByCountry(filepath : const std::filesystem::path&) : NamesByCountry
}
class BiergartenPipelineOrchestrator { class BiergartenPipelineOrchestrator {
- enrichment_service_ : std::unique_ptr<IEnrichmentService> - preloader_ : std::unique_ptr<DataPreloader>
- enrichment_service_ : std::unique_ptr<EnrichmentService>
- generator_ : std::unique_ptr<DataGenerator> - generator_ : std::unique_ptr<DataGenerator>
- exporter_ : std::unique_ptr<IExportService> - logger_ : std::unique_ptr<Logger>
- brewery_context_strategy_ : std::unique_ptr<IContextStrategy> - exporter_ : std::unique_ptr<ExportService>
- sampling_strategy_ : std::unique_ptr<ISamplingStrategy> - brewery_context_strategy_ : std::unique_ptr<ContextStrategy>
- beer_selection_strategy_ : std::unique_ptr<IBeerSelectionStrategy> - sampling_strategy_ : std::unique_ptr<SamplingStrategy>
- checkin_strategy_ : std::unique_ptr<ICheckinDistributionStrategy> - beer_selection_strategy_ : std::unique_ptr<BeerSelectionStrategy>
- checkin_strategy_ : std::unique_ptr<CheckinDistributionStrategy>
- beer_style_palette_ : std::vector<BeerStyle> - beer_style_palette_ : std::vector<BeerStyle>
- options_ : ApplicationOptions - options_ : ApplicationOptions
-- --
@@ -298,33 +336,39 @@ package "Orchestration" {
- RunCheckinPhase() : void - RunCheckinPhase() : void
- RunRatingPhase() : void - RunRatingPhase() : void
} }
class JsonLoader {
+ {static} LoadLocations(filepath : const std::filesystem::path&) : std::vector<Location>
+ {static} LoadBeerStyles(filepath : const std::filesystem::path&) : std::vector<BeerStyle>
+ {static} LoadPersonas(filepath : const std::filesystem::path&) : std::vector<Persona>
+ {static} LoadNamesByCountry(filepath : const std::filesystem::path&) : NamesByCountry
}
} }
' ==========================================
' INFRASTRUCTURE: PRELOADING
' ==========================================
package "Infrastructure: Preloading" {
class JsonLoader {
+ LoadLocations(filepath : const std::filesystem::path&) : std::vector<Location>
+ LoadBeerStyles(filepath : const std::filesystem::path&) : std::vector<BeerStyle>
+ LoadPersonas(filepath : const std::filesystem::path&) : std::vector<Persona>
+ LoadNamesByCountry(filepath : const std::filesystem::path&) : NamesByCountry
}
}
' ==========================================
' INFRASTRUCTURE: ENRICHMENT
' ==========================================
package "Infrastructure: Enrichment" { package "Infrastructure: Enrichment" {
interface IEnrichmentService <<interface>> { interface EnrichmentService <<interface>> {
+ GetLocationContext(loc : const Location&,\n strategy : const IContextStrategy&) : LocationContext + GetLocationContext(loc : const Location&,\n strategy : const ContextStrategy&) : LocationContext
} }
class WikipediaService { class WikipediaService {
- client_ : std::unique_ptr<WebClient> - client_ : std::unique_ptr<WebClient>
- extract_cache_ : std::unordered_map<std::string, std::string> - extract_cache_ : std::unordered_map<std::string, std::string>
+ GetLocationContext(loc : const Location&,\n strategy : const IContextStrategy&) : LocationContext + GetLocationContext(loc : const Location&,\n strategy : const ContextStrategy&) : LocationContext
- FetchExtract(query : std::string_view) : std::string - FetchExtract(query : std::string_view) : std::string
} }
note right of WikipediaService
extract_cache_ keyed by query string.
Beer pass gets near-100% cache hits
since locations were already fetched
during the brewery pass.
end note
interface WebClient <<interface>> { interface WebClient <<interface>> {
+ Get(url : const std::string&) : std::string + Get(url : const std::string&) : std::string
@@ -338,6 +382,10 @@ package "Infrastructure: Enrichment" {
} }
' ==========================================
' INFRASTRUCTURE: GENERATION
' ==========================================
package "Infrastructure: Generation" { package "Infrastructure: Generation" {
interface DataGenerator <<interface>> { interface DataGenerator <<interface>> {
@@ -347,12 +395,6 @@ package "Infrastructure: Generation" {
+ GenerateCheckin(user : const GeneratedUser&,\n brewery : const GeneratedBrewery&,\n timestamp : const std::string&) : CheckinResult + GenerateCheckin(user : const GeneratedUser&,\n brewery : const GeneratedBrewery&,\n timestamp : const std::string&) : CheckinResult
+ GenerateRating(user : const GeneratedUser&,\n beer : const GeneratedBeer&,\n checkin_id : sqlite3_int64) : RatingResult + GenerateRating(user : const GeneratedUser&,\n beer : const GeneratedBeer&,\n checkin_id : sqlite3_int64) : RatingResult
} }
note right of DataGenerator
GenerateBeer receives BeerStyle
as a parameter. Style selection
and count decisions live in
IBeerSelectionStrategy, not here.
end note
class MockGenerator { class MockGenerator {
+ GenerateBrewery(...) : BreweryResult + GenerateBrewery(...) : BreweryResult
@@ -366,7 +408,7 @@ package "Infrastructure: Generation" {
class LlamaGenerator { class LlamaGenerator {
- model_ : ModelHandle - model_ : ModelHandle
- context_ : ContextHandle - context_ : ContextHandle
- prompt_formatter_ : std::unique_ptr<IPromptFormatter> - prompt_formatter_ : std::unique_ptr<PromptFormatter>
- rng_ : std::mt19937 - rng_ : std::mt19937
+ GenerateBrewery(...) : BreweryResult + GenerateBrewery(...) : BreweryResult
+ GenerateBeer(...) : BeerResult + GenerateBeer(...) : BeerResult
@@ -377,15 +419,8 @@ package "Infrastructure: Generation" {
- Infer(system_prompt, user_prompt,\n max_tokens, grammar) : std::string - Infer(system_prompt, user_prompt,\n max_tokens, grammar) : std::string
- ValidateModelArchitecture() : void - ValidateModelArchitecture() : void
} }
note right of LlamaGenerator
Constructed from GeneratorOptions.
SamplingOptions fields are applied
during Load(). LlamaConfig removed —
GeneratorOptions is the sole
configuration surface.
end note
interface IPromptFormatter <<interface>> { interface PromptFormatter <<interface>> {
+ Format(system_prompt : std::string_view,\n user_prompt : std::string_view) : std::string + Format(system_prompt : std::string_view,\n user_prompt : std::string_view) : std::string
+ ExpectedArchitecture() : std::string_view + ExpectedArchitecture() : std::string_view
} }
@@ -397,6 +432,10 @@ package "Infrastructure: Generation" {
} }
' ==========================================
' INFRASTRUCTURE: PIPELINE CHANNEL
' ==========================================
package "Infrastructure: Pipeline Channel" { package "Infrastructure: Pipeline Channel" {
class "BoundedChannel<T>" as BoundedChannel { class "BoundedChannel<T>" as BoundedChannel {
@@ -410,19 +449,16 @@ package "Infrastructure: Pipeline Channel" {
+ Receive() : std::optional<T> + Receive() : std::optional<T>
+ Close() : void + Close() : void
} }
note right of BoundedChannel
Back-pressure via capacity_ bound.
Stalls fast producers (enrichment ×N)
when the LLM worker cannot keep up.
Close() is the termination signal —
workers drain remaining items then exit.
end note
} }
' ==========================================
' INFRASTRUCTURE: EXPORT
' ==========================================
package "Infrastructure: Export" { package "Infrastructure: Export" {
interface IExportService <<interface>> { interface ExportService <<interface>> {
+ Initialize() : void + Initialize() : void
+ ProcessBrewery(brewery : const GeneratedBrewery&) : sqlite3_int64 + ProcessBrewery(brewery : const GeneratedBrewery&) : sqlite3_int64
+ ProcessBeer(beer : const GeneratedBeer&) : sqlite3_int64 + ProcessBeer(beer : const GeneratedBeer&) : sqlite3_int64
@@ -433,7 +469,7 @@ package "Infrastructure: Export" {
} }
class SqliteExportService { class SqliteExportService {
- date_time_provider_ : std::unique_ptr<IDateTimeProvider> - date_time_provider_ : std::unique_ptr<DateTimeProvider>
- db_handle_ : SqliteDatabaseHandle - db_handle_ : SqliteDatabaseHandle
- insert_location_stmt_ : SqliteStatementHandle - insert_location_stmt_ : SqliteStatementHandle
- insert_brewery_stmt_ : SqliteStatementHandle - insert_brewery_stmt_ : SqliteStatementHandle
@@ -456,15 +492,8 @@ package "Infrastructure: Export" {
- RollbackAndCloseNoThrow() : void - RollbackAndCloseNoThrow() : void
- FinalizeStatements() : void - FinalizeStatements() : void
} }
note right of SqliteExportService
Single writer — no lock contention.
location_cache_ deduplicates city rows.
brewery_cache_ resolves beer FK without
re-querying. Single long-running
transaction committed in Finalize().
end note
interface IDateTimeProvider <<interface>> { interface DateTimeProvider <<interface>> {
+ GetUtcTimestamp() : std::string + GetUtcTimestamp() : std::string
} }
@@ -475,43 +504,55 @@ package "Infrastructure: Export" {
} }
' ==========================================
' GLOBAL RELATIONSHIPS
' ==========================================
' Orchestration ' --- Orchestration Aggregations (Services & Strategies) ---
BiergartenPipelineOrchestrator *-- IEnrichmentService BiergartenPipelineOrchestrator *-- DataPreloader
BiergartenPipelineOrchestrator *-- EnrichmentService
BiergartenPipelineOrchestrator *-- DataGenerator BiergartenPipelineOrchestrator *-- DataGenerator
BiergartenPipelineOrchestrator *-- IExportService BiergartenPipelineOrchestrator *-- ExportService
BiergartenPipelineOrchestrator *-- ICheckinDistributionStrategy BiergartenPipelineOrchestrator *-- CheckinDistributionStrategy
BiergartenPipelineOrchestrator *-- ISamplingStrategy BiergartenPipelineOrchestrator *-- SamplingStrategy
BiergartenPipelineOrchestrator *-- IBeerSelectionStrategy BiergartenPipelineOrchestrator *-- BeerSelectionStrategy
BiergartenPipelineOrchestrator *-- ApplicationOptions BiergartenPipelineOrchestrator *-- ApplicationOptions
BiergartenPipelineOrchestrator ..> JsonLoader BiergartenPipelineOrchestrator *-- Logger
' Policy implementations ' --- Orchestration Aggregations (Data Pools) ---
IContextStrategy <|.. BreweryContextStrategy BiergartenPipelineOrchestrator *-- "0..*" GeneratedUser : user_pool_
IContextStrategy <|.. BeerContextStrategy BiergartenPipelineOrchestrator *-- "0..*" GeneratedBrewery : brewery_pool_
ISamplingStrategy <|.. UniformSamplingStrategy BiergartenPipelineOrchestrator *-- "0..*" GeneratedBeer : beer_pool_
IBeerSelectionStrategy <|.. RandomBeerSelectionStrategy BiergartenPipelineOrchestrator *-- "0..*" GeneratedCheckin : checkin_pool_
ICheckinDistributionStrategy <|.. JCurveCheckinStrategy
' Enrichment ' --- Interfaces & Implementations ---
IEnrichmentService <|.. WikipediaService DataPreloader <|.. JsonLoader
WikipediaService *-- WebClient Logger <|.. PipelineLogger
WikipediaService ..> IContextStrategy ContextStrategy <|.. BreweryContextStrategy
ContextStrategy <|.. BeerContextStrategy
SamplingStrategy <|.. UniformSamplingStrategy
BeerSelectionStrategy <|.. RandomBeerSelectionStrategy
CheckinDistributionStrategy <|.. JCurveCheckinStrategy
EnrichmentService <|.. WikipediaService
WebClient <|.. CURLWebClient WebClient <|.. CURLWebClient
' Generation
DataGenerator <|.. MockGenerator DataGenerator <|.. MockGenerator
DataGenerator <|.. LlamaGenerator DataGenerator <|.. LlamaGenerator
LlamaGenerator *-- IPromptFormatter PromptFormatter <|.. Gemma4JinjaPromptFormatter
ExportService <|.. SqliteExportService
DateTimeProvider <|.. SystemDateTimeProvider
' --- Service Compositions & Dependencies ---
WikipediaService *-- WebClient
WikipediaService ..> ContextStrategy
LlamaGenerator *-- PromptFormatter
LlamaGenerator ..> GeneratorOptions LlamaGenerator ..> GeneratorOptions
IPromptFormatter <|.. Gemma4JinjaPromptFormatter SqliteExportService *-- DateTimeProvider
' Export ' --- Cross-Component Aggregations (Held References) ---
IExportService <|.. SqliteExportService PipelineLogger o-- BoundedChannel : logs to
SqliteExportService *-- IDateTimeProvider LogWorker o-- BoundedChannel : drains from
IDateTimeProvider <|.. SystemDateTimeProvider
' Domain containment ' --- Domain Containment ---
EnrichedCity *-- Location EnrichedCity *-- Location
EnrichedCity *-- LocationContext EnrichedCity *-- LocationContext
GeneratedBrewery *-- Location GeneratedBrewery *-- Location