Aaron Po
c8db2ed06c
Integrate SQLite export functionality
2026-04-19 12:01:43 -04:00
Aaron Po
1b242e86b5
Improve type safety, update logging, remove unused paths
2026-04-18 19:18:21 -04:00
Aaron Po
8a6cbe5efd
Fix stale/inaccurate documentation
2026-04-18 19:00:13 -04:00
Aaron Po
056fb47b93
documentation updates
2026-04-18 18:23:30 -04:00
Aaron Po
88527f7709
make prompt formatter unique ptr
2026-04-18 18:21:00 -04:00
Aaron Po
49f4ed6787
Add activity diagram
2026-04-18 16:01:53 -04:00
Aaron Po
4d4b897d02
add activity diagram
2026-04-18 15:59:25 -04:00
Aaron Po
f71e4ddc83
refactor prompt placeholders for consistency
2026-04-18 15:49:58 -04:00
Aaron Po
212077793e
add example to readme
2026-04-18 15:45:31 -04:00
Aaron Po
e6d1954506
update readme/prompts
2026-04-18 15:27:27 -04:00
Aaron Po
ce56532728
Update readme
2026-04-18 12:56:34 -04:00
Aaron Po
9649c993e8
Add local language handling
2026-04-18 01:38:50 -04:00
Aaron Po
f782fdb51d
Add localized name/description to data models
2026-04-17 22:08:26 -04:00
Aaron Po
fcc7a5dc8b
Enhance ValidateBreweryJson to include reasoning output and update GenerateBrewery to use user_prompt
...
Add gemma parser
2026-04-17 16:41:14 -04:00
Aaron Po
44a74ed2ad
update chatprompt and llama prompt handling
2026-04-16 15:34:47 -04:00
Aaron Po
6682b5de01
fix llama grammar
2026-04-15 23:28:27 -04:00
Aaron Po
62dfb5e14a
Add llama grammar to ensure proper json output
2026-04-15 13:39:01 -04:00
Aaron Po
ddf4bcb981
cleanup
2026-04-15 00:22:15 -04:00
Aaron Po
15853c62fd
remove const to enable use of std::move
2026-04-13 22:02:31 -04:00
Aaron Po
ff4b7f2578
Use unique_ptr with custom deleter for llama
2026-04-13 21:45:00 -04:00
Aaron Po
3c70c46957
fix include order
2026-04-13 10:03:23 -04:00
Aaron Po
c7abc808ea
Fix naming violations, use of magic numbers in web client get
2026-04-13 00:33:48 -04:00
Aaron Po
ef4f47d415
Update all .cpp files to use .cc extension (google style)
2026-04-13 00:14:20 -04:00
Aaron Po
035b30abba
updates
2026-04-13 00:14:20 -04:00
Aaron Po
1cd30488eb
Code format updates
2026-04-11 23:51:08 -04:00
Aaron Po
823599a96f
Fix style guide errors
2026-04-11 23:46:16 -04:00
Aaron Po
56ec728ba7
Refactor Llama generator, helpers, and build assets
...
make Gemma 4 the default model, enable thinking mode
style updates
2026-04-11 23:35:17 -04:00
Aaron Po
7ca651a886
updates for gemma-4-E4B-it-Q6_K.gguf
2026-04-09 23:59:38 -04:00
Aaron Po
b53f9e5582
fix: llama backend lifetime, Wikipedia enrichment depth, and misc cleanup
2026-04-09 21:59:46 -04:00
Aaron Po
824f5b2b4f
Refactor BiergartenDataGenerator to use dependency injection container
2026-04-09 20:46:20 -04:00
Aaron Po
5d93d76e99
Refactor data generator constructor and update web client handling; enhance README with detailed pipeline overview and class diagram
2026-04-09 18:19:12 -04:00
Aaron Po
028786b8b5
updates
2026-04-09 17:26:49 -04:00
Aaron Po
d7a31b5264
Create one method per file
2026-04-09 17:19:04 -04:00
Aaron Po
b31be494d7
Update documentation
2026-04-08 22:24:23 -04:00
Aaron Po
7807f0bc2a
Add beer styles json
2026-04-08 21:26:35 -04:00
Aaron Po
772ef0cdfb
Update CMakeLists.txt
2026-04-08 21:25:11 -04:00
Aaron Po
a6e2ea21d0
fix include
2026-04-08 21:24:59 -04:00
Aaron Po
a7cbf7507f
fix location.h
2026-04-08 21:07:28 -04:00
Aaron Po
3c7e74e3c1
update readme
2026-04-08 11:27:37 -04:00
Aaron Po
b1ac3a6068
fix: remove outdated data source information from help message
2026-04-07 18:02:21 -04:00
Aaron Po
06d329cac5
refactor
2026-04-07 17:55:15 -04:00
Aaron Po
54c403526b
fix: improve error handling and logging in data generation pipeline
2026-04-07 13:36:59 -04:00
Aaron Po
b8e96a6d45
replace SQLite geo pipeline with curated in-memory locations
2026-04-07 02:28:15 -04:00
Aaron Po
60ee2ecf74
add prompts
2026-04-03 15:53:04 -04:00
Aaron Po
e4e16a5084
fix: address critical correctness, reliability, and design issues in pipeline
...
CORRECTNESS FIXES:
- json_loader: Add RollbackTransaction() and call it on exception instead of
CommitTransaction(). Prevents partial data corruption on parse/disk errors.
- wikipedia_service: Fix invalid MediaWiki API parameter explaintext=true ->
explaintext=1. Now returns plain text instead of HTML markup in contexts.
- helpers: Fix ParseTwoLineResponse filter to only remove known thinking tags
(<think>, <reasoning>, <reflect>) instead of any <...> pattern. Prevents
silently removing legitimate output like <username>content</username>.
RELIABILITY & DESIGN IMPROVEMENTS:
- load/main: Make n_ctx (context window size) configurable via --n-ctx flag
(default 2048, range 1-32768) to support larger models like Qwen3-14B.
- generate_brewery: Prevent retry prompt growth by extracting location context
into constant and using compact retry format (error + schema + location only).
Avoids token truncation on final retry attempts.
- database: Fix data representativeness by changing QueryCities from
ORDER BY name (alphabetic bias) to ORDER BY RANDOM() for unbiased sampling.
Convert all SQLITE_STATIC to SQLITE_TRANSIENT to prevent use-after-free risks.
POLISH:
- infer: Advance sampling seed between generation calls to improve diversity
across brewery and user generation.
- data_downloader: Remove unnecessary commit hash truncation; use full hash.
- json_loader: Fix misleading log message from "RapidJSON" to "Boost.JSON".
2026-04-03 11:58:00 -04:00
Aaron Po
8d306bf691
Update documentation for llama
2026-04-02 23:24:06 -04:00
Aaron Po
077f6ab4ae
edit prompt
2026-04-02 22:56:18 -04:00
Aaron Po
534403734a
Refactor BiergartenDataGenerator and LlamaGenerator
2026-04-02 22:46:00 -04:00
Aaron Po
3af053f0eb
format codebase
2026-04-02 21:46:46 -04:00
Aaron Po
ba165d8aa7
Separate llama generator class src file into method files
2026-04-02 21:37:46 -04:00