# Simon **Simon** is the Freiberg family's AI historian — a conversational interface to a genealogical database of 1,311 people spanning three centuries, from 18th-century Palatinate Germany to present-day America. He's named after **Simon Freiberg I** (1782–1864), the patriarch who was born Sedrel Moses in Steinbach am Donnersberg and adopted the surname Freiberg during the Napoleonic decrees. Ask Simon who he is, and he'll tell you about the man, not the chatbot. **Live at:** [simon.sethpc.xyz](https://simon.sethpc.xyz) ## Background In 1992, **Andrew S. Freiberg, M.D.** sat down with Visual Basic 1.0 and wrote a program called *Family Tree For Windows*. No internet to reference, no AI to help — just a doctor with a clear idea of how a family should be represented in data. Over the next 34 years, he entered 1,188 people into that program: names, dates, marriages, children, notes. The Freiberg family and every branch it touches — Loeser, Weil, Shire, Auer, Fernbach, Bing, Workum — all preserved in a CSV file and a 16-bit Windows executable. The source code was lost long ago. Only the binary and the data survived. In March 2026, Andy's son **Seth** used AI to reverse-engineer the original executable — pulling strings from the 16-bit binary, decoding the custom CSV format, reconstructing the navigation logic — and built a faithful web recreation. Same interface, same behavior, running in a browser. When Andy saw it, he said *"I thought I was dreaming."* He immediately started adding new family members. Seth kept building. The flat CSV became a GEDCOM-standard database backed by a REST API designed from the ground up for AI collaboration. He built an agent infrastructure where AI researchers — Claude, Gemini, Codex — can register themselves, claim tasks from a work queue, conduct research against public historical records, and generate new tasks for future agents. Each agent session picks up where the last one left off. Discoveries don't go straight into the family tree. The system uses a tiered fact promotion model: AI agents submit findings as unverified research facts, tagged with sources and confidence scores. A **SourceRank** algorithm weighs the reliability of each source — a census record ranks higher than a newspaper mention, which ranks higher than an AI inference. Facts move through tiers: *AI-inferred → AI-sourced → human-entered → human-reviewed → verified*. Nothing reaches the curated tree without evidence and review. The goal isn't just to grow the tree — it's to know how much to trust each piece of it. Along the way, the family's story expanded: the Freiberg & Workum whiskey empire (Ohio and Kentucky's largest), three generations of Reform Judaism leadership, Dr. Albert Freiberg's pioneering orthopedic work, Stella Freiberg's role in founding the National Federation of Temple Sisterhoods, David Shire's Oscar-winning compositions. The backend was designed for AI from the start. After building it, Seth realized the only appropriate frontend would also be an AI. Simon is that frontend — a way for family members to simply ask questions and get answers. But he's more than a lookup tool. ## How He Works Simon runs on [Gemma 4 (26B)](https://ai.google.dev/gemma), Google's open-weight language model, hosted locally. When you ask a question, he searches the family database, looks up relationships, pulls life events and dates, and composes a response. All of this happens on private infrastructure — no data leaves the family's network. He has six tools: - **find_by_name** — looks up a person by name, handling nicknames, typos, and partial matches - **search_by_topic** — hybrid BM25 + semantic search across the tree for thematic queries (places, occupations, eras) - **lookup_person** — retrieves full details for a specific person (facts, citations, family) - **find_relationship** — traces the path between two people in the tree - **get_stats** — tree-wide statistics - **get_historical_context** — retrieves sourced historical background entries relevant to a person, place, or era. Returns the context along with the list of family members it applies to. The search tools use a hybrid ranking system that combines traditional keyword matching (BM25) with semantic similarity (cosine over 1024-dimensional embeddings from bge-large-en-v1.5). This means Simon finds relevant results even when the question uses different words than the database — asking about "the liquor business" finds people tagged with "distillery" and "wholesale liquor." He has two modes: **Historian** — the default. Ask about anyone in the tree and Simon looks them up. Direct, factual, no filler. He knows the difference between what the records say and what's uncertain. **Interview** — when a family member identifies themselves, Simon offers to switch into interview mode. In this mode, he becomes an oral history collector. He asks follow-up questions, prompts your memory using what's in the database, and captures everything. These conversations are logged so they can be reviewed and — where corroborated — added to the family record. Living family members are the richest source we have. ## Historical Context Simon draws on a growing library of historical context entries — sourced background articles about the eras, places, and events that shaped the family. When you ask "why did the Freibergs come to Cincinnati?" or "what was the whiskey industry like in the 1860s?", Simon retrieves relevant context entries with citations and shows which family members they apply to. These entries are generated autonomously by **gemma-context**, a local LLM tool that clusters persons by geography, era, and thematic connections (occupations, immigration patterns, religious communities), then searches Wikipedia and other public sources to synthesize grounded historical summaries. Every entry is tagged for human review before it's considered authoritative. Topics covered include German-Jewish immigration patterns, the Cincinnati whiskey trade, Napoleonic civil registration decrees in the Palatinate, Jewish religious communities, and more — each linked to the specific people in the tree it applies to. ## Data Quality Two local LLM tools run continuously to maintain data quality: **gemma-audit** scans the entire database for issues: facts without source citations, internal contradictions (date math errors, timeline impossibilities), research tasks that are already answered by existing data, and potential duplicate person records. It produces a findings report that a verification agent then reviews and acts on — linking existing sources, contesting bad facts, or filing research tasks for anything that needs external investigation. **gemma-context** populates the historical context library described above. It clusters persons by surname, geography, era, and thematic signals (occupations, migration patterns, religious affiliations, census records, even the Napoleonic surname decrees), generates targeted web search queries, and synthesizes the results into sourced context entries. Both tools run on local GPUs (a 3090 Ti and a V100) using Google's Gemma 4 model. No data leaves the network. No API calls to external AI services. ## The Family The Freibergs arrived in Cincinnati in the 1840s from the Palatinate region of what is now southwestern Germany. What followed is a distinctly American story: - **Simon Freiberg I** and his wife Minnie Grunwald had twelve children. Their sons built a whiskey empire — eleven distillery companies operating across Ohio and Kentucky before Prohibition. - **Judah (Julius) Freiberg** became one of Cincinnati's most prominent citizens: president of the Union of American Hebrew Congregations, trustee of Hebrew Union College, and a leader of Congregation Bene Israel, one of the oldest Jewish congregations west of the Alleghenies. - **J. Walter Freiberg** succeeded his father at the helm of UAHC and became a major philanthropist, helping shape Reform Judaism in America during the early 20th century. - **Dr. Albert Freiberg** was a pioneering orthopedic surgeon at the University of Cincinnati — Freiberg Disease (infraction of the second metatarsal) bears his name. - **Stella Freiberg** was a founding leader of the National Federation of Temple Sisterhoods and a pillar of the Cincinnati Symphony Orchestra. - **David Shire**, connected through the Scheuer/Shire branch, won an Academy Award for the song *"It Goes Like It Goes"* from the film *Norma Rae*. The tree today includes 1,311 people, 2,198 relationships, 222 sources, and 1,016 citations across more than a dozen interconnected families. ## Origins Andrew wrote the program. Seth brought it back and built everything around it — the database, the API, the agent infrastructure, and Simon. The AI tools (Claude, Gemini, Codex) are collaborators, but the vision, architecture, and editorial judgment are Seth's. Simon was built on April 4, 2026 using [Claude Opus 4.6](https://claude.ai). He is named after the patriarch, and he knows it. Simon, the family data, and the research infrastructure are private.