Reorganize docs by project and archive legacy context files
This commit is contained in:
@@ -0,0 +1,5 @@
|
||||
# Archive
|
||||
|
||||
This folder stores superseded context docs.
|
||||
|
||||
- `legacy-context-2026-03-13/`: initial ungrouped context/runbook files moved during project-based reorganization.
|
||||
@@ -0,0 +1,31 @@
|
||||
# Architecture
|
||||
|
||||
## Request flow
|
||||
|
||||
1. User query enters SearXNG (`search.sethpc.xyz`).
|
||||
2. SearXNG calls `json_engine` endpoint at SethSearch API.
|
||||
3. SethSearch queries local SQLite FTS5 index and returns normalized results.
|
||||
4. SearXNG merges SethSearch with other engines and renders the result page.
|
||||
|
||||
## Data plane
|
||||
|
||||
- Index DB: `/opt/sethsearch/articles.db`
|
||||
- Tables:
|
||||
- `documents` (canonical indexed records)
|
||||
- `documents_fts` (FTS5 virtual table)
|
||||
- Source-level scoring and matching occur in SethSearch.
|
||||
|
||||
## Source adapters
|
||||
|
||||
- Caddy snapshot parser: domain discovery and tag generation.
|
||||
- Gitea adapter: public repo metadata via REST.
|
||||
- Wiki.js adapter: public crawl with fallback records.
|
||||
- WordPress adapter: public posts/pages via `/wp-json/wp/v2/...`.
|
||||
- Emby adapter: media index using server API token and deep links.
|
||||
- FreshRSS adapter: GReader API article ingest.
|
||||
|
||||
## Reliability model
|
||||
|
||||
- SethSearch syncs sources independently.
|
||||
- If one source fails, others continue and commit.
|
||||
- Service runs under systemd with restart policy.
|
||||
@@ -0,0 +1,56 @@
|
||||
# SearchXNG Context
|
||||
|
||||
Last updated: 2026-03-13 05:27:14 UTC
|
||||
|
||||
## Homelab placement
|
||||
|
||||
- Cluster: `sethpc`
|
||||
- SearXNG:
|
||||
- CT: `119`
|
||||
- Node: `pve173`
|
||||
- URL: `https://searxng.sethpc.xyz` and `https://search.sethpc.xyz`
|
||||
- Config: `/etc/searxng/settings.yml`
|
||||
- SethSearch API:
|
||||
- CT: `620`
|
||||
- Node: `pve173`
|
||||
- URL: `https://sethsearch.sethpc.xyz`
|
||||
- Service: `sethsearch.service`
|
||||
- App path: `/opt/sethsearch/sethsearch.py`
|
||||
- Config: `/opt/sethsearch/config.json`
|
||||
- Caddy:
|
||||
- CT: `600`
|
||||
- Node: `pve241`
|
||||
- Config: `/etc/caddy/Caddyfile`
|
||||
|
||||
## Search engines in use
|
||||
|
||||
- `sethsearch` (`shortcut: ss`, category: `general`)
|
||||
- URL: `https://sethsearch.sethpc.xyz/search?q={query}&source=general&limit=40`
|
||||
- Weight: `5.0`
|
||||
- `sethflix` (`shortcut: sfx`, category: `videos`)
|
||||
- URL: `https://sethsearch.sethpc.xyz/search?q={query}&source=sethflix&limit=40`
|
||||
- Weight: `5.0`
|
||||
- `libretranslate` (`shortcut: lt`)
|
||||
- Base URL: `https://translate.sethpc.xyz`
|
||||
|
||||
## SethSearch sources
|
||||
|
||||
- `sites`: Caddy host/domain catalog with tags.
|
||||
- `gitea`: public repositories.
|
||||
- `wikijs`: public crawl/fallback page catalog.
|
||||
- `wordpress`: public pages/posts from `sethfreiberg.com`.
|
||||
- `emby`: media discovery index (links require account session).
|
||||
- `freshrss`: article index with stricter matching and lower weight.
|
||||
|
||||
## Matching policy
|
||||
|
||||
- General (`source=general`): includes Emby with stricter matching.
|
||||
- Sethflix (`source=sethflix`): Emby only with liberal matching.
|
||||
- FreshRSS: strict term matching and lower source weight.
|
||||
|
||||
## API endpoints
|
||||
|
||||
- Health: `GET /health`
|
||||
- Search: `GET /search?q=<query>&source=<group|source>&limit=<n>`
|
||||
- Stats: `GET /stats`
|
||||
- Manual sync: `POST /sync`
|
||||
@@ -0,0 +1,36 @@
|
||||
# Operations Runbook
|
||||
|
||||
## Common commands
|
||||
|
||||
- SethSearch service status:
|
||||
- `ssh pve173 "pct exec 620 -- systemctl status sethsearch --no-pager"`
|
||||
- SethSearch logs:
|
||||
- `ssh pve173 "pct exec 620 -- journalctl -u sethsearch -n 100 --no-pager"`
|
||||
- SearXNG service status:
|
||||
- `ssh pve173 "pct exec 119 -- systemctl status searxng --no-pager"`
|
||||
- SearXNG logs:
|
||||
- `ssh pve173 "pct exec 119 -- journalctl -u searxng -n 100 --no-pager"`
|
||||
|
||||
## Verify behavior
|
||||
|
||||
- General search endpoint:
|
||||
- `curl -s "https://sethsearch.sethpc.xyz/search?q=home&source=general&limit=5"`
|
||||
- Sethflix endpoint:
|
||||
- `curl -s "https://sethsearch.sethpc.xyz/search?q=always%20sunny&source=sethflix&limit=5"`
|
||||
- Stats:
|
||||
- `curl -s "https://sethsearch.sethpc.xyz/stats"`
|
||||
|
||||
## Config touchpoints
|
||||
|
||||
- SethSearch config: `/opt/sethsearch/config.json`
|
||||
- SethSearch code: `/opt/sethsearch/sethsearch.py`
|
||||
- SearXNG config: `/etc/searxng/settings.yml`
|
||||
- Caddy config: `/etc/caddy/Caddyfile`
|
||||
|
||||
## Change protocol
|
||||
|
||||
1. Edit SethSearch code/config.
|
||||
2. Restart SethSearch and verify `/health` and `/stats`.
|
||||
3. Edit SearXNG engines (if needed).
|
||||
4. Restart SearXNG and verify `/config` engine list.
|
||||
5. Validate top query use-cases.
|
||||
@@ -0,0 +1,24 @@
|
||||
# SethSearch API Layer
|
||||
|
||||
## Live deployment
|
||||
|
||||
- Host CT: 620 (`sethsearch-api`)
|
||||
- URL: `https://sethsearch.sethpc.xyz`
|
||||
- App: `/opt/sethsearch/sethsearch.py`
|
||||
- Config: `/opt/sethsearch/config.json`
|
||||
|
||||
## Source groups
|
||||
|
||||
- `source=general`: sites, gitea, wikijs, wordpress, freshrss, emby (strict)
|
||||
- `source=sethflix`: emby (liberal)
|
||||
|
||||
## Weighting overview
|
||||
|
||||
- Higher: sites, gitea, wikijs, wordpress, emby
|
||||
- Lower + strict: freshrss
|
||||
|
||||
## Maintenance
|
||||
|
||||
- Manual re-index: `POST /sync`
|
||||
- Health check: `GET /health`
|
||||
- Index summary: `GET /stats`
|
||||
@@ -0,0 +1,19 @@
|
||||
# SearXNG Layer
|
||||
|
||||
This folder documents SearXNG-side integration with SethSearch.
|
||||
|
||||
## Active custom engines
|
||||
|
||||
- `sethsearch` (general, highest weight)
|
||||
- `sethflix` (videos, Emby-only)
|
||||
- `libretranslate` (translate)
|
||||
|
||||
## Live config location
|
||||
|
||||
- `/etc/searxng/settings.yml` in CT 119 on `pve173`
|
||||
|
||||
## Important notes
|
||||
|
||||
- SearXNG blocks plain HTTP in engine requests; use HTTPS endpoints.
|
||||
- Engine names should be lowercase to avoid startup warnings.
|
||||
- `use_default_settings: true` allows small override file patterns.
|
||||
Reference in New Issue
Block a user