Roadmap

▸ what shipped, what's next, what's parked, what we won't ship

SOURCE · PLANNED.md + git log · LAST AUDIT 2026-06-04 · v10.38.31

All Shipped In flight Parked Won't do

Segmentation roadmap

▸ CLIPSeg → SAM + GroundingDINO

SeekDeep's inpaint pipeline currently uses CLIPSeg (under 200 MB, fast startup, fits in a general VRAM budget). It works but has hard limits on resolution, semantic overlap, and small-object detection. A future upgrade path lifts segmentation quality dramatically — but adds 1.5-3 GB+ of models and complex install deps.

▸ Current behavior · CLIPSeg

Model: CIDAS/clipseg-rd64-refined. Takes a text query (e.g. "the wizard") and returns a low-resolution heatmap of pixel probabilities. The heatmap is upscaled and thresholded (> 0.3) into a binary mask. Mask bounds extend via MaxFilter(21) dilation and GaussianBlur(8) feathering.

Limitations

Internal resolution is 64×64 — jagged edges, missing thin features (fingers, poles). Often segments multiple similar objects together. Very small objects fail to excite the text encoder and return empty masks.

Future · SAM + GroundingDINO

GroundingDINO generates high-quality bounding boxes from text. Segment Anything Model (SAM) uses those boxes as prompts to produce pixel-perfect instance masks. Expected: sharp edges, distinct objects, robust small-object detection.

▸ DEFERRED RATIONALE — SAM + GroundingDINO combined add 1.5-3 GB+ of models, complex dependency chains (torchvision, supervision, custom CUDA extensions), and routinely exceed consumer GPU VRAM when running alongside SDXL + chat + vision. Mask preview (@SeekDeep mask preview <target>) is a stepping stone — debug CLIPSeg locally before spending diffusion cycles.

Persistent memory · shipped

▸ on-disk JSON · 9 /memory/* endpoints · v10.35.x

No longer parked. Persistent memory shipped in v10.35.x — data/user-facts.json is the on-disk store, atomic-written by both the Discord bot and the /memory/* HTTP endpoints. The web playground uses web-owner as the default user-id so single-user setups work without auth.

▸ How it works now

Facts live in data/user-facts.json. Writes are atomic (temp-then-rename) so concurrent writers from the bot and the GUI bound to last-writer-wins instead of corruption. Memory injection into prompts is plain inclusion — no semantic search yet — capped at 25 facts / 10 KB per user with oldest-first LRU pruning.

API · live

GET /memory/users
GET /memory/user/<id>
POST /memory/user/<id>/fact
PATCH /memory/user/<id>/fact/<n>
DELETE /memory/user/<id>/fact/<n>
Plus /memory/presets/<id> for the four KNOWN_PRESETS quick-toggles. Same routes power the Discord /remember /recall /forget commands.

Still open

Semantic search / summarization on retrieval is not yet implemented — prompt injection is "include them all" up to the cap. For under ~25 short facts this is fine; for richer profiles we\'d want a small local embedding model. Filed under QoL wishlist, not blocking.