Your AI agent is sensible however forgetful. Each new session begins from zero — no reminiscence of who you met, what you learn, what you determined final Tuesday. GBrain is an open-source repair for that. Constructed by Garry Tan (President and CEO of Y Combinator) to energy his personal OpenClaw and Hermes deployments, it’s a markdown-first, Postgres-backed data layer that ingests conferences, emails, tweets, and notes, then auto-wires a typed data graph on prime — with zero LLM requires the graph extraction. The manufacturing mind behind Garry’s precise brokers at the moment holds 146,646 pages, 24,585 folks, 5,339 corporations, and 66 autonomous cron jobs. By itself benchmark (BrainBench, a 240-page rich-prose corpus), GBrain hits P@5 49.1% and R@5 97.9%, a +31.4-point P@5 lead over the identical codebase with the graph layer disabled.
This can be a hands-on tutorial. You’ll set up GBrain domestically, import a small notes folder, run an actual search, watch the data graph wire itself, and join it to Claude Code by way of MCP. About 20 minutes begin to end. All terminal outputs beneath have been captured from a reside set up of GBrain v0.38.2.0. The repository (MIT-licensed) lives at github.com/garrytan/gbrain.
What you’re constructing
By the top of the tutorial, you’ll have:
An area ~/.gbrain/mind.pglite database — embedded Postgres 17 (by way of WASM) with pgvector, zero server config.
A small “mind repo” of markdown notes about folks, corporations, and ideas.
A working hybrid-search CLI that mixes vector + BM25 key phrase + Reciprocal Rank Fusion (RRF), with a ZeroEntropy reranker on prime by default.
A typed data graph (works_at, based, invested_in, attended, advises, mentions) auto-extracted out of your notes.
An MCP server exposing 74 instruments so Claude Code, Cursor, and Windsurf can learn and write to the mind straight.
Stipulations
macOS or Linux (Home windows customers: use WSL2).
A code editor.
Bun ≥ 1.3.10 (the runtime GBrain ships on; the repo’s package deal.json declares this because the minimal engine). We’ll set up it in Step 1.
An embedding API key from one among: ZeroEntropy (default), OpenAI, or Voyage. With out one, you possibly can nonetheless set up and run key phrase search, however gbrain question (hybrid + vector) will return no outcomes.
Non-compulsory: an Anthropic API key for multi-query enlargement throughout search.
Step 1 — Set up Bun and GBrain
GBrain is written in TypeScript and runs on Bun. Set up it first:
exec $SHELL # reload shell so `bun` is on PATH
bun –version
Now set up GBrain. As of v0.38, the canonical set up path is a single international Bun set up:
gbrain –version
# gbrain 0.38.2.0
Step 2 — Initialize your mind
gbrain init –pglite provisions a neighborhood PGLite database in ~/.gbrain/. PGLite is full Postgres compiled to WASM — no server, no Docker, prepared in roughly two seconds.
For this tutorial we’ll defer the embedding supplier so you possibly can comply with alongside with out an API key straight away — we’ll wire it up in Step 6 after we run hybrid search:
(When you’d fairly configure embeddings now, set one among OPENAI_API_KEY, ZEROENTROPY_API_KEY, or VOYAGE_API_KEY in your surroundings earlier than working plain gbrain init –pglite.)
Actual output captured from a recent set up (truncated for brevity — there are 81 migrations from schema v1 → v85):
Schema model 1 → 85 (81 migration(s) pending)
[2] slugify_existing_pages…
[2] ✓ slugify_existing_pages
[3] unique_chunk_index…
[3] ✓ unique_chunk_index
…
Mind prepared at /house/you/.gbrain/mind.pglite
0 pages. Engine: PGLite (native Postgres).
You now have an empty mind. Affirm:
# Pages: 0
# Chunks: 0
# Embedded: 0
# Hyperlinks: 0
# Tags: 0
# Timeline: 0
Step 3 — Create a tiny mind repo
The mind repo is only a listing of markdown information. Every file follows GBrain’s compiled fact + timeline sample: a present best-understanding part on prime, an append-only proof path beneath.
Essential: wikilinks should use the total slug path (e.g., [[people/alice-chen]], not simply [[alice-chen]]) for the graph extractor to resolve them. This can be a actual gotcha — I examined each types; the quick kind silently produces zero hyperlinks.
cd ~/my-brain
Create an individual web page:
—
sort: particular person
title: Alice Chen
tags: [founder, ai-infra]
—
Founder and CEO of [[companies/acme-ai]]. Beforehand workers engineer at
Google Mind. Focus space: inference optimization for small language fashions.
—
– 2024-03-12: Met at AI Engineer Summit. Mentioned sparse MoE routing.
– 2024-09-04: Introduced $12M seed led by Sequoia.
– 2025-01-18: Shipped open-source inference router on GitHub.
EOF
An organization web page:
—
sort: firm
title: Acme AI
tags: [startup, inference]
—
YC W24 inference-optimization startup. Based by [[people/alice-chen]].
Constructing latency-aware routing for sub-7B fashions.
—
– 2024-09-04: $12M seed, led by Sequoia.
– 2025-01-18: Open-sourced their inference router.
EOF
And an idea web page:
—
sort: idea
title: Inference Optimization
tags: [ml-systems]
—
Strategies to cut back latency and value when serving language fashions:
quantization, speculative decoding, KV-cache reuse, and request batching.
EOF
Step 4 — Import the repo
gbrain import is idempotent (content-hash deduplicated). We’ll move –no-embed so this step is deterministic for readers who don’t have an embedding key set but — embeddings get backfilled in Step 6. Actual output:
[gbrain phase] import.collect_files finished 2ms information=3
Discovered 3 markdown information
[import.files] 3/3 (100%) imported=3 skipped=0 errors=0
Import full (0.3s):
3 pages imported
0 pages skipped (0 unchanged, 0 errors)
3 chunks created
Affirm:
# corporations/acme-ai firm 2026-05-22 Acme AI
# ideas/inference-optimization idea 2026-05-22 Inference Optimization
# folks/alice-chen particular person 2026-05-22 Alice Chen
Step 5 — Wire the data graph
For a first-time import, run the hyperlink extractor explicitly to backfill the graph out of your wikilinks. That is pure regex + typed inference — zero LLM calls.
Actual output:
Hyperlinks: created 2 from 3 pages (db supply)
Carried out: 2 hyperlinks, 0 timeline entries from 3 pages
Two typed edges have been inferred from the wikilinks: alice-chen –works_at–> acme-ai (from “Founder and CEO of …”) and acme-ai –founded–> alice-chen (from “Based by …”). The inference cascade fires so as: FOUNDED → INVESTED → ADVISES → WORKS_AT → MENTIONS. No mannequin within the loop.
Examine the graph straight:
# [depth 0] folks/alice-chen
# –works_at-> corporations/acme-ai (depth 1)
# [
# {
# “from_slug”: “people/alice-chen”,
# “to_slug”: “companies/acme-ai”,
# “link_type”: “works_at”,
# “context”: “Founder and CEO of [[companies/acme-ai]]…”,
# “link_source”: “markdown”,
# …
# }
# ]
That is the distinction between vector search and structured retrieval. “Who works at Acme AI?” is now a one-hop typed-edge traversal, not a similarity rating. That structural channel is what drives the +31.4-point P@5 carry over the graph-disabled variant on BrainBench.
Step 6 — Run a search
GBrain ships two search verbs. gbrain search is keyword-only (BM25 on Postgres tsvector) and works with out embeddings:
# [0.3648] corporations/acme-ai — YC W24 inference-optimization startup…
# [0.3648] folks/alice-chen — Founder and CEO of [[companies/acme-ai]]…
gbrain question is the total hybrid pipeline: vector (HNSW on pgvector) + BM25 + Reciprocal Rank Fusion + optionally available multi-query enlargement (Anthropic Haiku) + an optionally available ZeroEntropy reranker. It wants embeddings, which we deferred in Step 2 — wire them up now:
export OPENAI_API_KEY=sk-…
gbrain config set embedding_model openai:text-embedding-3-large
gbrain embed –all # one-time backfill in opposition to your embedding supplier
gbrain question “who works on small-model inference?”
export OPENAI_API_KEY=sk-…
gbrain config set embedding_model openai:text-embedding-3-large
gbrain embed –all # one-time backfill in opposition to your embedding supplier
gbrain question “who works on small-model inference?”
Three search modes ship out of the field — conservative, balanced, tokenmax — bundling the price/high quality knobs into one config key. Default is balanced with the ZeroEntropy reranker on. RRF components: rating = sum(1 / (60 + rank)).
Step 7 — Connect with Claude Code by way of MCP
The mind is extra helpful when an AI agent can learn and write to it straight. GBrain exposes 74 instruments over the Mannequin Context Protocol by way of stdio. The canonical setup is one command (not a hand-edited JSON file):
Confirm the set up:
# gbrain stdio gbrain serve
Now ask Claude Code one thing like “search the mind for inference optimization” and it’ll route by the search instrument and return your listed outcomes. The precise MCP instrument names are plain snake_case: get_page, put_page, delete_page, list_pages, search, question, add_link, get_backlinks, add_tag, and 65 extra.
Cursor and Windsurf use the usual MCP JSON config of their respective settings UIs. The server spec is similar:
“mcpServers”: {
“gbrain”: { “command”: “gbrain”, “args”: [“serve”] }
}
}
Claude Desktop makes use of claude_desktop_config.json for native stdio MCP servers with the identical JSON spec. Distant HTTP MCP servers should be added by Settings → Integrations with a bearer token. See docs/mcp/CLAUDE_DESKTOP.md within the repo for the GUI walkthrough.
If you need distant entry from any machine, swap stdio for HTTP:
# Bearer auth, default-deny CORS, two-bucket charge restrict, per-request audit log.
# Postgres-only by design (PGLite is local-only).
Step 8 — Let the mind run itself
GBrain ships an autopilot loop. As of v0.36.4, one command computes a dependency-ordered remediation plan, submits every step as a Minion job, re-checks the mind’s well being rating between steps, and refuses to spend previous your price cap:
Or run it as a daemon:
Wholesome brains sleep for 60 minutes between ticks. Unhealthy ones get the total in a single day cycle: sync, extract, embed, consolidate, synthesize. Three phases (synthesize, patterns, consolidate) are protected so an MCP-connected agent can’t silently burn API credit.
For ad-hoc background work, the Minions queue takes shell jobs and LLM subagent jobs facet by facet:
gbrain jobs stats
gbrain jobs work –queue default
One PGLite caveat: gbrain jobs supervisor (the auto-restarting employee daemon) is Postgres-only. PGLite’s unique file lock blocks the separate employee course of — the CLI rejects with a transparent error if config.engine === ‘pglite’. When you’re on PGLite, follow inline –follow jobs for the tutorial, or run gbrain migrate –to supabase earlier than standing up a persistent employee.
Routing rule: deterministic work (pull tweets, parse JSON, write a web page) goes to Minions; judgment work (triage an inbox, assess precedence) goes to LLM sub-agents.
What simply occurred, in a single diagram
(your repo, (hybrid retrieval + (HOW to make use of the mind;
supply of fact) typed graph) RESOLVER.md routes intent)
▲ │
└────────────── agent reads/writes ──────────┘
The markdown repo is the system of document. GBrain is the retrieval + graph layer over it. The agent reads and writes by each, and people can all the time open any .md file and edit it straight — gbrain sync picks up the change.
The place to go subsequent
One-line seize (new in v0.38): gbrain seize “the thought I need to keep in mind” lands straight in inbox/YYYY-MM-DD-. Additionally accepts –file, –stdin, and webhook ingestion by way of gbrain serve –http /ingest.
Migrate to Supabase when your mind outgrows native (PGLite is nice as much as ~50K pages): gbrain migrate –to supabase.
Ingest actual information with one of many recipes: voice (Twilio + OpenAI Realtime), e-mail + calendar, 16 embedding suppliers, credential gateway.
Run the benchmarks within the sibling repo gbrain-evals: BrainBench (artificial) and gbrain eval longmemeval (the general public LongMemEval benchmark).
Writer your individual abilities. A ability is a fats markdown file that encodes a workflow — triggers, checks, high quality gate. gbrain check-resolvable validates the ability tree for reachability / MECE / DRY.
The deeper wager behind GBrain is that skinny harness, fats abilities beats skinny abilities behind a fats agent. The runtime stays small; the intelligence lives in markdown information the agent reads at choice time. Every commit you make to your mind repo is everlasting context your agent inherits the following time it wakes up. The longer you run it, the smarter it will get.
Marktechpost’s Visible Explainer
Key Takeaways
GBrain (v0.38.2.0) offers AI brokers a persistent, markdown-first reminiscence layer — constructed by Garry Tan to energy his personal OpenClaw/Hermes deployments holding 146,646 pages and 24,585 folks.
Set up runs domestically in ~half-hour on PGLite (Postgres 17 compiled to WASM, zero server) and scales to Supabase or self-hosted Postgres when wanted.
Each wikilink is parsed by a regex inference cascade (FOUNDED → INVESTED → ADVISES → WORKS_AT) that writes typed graph edges with zero LLM calls.
Hybrid search (vector + BM25 + RRF + ZeroEntropy reranker) hits P@5 49.1% / R@5 97.9% on BrainBench — a +31.4-point P@5 carry over the graph-disabled variant.
Exposes 74 instruments over MCP — wire it into Claude Code with a single claude mcp add gbrain — gbrain serve and your agent can learn/write the mind straight.
Take a look at the GitHub Repo and Implementation Codes. Additionally, be happy to comply with us on Twitter and don’t overlook to hitch our 150k+ ML SubReddit and Subscribe to our E-newsletter. Wait! are you on telegram? now you possibly can be a part of us on telegram as nicely.
Have to companion with us for selling your GitHub Repo OR Hugging Face Web page OR Product Launch OR Webinar and many others.? Join with us


