Ember Insights · The Blog

Notes from inside real deployments.

How we think about retrieval, what we've learned shipping Ember into PE firms, top-quartile consultancies, and venture funds — and the occasional opinion the rest of the industry won't have for another year.

Retrieval

Apr 22, 2026 8 min read

Why Graph-RAG eats vector search for breakfast in the enterprise

Chunks-and-cosine works for blogs. It quietly falls apart on a deal room, where the answer lives across a memo, three appendices, and a footnote that cites them.

Sam Chen 2026-04-22

Architecture

Apr 08, 2026 5 min read

Two files, any agent: how Ember Native plugs into your stack in an afternoon

A pointer file and a manifest. That's the whole integration surface. Here's the design rationale, and what we deliberately left out.

Krishna T 2026-04-08

Field Notes

Mar 19, 2026 12 min read

Inside a PE deployment: what a deal team's brain actually looks like

Six weeks at a $40B AUM firm. The agents the partners already use, the documents they wouldn't share with each other, and the structure that finally made the agents useful.

Dela L 2026-03-19

Graph

Mar 02, 2026 9 min read

Document graphs, citations, and why chunking lies to you

If a memo references an exhibit and the exhibit references a model, your chunker has no idea. The graph does — and it changes which sources get retrieved.

Sam Chen 2026-03-02

Security

Feb 14, 2026 6 min read

The cross-tenant problem nobody talks about

Most enterprise RAG vendors will, given the right prompt, leak across customers. Here's the boring, low-glamour fix — and why it has to live below the retriever.

Krishna T 2026-02-14

Strategy

Jan 28, 2026 4 min read

What we actually mean when we say "institutional knowledge"

It's not the Confluence wiki. It's not even the deal memos. It's the way the partners reason about them — and that's the thing agents need next.

Dela L 2026-01-28

See all posts →

Ember Insights has been picked up by

The Information

Stratechery

Bloomberg

TechCrunch

a16z · future

Lenny's Newsletter

Ember Output · Technical Papers

Research from the team building the knowledge layer.

Peer-reviewed work, preprints, and internal technical reports. Most of what we publish is a direct consequence of something we hit in production at a customer — written up so the rest of the field can build on it.

NeurIPS '26 Workshop Apr 2026

GraphRAG-X: Citation-Aware Retrieval over Heterogeneous Enterprise Corpora

S. Chen, K. Thiagarajan, D. Lin, et al.

A retrieval architecture that treats citations as first-class edges. Outperforms dense baselines by 17.4 nDCG@10 on a held-out PE deal-room benchmark.

PDF arXiv BibTeX 14 pp

arXiv preprint Mar 2026 · 2603.11421

Cross-Tenant Retrieval Without Cross-Tenant Leakage

K. Thiagarajan, S. Chen

A permission-aware retriever that enforces tenant boundaries below the embedding layer. Zero cross-tenant hits across 1.2M adversarial probes; 0.3% latency overhead.

PDF arXiv BibTeX 22 pp

VLDB '26 Feb 2026 · in submission

Document Graphs as a First-Class Index

D. Lin, S. Chen, K. Thiagarajan

Treating the document graph as a primary access path — not a post-hoc rerank — yields a 4× reduction in tokens-to-correct-answer on multi-hop financial queries.

PDF arXiv BibTeX 18 pp

Tech Report Jan 2026 · TR-2026-01

Benchmarking Retrieval over Real Private-Equity Deal Rooms

D. Lin, S. Chen, K. Thiagarajan, M. Park

A 12,400-query benchmark across 47 anonymized deal rooms. We release the evaluation harness; the underlying corpora remain under NDA.

PDF Harness BibTeX 31 pp

ICML '25 Workshop Jul 2025 · R2-FM

Long-Context Models Forget Earlier Citations: A Failure Mode We Should Stop Hiding

S. Chen, K. Thiagarajan

Frontier 200k-context models degrade by 28% on attribution faithfulness past 60k tokens, even when raw recall stays flat. Implications for long-document RAG.

PDF arXiv BibTeX 9 pp

Tech Report Oct 2025 · TR-2025-04

Ingestion at Audit Standards: A Pipeline for Regulated Verticals

M. Park, D. Lin

How we extract, normalize, and provenance-track 14 document types — including handwritten margin notes on legacy memos — for audit-grade retrieval.

PDF Harness BibTeX 17 pp

See all papers →