Mar 22, 2026 Why reasoning models break confidence-based hallucination filters When we tested o3-mini against our gating layer, the filter had exactly zero effect. Here is why that result is not a fluke.
Mar 15, 2026 Running a controlled experiment on LLM output gating: methodology, results, and what we'd do differently We built a gate to filter LLM answers before they reach users. Then we ran a controlled experiment across three models to test whether it worked.
Mar 8, 2026 Sanity ships a built-in semantic search API. I had no idea it was this capable. I was about to set up a separate vector store. Then I checked Sanity's docs more carefully.
Mar 1, 2026 I tested Sanity's search APIs on a real LLM pipeline. Not what I expected. I needed a retrieval layer for an LLM benchmark. Sanity had more built-in than I thought, and one feature I had not considered changed how I think about auditing.
Feb 10, 2026 Shipping Momentum Without Burnout A simple cadence for shipping weekly without losing the thread.
Nov 8, 2025 Designing Systems That Scale With Teams A few defaults that keep systems readable as headcount grows.