Log inSign up
Hrishi
4,189 posts
Image
user avatar
Hrishi
@hrishioa
Trying to build systems of lasting value at Southbridge.ai. Previously CTO, Greywing (YC W21). Chop wood carry water.
Long form thoughts 🫱
olickel.com
Joined June 2013
2,767
Following
11.6K
Followers

New to X?

Sign up now to get your own personalized timeline!

Create account

By signing up, you agree to the Terms of Service and Privacy Policy, including Cookie Use.

Terms·Privacy·Cookies·Accessibility·Ads Info·© 2026 X Corp.
Don't miss what's happening
People on X are the first to know.
Log inSign up
  • Pinned
    user avatar
    Hrishi
    @hrishioa
    Mar 20
    I wasn't sure if we were going to share this, because knowing what doesn't work is often more valuable than seeing what worked. That - and being nervous about sharing your failures. Here's a technical retrospective on our 2025:
    Image
    RISC won: Building towards Data AGI
    From southbridge.ai
    4.3K
  • user avatar
    Hrishi
    @hrishioa
    Jul 12, 2025
    Kimi K2 is genuinely impressive. On the same tasks and the same agentic harness, one on one beats Grok 4. Also does it without CoT or thinking tokens looks like.
    Image
    GitHub - MoonshotAI/Kimi-K2: Kimi K2 is the large language model series developed by Moonshot AI...
    From github.com
    278K
  • user avatar
    Hrishi
    @hrishioa
    Jul 13, 2025
    Kimi is the real deal. Unless it's really Sonnet in a trench coat, this is the best agentic open-source model I've tested - BY A MILE. Here's a slice* of a 4 HOUR run (~1 second per minute) with not much more than 'keep going' from me every 90 minutes or so. The task involved
    Image
    00:00
    211K
  • user avatar
    Hrishi
    @hrishioa
    Dec 11, 2023
    A week ago one of our customers handed us 1000 pages of this (10,000 more to come), and asked us for RAG solution. We said yes - because we said yes before we saw the document. But we've solved it - and there's a chance it's a strong improvement on all RAG SoTA.
    Image
    455K
  • user avatar
    Hrishi
    @hrishioa
    Dec 13, 2023
    No more waiting - we finally we have a demo for multi-modal, 'walking' RAG! Still blows my mind - this is an AI that's reading complex diagrams in a document like a human, 'looking' at pages, then 'walking' to more relevant pages until it's found an answer. More details below
    Image
    00:00
    392K
  • user avatar
    Hrishi
    @hrishioa
    Dec 10, 2023
    How many years until open source models take over? Is it never? I've been manually testing 20-30 different models all claiming impressive scores on benchmarks against OpenAI and Anthropic. What I've found: * Super tiny models are becoming insanely good * Medium models are
    470K
  • user avatar
    Hrishi
    @hrishioa
    Dec 29, 2023
    How exactly do Language Models perceive time? This is one of the best papers I've read this year (from Kai Nylund, @ssgrn, @nlpnoah), and here's what it suggests (IMO) 👇
    Image
    222K
  • user avatar
    Hrishi
    @hrishioa
    Nov 28, 2023
    This is genuinely blowing my mind - four years of everything we've done at Greywing, finished in 60 seconds The rest is just me fooling around. Before you ask it's not the Assistants API - that's why we have interactive charts, abort, <200ms latency.
    Image
    00:00
    453K
  • user avatar
    Hrishi
    @hrishioa
    Jun 1, 2025
    I decompiled Claude Code from just the minified code. Took me 8-10 hours, multiple subagents, and every flagship model from every provider. Holy shit there's a lot in there. Claude Code is NOT just Claude in a loop - there's so much to learn from.
    Image
    Claude Code: An analysis | Notion
    From southbridge-research.notion.site
    130K
  • user avatar
    Hrishi
    @hrishioa
    Jan 19, 2024
    This is scary - ETL pipelines and ORMs are likely going away - or at least I shouldn't be getting paid for doing them anymore. This is AI generating thousands of lines of typespecs and DDLs (with no more context than the dataset), and somehow it's all 100% correct. Rant?👇
    Image
    00:00
    166K
  • user avatar
    Hrishi
    @hrishioa
    Mar 24, 2024
    github.com/hrishioa/lumen… leave this here
    Image
    142K
  • user avatar
    Hrishi
    @hrishioa
    Jun 21, 2025
    Another way to make Claude Code a 10x engineer for a complex change: 1. Make a plan for the change (if you need it) with Gemini. 2. Open a new branch. 3. Ask Claude to implement the change and maintain a scratchpad.md that is an APPEND-ONLY log with gotchas, judgement
    140K
  • user avatar
    Hrishi
    @hrishioa
    Nov 3, 2023
    Mindblown - this is a 7b local model with 128K context combining Metamorphosis with The Last Question to write a new story using just 10 GB of RAM Even two months ago this would be unfathomable. Next is to try 20k tokens of SQL DDLs, for complex data (Model below)
    Image
    00:00
    134K
  • user avatar
    Hrishi
    @hrishioa
    Oct 15, 2024
    Turns out I was wrong. Gemini is 30x cheaper for transcription (same quality) if you prompt right and segment to stay under 128k. So how good is it? It's crazy for clean audio (source+code in 🧵) AssemblyAI: 92.06% ($0.21) Flash-002: 92.68% ($0.00679) 🤯 Let me say more 👇
    137K
This post is unavailable.