Log inSign up
Vasu Shyam
632 posts
Image
user avatar
Vasu Shyam
@vasud3vshyam
Currently working as a machine learning researcher at a Silicon Valley startup. Former physics postdoc at Stanford and Branco Weiss fellow.
San Francisco CA
Joined January 2023
404
Following
1,210
Followers
  • Pinned
    user avatar
    Vasu Shyam
    @vasud3vshyam
    Aug 12, 2024
    Ever looked at the attention operation and said "hang on, that's a one-point function!"?
    Image
    631K
  • user avatar
    Vasu Shyam
    @vasud3vshyam
    Aug 12, 2024
    Replying to @vasud3vshyam
    Well, if the answer to any of these questions was "no" then consider reading: arxiv.org/pdf/2408.04093 that I co authored with @J_Pilault @nshepperd1, @BerenMillidge, @QuentinAnthon15
    user avatar
    Jonathan Pilault
    @J_Pilault
    Aug 12, 2024
    Zyphra is proud to release Tree Attention, a fast inference method for extremely large sequence lengths • 8x faster inference speed vs. Ring Attention • 2x less peak memory • low data communication volumes Paper: arxiv.org/abs/2408.04093 Code: github.com/Zyphra/tree_at… A 🧵
    Image
    17K
  • user avatar
    Vasu Shyam
    @vasud3vshyam
    Aug 12, 2024
    Replying to @vasud3vshyam
    Having noticed that, did you then write down the generating function?
    Image
    15K
  • user avatar
    Vasu Shyam
    @vasud3vshyam
    Aug 12, 2024
    Replying to @vasud3vshyam
    And then did you see how much faster than Ring Attention this method ends up being for decoding?
    Image
    12K
  • user avatar
    Vasu Shyam
    @vasud3vshyam
    Aug 12, 2024
    Replying to @vasud3vshyam
    Then, did you happen to recall that thanks to automatic differentiation (timvieira.github.io/blog/post/2016…) the time complexity to compute the gradient of a function is roughly equivalent to the complexity of computing the function itself?
    Image
    14K
  • user avatar
    Vasu Shyam
    @vasud3vshyam
    Jan 10, 2024
    Finally managed retire early from my professional physics research career (I know, I know, in time it would have retired me). Eagerly looking forward to going full crackpot as an amateur.
    11K
  • user avatar
    Vasu Shyam
    @vasud3vshyam
    Aug 12, 2024
    Replying to @vasud3vshyam
    Consequently, did you realize that the efficient tree reductions here can be done on a DGX cluster via NCCL Allreduce in a topology-aware, efficiently overlapped manner?
    Image
    12K
  • user avatar
    Vasu Shyam
    @vasud3vshyam
    Aug 12, 2024
    Replying to @vasud3vshyam
    And how much lower the peak memory is?
    Image
    11K
  • user avatar
    Vasu Shyam
    @vasud3vshyam
    Mar 9, 2024
    youtu.be/ZCIho8geEfI?si… Podcast is back! Thanks @quantum_geoff and Suvrat Raju for participating!
    27K
  • user avatar
    Vasu Shyam
    @vasud3vshyam
    Aug 12, 2024
    Replying to @vasud3vshyam
    If you thought this was interesting, wait till you see what else my incredible colleagues at @ZyphraAI are up to!
    12K
  • user avatar
    Vasu Shyam
    @vasud3vshyam
    Aug 12, 2024
    Replying to @vasud3vshyam
    OK that was a fun 15 minutes of twitter fame. Now for the downfall - the tweet at the top's got an embarrassing typo, bonus points for whoever catches it
    8.4K
  • user avatar
    Vasu Shyam
    @vasud3vshyam
    Aug 23, 2023
    John Donoghue @JFdonoghue1033 kindly joined my podcast yesterday and clearly explained in what sense we already have quantum gravity at low energies:
    4.7K
  • user avatar
    Vasu Shyam
    @vasud3vshyam
    Aug 14, 2024
    Replying to @ylecun
    Thanks for sharing! Another little trick that might amuse you is that we identified a function which upon minimization produces the forward pass of the attention block:
    Image
    2.1K
  • user avatar
    Vasu Shyam
    @vasud3vshyam
    Jan 26, 2023
    Replying to @CburgesCliff
    Thanks for this, Cliff. What's a good reference that you'd recommend to learn about the current status of the Baryon Asymmetry problem?
    1.1K

New to X?

Sign up now to get your own personalized timeline!

Create account

By signing up, you agree to the Terms of Service and Privacy Policy, including Cookie Use.

Terms·Privacy·Cookies·Accessibility·Ads Info·© 2026 X Corp.
Don't miss what's happening
People on X are the first to know.
Log inSign up