Log inSign up
Vaishnavh Nagarajan
843 posts
Image
user avatar
Vaishnavh Nagarajan
@_vaishnavh
Foundations of AI. I like simple & minimal examples and creative ideas. I also like thinking about going beyond the next token 🧮🧸 Google DeepMind | PhD, CMU
New York, NY
vaishnavh.github.io
Joined June 2017
748
Following
3,770
Followers
  • Pinned
    user avatar
    Vaishnavh Nagarajan
    @_vaishnavh
    Jun 19
    In my next blogpost, I write about how I view technical communication: it's like trying to communicate an escape route to someone without a map but with a catch: you're not with them. You only have a walkie-talkie. Also, they're in panic.
    Image
    4.1K
  • user avatar
    Vaishnavh Nagarajan
    @_vaishnavh
    Mar 12, 2024
    🗣️ “Next-token predictors can’t plan!” ⚔️ ​​“False! Every distribution is expressible as product of next-token probabilities!” 🗣️ In work w/ @GregorBachmann1 , we carefully flesh out this emerging, fragmented debate & articulate a key new failure. 🔴
    arXiv logo
    arxiv.org
    The pitfalls of next-token prediction
    Can a mere next-token predictor faithfully model human intelligence? We crystallize this emerging concern and correct popular misconceptions surrounding it, and advocate a simple multi-token...
    55K
  • user avatar
    Vaishnavh Nagarajan
    @_vaishnavh
    Dec 9, 2019
    Thrilled that our paper w/ @zicokolter on generalization in deep learning has been selected for the Outstanding New Directions Paper Award at #NeurIPS2019. Extremely grateful to the selection committee, reviewers & many others who provided useful feedback to improve our paper.
    user avatar
    Hugo Larochelle
    @hugo_larochelle
    Dec 8, 2019
    Want to know which NeurIPS papers were selected for an award, and how the selection was done? Check out our latest blog post on the subject: medium.com/@NeurIPSConf/n…
  • user avatar
    Vaishnavh Nagarajan
    @_vaishnavh
    Oct 22, 2021
    I guess it's time to let Twitterverse know that I successfully defended my thesis! arxiv.org/abs/2110.08922 It was deeply rewarding to put this document together as it made me reflect on many aspects of my PhD journey, both technical & personal. Really happy to share it!
    arXiv logo
    arxiv.org
    Explaining generalization in deep learning: progress and fundamental limits
    This dissertation studies a fundamental open challenge in deep learning theory: why do deep networks generalize well even while being overparameterized, unregularized and fitting the training data...
  • user avatar
    Vaishnavh Nagarajan
    @_vaishnavh
    Jul 9, 2024
    Looking forward to presenting our #ICML paper advocating multi-token prediction and correcting what it really means to say "next-token prediction cannot do what humans do" --- which is often argued poorly. @GregorBachmann1 and I just updated the camera ready version on arxiv.
    Image
    39K
  • user avatar
    Vaishnavh Nagarajan
    @_vaishnavh
    Oct 30, 2020
    “Understanding the failure modes of out-of-distribution generalization”, new paper w/ @bneyshabur and @AJAndreassen at Google arxiv.org/abs/2010.15775 We explain why classifiers rely on spurious correlations (e.g. bkgd.) that hold only in training. 1/
    Image
  • user avatar
    Vaishnavh Nagarajan
    @_vaishnavh
    Jun 29, 2021
    New work w/ @yidingjiang @_christinabaek & @zicokolter on estimating the generalization error of neural networks by looking at "disagreement rates" on unlabeled data + a theory of why this works really well arxiv.org/abs/2106.13799 1/
    Image
    00:00
  • user avatar
    Vaishnavh Nagarajan
    @_vaishnavh
    Jul 9, 2019
    Excited to share our new blog post **w/ code** (downloadable as Jupyter notebook) locuslab.github.io/2019-07-09-uni… highlighting why current approaches to deriving generalization bounds in deep learning may be severely limited. arxiv.org/abs/1902.04742
    Image
  • user avatar
    Vaishnavh Nagarajan
    @_vaishnavh
    Dec 5, 2021
    I've always felt uncomfortable seeing criteria like "(highly) motivated/passionate", "(exceptionally) talented" and "strong" [background in X] appear in calls for PhD/postdoc applications. (1/)
  • user avatar
    Vaishnavh Nagarajan
    @_vaishnavh
    Dec 11, 2023
    Is your student a bit disobedient? 🙅 This may be a good thing! Our new paper on knowledge distillation argues why rebellious students are not just good, but can even be better than the teacher! 🧑‍🎓>>> 🧑‍🏫 arxiv.org/abs/2301.12923 #neurips 1/
    Screenshot of NeurIPS paper titled "On student-teacher deviations in distillation: does it pay to disobey?" The image also mentions the authors and the abstract. These details can be found in the link mentioned in the tweet.
    21K
  • user avatar
    Vaishnavh Nagarajan
    @_vaishnavh
    Jun 2, 2025
    📢 New paper on creativity & multi-token prediction! We design minimal open-ended tasks to argue: → LLMs are limited in creativity since they learn to predict the next token → creativity can be improved via multi-token learning & injecting noise ("seed-conditioning" 🌱) 1/ 🧵
    Screenshot of paper title (Roll the dice and look before you leap: going beyond the creative limits of next-token prediction) and authors, with a poster describing the overarching motivation and the type of tasks that are studied.
    29K
  • user avatar
    Vaishnavh Nagarajan
    @_vaishnavh
    Jan 8, 2019
    Uploaded our (with @zicokolter) NeurIPS 17(!) workshop spotlight paper "Generalization in Deep Networks: The Role of Distance from Initialization". We argued why it's important to take into account the initialization to explain generalization. arxiv.org/abs/1901.01672
    Image
  • user avatar
    Vaishnavh Nagarajan
    @_vaishnavh
    Mar 3, 2021
    If you're worried about spurious correlations in ML models, check out my CMU AI seminar talk based on ICLR'21 work w/ @AJAndreassen and @bneyshabur on "understanding the failure modes of out-of-distribution generalization". arxiv.org/abs/2010.15775 1/3 youtube.com/watch?v=DhPMq_…
    arXiv logo
    arxiv.org
    Understanding the Failure Modes of Out-of-Distribution Generalization
    Empirical studies suggest that machine learning models often rely on features, such as the background, that may be spuriously correlated with the label only during training time, resulting in poor...
  • user avatar
    Vaishnavh Nagarajan
    @_vaishnavh
    Jun 13, 2025
    Wrote my first blog post! I wanted to share a powerful yet under-recognized way to develop emotional maturity as a researcher: making it a habit to read about the ✨past ✨ and learn from it to make sense of the present
    Screenshot from the blog enumerating patterns in the history of science.
    16K

New to X?

Sign up now to get your own personalized timeline!

Create account

By signing up, you agree to the Terms of Service and Privacy Policy, including Cookie Use.

Terms·Privacy·Cookies·Accessibility·Ads Info·© 2026 X Corp.
Don't miss what's happening
People on X are the first to know.
Log inSign up