Log inSign up
Taco Cohen
1,619 posts
user avatar
Taco Cohen
@TacoCohen
Slop janitor & post-trainologer at Meta / FAIR. Into codegen, RL, equivariance. Spent time at Qualcomm, Scyfer (acquired), UvA, Deepmind, OpenAI.
tacocohen.wordpress.com
Joined March 2013
3,739
Following
30.1K
Followers
  • user avatar
    Taco Cohen
    @TacoCohen
    Jan 1, 2023
    Surprisingly little AI progress in 2023 so far. What’s going on??
    649K
  • user avatar
    Taco Cohen
    @TacoCohen
    Nov 22, 2023
    An interesting aspect of this discussion is the fact that LLMs will soon start affecting our thoughts, beliefs, mental & linguistic habits, and culture. The idea that we could select a handful of "trustworthy" institutions with the "correct" set of values and beliefs to shape LLM
    user avatar
    Andrej Karpathy
    @karpathy
    Nov 21, 2023
    Thinking a lot about centralization and decentralization these few days.
    714K
  • user avatar
    Taco Cohen
    @TacoCohen
    May 25, 2019
    An easy guide to Gauge Equivariant Convolutional Networks. (I finally get it!) medium.com/@kayzaks/an-ea…
  • user avatar
    Taco Cohen
    @TacoCohen
    Oct 23, 2025
    Exactly. I learned a ton of math during my PhD, and it was fun and easy *because I had a goal* to use it in my research. Coding it up is also a great way to detect gaps in your understanding. Totally different from learning in class. Another common fallacy is that you need to
    user avatar
    Jeremy Howard
    @jeremyphoward
    Oct 22, 2025
    This is empirically incorrect. Hundreds of thousands of fast.ai students have learned the required math for ML as they go. By *far* the biggest problem we've seen is from people who try to learn the math first. They learn the wrong stuff & have not context.
    163K
  • user avatar
    Taco Cohen
    @TacoCohen
    May 26, 2025
    Nobody wants to hear it, but working on data is more impactful than working on methods or architectures.
    user avatar
    Katie Everett
    @_katieeverett
    May 22, 2025
    1. We often observe power laws between loss and compute: loss = a * flops ^ b + c 2. Models are rapidly becoming more efficient, i.e. use less compute to reach the same loss But: which innovations actually change the exponent in the power law (b) vs change only the constant (a)?
    161K
  • user avatar
    Taco Cohen
    @TacoCohen
    May 28, 2021
    Rumor has it that I don't even have a PhD yet. This is in fact true... 😏 BUT! I am happy to report that I will be graduating before any of the PhD students I'm advising. The thesis is now online and I will be defending Jun 9th, 16.00 CET! Check it out: dare.uva.nl/search?identif…
    Image
  • user avatar
    Taco Cohen
    @TacoCohen
    Apr 30, 2022
    8 years of progress in generative modelling. What a time to be alive
    Image
    Image
  • user avatar
    Taco Cohen
    @TacoCohen
    Jan 21, 2024
    Two weeks ago I joined Meta / FAIR, and I couldn't be more excited about this new chapter. Meta is indeed the only place left that supports highly ambitious long-term oriented & fundamental research projects and has a strong commitment to open science and open source. (and has
    user avatar
    Yann LeCun
    @ylecun
    Jan 19, 2024
    There is literally no other company doing this today: - open research towards human-level AI - open source AI platform enabling a huge AI ecosystem - wearable device to interact with always-on AI assistants
    269K
  • user avatar
    Taco Cohen
    @TacoCohen
    Sep 25, 2024
    🚨 Attention aspiring PhD students: Meta / FAIR is looking for candidates for a joint academic/industry PhD! 🚨 Among others, the CodeGen team is looking for candidates to work on world models for code, discrete search & continuous optimization methods for long-term planning,
    Image
    141K
  • user avatar
    Taco Cohen
    @TacoCohen
    Apr 11, 2018
    Best paper award for our ICLR paper, "Spherical CNNs"! Read it while it's hot 🔥 arxiv.org/abs/1801.10130 🔥
    Image
    GIF
  • user avatar
    Taco Cohen
    @TacoCohen
    Aug 12, 2024
    Fascinating paper, showing that transformers are energy-based models in disguise .. And this insight leads to an efficient decoding algorithm
    user avatar
    Vasu Shyam
    @vasud3vshyam
    Aug 12, 2024
    Ever looked at the attention operation and said "hang on, that's a one-point function!"?
    Image
    63K
  • user avatar
    Taco Cohen
    @TacoCohen
    Jul 18, 2023
    Llama-2 is coming to your phone:
    Image
    Qualcomm Works with Meta to Enable On-device AI Applications Using Llama 2 | Qualcomm
    From qualcomm.com
    464K
  • user avatar
    Taco Cohen
    @TacoCohen
    Oct 12, 2021
    So these "Multi-Headed Vision Transformers", are they in the room with us right now?
    Image
  • user avatar
    Taco Cohen
    @TacoCohen
    May 1, 2019
    Interested in geometric and equivariant deep learning? Check out our latest paper on Gauge Equivariant CNNs, where we show how gauge theory makes it possible to build CNNs on general manifolds: arxiv.org/abs/1902.04615
    Image

New to X?

Sign up now to get your own personalized timeline!

Create account

By signing up, you agree to the Terms of Service and Privacy Policy, including Cookie Use.

Terms·Privacy·Cookies·Accessibility·Ads Info·© 2026 X Corp.
Don't miss what's happening
People on X are the first to know.
Log inSign up