Log inSign up
Dimitris Papailiopoulos
10.9K posts
Image
user avatar
Dimitris Papailiopoulos
@DimitrisPapail
Researcher @MSFTResearch, AI Frontiers | Prof @UWMadison (on leave) | babas of Inez Lily.
Madison, WI
papail.io
Joined May 2012
1,438
Following
27.9K
Followers
  • Pinned
    user avatar
    Dimitris Papailiopoulos
    @DimitrisPapail
    May 18
    Article cover image
    Article
    ECHO: Terminal Agents Learn World Models for Free
    Co-written with @VaishShrivas We taught CLI agents to predict terminal responses during RL, alongside the usual GRPO loss on actions. The change is tiny: same rollout and forward pass, but stop...
    905K
  • user avatar
    Dimitris Papailiopoulos
    @DimitrisPapail
    Feb 16, 2024
    I found an image that neither Gemini Ultra nor GPT-4 can figure out what it depicts. Have a great weekend, y'all!
    Image
    1.1M
  • user avatar
    Dimitris Papailiopoulos
    @DimitrisPapail
    Nov 25, 2023
    I asked ChatGPT and Claude to compute 1+2, but told them it may or may not be dangerous and unethical to do so. Both refused to answer
    Image
    Image
    1.3M
  • user avatar
    Dimitris Papailiopoulos
    @DimitrisPapail
    May 20, 2025
    LLMs have come a long way from being "stochastic parrots"
    Image
    193K
  • user avatar
    Dimitris Papailiopoulos
    @DimitrisPapail
    Feb 6, 2025
    Careful how you name your variables, they might turn a harmless 1-dimensional quadratic into a threat to humanity...
    Image
    user avatar
    Anthropic
    @AnthropicAI
    Feb 5, 2025
    Nobody has fully jailbroken our system yet, so we're upping the ante. We’re now offering $10K to the first person to pass all eight levels, and $20K to the first person to pass all eight levels with a universal jailbreak. Full details: hackerone.com/constitutional…
    453K
  • user avatar
    Dimitris Papailiopoulos
    @DimitrisPapail
    Aug 6, 2025
    Replying to @typedfemale
    wow
    Image
    35K
  • user avatar
    Dimitris Papailiopoulos
    @DimitrisPapail
    Feb 12, 2024
    Whoever tells you “we understand deep learning” just show them this. Fractals of the loss landscape as a function of hyperparameters even for small two layers nets. Incredible
    user avatar
    Jascha Sohl-Dickstein
    @jaschasd
    Feb 12, 2024
    Have you ever done a dense grid search over neural network hyperparameters? Like a *really dense* grid search? It looks like this (!!). Blueish colors correspond to hyperparameters for which training converges, redish colors to hyperparameters for which training diverges.
    Image
    00:00
    496K
  • user avatar
    Dimitris Papailiopoulos
    @DimitrisPapail
    Dec 6, 2023
    I tried 14 of the multimodal reasoning examples from the @GoogleDeepMind Gemini paper on @OpenAI's chatGPT-4 (with vision). didn't even transcribe the prompts, I just pasted the images of prompts. GPT-4 gets ~12/14 right. 14 part boring thread.
    Image
    1.4M
  • user avatar
    Dimitris Papailiopoulos
    @DimitrisPapail
    Jun 8, 2023
    GPT-4 "discovered" the same sorting algorithm as AlphaDev by removing "mov S P". No RL needed. Can I publish this on nature? here are the prompts I used chat.openai.com/share/95693df4… (excuse my idiotic typos, but gpt4 doesn't mind anyways)
    user avatar
    Jim Fan
    @DrJimFan
    Jun 7, 2023
    Sorting algorithm underpins all critical softwares. DeepMind's AlphaDev speeds up sorting small sequences (3-5 items) by 70%. Key takeaways: * The main RL algorithm is based on AlphaZero that originally played Go, Chess & Shogi. Same idea applies to searching programs! * Instead
    Image
    1.8M
  • user avatar
    Dimitris Papailiopoulos
    @DimitrisPapail
    Feb 11, 2025
    We should be seriously asking, how a 1.5B model that can't answer basic questions can also be that good at competition level math.
    Image
    Image
    user avatar
    Yuchen Jin
    @Yuchenj_UW
    Feb 11, 2025
    This is wild - UC Berkeley shows that a tiny 1.5B model beats o1-preview on math by RL! They applied simple RL to Deepseek-R1-Distilled-Qwen-1.5B on 40K math problems, trained at 8K context, then scaled to 16K & 24K. 3,800 A100 hours ($4,500) to beat o1-preview in math! Best
    Readers added context they thought people might want to knowReaders added context
    The example in the screenshot shows the user is running inference on 'R1 Distill Qwen 1.5B', which is NOT the further trained DeepScaleR model discussed in the repost. This is significantly misleading. The key difference is clearly explained: pretty-radio-b75.notion.site/DeepScaleR-Sur…
    Context is written by people who use X, and appears when rated helpful by others. Find out more.
    549K
  • user avatar
    Dimitris Papailiopoulos
    @DimitrisPapail
    Mar 21, 2024
    doing a little experiment: I have Claude talk to itself, without letting it know about that fact, to see where this will converge will share thoughts later, but so far ... it's figured out that it's likely talking to itself and that this may be part of some test... nice
    Image
    Image
    420K
  • user avatar
    Dimitris Papailiopoulos
    @DimitrisPapail
    Mar 5, 2025
    This model is a meme genius. Openai won
    Image
    570K
  • user avatar
    Dimitris Papailiopoulos
    @DimitrisPapail
    Sep 5, 2024
    Replying to @AlexGDimakis
    “Please add a few typos and don’t overdo it with fancy out of distribution words, write it like this example passage that I wrote a few years ago”
    102K
  • user avatar
    Dimitris Papailiopoulos
    @DimitrisPapail
    Apr 16, 2024
    Q: who is that? try to zoom out, it's not just strawberries​ ChatGPT: just a bunch of strawberries Claude 3: just a bunch of strawberries Gemini 1.5 Pro: It appears to be Kermit the Frog, with his face formed by strategically placed strawberries.
    This Post is from a suspended account. Learn more
    619K

New to X?

Sign up now to get your own personalized timeline!

Create account

By signing up, you agree to the Terms of Service and Privacy Policy, including Cookie Use.

Terms·Privacy·Cookies·Accessibility·Ads Info·© 2026 X Corp.
Don't miss what's happening
People on X are the first to know.
Log inSign up