Log inSign up
Ilia Shumailov🦔
1,093 posts
Image
user avatar
Ilia Shumailov🦔
@iliaishacked
Now: @Meta, Past: {CEO @aisequrity, Senior Scientist @GoogleDeepMind, JRF @ChCh_Oxford @UniofOxford, Fellow @VectorInst, PhD @Cambridge_Uni}
[email protected]
iliaishacked.github.io
Joined December 2017
824
Following
3,968
Followers
  • Pinned
    user avatar
    Ilia Shumailov🦔
    @iliaishacked
    Jan 28
    Article cover image
    Article
    SEQURITY.AI makes MoltBot unhackable with indirect prompt injections (and much more)
    Friends, we love everything MoltBot. To ensure your creations stay safe as they scale, we are introducing Sequrity Control – a new way to secure your AI on demand. Integration is as simple as adding a...
    5.6K
  • user avatar
    Ilia Shumailov🦔
    @iliaishacked
    Oct 17, 2023
    Our team is looking for student researchers to study things at the intersection of ML, Security, Safety, and Privacy. To express interest please fill in the form:
    Image
    docs.google.com
    DeepMind Student Intern Interest Form
    With this form we are trying to measure interest in in-person (!) student internships at Google Deepmind, London. During the internship the student will be expected to conduct research at the...
    70K
  • user avatar
    Ilia Shumailov🦔
    @iliaishacked
    Jul 20, 2023
    Is censorship of LLMs even possible? Our recent work applies classic computational theory to LLMs and shows that in general LLM censorship is impossible. We show that Rice theorem applies to interactions with augmented LLMs, implying that semantic censorship is undecidable.
    Image
    42K
  • user avatar
    Ilia Shumailov🦔
    @iliaishacked
    May 27, 2023
    What happens when generated data of one LLM becomes training data of another LLM? Turns out that models start forgetting the real distribution and as the process repeats models develop dementia. cl.cam.ac.uk/~is410/Papers/…
    Image
    65K
  • user avatar
    Ilia Shumailov🦔
    @iliaishacked
    Jun 3, 2025
    🤯 Our new @GoogleDeepMind paper reveals a vulnerability in the AI supply chain. Our paper, "Cascading Adversarial Bias," shows how tiny, malicious changes to a large "teacher" language model can create amplified biases in smaller "student" models after distillation.
    Image
    23K
  • user avatar
    Ilia Shumailov🦔
    @iliaishacked
    Dec 9, 2023
    Replying to @ibab
    We actually studied what happens in the limit here — variance is lost and models degenerate
    arXiv logo
    arxiv.org
    The Curse of Recursion: Training on Generated Data Makes Models Forget
    Stable Diffusion revolutionised image creation from descriptive text. GPT-2, GPT-3(.5) and GPT-4 demonstrated astonishing performance across a variety of language tasks. ChatGPT introduced such...
    42K
  • user avatar
    Ilia Shumailov🦔
    @iliaishacked
    Nov 4, 2024
    📢 New security risk for Mixture-of-Experts (MoE)! 📢 @GoogleDeepMind research reveals a new kind of vulnerability that could leak user prompts in MoE models. Our "MoE Tiebreak Leakage" attack exploits the Expert Choice Routing strategy. arxiv.org/pdf/2410.22884
    Image
    27K
  • user avatar
    Ilia Shumailov🦔
    @iliaishacked
    Jun 2, 2025
    Our new @GoogleDeepMind paper, "Lessons from Defending Gemini Against Indirect Prompt Injections," details our framework for evaluating and improving robustness to prompt injection attacks.
    Image
    19K
  • user avatar
    Ilia Shumailov🦔
    @iliaishacked
    Jul 21, 2025
    Just saw our Nature paper on model collapse passed 500k accesses. To put that in perspective, the Nobel-winning AlphaFold paper has 2.3M accesses—only 4.6x more. I wanted to reflect back on it and progress broadly in the past year.
    Image
    17K
  • user avatar
    Ilia Shumailov🦔
    @iliaishacked
    Jul 3, 2024
    Unlearning, originally for privacy, today is often discussed as a content-regulation tool. If my model doesnt know X, it is safe. We argue that unlearning provides illusion of safety, since adversaries can inject malicious knowledge back into the models. arxiv.org/pdf/2407.00106
    Image
    19K
  • user avatar
    Ilia Shumailov🦔
    @iliaishacked
    Jul 14, 2025
    My friends, I want to organise Secure AI Club in London -- gig for people interested in (practical!) AI Security. Not just academic toy setups, but actually making systems reliable. Trying to gauge interest, please sign up here:
    Image
    docs.google.com
    Secure AI Club London Meetup
    I want to organise a Secure AI Club meetup in London Need to figure out how many people will come to find appropriate venue
    15K
  • user avatar
    Ilia Shumailov🦔
    @iliaishacked
    Jun 10, 2025
    Are modern large language models (LLMs) vulnerable to privacy attacks that can determine if given data was used for training? Models and dataset are quite large, what should we even expect? Our new paper looks into this exact question. 🧵 (1/10)
    Image
    21K
  • user avatar
    Ilia Shumailov🦔
    @iliaishacked
    Jul 17, 2025
    Replying to @elder_plinius
    We actually theoretically kinda describe this in 6.4 in arxiv.org/pdf/2503.18813, its a kind of polymorphic crypter
    Image
    17K
  • user avatar
    Ilia Shumailov🦔
    @iliaishacked
    Apr 11, 2025
    Folks, our @GoogleDeepMind team is cooking exciting security privacy tooling and we need your help. We are looking to hire more folks! Please reach out to me with cv if you want to contribute to making Gemini secure
    user avatar
    Andreas Terzis
    @aterzis
    Apr 11, 2025
    1/3 🚨 AGI agents are venturing into untrusted territories, but current LLMs face vulnerabilities like prompt injections. How do we ensure their safety? 🤔
    17K

New to X?

Sign up now to get your own personalized timeline!

Create account

By signing up, you agree to the Terms of Service and Privacy Policy, including Cookie Use.

Terms of Service|Privacy Policy|Cookie Policy|Accessibility|Ads info|© 2026 X Corp.
Don't miss what's happening
People on X are the first to know.
Log inSign up