I have been playing around with CLIP and StyleGAN. Apparently, CLIP can't just guide StyleGAN to generate images, but it also can edit existing ones! Here are a few examples of edits that I have tried on myself and a few celebs. The driving text appears below/above the result.
Or Patashnik
454 posts
Assistant Professor @ Tel-Aviv University
Joined May 2019
- Happy to share our latest work, “Cross-Image Attention”! 🔀🖼️🔍 We show how we can perform zero-shot appearance transfer by building on the self-attention layers of image diffusion models 😲 Great collaboration led by @yuvalalaluf and @DanielGaribi garibida.github.io/cross-image-at…
- 📢 Today I begin my first semester as faculty in Computer Science at @TelAvivUni! Excited to start this new journey, and grateful to teach & research where my own journey began 🩵
- Ever wondered how a SINGLE token represents all subject regions in personalization? Many methods use this token in cross-attention, meaning all semantic parts share the same single attention value. We present Nested Attention, a mechanism that generates localized attention values
- Finally, StyleCLIP is up at @replicateai ! 🎊 Replicate allows to easily deploy ML models so that users do not need to install anything, or to write a single line of code 😲 Enjoy! 😊 replicate.ai/orpatashnik/st…
- Happy to share that my two papers were accepted to #ICCV2021😁. Here are the two papers in action together!😂 Here are me and my colleagues @yuvalalaluf and @zongze_wu inverted with ReStyle and edited with StyleCLIP using the text: "A face of a person whose paper got accepted"
- Just released the code for my final first-author PhD project 😮 I'll be presenting it next week at #SIGGRAPH2025! Not just another attention technique, it's a 𝐍𝐞𝐬𝐭𝐞𝐝 𝐀𝐭𝐭𝐞𝐧𝐭𝐢𝐨𝐧. Built on SDXL, so quality isn't best, but it's super fun to try. Code and demo below👇Ever wondered how a SINGLE token represents all subject regions in personalization? Many methods use this token in cross-attention, meaning all semantic parts share the same single attention value. We present Nested Attention, a mechanism that generates localized attention values
- Check out our NFSD! We reexamine SDS and introduce an interpretation that demystifies the necessity for large CFG. With this interpretation, we propose the NFSD process, requiring minimal modifications to SDS, and yet achieving more effective distillation. orenkatzir.github.io/nfsd/
00:00 - When I saw Flux Kontext, I thought: "is image editing done?" The wave continued with Qwen-Image, NanoBanana, Reve, etc. But these backbones opened lots of new avenues for image editing research! Our Kontinuous Kontext adds a new level of control with continuous edit strength.“Make it red.” “No! More red!” “Ughh… slightly less red.” “Perfect!” ♥️ 🎚️Kontinuous Kontext adds slider-based control over edit strength to instruction-based image editing, enabling smooth, continuous transformations!
00:00 - Great news! Delighted to share that our paper has been accepted at #ICCV2023! 🎉 Grateful for the enjoyable collaboration with @DanielGaribi, @IdanAzuri, @ElorHadar, and @DanielCohenOr1. Our project page is available at orpatashnik.github.io/local-prompt-m…. Details below 🧵Localizing Object-level Shape Variations with Text-to-Image Diffusion Models @Gradio demo is out on @huggingface demo: huggingface.co/spaces/orpatas…
- Replying to @OPatashnikI am sharing my code with you. github.com/orpatashnik/St… Try it, it is fun. I hope you can be more creative.
- To all SDS lovers, especially those who are intrigued by it, our NFSD paper was accepted to @iclr_conf! See you in Vienna 🤩 @KatzirOren @DaniLischinski @DanielCohenOr1Check out our NFSD! We reexamine SDS and introduce an interpretation that demystifies the necessity for large CFG. With this interpretation, we propose the NFSD process, requiring minimal modifications to SDS, and yet achieving more effective distillation. orenkatzir.github.io/nfsd/
00:00 - Today at #SIGGRAPHAsia2024 I will be presenting our work "Consolidating Attention Features for Multi-view Image Editing"! In this work, We use a diffusion model to edit multi-view image sets, consolidating attention features during denoising for consistency.
- Today at 14:30 @DanielGaribi and I are presenting our work in Foyer Sud, poster number 103. Come to say hi! 👋 #ICCV2023Localizing Object-level Shape Variations with Text-to-Image Diffusion Models @Gradio demo is out on @huggingface demo: huggingface.co/spaces/orpatas…















