Today marks the official ending of my PhD life at MIT. So grateful to this journey.
Coincidentally, we arXiv a paper today: arxiv.org/abs/2306.00984. It shows the potential of learning from synthetic data.
This coincidence nicely concludes my PhD life in an academic manner.
Yonglong Tian
118 posts
Boston, MA
Joined June 2019
- HNY! Excited to share SynCLR, that rivals CLIP and Dino v2 but uses pure synthetic data. The interesting part - it can outperform models (e.g. CLIP) directly trained on LAION-2B, which was the dataset used to train SD 1.5 that we used to generate images. arxiv.org/abs/2312.17742
- How contrastive learning works on large-scale uncurated data? Want to significantly improve it in such a scenario? Check the new "Divide and Contrast (DnC)" work jointly with @avdnoord and @olivierhenaff ArXiv: arxiv.org/abs/2105.08054
- Check this new paper: what makes for good views for contrastive learning? pdf: arxiv.org/pdf/2005.10243… code: github.com/HobbitLong/PyC… - If you like analysis, there are fun experiments and intuitive theory. - If you like SoTA, there are best-performing models to play with.
- MIT is a place for serious research.
- Finally arxived my intern project @DeepMind last summer. Though this internship is completely remote due to COVID-19, it's a fantastic journey, thanks to my host @avdnoord (who is super responsive), @olivierhenaff, and many other colleagues who are very helpful.How contrastive learning works on large-scale uncurated data? Want to significantly improve it in such a scenario? Check the new "Divide and Contrast (DnC)" work jointly with @avdnoord and @olivierhenaff ArXiv: arxiv.org/abs/2105.08054
- Denoising Vision Transformers paper page: huggingface.co/papers/2401.02… identify crucial artifacts in ViTs caused by positional embeddings and propose a two-stage approach to remove these artifacts, which significantly improves the feature quality of different pre-trained ViTs
- Contrastive learning discriminates between samples from p(x,y) and samples from p(x)p(y): - Cross Entropy: x - input image; y - label - SupCon: x - image from class A; y - another image from A IMO, CE is kind of CL.Contrastive learning is an #ML technique typically used only in the self-supervised setting. Today we present SupCon, a method that bridges the gap between self- and fully supervised learning and consistently performs well on image classification tasks. goo.gle/2TGGWfQ
- Leveraging Unpaired Data for Vision-Language Generative Models via Cycle Consistency paper page: huggingface.co/papers/2310.03… Current vision-language generative models rely on expansive corpora of paired image-text data to attain optimal performance and generalization capabilities.
- It's somewhere connecting *invariance* and *equivariance* representation learning.SSL contrastive works focus on HOW to create pos/neg pairs, but never use this info again. Why not ALSO encode the pair generation method? *New* paper #iccv2021! Composable Augmentation Encoding for Video Representation Learning w @jesu9, @YonglongT, @CordeliaSchmid)! @GoogleAI
GIF - In diffusion models, samplers are primarily ODE-centric, overlooking slower stochastic methods. However, we show that stochastic sampler can outperform previous samplers on Stable Diffusion, if we use stochasticity correctly! check out Restart Sampling: arxiv.org/abs/2306.14878
GIF - what will happen if we build views that only share label info in the contrastive learning framework? Check out this new work... arxiv.org/pdf/2004.11362…Replying to @dilipkayIt shows clear benefits in top-1 accuracy and robustness; and is more stable across a range of hyperparameters. Joint work with @PrannayKhosla @YonglongT @phillip_isola and other colleagues.
- Replying to @TheGradient and @farajtabarCongrats Hossein! Our CVPR submission used self-distillation, and one reviewer asked for a theoretical analysis of this technique. Now here it is...
- Today marks the official ending of my PhD life at MIT. So grateful to this journey. Coincidentally, we arXiv a paper today: arxiv.org/abs/2306.00984. It shows the potential of learning from synthetic data. This coincidence nicely concludes my PhD life in an academic manner.
















