This neural network architecture that was showcased at the @Tesla AI day is a perfect example of Deep Learning at its finest. Mix and match all the greatest innovations to do something drastic and super ambitious. Congrats!
Exciting times, welcome Gemini (and MMLU>90)! State-of-the-art on 30 out of 32 benchmarks across text, coding, audio, images, and video, with a single model 🤯
Co-leading Gemini has been my most exciting endeavor, fueled by a very ambitious goal. And that is just the beginning!
Hello Gemini 2.5 Flash-Lite! So fast, it codes *each screen* on the fly (Neural OS concept 👇).
The frontier isn't always about large models and beating benchmarks. In this case, a super fast & good model can unlock drastic use cases.
Read more: blog.google/products/gemin…
Distillation has been on the news (!) due to @deepseek_ai. The paper arxiv.org/abs/1503.02531 was actually rejected from NeurIPS 2014 due to lack of novelty 🧐 (true-ish), and lack of impact 🙃.
Thanks reviewer#2 (literally), and thanks for @arxiv!
@geoffreyhinton@JeffDean
Best GAN samples ever yet? Very impressive ICLR submission! BigGAN improves Inception Scores by >100.
Paper: openreview.net/pdf?id=B1xsqj0…
Lots more samples: goo.gl/FJoEiG
The "Deep Learning Toolbox" has greatly expanded in the last decade thanks to our wonderful research community. Also, important progress has been made to make our community more inclusive and less toxic. Still, there's LOTS to do, and I plan to keep focusing on advancing both.
Many people mistakenly think that large language models that generate one word at a time is the end game.
Connecting LLMs with tools, e.g.: search engines, python interpreters, etc., is super exciting, and leverages tools' power, robustness & more!
Welcome back, gradients! This method is orders of magnitude faster than state-of-the-art non-differentiable techniques.
DARTS: Differentiable Architecture Search by Hanxiao Liu, Karen Simonyan, and Yiming Yang.
Paper: arxiv.org/abs/1806.09055
Code: github.com/quark0/darts