Log(😅) = 💧Log(😄)
Hossein Mobahi
1,380 posts
Rεsεαrch Sciεητisτ @GoogleDeepMind. I ∈ Optimization ∩ Machine Learning. Here to discuss research 🤓. Like heavy music🤘.Origin=🇮🇷 Citizen=🇺🇸.
- Today @GoogleAI officially launched the Google Research YouTube channel 🚀 On the channel, we have a range of content including shows such as (1) Meet a Researcher and (2) ResearchBytes, along with (3) Spotlights.
- 1/5 Self-Distillation loop (feeding predictions as new target values & retraining) improves test accuracy. But why? We show it induces a regularization that progressively limits # of basis functions used to represent the solution. bit.ly/2HnOACo w/@farajtabar P.Bartlett
- My PhD adviser Yi Ma and academic brother John Wright just put their new & free (pre-production) book online "High-Dimensional Data Analysis with Low-Dimensional Models: Principles, Computation, and Applications" book-wright-ma.github.io. Organized, easy read, w/ lots of visuals.
- (0/17) Grab your🍿 for a thread on some mysteries and explanations connecting flat minima, second order optimization, weight noise, gradient norm penalty, and activation functions😱 There is also a video presentation if you prefer:
- 1/5 In July 2016, Jitendra Malik gave an inspiring talk at Google, with a slide showing a block diagram of visual pathway in a primate. He said "there are a lot of feedback loops as you see". He then stressed that, yet the current deep neural architectures are mainly feedforward.
- 1/10 Heard about implicit regularization of SGD (i.e. bias toward certain solutions not explicitly stated in the objective function) and wondered why it happens? This thread provides some introductory analysis on why SGD prefers solutions with small norm.
- One of the most touching thesis dedications: "To all the students who had to discontinue their PhD because of toxic work environments, and to all the kind and humble researchers who are striving to make academia a better place." from @_vaishnavh's PhD thesis @SCSatCMU Aug. 2021.
- Are neural networks learning or memorizing? It must be learning, otherwise how they generalize so well. But hey, maybe memorization is not against generalization, and maybe necessary when there's little training data for some classes. arxiv.org/pdf/2008.03703… by @vitalyFM & Zhang.
- One of the most comprehensive studies of generalization to date; ≈40 complexity measures over ≈10K deep models. Surprising observations worthy of further investigations. Fantastic Generalization Measures: bit.ly/34TqKZs w @yidingjiang @bneyshabur @dilipkay S. Bengio
- In case you missed the news: @GoogleAI's Student Researcher Program (i.e. internship, read below though) for 2024 is now live and you can apply here: google.com/about/careers/… Note: Intern is the same as Student Researcher. The only difference is that the former relates to
- 1/11 Earlier this year, I promised to write an introductory thread on calculus of variations and its uses in machine learning. The time has arrived! I am not to give a rigorous treatment, but a minimal intuition here. In one line: it is a formalism for seeking optimal functions.
- After several wonderful years at @GoogleAI, I joined @GoogleDeepMind today. I look forward to continuing my work on foundational ML research with this exceptional team. Big thanks to @hugo_larochelle and @ZoubinGhahrama1 for invaluable support.









