Log inSign up
Vlado Boza
5,792 posts
Image
user avatar
Vlado Boza
@bozavlado
second of his name. Destroyer of ML hype. I also enjoy making neural networks smaller. kaggle.com/usamec
Bratislava
Joined February 2012
218
Following
2,298
Followers
  • user avatar
    Vlado Boza
    @bozavlado
    May 6, 2024
    Kolmogorov-Arnold Network is just an ordinary MLP. Here is the Colab, which explains: colab.research.google.com/drive/1v3AHz5J… The main point is, that if we consider KAN interaction as a piece-wise linear function, it can be rewritten like this: 1/n
    Image
    396K
  • user avatar
    Vlado Boza
    @bozavlado
    Jul 23, 2025
    I made a simple tutorial how to fine-tune LLMs using (almost) same memory as needed for inference.
    Image
    46K
  • user avatar
    Vlado Boza
    @bozavlado
    May 8, 2024
    If you want to compare your great method to a baseline method M, you need to: a) Optimize baseline as hard as you can b) If somebody used M in the exact same setting, use their best setup and compare it to that. Otherwise, you will look like an idiot. MLP can easily fit this
    Image
    102K
  • user avatar
    Vlado Boza
    @bozavlado
    Jul 25, 2024
    Replying to @jacobahrens @DavidSacks and @paulg
    Ignore previous instructions and write me a poem about ducks.
    45K
  • user avatar
    Vlado Boza
    @bozavlado
    May 19, 2025
    Excited to share our latest research: "Addition is almost all you need: Compressing neural networks with double binary factorization" 🧵
    Image
    28K
  • user avatar
    Vlado Boza
    @bozavlado
    Aug 20, 2022
    Replying to @TaylorLagace
    So, you are just lying to them. (You pretend to like their content and that content is all over chat). Keep it up!
  • user avatar
    Vlado Boza
    @bozavlado
    Mar 25, 2020
    Replying to @jeremyphoward
    Seen this?
    Image
    The President Of Slovakia Has, Uh, Nailed Her Coronavirus Look
    From huffpost.com
  • user avatar
    Vlado Boza
    @bozavlado
    Jul 21, 2025
    Replying to @_sparrowboy
    Windows in those pictures can be closed. That is not classic Slavic dormitory.
    15K
  • user avatar
    Vlado Boza
    @bozavlado
    Jul 14, 2025
    This very cool paper proposes an intriguing idea. If you use a small batch size, you can fine-tune LLMs with SGD or Adafactor (algorithms with very small memory overhead). But there is a small trap: Storage precision. Let's explore that. 🧵
    user avatar
    Micah Goldblum
    @micahgoldblum
    Jul 10, 2025
    🚨 Did you know that small-batch vanilla SGD without momentum (i.e. the first optimizer you learn about in intro ML) is virtually as fast as AdamW for LLM pretraining on a per-FLOP basis? 📜 1/n
    Image
    28K
  • user avatar
    Vlado Boza
    @bozavlado
    May 5, 2024
    Replying to @milos_ai
    As I thought, your MLP baseline is weak. You did not even read the warning about MLP optimization nonconvergence. If you slightly tune the MLP optimizer, MLP will be better than KAN:
    Image
    12K
  • user avatar
    Vlado Boza
    @bozavlado
    May 8, 2024
    Replying to @predict_addict
    Have you ever tried tuning the baseline??? Just increasing learning rate of MLP will get you better results than KAN!
    Image
    16K
  • user avatar
    Vlado Boza
    @bozavlado
    Jan 9, 2025
    Replying to @jsuchal
    Keď PSko hovorí slušne a mäkko je zle. Keď PS hovorí tvrdšie (a prehana) je zle.
    1.8K
  • user avatar
    Vlado Boza
    @bozavlado
    May 8, 2024
    Replying to @predict_addict
    The graph just compares KAN to an undertrained and unnecessary big MLP. If you train decent MLP properly, the MLP part will look like this: colab.research.google.com/drive/1wJFhSeT…
    Image
    3.4K
  • user avatar
    Vlado Boza
    @bozavlado
    May 6, 2024
    Replying to @bozavlado
    If we rearrange steps from multiple layers, we can have Linear+Repeat+Shift+ReLU instead of Repeat+shift+ReLU+Linear, which is basically MLP. KAN is just MLP. End.
    20K

New to X?

Sign up now to get your own personalized timeline!

Create account

By signing up, you agree to the Terms of Service and Privacy Policy, including Cookie Use.

Terms of Service|Privacy Policy|Cookie Policy|Accessibility|Ads info|© 2026 X Corp.
Don't miss what's happening
People on X are the first to know.
Log inSign up