Introducing soarXiv ✈️, the most beautiful way to explore human knowledge
Take any paper's URL and replace arxiv with soarxiv (show in video) to teleport to its place in the universe
I've embedded all 2.8M papers up until April 2025
Try it at: soarxiv dot org
how do we get truly novel outputs from AI?
major labs including DeepMind and OpenAI have had teams dedicated to open-endedness.
makes sense given the impact it could have on enabling discovery and self-improvement.
doing today's deep dive thread on this field:
It started as a project for mapping the the embeddings of every arXiv paper
Now, soarXiv turned into a truly wonderful tool for discovering papers. I've spent so long just flying around and seeing what niche subjects I can find (so much on physics!)
I built a map of all the papers on arXiv
With my database of semantic paper embeddings, I created an interactive 3D visualization of the papers on arXiv. The result is this beautiful (and fast!) map.
reply if you'd wanna try this and what I should add next
I'm working on a more detailed blog post for how I built it, but here are some essential parts of the stack:
@runpod for embedding all of arXiv in just a few hours
@threejs for visualization
@leland_mcinnes 's UMAP for building the 3D map
I built a map of all the papers on arXiv
With my database of semantic paper embeddings, I created an interactive 3D visualization of the papers on arXiv. The result is this beautiful (and fast!) map.
reply if you'd wanna try this and what I should add next
I don't think he's comparing the relative information density of text vs images, but instead highlighting how general the transformer architecture is for many use-cases beyond text.
See papers like MeshGPT or those that use transformers as GNNs.
Let me know what I should add! Also, if you're a cloud provider, I'll happily accept hosting credits and shill your platform 😂
Some things I'm already thinking about:
- adding many more open science databases (med/bio/chem-rxiv)
- adding new papers daily
- multiplayer! imagine
Introducing soarXiv ✈️, the most beautiful way to explore human knowledge
Take any paper's URL and replace arxiv with soarxiv (show in video) to teleport to its place in the universe
I've embedded all 2.8M papers up until April 2025
Try it at: soarxiv dot org
soarXiv now loads faster than your vibe-coded B2B SaaS.
Most of you should see load times <2s.
This also makes expanding the soarXiv universe much more feasible.
How? Some tech details in thread:
Introducing soarXiv ✈️, the most beautiful way to explore human knowledge
Take any paper's URL and replace arxiv with soarxiv (show in video) to teleport to its place in the universe
I've embedded all 2.8M papers up until April 2025
Try it at: soarxiv dot org