Honored to share a major thread of my PhD research, out now in PNAS. We address a core issue with how models are used for scientific discovery.
Models are so important that they define the entire scientific process... 1/n
My team at genbio.ai is hiring PhD interns this Summer to work on sequence-based foundation models (DNA, RNA, protein) as part of an AI-Driven Digital Organism (AIDO).
To apply, please send your CV, github, and website (if applicable) to [email protected]
Pho Tran.
Thanh Vi.
Pho Shizzle.
Spring Kitchen.
Long ago, the four nations lived together in harmony. Then, everything changed when Pho Shizzle attacked.
Good software should be fast, reliable, reusable, and maintainable. A lot of BioML benchmarking is uh… not.
But biology doesn’t standardize to a few data types like language, audio, or images. We’re constantly inventing new ways to measure life... 1/n
Attending my first conference next week -- and I'll be giving a talk! If you'll be at CSHL Biological Data Science, look for our work on inferring sample-specific gene regulatory networks for 7000 tumors. Lots to come, but what a fun start to this project.
#biodata22
Beyond excited to announce the first release of Contextualized (v0.1.0), a statistical machine learning toolbox for estimating models, distributions, and functions with context-dependent behavior. Check out our demos for examples with code!
NEW PREPRINT
Human decisions are nuanced and hard to quantify. Accurate models are too complex to interpret, and interpretable models are too simple to be accurate. But decision models must be both accurate and interpretable to support real decisions!
arxiv.org/abs/2310.07918
I'm a big admirer of what @manntis4 and @CorinWagen have built at rowansci.com in such a short time. Now they've released their first FMs as NNPs seem like they're at an inflection point. Really excited to dig in on Egret-1 and the future of NNPs. Come learn with us!
📡🧬 Join us for the next Foundation Models for Biology Seminar Series (#FM4Bio) session on June 18 at 9 AM PT, featuring @manntis4.
Eli will present "Egret-1: Pretrained Neural Network Potentials for Efficient and Accurate Bioorganic Simulation." Simulating atomic systems with
The Contextualized Machine Learning White Paper
arxiv.org/abs/2310.11340
w/ @ben_lengerich
Intuition, applications, algorithms, and extensions for contextualized models: models that understand heterogeneity in real data, adapt to new environments, and are explainable by design.
AIDO models are getting bigger, stronger, and more multi-modal, and AIDO.ModelGenerator is making it all possible.
Today's release includes the new AIDO.Tissue, AIDO.StructurePrediction, and AIDO.Protein-RAG models, plus new tutorials, benchmarks, and open-source integrations.
1/ We’ve updated AIDO.ModelGenerator, our open-source framework, which enables scientists and engineers to integrate their own data with all publicly available data using state-of-the-art (SOTA) foundation models. AIDO.ModelGenerator can be used for adapting, benchmarking, and
Tomorrow at NeurIPS I'll be presenting my work on sample-specific graphical models, and using them to model the heterogeneous and patient-specific pathology of cancer with observational data.
Come find with me at the Generative AI and Biology workshop, or ping me to chat