Unstructured (@UnstructuredIO) / X

Unstructured

1,569 posts

Unstructured

@UnstructuredIO

Stop dilly-dallying. Get your data. 👉🏼 Get Started: unstructured.io

San Francisco, CA

Joined August 2022

Unstructured
@UnstructuredIO
Aug 8, 2025
We're excited to announce that Unstructured has joined the @PalantirTech FedStart program to accelerate our path to FedRAMP High and IL-5 compliance. Through FedStart, we will leverage Palantir’s proven security and accreditation expertise to fast-track the deployment of secure,
20K
Unstructured
@UnstructuredIO
Sep 10, 2024
🚀 Improve your RAG results by adding metadata pre-filtering with @MongoDB and LangGraph by @langchain. Read this blog post by @joshaaayyyy and @mariakhalusova and learn how to add custom metadata extraction to your unstructured data preprocessing pipeline, and how you can
54K
Unstructured
@UnstructuredIO
Apr 18, 2024
🚀 Llama 3 is here! Who's excited? We are! Let's play with the model and build a RAG system for chatting with your PDF files! In this quick tutorial we use Unstructured API for preprocessing PDF files, FAISS for vector storage, @langchain to bring everything together, and
7.9K
Unstructured
@UnstructuredIO
Apr 16, 2024
🎥 A gentle introduction into preprocessing PDF, HTML and email files into normalized format for your RAG or other LLM applications with Unstructured open source library by @1littlecoder: Video: youtube.com/watch?v=iPiYVC… Notebook: colab.research.google.com/drive/1U8VCjY2…
5.6K
Unstructured
@UnstructuredIO
Feb 7, 2024
⚡️We are excited to announce that our new no-code Enterprise Platform is NOW available in private beta! As RAG apps advance from prototype to production we’ve been overwhelmed by requests for an enterprise grade solution to provide these applications with the data they need.
00:00
22K
Unstructured
@UnstructuredIO
Jul 17, 2024
⏰New blog post : “Chunking for RAG: best practices” Learn about the importance of chunking, common methods, and smart chunking strategies. Bring your RAG system's performance to the next level with our Serverless API that is free to get started.
unstructured.io
Chunking Strategies for RAG: Best Practices and Key Methods | Unstructured
Chunking strategies for RAG directly affect retrieval precision and LLM response quality. Compare fixed-size, recursive, and structure-aware approaches.
23K
Unstructured
@UnstructuredIO
Dec 5, 2023
RAG with unstructured and semi-structured data presents unique challenges (and opportunities). In this blog we worked on a real-world healthcare use case with @langchain to test emerging approaches related to Multi-Vector Retrieval. The challenge here was how to empower
9.7K
Unstructured
@UnstructuredIO
Mar 14, 2024
🚀 Exciting News! We just raised $40 million in Series B funding to help move more #RAG prototypes into production. 🌟 “Nobody is passionate about getting data ready, everyone's passionate about the models themselves. Our vision is to connect human generated data with foundation
Brian Raymond
@_Brian_Raymond
Mar 14, 2024
Big news! Unstructured has just secured a $40M Series B (businesswire.com/news/home/2024… led by Menlo Ventures with support from Madrona, Bain Capital Ventures, Mango Capital and industry leaders including NVentures at NVIDIA, Databricks Ventures, and IBM Ventures. This moment marks a
4.6K
Unstructured
@UnstructuredIO
Oct 10, 2023
A simple recipe for building a local #RAG with your private data: 🚀Unstructured.io + 🦙#Llama2 + ❤️ your favorite #VectorDB +🦜 @langchain medium.com/unstructured-i…
unstructured.io
Unstructured Data Platform for GenAI | Unstructured
Transform complex, unstructured data into clean, AI-ready inputs. Connect to any source, process 64+ file types, and power your GenAI projects. Start now.
8.4K
Unstructured
@UnstructuredIO
Nov 16, 2023
🎉Congratulations to the @weaviate_io team on the launch of the Verba #RAG system. 📑Integrating our PDF extraction significantly advances the framework's ability to accurately process and structure complex PDF formats and tables, addressing key technical challenges in data
Weaviate AI Database
@weaviate_io
Nov 16, 2023
Verba, our open-source RAG framework 🐕 Keep your data private and 100% free using open-source models like SentenceTransformers and run Weaviate locally. Repo: github.com/weaviate/Verba Learn more about the modular architecture that powers you to build in the thread!
7.9K
Unstructured
@UnstructuredIO
Apr 30, 2024
📧 Build a local RAG app for your emails with Unstructured, @langchain and @ollama in a few steps. 🧵
6.6K
Unstructured
@UnstructuredIO
Oct 30, 2024
🤔 Every now and then, the debate resurfaces: are long context models making RAG obsolete? We’re sharing our thoughts on why we think RAG is here to stay. Check out our latest blog post to learn more:
unstructured.io
RAG vs. Long-Context Models. Do we still need RAG? | Unstructured
Explore the ongoing debate on RAG vs long-context models. Discover why RAG remains essential for enhancing LLMs, even as context windows grow.
6.7K
Unstructured
@UnstructuredIO
Oct 11, 2023
Are you struggling with varying data quality and hallucinations from your #LLM? Not anymore! Use @MongoDB + @UnstructuredIO to enhance the accuracy of your #LLMs with the power of #metadata, making it better at understanding and generating *accurate* text.
mongodb.com
MongoDB Vector Search Overview - MongoDB Vector Search - MongoDB Docs
Use MongoDB Vector Search to create vector indexes and perform vector search, including semantic search and hybrid search, on your vector embeddings in MongoDB.
18K
Unstructured
@UnstructuredIO
Jun 6, 2024
Check out the new notebook in the amazing @huggingface cookbook: “Building RAG with Custom Unstructured Data” Learn to preprocess multiple different types of unstructured data to use in your RAG application.
Building RAG with Custom Unstructured Data · Hugging Face
From huggingface.co
2.8K