We're excited to announce that Unstructured has joined the @PalantirTech FedStart program to accelerate our path to FedRAMP High and IL-5 compliance.
Through FedStart, we will leverage Palantir’s proven security and accreditation expertise to fast-track the deployment of secure,
Unstructured
1,569 posts
Stop dilly-dallying. Get your data.
👉🏼 Get Started: unstructured.io
- 🚀 Improve your RAG results by adding metadata pre-filtering with @MongoDB and LangGraph by @langchain. Read this blog post by @joshaaayyyy and @mariakhalusova and learn how to add custom metadata extraction to your unstructured data preprocessing pipeline, and how you can
- 🚀 Llama 3 is here! Who's excited? We are! Let's play with the model and build a RAG system for chatting with your PDF files! In this quick tutorial we use Unstructured API for preprocessing PDF files, FAISS for vector storage, @langchain to bring everything together, and
- 🎥 A gentle introduction into preprocessing PDF, HTML and email files into normalized format for your RAG or other LLM applications with Unstructured open source library by @1littlecoder: Video: youtube.com/watch?v=iPiYVC… Notebook: colab.research.google.com/drive/1U8VCjY2…
- ⚡️We are excited to announce that our new no-code Enterprise Platform is NOW available in private beta! As RAG apps advance from prototype to production we’ve been overwhelmed by requests for an enterprise grade solution to provide these applications with the data they need.
00:00 - ⏰New blog post : “Chunking for RAG: best practices” Learn about the importance of chunking, common methods, and smart chunking strategies. Bring your RAG system's performance to the next level with our Serverless API that is free to get started.
- RAG with unstructured and semi-structured data presents unique challenges (and opportunities). In this blog we worked on a real-world healthcare use case with @langchain to test emerging approaches related to Multi-Vector Retrieval. The challenge here was how to empower
- 🚀 Exciting News! We just raised $40 million in Series B funding to help move more #RAG prototypes into production. 🌟 “Nobody is passionate about getting data ready, everyone's passionate about the models themselves. Our vision is to connect human generated data with foundationBig news! Unstructured has just secured a $40M Series B (businesswire.com/news/home/2024… led by Menlo Ventures with support from Madrona, Bain Capital Ventures, Mango Capital and industry leaders including NVentures at NVIDIA, Databricks Ventures, and IBM Ventures. This moment marks a
- A simple recipe for building a local #RAG with your private data: 🚀Unstructured.io + 🦙#Llama2 + ❤️ your favorite #VectorDB +🦜 @langchain medium.com/unstructured-i…
- 🎉Congratulations to the @weaviate_io team on the launch of the Verba #RAG system. 📑Integrating our PDF extraction significantly advances the framework's ability to accurately process and structure complex PDF formats and tables, addressing key technical challenges in dataVerba, our open-source RAG framework 🐕 Keep your data private and 100% free using open-source models like SentenceTransformers and run Weaviate locally. Repo: github.com/weaviate/Verba Learn more about the modular architecture that powers you to build in the thread!
- 🤔 Every now and then, the debate resurfaces: are long context models making RAG obsolete? We’re sharing our thoughts on why we think RAG is here to stay. Check out our latest blog post to learn more:
- Are you struggling with varying data quality and hallucinations from your #LLM? Not anymore! Use @MongoDB + @UnstructuredIO to enhance the accuracy of your #LLMs with the power of #metadata, making it better at understanding and generating *accurate* text.
- Check out the new notebook in the amazing @huggingface cookbook: “Building RAG with Custom Unstructured Data” Learn to preprocess multiple different types of unstructured data to use in your RAG application.








