Density Labs

Why most AI engineer placements fail at week 8 (and how we fixed it)

2026-04-30T00:00:00+00:00

There’s a pattern we noticed about a decade ago. It was so consistent that we eventually built our entire delivery model around it.

Most engineering placements that fail don’t fail at week 1. They don’t fail at week 4. They fail somewhere between week 6 and week 10, the period after the engineer has been onboarded but before they’re truly integrated. The team has moved past the early curiosity phase. The novelty is gone. The engineer is now expected to contribute at the level the JD promised, and a small but cumulative set of frictions starts to compound.

By the time anyone notices, it’s too late. The team has formed an opinion. The engineer has formed an opinion of the team. The relationship is already structurally broken, even if everyone is still polite in standup.

What’s frustrating about this is that almost every signal you need to predict it is visible by week 5. You just have to be looking.

This post is about what we look for, and the framework we built to make sure someone is always looking.

The shape of the failure

When a placement fails at week 8, the postmortem usually surfaces one of a small number of patterns. We’ve cataloged them across 12+ active engagements and a decade of embedded delivery. The pattern is almost always one of these:

The hidden context gap. The engineer was technically capable on paper. The codebase walkthrough went fine. But there’s a layer of institutional knowledge, why a service was named what it was named, why a test suite is structured the way it is, what an internal acronym means, that nobody documented and nobody thought to explain. Six weeks in, the engineer is making technically correct decisions that violate unwritten conventions, and the team is starting to think of them as someone who “doesn’t get how we work.”

The feedback vacuum. No one on the client team has been told they’re responsible for giving the engineer feedback. The engineer’s PRs get approved without comment. Their questions in Slack get reactji’d, not answered. By week 6 they have no idea whether they’re meeting expectations, and they start to either overdeliver in unhelpful ways or pull back. Both are signals the team eventually reads as “this isn’t working.”

The expectation drift. The engineer was hired for one role and is being asked to operate in another. This usually isn’t malicious, it’s just that the team’s needs evolved between the JD being written and the engineer arriving. By week 7 the engineer is being judged against work they were never told they would be doing. They don’t know it. The team doesn’t quite know it either. Everyone is just vaguely dissatisfied.

The integration ceiling. The engineer was treated as a contractor. They weren’t invited to architecture discussions. They weren’t included in the Slack channel where the real decisions get made. They got handed tickets and shipped them. Then in week 8 the team complained that the engineer “wasn’t proactive”, but proactivity requires context, and they were structurally denied it.

Notice what these have in common. None of them are technical. None of them would be solved by hiring “a better engineer.” They’re all design failures of the embed itself.

What we changed

About four years ago we stopped treating placement as the deliverable and started treating it as week zero of a structured 90 day process. The framework is deliberately boring. It is not innovative. What’s innovative is that we actually run it on every single placement, and we still run it six years later with no exceptions.

Here is the structure.

Days 0 to 30: Orientation and context

The first thirty days are not about productivity. They are about building the model of how the team works. The engineer reads the codebase, but more importantly they sit through enough standups, planning meetings, and 1:1s to understand the political topology of the team. Who actually decides what gets built. Who has historical context that isn’t documented. Who is the unofficial reviewer that everything has to pass through.

Concrete deliverables in this window:

A codebase walkthrough led by the team’s most senior engineer, recorded.
A team introduction round where each team member spends 15 minutes explaining what they own and what they care about.
An expectations document, signed by both the engineer and the client lead, that names what success looks like in 30, 60, and 90 days.
Day one access to our internal AI playbook, the same one we update across 12+ engagements, so the engineer is not learning RAG patterns or eval design from first principles on company time.

The expectations document is the single most important artifact in the entire framework. It is the thing that prevents expectation drift. It gets revisited every 30 days.

Days 30 to 60: Contribution and feedback

By day 30 the engineer should be shipping production code. Not heroic work, that comes later, but real, merged, deployed contributions. The point is to surface friction early, when it can still be corrected.

This is also when we run our weekly tech lead check ins. Once a week, a Density tech lead, not the client’s lead, ours, has a 30 minute private conversation with the placed engineer. The goal of that call is specifically to surface things the client team can’t see: confusion about expectations, friction with a particular team member, a sense that the role is drifting from what was scoped.

We’ve heard things in those calls that nobody on the client team knew. A backend engineer who was being pulled into design decisions and didn’t feel qualified. A senior who realized at week 5 that the team’s tech lead was ignoring her PRs. A junior who was being given too much autonomy and was too proud to ask for more guidance.

In every one of those cases, the issue was solvable at week 5. By week 12 it would have been a placement failure.

Days 60 to 90: Integration and ownership

By day 60 the engineer should be getting handed a feature or surface to own. Not a ticket, not a sprint of tickets, a coherent piece of the product that they are accountable for, end to end. This is the test of whether the embed has worked. An engineer who has truly integrated takes the surface and runs with it. An engineer who hasn’t, even one who is technically excellent, will hesitate, ask too many clarifying questions, or quietly stay in execution mode.

This is also when the engineer starts being included in design discussions as a peer, not as a service provider. The shape of the engagement past day 90 is set in this window. Engineers who become peers stay for years. Engineers who stay in execution mode, in our experience, leave within 18 months.

What this costs us

The framework is expensive to run. Each engineer we place gets four hours per week of senior Density attention for the first 90 days, between the weekly check in, async check ins, expectation reviews, and the work of resolving frictions when they surface.

That’s economically indefensible if you treat staff augmentation as an arbitrage business. You can’t run this framework and also charge the lowest hourly rate in the market. We don’t try. We charge what the framework costs to run, plus a margin that lets us keep doing this for the next decade.

The economic argument is that this is cheaper, not more expensive, for our clients. A failed placement costs the client roughly six months of velocity, a quarter of leadership attention, and the cultural cost of a public failure. Avoiding two of those over the life of a five year engagement pays for the framework many times over.

That’s why the metric we report on is forced replacements. We’ve had zero of them in the last six years. That’s the framework working.

What it doesn’t fix

There are things this process does not solve.

It doesn’t fix a misalignment between the client’s stated needs and their actual needs. If the client said they wanted a senior backend engineer but actually needed a tech lead, the framework will surface the gap, but the gap is theirs to close.

It doesn’t compensate for a fundamentally hostile team culture. We’ve had two cases over the years where a client team was structurally not ready to integrate an outside engineer. The framework surfaced the problem clearly within 60 days. We ended both engagements. Neither was the placed engineer’s fault.

It doesn’t replace the client’s responsibility to lead. The check ins surface friction; they don’t resolve it. Resolution requires someone on the client side to act on what we surface. Most of the time they do. When they don’t, the framework gives us the data to have an honest conversation early enough to do something about it.

The real lesson

The single most important thing we’ve learned over six years of running this framework is that most placement failures are visible by week 5 and almost always avoidable by week 7, but only if someone is paying attention. Most providers stop paying attention the moment the offer letter is signed. We start paying attention at exactly that moment, and we don’t stop until day 90.

That is, in the end, the whole pitch. The engineers we place are good. So are the engineers a lot of other providers place. The difference isn’t the engineer. It’s whether anyone is in the room when something starts to go wrong at week 8.

We are.

If you’re considering an embedded engineering engagement and you want to talk about how this would work for your team, book a 30 minute discovery call. We’ll tell you honestly whether we can help, and if not, who to talk to instead.

Why the AWS outage took 16 hours, not 1

2025-10-25T00:00:00+00:00

October 20, 2025. AWS went dark for 16 hours.

A DNS resolution issue. The kind of problem AWS engineers have seen and fixed in under an hour, dozens of times across the years. Snapchat down. Fortnite offline. Banking apps unreachable. Smart doorbells dumb again. Even ChatGPT silent. Across 2,000 businesses, billions in lost productivity.

The official explanation is a DNS issue. The unofficial one is more interesting.

Months earlier, Amazon laid off hundreds of AWS engineers as part of a broader restructure to “do more with AI.” The cloud computing unit, the team responsible for the systems the rest of the internet runs on, lost a meaningful slice of its experienced staff.

Corey Quinn, a cloud computing analyst, put it cleanly:

“You can hire brilliant people who understand DNS at a technical level. What you can’t easily replace is the person who remembers that when DNS starts acting weird, you need to check that seemingly unrelated system in the corner, because it’s caused problems before.”

That is tribal knowledge. The institutional memory that comes from being in the room when things broke, repeatedly, for years.

You cannot buy it. You cannot transfer it through documentation. You cannot synthesize it with an LLM trained on Stack Overflow. It is the layered context an engineer accumulates by being in the system long enough to remember the weird thing that happened in 2019 that led to the workaround that is still load bearing in 2025.

Density compounds. So does its absence.

The four densities, and what Amazon lost

We use an internal framework for what makes engineering partnerships actually work. We call it operational density. Four dimensions that accumulate over time and cannot be shortcut.

Context density. How much of the system’s domain, codebase, and culture lives inside the engineer’s working knowledge. This is what Quinn was describing. AWS engineers who left were not interchangeable with the new ones. The new ones were brilliant. They had not yet seen DNS behave the way it behaves at AWS scale, when this specific service interacts with that specific load balancer config.

Trust density. How many small bets the engineer and the team have run together that paid off. When something goes wrong, who do you call first? In an outage, the right answer is the person we trust, not the person on call by the rotation. Lay off a tenured engineer and you do not just lose their skills. You lose every team’s first phone call.

Cadence density. How tight the feedback loops between teams have become. Senior engineers do not escalate in formal channels. They DM, they grab a Zoom, they get the right people in the room in two minutes. New engineers escalate by ticket and wait. In a 16 hour outage, those minutes compound.

Stake density. How much each side has invested beyond the contract. Engineers who feel ownership of the system show up differently than engineers who feel like cost centers. After a layoff round, the remaining engineers know which side of that line their employer thinks they sit on.

Why mid market AI buyers should pay attention

This was Amazon. Unlimited budget, unlimited talent, a decade of tooling. They still lost 16 hours because the four densities had been hollowed out.

If your team is building anything in production, the same dynamics apply. Your seniors have context that did not get written down. Your incidents get resolved fast because someone remembers. Your AI initiative will land or stall not on the model you pick, but on whether the people running it have been in the system long enough to know the unwritten conventions.

The mistake most companies are making in 2026 is treating AI as a substitute for tenure. It is not. AI is a multiplier on tenure. Engineers with deep context plus AI ship faster. Engineers without context plus AI ship features that break in production at week 8.

What we do differently

This is the entire reason we structure engagements around tenure, not utilization.

When we place an engineer, we are not selling you hours. We are selling you the four densities, accumulating over years. Our longest engagement at Ooma is now in its tenth year. The engineer who started in 2016 is not 4x more productive than the one who started in 2024. They are 10x or 50x more productive, because all four densities have compounded.

The 16 hour AWS outage is the case study for what happens when you optimize for cost over density. We optimize for density. That is the entire difference.

If you are building AI that has to ship to production this quarter, you do not want bodies. You want engineers who have been in your codebase long enough to know which DNS quirk is going to bite you in week 8.

Start with the AI Roadmap →

Or read more about how we work: The Density Method.

Choosing the right Knowledge base app

2025-03-30T00:00:00+00:00

Choosing the right Knowledge base app

What’s a Knowledge base app? A central hub for teams to organize, document, and access work information easily through pages, subpages, categories, links, permissions, and tags. Learn more about top wiki apps

🕵🏻‍♂️ Overview of apps

When comparing popular collaboration tools, Notion, Loop, and Confluence each stand out for their unique strengths. Notion is highly flexible and user-friendly, making it an excellent choice for small to medium-sized teams. Loop, on the other hand, offers simplicity and seamless integration with Microsoft 365, making it ideal for users already embedded in the Microsoft ecosystem. Confluence, known for its robust features and scalability, is particularly well-suited for large enterprises with complex collaboration needs. Each tool caters to different team sizes and workflows, ensuring there’s a solution for various organizational requirements.

Breaking it down we could say:

Notion : Flexible and user-friendly, great for small to medium-sized teams.
Loop : Simple and integrates seamlessly with Microsoft 365, ideal for Microsoft users.
Confluence : Powerful and scalable, best suited for large enterprises.

💪 Strength of each tool

Notion:

Easy page and subpage management : Notion’s intuitive interface allows users to effortlessly create, organize, and nest pages and subpages, making it ideal for structuring projects or knowledge bases.
Drag-and-drop sorting : Users can easily rearrange content, tasks, or sections with a simple drag-and-drop feature, enhancing flexibility and workflow efficiency.
Great linking and collaboration features : Notion excels at connecting related content through internal links and supports real-time collaboration, making it a strong choice for team-based projects.

Loop:

Simple and intuitive : Loop’s clean and straightforward design ensures a low learning curve, enabling users to get started quickly without extensive training.
Works seamlessly with Microsoft tools : As part of the Microsoft 365 ecosystem, Loop integrates effortlessly with apps like Teams, Outlook, and OneDrive, streamlining workflows for Microsoft-centric teams.

Confluence:

Robust page hierarchies : Confluence offers advanced structuring capabilities, allowing users to create complex page hierarchies that are perfect for organizing large amounts of information.
Advanced permissions and linking : With granular permission settings and powerful linking features, Confluence ensures secure and efficient content management, even in large organizations.
Perfect for technical documentation : Its robust formatting options, version control, and support for macros make Confluence a top choice for creating and maintaining detailed technical documentation.

👁️‍🗨️ What they can improve

Notion:

Offline editing : Introducing offline access would greatly enhance usability for users who need to work without an internet connection.
Better performance for large teams : Optimizing the platform to handle large datasets and high user activity would improve efficiency for bigger teams.
More advanced permissions : Adding granular permission settings, especially for hiding or restricting access to specific links, would enhance security and control.

Loop:

Better tagging and organization : Improving tagging systems and organizational features would make it easier to manage and locate content.
More customization options : Expanding customization capabilities would allow teams to tailor the tool to their unique workflows and branding needs.
Integration with non-Microsoft tools : Adding integrations with tools outside the Microsoft ecosystem would make Loop more versatile for diverse teams.

Confluence:

Simplified interface : Streamlining the interface to make it more user-friendly for non-technical users would reduce the learning curve.
Improved mobile experience : Enhancing the mobile app’s functionality and usability would support productivity on the go.
More modern templates : Offering updated and visually appealing templates would make content creation more engaging and efficient.

📝 Which Tool is Right for You?

Notion : Best for small to medium teams who want flexibility, ease of use, and a highly customizable workspace for collaboration and project management.
Loop : Ideal for teams already embedded in the Microsoft 365 ecosystem, offering seamless integration and simplicity for everyday workflows.
Confluence : Great for large enterprises, particularly those using Atlassian tools, as it provides robust features, scalability, and advanced functionality for complex documentation and collaboration needs.

✍🏼 Final thoughts

Each tool has its unique strengths and weaknesses, making them suitable for different use cases.
The best choice depends on your team’s size, technical expertise, and existing tool ecosystem.
Before making a decision, carefully consider your team’s specific needs and try out a demo to see which tool aligns best with your workflow.

Ultimately, the best tool for your team depends on your specific requirements, existing tools, and long-term goals. Don’t hesitate to explore demos or free trials to see which one feels like the perfect fit. Whichever you choose, embracing the right collaboration platform can transform the way your team works together.

Thanks for reading, and here’s to smoother workflows and better teamwork! 🚀

Why You Should Implement QA in Your Next Project

2025-03-27T00:00:00+00:00

Why You Should Implement QA in Your Next Project

By:

Cristian Cuéllar

Mar 27, 2025

Introduction

I have been working in software projects for over 15 years and many times people ask me, “What does your job entail?”. I usually answer something like this:

“Each time we use an electronic device, a mobile application or a software program for a certain purpose or need, we want it to work properly, the way it was designed. That’s what a QA verifies”

I want to explain the main purpose of QA and the impact and benefits that such role brings to a software project.

What is “QA”?

A software project follows different methodologies and processes to carry it out. Such processes are done by people; however, there can be unexpected situations like:

Unclear requirements.
Different conceptions or interpretations from the team.
Unforeseen failures and/or technical issues.

That’s where QA, or “Quality Assurance”, comes in to ensure that the final product will meet the minimum requirements expected and achieving the expected functionality. That ensures a better experience for the users, making it more likely they continue using the product.

A tester is not the same as a QA Engineer

A tester is a person who follows test cases with a predefined set of instructions, analyzing obtained results vs. expected behavior and reporting them as successful or failed depending on each case. A QA engineer, on the other hand, needs to have more skills than just “running” test cases linearly.

A QA engineer needs to stay in the loop and get involved in different areas of the organization and phases during the life of the project; to be in constant communication with end users, Project Managers, Product Owners, and Developers. This is mainly because a QA engineer is the midpoint between administrators, product owners, and the development team that is implementing the necessary technical solution.

What does a “QA Engineer” do?

The QA engineer is in charge of conducting all the necessary tests of an application to ensure it complies with the requirements requested by the customer, sometimes “forcing” the system in controlled environments and/or scenarios with different situations to see how the application behaves.

The QA engineer requires analyzing and understanding the requirements and the business process involved with the system to be validated, through operation manuals, business flows, diagrams, etc. All this in order to understand the urgency and need for a specific software product to be released. The QA engineer is also responsible for planning, designing, executing, and reporting the necessary tests during the testing phase for the software project. It requires having constant communication with the experts, product owners and end users of the system, as well as with the programmers to clarify doubts, specifications and requirements before, during and after the test stage.

That is why adding a QA engineer or implementing at least a basic test phase during the time it is being developed will give you a big edge over your competition. Developers can only do so much, and their work load doesn’t give them the time or focus to test like a QA engineer can.

In Conclusion

Every time we buy a product, use an online service, or pay for an “automated” system think about all the necessary effort that someone needed to do to ensure that it happened successfully and without errors.

Now that you know what a QA engineer does, you should consider having quality assurance done for your next project.

Get a free consultation and let’s talk about your needs to find you a QA engineer, developer, or designer tailored to your team’s needs.

Explore other blog posts

Choosing the right Knowledge base app ›

About us

Our clients

Staff augmentation

Software development

Design

Start Hiring

FOR COMPANIES

Staff augmentation

UX/UI design

Software development

RESOURCES

Blog

Newsletter

PreVetted Podcast

Cronus

ABOUT US

About Density Labs

Apps

IT Girls

SUPPORT

CONNECT WITH US

Retrieval Augmented Generation using Langchain + MongoVDB

2025-03-07T00:00:00+00:00

Retrieval Augmented Generation using Langchain + MongoVDB

Augmented retrieval or RAG is a technique that allows us to provide additional context to our conversions with a natural language model (LLM), this context can help us to generate more specific answers with knowledge that may be specific to our project and with which the model we are using has not been trained.

Example, we want to implement an assistant in our platform which can provide support to customers about a product or service, our assistant will only have the context with which he was trained, being our documentation about the product a resource which he can use to give a better answer about the problem to solve, automatically generating searches to our database whenever necessary.

First of all, let’s review some key concepts to facilitate the understanding of the code.

What is Langchain?

Langchain is a Python framework with tools for communicating with LLMs to facilitate basic interactions with the different models, this allows us with just a few lines of code to connect to a wide variety of available models and perform tasks such as having converts, creating agents, configuring tasks, or using embeddings to process text, something that will be useful for configuring our RAG.

What is VectorDB

A vector database is a list of numbers where each element of the list represents a characteristic or attribute of the data, these values are assigned by the embedding with which it is created and the value can vary from one embedding to another depending on how optimized it is to describe certain attributes, some being more focused on describing characteristics such as colors, tones or size.

This looks as follows

Being this array of numerical values the field to be stored in our vector database, with which our search engine will apply a proximity algorithm to get the most relevant records with respect to a search.

Let’s check the python implementation

First of all we have to create a collection in Mongo in which we will be storing our documents as we do with any kind of record. In this example we will create a base class to handle interactions with the data either store or consume.

class BaseRAG:
    embedding = embedding
    mongo_uri = (
        f"mongodb+srv://{MONGO_USER}:{MONGO_PASSWORD}@{MONGO_SPACE}.yng1j.mongodb.net"
    )
    mongo_client = MongoClient(
        mongo_uri,
        server_api=ServerApi("1"),
        maxPoolSize=50,
    )

    def __init__ (self) -> None:
        self.db = self.mongo_client[MONGO_DB_NAME]
        self.collection = self.db[]
        self.vector_store = MongoDBAtlasVectorSearch(
            collection=self.collection,
            embedding=embedding,
            index_name=ATLAS_VECTOR_SEARCH_INDEX_NAME,
            relevance_score_fn="cosine",
        )

Once the collection is configured, we will have to configure an index through Atlas Search either in the Cloud mongo platform itself or by writing some code. Create your index model, then create the search index.

def create_index(self, index_name):
        search_index_model = SearchIndexModel(
            definition={
                "fields": [
                    {
                        "numDimensions": 1536,
                        "path": "embedding",
                        "similarity": "cosine",
                        "type": "vector",
                    },
                    {"type": "filter", "path": "metadata.category"},
                ]
            },
            name=index_name,
            type="vectorSearch",
        )
        return self.collection.create_search_index(model=search_index_model)

Once our Mongo configuration is finished, we can connect our embedding that will allow us to have text chunks and assign a score to each one to later perform our search, in this example we will use the OpenAI embedding.

embedding = OpenAIEmbeddings(model=EMBEDDING_MODEL, chunk_size=1000)

Now we have to load the content of our file using a loader that will allow us with just one line to create documents of the size assigned in our embedding, in this case multiple text chunks of 1000 tokens.

Our documents can include metadata which will help us later to have a limited scope of data to rank which is very useful for separating data into groups such as documents by project or user.

def get_embedding(self, data):
        return self.embedding.embed_query(data)

    def prepare_documents(self, documents, metadata={}):
        return [
            {
                "metadata": {
                    **doc.metadata,
                    **metadata,
                    "created_at": datetime.utcnow(),
                },
                "text": doc.page_content,
                "embedding": self.get_embedding(doc.page_content),
            }
            for doc in documents
        ]

    def ingest_file(self, file_path, metadata={}):
        try:
            file_content = PyPDFLoader(file_path=file_path).load()

            if not file_content:
                return []

            docs = self.prepare_documents(file_content, metadata)
            return self.save_embeddings(docs)
        except Exception as e:
            print(f"Error loading file: {e}")
            return []

This will result in a document with the following structure.

In which we will have to save in MongoDb in our previously created collection and this will add our record to our Index automatically.

def save_embeddings(self, data):
        return self.collection.insert_many(data)

Finally, the only thing left to do is to start consuming data.

def retrieve(self, query):
        results = self.vector_store.similarity_search(
            query, k=100, pre_filter={"category": "real-state"}
        )

        if not results:
            print("No matches found on documents")
            return ""

        print(f"Retrieve results from {len(results)} documents")
        return "\n\n".join(chunk.page_content for chunk in results)

But we can modify our retriever based on different params according to our needs.

# Retrieve more documents with higher diversity
# Useful if your dataset has many similar documents
self.vector_store.as_retriever(
    search_type="mmr",
    search_kwargs={'k': 6, 'lambda_mult': 0.25}
)

# Fetch more documents for the MMR algorithm to consider
# But only return the top 5
self.vector_store.as_retriever(
    search_type="mmr",
    search_kwargs={'k': 5, 'fetch_k': 50}
)

# Only retrieve documents that have a relevance score
# Above a certain threshold
self.vector_store.as_retriever(
    search_type="similarity_score_threshold",
    search_kwargs={'score_threshold': 0.8}
)

# Only get the single most similar document from the dataset
self.vector_store.as_retriever(search_kwargs={'k': 1})

# Use a filter to only retrieve documents from a specific paper
self.vector_store.as_retriever(
    search_kwargs={'filter': {'paper_title':'GPT-4 Technical Report'}}

Some common integrations are:

Provide context about a product to improve customer service.
Create a chat to talk with a model about a local document.
Business analysis and decision-making.

Resources:

https://www.mongodb.com/resources/basics/databases/vector-databases?msockid=1ff5be6fd99d6b2405faaa3fd8e46a6a
https://python.langchain.com/api_reference/core/vectorstores/langchain_core.vectorstores.base.VectorStore.html#langchai

Final code:

import os
from pymongo.mongo_client import MongoClient
from pymongo.server_api import ServerApi
from pymongo.operations import SearchIndexModel
from langchain.vectorstores import MongoDBAtlasVectorSearch
from langchain_openai import OpenAIEmbeddings
from langchain_community.document_loaders import PyPDFLoader
from datetime import datetime

MONGO_USER = os.getenv("MONGO_USER", "teamx")
MONGO_PASSWORD = os.getenv("MONGO_PASSWORD", "lPoIoKOFavVJBPJy")
MONGO_SPACE = os.getenv("MONGO_SPACE", "personax")
MONGO_DB_NAME = os.getenv("MONGO_DB_NAME", "personax-dev")
EMBEDDINGS_COLLECTION = os.getenv("EMBEDDINGS_COLLECTION", "data_embeddings_test")
ATLAS_VECTOR_SEARCH_INDEX_NAME = os.getenv(
    "ATLAS_VECTOR_SEARCH_INDEX_NAME", "vector_index_dev_test"
)
OPENAI_API_KEY = os.getenv("OPENAI_API_KEY", "")
OPENAI_MODEL_NAME = os.getenv("OPENAI_MODEL_NAME", "gpt-4o")
EMBEDDING_MODEL = os.getenv("EMBEDDING_MODEL", "text-embedding-3-small")

embedding = OpenAIEmbeddings(model=EMBEDDING_MODEL, chunk_size=1000)

class BaseRAG:
    embedding = embedding
    mongo_uri = (
        f"mongodb+srv://{MONGO_USER}:{MONGO_PASSWORD}@{MONGO_SPACE}.yng1j.mongodb.net"
    )
    mongo_client = MongoClient(
        mongo_uri,
        server_api=ServerApi("1"),
        maxPoolSize=50,
    )

    def __init__ (self) -> None:
        self.db = self.mongo_client[MONGO_DB_NAME]
        self.collection = self.db[EMBEDDINGS_COLLECTION]
        self.vector_store = MongoDBAtlasVectorSearch(
            collection=self.collection,
            embedding=embedding,
            index_name=ATLAS_VECTOR_SEARCH_INDEX_NAME,
            relevance_score_fn="cosine",
        )

    def create_index(self, index_name):
        search_index_model = SearchIndexModel(
            definition={
                "fields": [
                    {
                        "numDimensions": 1536,
                        "path": "embedding",
                        "similarity": "cosine",
                        "type": "vector",
                    },
                    {"type": "filter", "path": "metadata.category"},
                ]
            },
            name=index_name,
            type="vectorSearch",
        )
        return self.collection.create_search_index(model=search_index_model)

    def save_embeddings(self, data):
        return self.collection.insert_many(data)

    def retrieve(self, query):
        results = self.vector_store.similarity_search(
            query, k=100, pre_filter={"category": "real-state"}
        )

        if not results:
            print("No matches found on documents")
            return ""

        print(f"Retrieve results from {len(results)} documents")
        return "\n\n".join(chunk.page_content for chunk in results)

    def get_embedding(self, data):
        return self.embedding.embed_query(data)

    def prepare_documents(self, documents, metadata={}):
        return [
            {
                "metadata": {
                    **doc.metadata,
                    **metadata,
                    "created_at": datetime.utcnow(),
                },
                "text": doc.page_content,
                "embedding": self.get_embedding(doc.page_content),
            }
            for doc in documents
        ]

    def ingest_file(self, file_path, metadata={}):
        try:
            file_content = PyPDFLoader(file_path=file_path).load()

            if not file_content:
                return []

            docs = self.prepare_documents(file_content, metadata)
            return self.save_embeddings(docs)
        except Exception as e:
            print(f"Error loading file: {e}")
            return []

if __name__ == " __main__":
    file_to_ingest = "./real-state-data.pdf"
    rag = BaseRAG()
    # rag.create_index("example_index_name")
    # rag.ingest_file(file_to_ingest, {"category": "real-state"})
    query = "What is the..."
    result = rag.retrieve(query)
    print(result)

The Importance of Streams in Node.js: Composability, Spatial Efficiency, and Gzipping

2025-02-12T00:00:00+00:00

The Importance of Streams in Node.js: Composability, Spatial Efficiency, and Gzipping

When working with Node.js, one of the most powerful yet often underappreciated features is streams. Streams are a core concept that underpins many Node.js functionalities and offer an efficient way to handle I/O operations. Whether you’re working with file uploads, data processing, or web servers, understanding and leveraging streams can significantly improve the performance and maintainability of your applications.

In this blog post, we’ll explore the importance of streams, focusing on their composability , spatial efficiency , and the practical benefits of gzipping data on the fly.

What Are Streams in Node.js?

Streams are objects in Node.js that allow you to read data from a source or write data to a destination in a continuous, asynchronous manner. They are particularly suited for handling large amounts of data efficiently, without loading everything into memory.

There are four types of streams in Node.js:

1. Readable : Used to read data (e.g., file streams or HTTP requests).

2. Writable : Used to write data (e.g., file writes or HTTP responses).

3. Duplex : Can read and write (e.g., TCP sockets).

4. Transform : A special type of duplex stream that can modify or transform the data as it is read or written (e.g., gzipping data).

Why Streams Matter

1. Composability

Streams are inherently composable, which means you can chain them together to create powerful data pipelines. This composability aligns with Node.js’s modular philosophy and allows you to build complex data workflows with ease.

For example, consider processing a large file:

Here, each stream performs a specific task (reading, gzipping, writing), and the pipe method elegantly connects them. This not only simplifies your code but also improves readability and maintainability.

2. Spatial Efficiency

One of the key advantages of streams is their ability to handle data incrementally, consuming only a small amount of memory at any given time. This is especially important when dealing with large datasets, such as video files, logs, or data from APIs.

For instance, when reading a file using streams:

In this example, the file is read in chunks, meaning you don’t need to load the entire file into memory, which is crucial for performance in resource-constrained environments. Contrast this with reading the entire file using fs.readFile, which would block your event loop and consume significant memory for large files.

3. Gzipping Data On the Fly

Compression is an essential part of modern web development. Serving compressed files reduces bandwidth usage and speeds up data transfer. Streams shine here by allowing you to gzip data on the fly as it is being read or written.

Here’s an example of setting up a Node.js server that serves gzipped responses:

With this approach, you don’t need to create intermediate compressed files, saving both disk space and time. Streams allow you to directly process and serve the compressed data as it flows through the pipeline.

Key Benefits of Using Streams

Scalability : Streams can process data piece by piece, making them suitable for handling large datasets without overloading memory.
Performance : By avoiding intermediate storage and unnecessary data copying, streams provide a faster and more efficient way to process data.
Modularity : The composability of streams encourages modular, reusable code.
Flexibility : Streams work seamlessly with various Node.js modules, such as zlib for compression, crypto for encryption, or HTTP for server responses.

Conclusion

Streams are a cornerstone of Node.js, enabling developers to handle data in an efficient, modular, and scalable way. Their composability allows you to build complex pipelines, their spatial efficiency makes them ideal for processing large datasets, and their integration with tools like zlib provides practical benefits such as gzipping data on the fly.

If you’re not already using streams in your Node.js projects, now is the time to dive in. They’re not just a tool—they’re a paradigm that can elevate your applications to new levels of performance and maintainability.

Happy streaming! 🚀

Piping Patterns in Node.js: A Deep Dive with Examples

2025-01-28T00:00:00+00:00

Piping Patterns in Node.js: A Deep Dive with Examples

Node.js provides an efficient way to handle streams using the pipe() method. This functionality is pivotal for working with data streams, allowing you to transfer data from a readable stream to a writable stream with ease. In this post, we’ll explore two examples of piping patterns that demonstrate the power and flexibility of Node.js streams.

What Is Piping in Node.js?

Piping is a mechanism in Node.js that allows you to connect a readable stream to a writable stream. This means you can direct the flow of data from one stream to another without manually handling chunks of data.

The general syntax is:

You can also chain multiple streams together, creating a pipeline for data processing.

Example 1: Reading a File and Compressing It

In this example, we’ll read a file, compress its contents using the zlib module, and write the compressed data to a new file.

Code Example

How It Works

1. The fs.createReadStream() reads the file in chunks.

2. The zlib.createGzip() compresses each chunk of data.

3. The fs.createWriteStream() writes the compressed chunks to a new file.

4. The pipe() method seamlessly connects these streams.

This pattern is efficient because it avoids loading the entire file into memory, making it suitable for handling large files.

Example 2: Streaming an HTTP Response with Transformation

In this example, we’ll create a basic HTTP server that streams a file to the client while transforming its contents to uppercase.

Code Example

How It Works

1. The server listens for incoming requests.

2. The file is read using fs.createReadStream().

3. The Transform stream modifies the data by converting it to uppercase.

4. The transformed data is sent directly to the client using res (an HTTP writable stream).

This approach demonstrates how to apply real-time transformations to a stream, which is a common requirement in web servers.

Benefits of Using Piping Patterns

1. Memory Efficiency : Streams handle data in chunks, avoiding memory bloat.

2. Composability : You can chain multiple streams together, creating powerful data pipelines.

3. Ease of Use : The pipe() method simplifies the process of connecting streams.

4. Real-Time Processing : Streams enable processing of data as it arrives, making them ideal for time-sensitive applications.

Conclusion

Node.js piping patterns offer a robust way to handle streaming data. Whether you’re compressing files, transforming HTTP responses, or chaining complex operations, the pipe() method makes your code cleaner and more efficient.

Try out the examples above and explore how streams can simplify your Node.js applications!

Happy coding! 🚀

Defend your design decisions

2024-12-04T00:00:00+00:00

Defend your design decisions

Part of being a good designer is also being able to convince people why you chose a certain approach. Some decisions can take more time than others and even need to be relied upon and reviewed with multiple people before moving forward. Reviewing the opinions of others and getting into an agreement is also part of the job. In this blog, we’ll go over some things to take into consideration when defending your designs and understand what’s the optimal course of action to take to deliver the best solution possible.

Learning to listen

Usually, when working with designs, there are stakeholders to check with and see how progress is going. From clients to project managers and even people in certain positions of power and influence, they all want to see how things are developing and will offer their input and perspective on things. They can have some ideas that could work and others that probably won’t, but it’s still important to listen to their ideas and avoid dismissing them in general. People in general just look to be heard and have their opinions valued so any type of sudden dismissal could hurt the relationship and development of the product being built. Whenever input is being offered by stakeholders, listen to what they say, try to understand the thought process behind their reasoning, and let them know that you appreciate their ideas and will evaluate them. Even if you don’t use their feedback in the design, the next time you meet up with them, you can explain to them that you reviewed their ideas and let them know your reasoning behind not moving forward with them. The stakeholders will see that you have valued their opinions and, even if you didn’t design based on their approach, will appreciate the transparency and willingness to listen to them.

Anticipate the outcome

Once you learn how your stakeholders think, you can start to anticipate the outcome in design meetings. Every time you’re in charge of presenting a design demo of a certain product, you can review the list of people who will have input or questions about the design and think of ideas or bullet points to explain your reasoning behind your decisions. You can also create quick alternate versions of the designs that could resemble something that the stakeholders would propose and explain why that approach would be less beneficial than the one you’re proposing. It’s important to get to know a bit more about who you’re presenting to and anticipate these outcomes to have an easier time defending your proposal. Going through this process could also make you change your mind and see that your proposal could not be the most optimal compared to the one that could be proposed by a stakeholder so don’t hesitate to go over this and be open to change as well if applicable.

Research and connect

You can defend your decisions and explain your reasonings, but if you don’t have anything to back it up, it could get more complicated to convince your stakeholders to move forward with the approach proposed. It’s important to always do research before designing, and also use that research as a way to back up your designs, whether it’s in patterns previously used in the industry, in the same product itself, or data that suggests going in a certain direction, these elements are essential not only to build an optimal design for the project, but also to advocate for it as well. Research can also come by reviewing decisions with fellow team members if possible. If you have the possibility, try to gather feedback from your colleagues and review the designs to understand their strengths and weaknesses and, based on the feedback, iterate if necessary until you get an optimal approach that can be presented to the stakeholders. You can present your designs to your colleagues the way you would present them to the stakeholders and view key areas to improve in the presentation. Don’t forget to also do the same for your teammates if they come looking for help and feedback as well, having a healthy environment of willingness to help each other helps not only the company but your growth and knowledge as well to build better designs.

Conclusion

Defending your designs is a skill that its developed over time and experience not only by building these designs but also by gathering feedback, listening to opinions, and researching and connecting with team members and colleagues. If you end up making a mistake during your presentation, don’t let it get to you, review what you did wrong and what you can improve based on it to keep moving forward and continue learning. Don’t forget to check in with your colleagues and even your stakeholders any time that’s possible and appropriate to learn more about their thought processes and how they perceive things, you can end up being surprised about how similar you can be to them in some ways. Continue to learn from everyone and this will guide you to be a better designer.

What is the Event Demultiplexer in Node.js?

2024-11-28T00:00:00+00:00

What is the Event Demultiplexer in Node.js?

The event demultiplexer is a crucial component in Node.js that allows it to efficiently handle multiple I/O operations at once. It works behind the scenes as part of the Node.js runtime to manage asynchronous tasks like file reading, network requests, or database operations. While it’s closely related to the event loop , the event demultiplexer is the actual mechanism that waits for I/O events to complete and signals the event loop when they are ready to be processed.

In essence, the event demultiplexer serves as a gatekeeper : it monitors various I/O operations happening in the system and tells Node.js when an operation is ready to be processed, allowing the event loop to pick it up.

How Does the Event Demultiplexer Work?

To understand the event demultiplexer, it’s important to break down its role in handling asynchronous operations. The demultiplexer is a mechanism that waits for multiple events from various I/O sources and notifies the application when one or more events are ready to be processed.

Here’s how the event demultiplexer fits into Node.js’s architecture:

An I/O Operation is Initiated : When Node.js makes an asynchronous call (e.g., reading from a file or making a network request), it doesn’t block the main thread while waiting for the result. Instead, it delegates the task to the event demultiplexer.
The Event Demultiplexer Watches : The event demultiplexer monitors various file descriptors (which represent things like network connections, open files, or pipes). This includes knowing when these descriptors are ready for actions like reading, writing, or when a network packet has arrived. Importantly, the demultiplexer waits efficiently, consuming minimal CPU resources.
Demultiplexing Events : When one or more I/O operations complete, the event demultiplexer wakes up. It notifies the event loop that specific tasks are ready to proceed. This notification includes which file descriptor is ready and what type of operation can be performed (e.g., read, write).
Event Loop Processes the Callbacks : After the event demultiplexer signals the event loop, the loop picks up the corresponding callback and processes it. The callback is then executed with the result of the I/O operation, such as a file’s contents or the response from an API call.

The Event Demultiplexer in Action

Let’s walk through an example of how the event demultiplexer works in a real-world scenario.

Starting an Asynchronous File Read :

In this example, the fs.readFile function doesn’t block the execution of the program. Instead, the file read operation is delegated to the event demultiplexer.

Monitoring the File Descriptor :
The event demultiplexer now monitors the file descriptor associated with the example.txt file. It waits for the file system to signal that the file is ready to be read.
Notifying the Event Loop :
Processing the Callback :

Under the Hood: How the Event Demultiplexer Works on Different Platforms

Node.js uses libuv , a library that abstracts the system’s I/O mechanisms to provide a consistent non-blocking I/O API across different operating systems. Under the hood, the event demultiplexer uses different platform-specific system calls depending on the OS:

epoll (Linux): Epoll is a highly efficient event notification system available on Linux. It allows the event demultiplexer to track large numbers of file descriptors with minimal overhead.
kqueue (macOS, FreeBSD): Kqueue is the equivalent of epoll on macOS and FreeBSD systems. It also enables efficient monitoring of I/O events.
IOCP (Windows): Windows uses I/O Completion Ports (IOCP), which work slightly differently but achieve the same goal of non-blocking I/O.

These system-specific implementations allow the event demultiplexer to manage multiple I/O operations simultaneously without creating performance bottlenecks, regardless of the underlying operating system.

The Role of the Event Demultiplexer in Node.js Performance

The event demultiplexer is what enables Node.js to efficiently handle multiple connections and asynchronous operations using a single thread. It avoids the overhead of managing multiple threads and context switching, which can be costly in terms of memory and CPU usage.

Efficient Resource Management : By delegating I/O operations to the event demultiplexer, Node.js can continue processing other tasks without waiting for I/O-bound operations to complete. This is particularly important for web servers that need to handle thousands of requests simultaneously.
Scalability : The event demultiplexer allows Node.js to scale well under high load. It can handle thousands of concurrent I/O operations (like database queries or file system reads) without spawning a new thread for each operation. This makes it ideal for applications like APIs or real-time communication platforms, where I/O operations dominate CPU-bound tasks.

Why Was the Event Demultiplexer Chosen for Node.js?

The event demultiplexer was chosen as a core component of Node.js for several reasons:

1. Non-blocking I/O:

Node.js is designed to handle asynchronous I/O operations efficiently. The event demultiplexer allows the system to wait for multiple I/O events simultaneously without blocking the main thread. This aligns perfectly with the non-blocking, event-driven nature of JavaScript.

2. Single-threaded Simplicity:

Instead of managing complex multithreaded systems with shared memory and synchronization issues, Node.js uses a single-threaded approach. The event demultiplexer fits this model well, as it offloads waiting for I/O events and wakes up the event loop only when necessary.

3. Scalability Without the Overhead of Threads:

Multithreaded systems can become complex and resource-intensive, especially when handling a large number of connections or I/O operations. The event demultiplexer, paired with the event loop, allows Node.js to scale efficiently without consuming additional resources for each connection.

Conclusion

The event demultiplexer is one of the key reasons Node.js is so efficient at handling asynchronous I/O operations. By monitoring file descriptors and signaling the event loop when tasks are ready to be processed, the demultiplexer enables Node.js to perform at high concurrency without the need for multiple threads.

This design choice allows Node.js to handle thousands of simultaneous connections and I/O operations with minimal overhead, making it the ideal platform for real-time applications, web servers, and microservices. Understanding the event demultiplexer helps explain why Node.js is so scalable, performant, and popular in the world of asynchronous, non-blocking applications.

Algorithmic Problem Solving: Frequency Counter, Multiple Pointers, and Divide & Conquer Patterns

2024-11-07T00:00:00+00:00

Algorithmic Problem Solving: Frequency Counter, Multiple Pointers, and Divide & Conquer Patterns

When it comes to solving algorithmic problems, efficiency is key. Writing code that is readable and functional is important, but as data grows, you need strategies that can handle large inputs in an optimal way. Enter algorithmic patterns. These are battle-tested approaches that help us structure our solutions, minimize complexity, and maximize performance.

In this post, we’ll explore three fundamental patterns: Frequency Counter , Multiple Pointers , and Divide and Conquer. Understanding these will not only improve your problem-solving skills but also allow you to tackle common coding challenges in a smart and efficient manner.

1. The Frequency Counter Pattern

The Frequency Counter pattern is an approach often used to compare different pieces of data by tallying their occurrences. Instead of using nested loops to compare elements, this pattern allows us to convert operations with O(n^2)time complexity into something more manageable, typically O(n).

When to Use

You need to compare data elements (arrays, strings, etc.) based on frequency of occurrence.
You’re asked to detect duplicates or patterns in a dataset.

Example Problem

Suppose you’re asked to write a function that checks if two strings are anagrams of each other. An anagram is a word or phrase formed by rearranging the letters of another. For example, “cinema” and “iceman” are anagrams.

Naive Solution

This solution works but it uses sorting, which takes O(n log n) time. Can we do better? Yes, with a frequency counter!

Optimized Solution Using Frequency Counter

By using frequency counters, we reduced time complexity to O(n), which is more efficient than the sorting approach.

2. The Multiple Pointers Pattern

The Multiple Pointers pattern is particularly useful for solving problems involving arrays or strings where you need to work with two or more elements simultaneously. Instead of using nested loops to compare items, we use multiple pointers to traverse and compare data, often in O(n) time.

When to Use

You need to compare two elements in a sorted array or string.
You are dealing with problems that involve finding pairs or subsets that meet certain criteria (e.g., sums, differences).

Example Problem

Write a function that accepts a sorted array and finds the first pair where the sum is zero.

Solution Using Multiple Pointers

In this case, we use two pointers: one starting at the beginning and the other at the end of the array. By shifting these pointers towards each other, we avoid the need for a nested loop, reducing time complexity to O(n).

3. The Divide and Conquer Pattern

The Divide and Conquer pattern is a powerful technique for solving problems by breaking them down into smaller, more manageable sub-problems. It’s often associated with recursive approaches and is the backbone of efficient algorithms like Merge Sort and Binary Search.

When to Use

You need to break a problem into smaller pieces and solve each one individually.
The problem can naturally be divided into two or more parts (e.g., searching or sorting algorithms).

Example Problem

Let’s implement Binary Search , which searches for a target element in a sorted array in O(log n) time.

Binary Search Using Divide and Conquer

Here, the array is continually split in half, and we narrow down the search to the correct segment. This results in a time complexity of O(log n), making it much more efficient than linear search (O(n)).

Conclusion

Each of these patterns—the Frequency Counter, Multiple Pointers, and Divide and Conquer—provides a different way to approach common algorithmic challenges. By understanding and applying these patterns, you can write more efficient code and solve problems faster, both in technical interviews and real-world applications.

Next time you’re faced with a coding problem, think about whether any of these patterns might help you optimize your solution. Happy coding!