LLMs

Why we haven’t gotten rid of Langchain yet

Nina Holm-Jensen
Senior Data Scientist

Based on a conversation between Todai’s data scientists.

Prerequisites:

A basic understanding of how a RAG system works (if not, you can read any article on the topic, like this one, and come back)

Why would you want to get rid of Langchain?

Langchain is a long-standing framework for working with LLMs. It is conceptually based on the idea of “chains”, which is a way of chaining together different flows, agents, and other functionality. It is designed to be very plug-and-play with standardised interfaces between the different modules. In principle, you should be able to call the same functions for OpenAI as you do for AWS’s Bedrock suite. Which is nice and simple.

But the major problem with Langchain is exactly this plug-and-play mentality.

Langchain invites you not to think too deeply about the implementation details of your work. Everything is abstracted away into these chain “links”, and you just have to configure a few things. But under the hood, the implementation details ARE very different and very complicated. LLMs change at a breakneck pace, and every provider has their own quirks. Langchain is not exactly stringent about following software engineering principles. Configuration code, executing code, database connections, data transformations, it all often happens in one big bowl of spaghetti code.

To fit the chain structure, the under-the-hood code often looks something like this:

This laissez-faire approach leads to fun surprises. One time, a colleague of mine just couldn’t understand why she kept getting a weird bug in her chain, until she went on a code deep dive and realised that, upstream in her LLM chain, Langchain had implemented a function which just returned None. No errors. No NotImplementedException. Just an empty return value, which was propagated further down the chain’s modules until it, eventually, caused an error.

The concept of “chains” is also another level of abstraction, which can be hard to explain to non-data scientists.

It is worth noting that Thoughtworks’s Tech Radar even moved Langchain to “Hold” last year, meaning they recommend against using it.

And, like a senior team member provocatively asked us all, if Langchain really abstracts away all details, what is the point of us? Couldn’t any software engineer do exactly what we do?

So why do we still stick by it?

Because speed.

Langchain is still exceptional in the prototyping stages, specifically because it allows you to ignore the details for now. The time from idea to execution is, frankly, exceptional.

Say you have built a RAG which is running happily in production. It is configured to pull the 25 most relevant document chunks in the retrieval stage. These are used when it writes an answer. 

But now, my project manager asks me to improve the quality of the answers.

I know, through reading and research, that one way (among many) to improve quality is to improve the context sent to the chatbot. Maybe I could query 100 chunks and then rerank them to 25, turning my context retrieval into a two-step process with limited loss of performance.

Wanting to pursue this route, I need to decide on a reranking algorithm. As a good, up-to-date data scientist, I know that such algorithms are legion. A few of them are:

  • max-marginal-relevance
  • cross-encoder modeller
  • coheres reranker API
  • LLM-as-rereanker
  • An entirely different embedding model and vector score

Each of these options would be very time consuming to implement from scratch. The worst part is, I cannot know in advance which method will outperform the others (nor if any of them will outperform my baseline). None of them are objectively best – it all depends on the quirks of my specific data and implementation. I also cannot know how the change will influence my solution. For example, MMR is known to be fast, while LLM-as-reranker is slow. Maybe it is prohibitively slow on my specific dataset?

Langchain has an implementation ready-to-go for each of these options. It also has a Reranker module which is designed to be easy to plug into my chain. With Langchain, I can implement and test each of these options quickly, and I can go back to my project manager within the week with a concrete plan for a better quality answer.

Why is it so fast?

Simple. Langchain is still one of the biggest frameworks around.

This means that the internet is bursting with guides, implementations, code examples and things of that nature. Whatever I want to test, Langchain and its huge community can provide.

It has plug-and-play modules for almost every provider you need, including database providers.

It is still one of the first to implement new algorithms, methods and other state-of-the-art designs.

In this case, big really does mean fast.

So couldn’t anyone just do the data scientist’s job?

Eventually, maybe.

But the entire field of LLMs are moving so fast these days that it is still a full-time job to keep abreast of the news cycle, not to mention tinkering and trying the new tools, and getting the real-life experiences necessary to build up an intuition around real-life challenges. 

While any engineer can pick up Langchain and implement anything, it still takes an expert to know exactly WHAT to implement to accommodate someone’s specific needs and challenges.

What are the Langchain alternatives?

I know a colleague of mine is working on an article comparing all the bigger LLM frameworks, so stay tuned for that.

Long-term, I have no doubt that we will see better, more mature frameworks take over the marketplace. Once the initial goldrush is over, we will have a diverse toolbox capable of covering most use cases. In that sense, LLMs are no different from other new, shiny technologies.

Today, however, we will venture the claim that the real alternative to Langchain is to not use a big framework at all. All the major providers have APIs for their services, and if you’re self-hosting your LLM, you probably have very specific needs anyway. LLM flows are easy to implement (once you know exactly how you want to configure them) and much less error-prone in production.

Custom-writing your LLM has another, less obvious benefit. To custom-write, you need to really, truly understand what exactly you are implementing. This increases the demand for data scientists but also seriously decreases bugs and unintended system behavior. Non-deterministic systems like LLMs are already fraught with unexplainable behaviours – we want to reduce those as much as possible.

This is probably where our short-term future lies. Move fast and break things in Langchain, then rewrite to production-friendly code once you have tested a gazillion complex options and have moved the project into consolidation/maintenance mode.

But is there really an “after” the prototype stage?

The senior team members leaned back, arms crossed, at this point. Because we know: nothing is more permanent than a temporary solution.

To be entirely candid, it is very probable that we will never rewrite our existing solutions. Or at least, we will never rewrite until Langchain introduces one bug too many, or one performance issue too many, and management decides it is worth prioritizing.

As long as the entire field is running headfirst towards the goldrush, no one will be able to consolidate for a while yet. Everything we implement, we can soon improve tremendously. Everything can change at a moment’s notice. If we leave Langchain completely behind, and some new research paper revolutionalises reranking, we need to test the new approach – and without all the Langchain benefits I mentioned. The data scientist will then have to either reconstruct our entire LLM flow in Langchain, or implement the new cool things from scratch. Both are time-demanding.

So, like almost every other decision in life, it is a cost-benefit consideration. Do we need low-bug, high-performance production code more than we need to move fast?

Only the product manager can answer that.

But for now, Langchain absolutely still has its defensible place in a field that is all about moving fast.