Langchain is the probably the easiest way for building LLM based applications.
According to Andrej Karpathy LLMs are like Operating systems that allow developers to build apps using their broad ranging capabilities. If we build on that analogy Langchain would be analogous to a framework as .Net or Django or Express.
As per the Langchain’s State of AI 2023 report:
- 42% of LLM applications involve some kind of retrieval system
- 17% involve an agentic system.
There is a huge push towards agentic systems from a lot of gaints of AI, including people like Andrew Ng and Andrej Karpathy.
What does langchain do?
In a nutshell, Langchain:
- makes it easier to interact with LLMs
- makes it easier to provide additional context
- makes it easier to fetch external data and store it an easy to query format.
Let’s look at the simplest possible example
from langchain_openai import ChatOpenAI
llm = ChatOpenAI()
llm.invoke("What do you know about SWE.today?")
"""
SWE.today is a website that provides resources
and information related to software engineering.
It offers articles, tutorials, job listings, and
other content aimed at software engineers and
individuals interested in the field of technology.
"""
While we have an answer, this bares the biggest issues with LLMs.
Hallucinations!!
SWE.today doesn’t do job listings. Atleast not yet!
In fact the entire description is imaginary. GPT didn’t know what SWE.today is. So, it made up the best possible answer it could. Here are some other answers it generated.
# Response 1
"""
SWE.today is a website that provides resources and
information for women in the field of software
engineering. It was founded in 2018 by a group of
female software engineers with the goal of promoting
diversity and inclusion in the tech industry.
"""
# Response 2
"""
SWE.today is not a known website or platform.
It is possible that it is a personal website or
a company website for a specific organization or
group. Without more information, it is difficult
to determine the exact purpose or content of SWE.today.
"""
# Sad, but true.
\
While this is good for fooling around, imagine having this issue in a business app. One way to solve this is to use a context and force the LLM to use the information available in the given context.
For example consider the following prompt
You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don’t know the answer, just say that you don’t know. Use three sentences maximum and keep the answer concise. Question: {question}
Context: {context}
Answer:
We could fetch the context from a database. Pass it in along with the question. And hopefully get a better answer. But this brings us to another problem.
Managing all the prompts, managing the context and keeping it updated can quickly become nightmarish. People at Langchain think of the best ways to interact with LLMs and come up with tooling around it.
Let’s look at a slightly more complicated example to understand the core concepts of Langchain.
Loaders
# Load docs
from langchain_community.document_loaders import WebBaseLoader
loader = WebBaseLoader("https://lilianweng.github.io/posts/2023-06-23-agent/")
docs = loader.load()
Loaders are used to well, load the data from an external source. Langchain makes it easy to load data from a wide variety of sources including websites, pdfs, doc files, csv, and many others.
Splitters
# Split the source documents
from langchain_text_splitters import RecursiveCharacterTextSplitter
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
splits = text_splitter.split_documents(docs)
LLMs have fixed context window. Their computational complexity is O(n^2). Moreover, if you are using an LLM API, you pay by the number of tokens used. Therefore it is not a good idea to send the entire context to the LLM even if it fits.
Langchain provides tools to split, vectorize and store the available data to make it easier to send just the relevant context.
Vector store
# Vectorize the source document chunks
from langchain_chroma import Chroma
from langchain_openai import OpenAIEmbeddings
vectorstore = Chroma.from_documents(documents=splits, embedding=OpenAIEmbeddings())
retriever = vectorstore.as_retriever()
The data chunks can be vectorized and stored for easy retrieval. Why vectorized? Because semantic search on vectors is a great way to search for similar things.
Community Prompts
# RAG prompt
from langchain import hub
prompt = hub.pull("rlm/rag-prompt")
"""
You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. Use three sentences maximum and keep the answer concise.
Question: {question}
Context: {context}
Answer:
"""
The good people at Langchain have created prompts for many different use cases. The prompt that we saw earlier can be loaded from the Langchain hub.
Models
# Make sure the Open AI API key is available in the env
import os
os.environ["OPENAI_API_KEY"] = OPENAI_API_KEY
# Load the LLM Model
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(model="gpt-3.5-turbo-0125")
Langchain provides wrappers over almost all the popular models. The models are categorized in 2 categories.
- Text Models (or just llms)
- and Chat Models.
Text models take in the prompt text as input and return the model generated text as the output. Chat models take in a list of AI, Human and System messages and return an AI message.
Chains and LCEL
from langchain_core.runnables import RunnablePassthrough
from langchain_core.output_parsers import StrOutputParser
def format_docs(docs):
return "\n\n".join(doc.page_content for doc in docs)
rag_chain = (
{"context": retriever | format_docs, "question": RunnablePassthrough()}
| prompt
| llm
| StrOutputParser()
)
We can compose all of the above component in a callable chain. Langchain has a concept of Runnable which is a base class that overrides the “Bitwise Or” operator and allows us to use pipes to chain the runnables together.
def __or__(self, other) -> RunnableSerializable[Input, Other]:
"""Compose this runnable with another object to create a RunnableSequence."""
return RunnableSequence(self, coerce_to_runnable(other))
# Invoke the chain with a question
rag_chain.invoke("What is Task Decomposition?")
We can now call the invoke method with a question and get a response from the LLM. Behind the scenes Langchain does the following
- Uses the question to find the relevant chunks to create the context by using the retriever
- Passes in the context along with the question to LLM and returns the generated output
Conclusion
All this is barely scratching the surface. In future posts we will get more hands on and create a QnA RAG system using Langchain. For more details on Langchain visit their awesome documentation.
Bonus
If you are curious how Langchain Runnables are able to handle the pipe operator, stick for a minute longer.
In python we have these things called Dunder methods. Dunder methods begin and end with 2 underscores. You might have seen the __init__()
method.
Among other things, dunder functions can help us in overriding the behaviour for common operators like, +, -, |, ^.
Let’s create a Pipeable class that overrides the bitwise or operator.
class Pipeable:
def __init__(self, value=None):
self.value = value
def __or__(self, other):
"""Overrides the bitwise or operator."""
return other(self)
def __str__(self):
"""Overrides the print handler."""
return f"Pipeable(value: {self.value})"
We have used 3 dunder methods above. __init__
, __or__
and __print__
. Let’s create a pipepable object.
one = Pipeable(1)
print(one)
# Output:
# Pipeable(value: 1)
This is good but we cant do much with just this one object. Let’s create somethings that can iteract with this pipeable object
class PlusOne(Pipeable):
def __call__(self, pipeable):
return Pipeable(pipeable.value + 1)
class MinusTwo(Pipeable):
def __call__(self, pipeable):
return Pipeable(pipeable.value - 2)
Here we see another dunder method, __call__
. The call method allows the object to be called as a function.
plus_one = PlusOne()
two = one | plus_one
print(two)
minus_two = MinusTwo()
zero = two | minus_two
print(zero)
three = one | plus_one | plus_one
print(three)
# Output:
# Pipeable(value: 2)
# Pipeable(value: 0)
# Pipeable(value: 3)
The Pipeable class is very basic and not exactly how Langchain handles it’s Runnables. But it should give you an idea of how you can use dunder methods to override default operator behavior.
In a future post we will explore the use of dunder functions extensively.