{"id":8215,"date":"2023-11-30T06:18:30","date_gmt":"2023-11-30T14:18:30","guid":{"rendered":"https:\/\/live-cometml.pantheonsite.io\/?p=8215"},"modified":"2025-04-24T17:04:07","modified_gmt":"2025-04-24T17:04:07","slug":"using-advanced-retrievers-in-langchain","status":"publish","type":"post","link":"https:\/\/www.comet.com\/site\/blog\/using-advanced-retrievers-in-langchain\/","title":{"rendered":"Using Advanced Retrievers in LangChain"},"content":{"rendered":"\n<section class=\"section section--body\">\n<div class=\"section-divider\"><\/div>\n<div class=\"section-content\">\n<div class=\"section-inner sectionLayout--insetColumn\">\n<h2 class=\"graf graf--h4\">More Techniques to Improve Retrieval Quality<\/h2>\n<figure class=\"graf graf--figure\">\n<\/figure><\/div><\/div><\/section>\n\n\n\n<figure class=\"wp-block-image alignnone graf-image\"><img decoding=\"async\" src=\"https:\/\/cdn-images-1.medium.com\/max\/1600\/0*A0armiPUzxdEUT1A\" alt=\"Advanced retrievers in LangChain with Comet ML and CometLLM\"\/><figcaption class=\"wp-element-caption\">Photo by <a href=\"https:\/\/unsplash.com\/@designbytholen?utm_source=medium&amp;utm_medium=referral\">Regine Tholen<\/a> on\u00a0<a href=\"http:\/\/Unsplash.com\">Unsplash<\/a><\/figcaption><\/figure>\n\n\n\n<p class=\"graf graf--p\">If you\u2019ve ever hit the wall with basic retrievers, it\u2019s time to gear up with some \u201cadvanced\u201d retrievers from LangChain.<\/p>\n\n\n\n<p class=\"graf graf--p\">This isn\u2019t just an upgrade; it\u2019s a new way to think about digging through data. Picture this: instead of a single line of inquiry, you deploy a squad of queries tailored to scout out a broader intel landscape. You\u2019re not just searching; you\u2019re launching a multi-pronged investigation into your database. This blog will arm you with everything you need to master advanced retrievers, and help you outsmart traditional search limitations and bring you closer to the heart of the matter.<\/p>\n\n\n\n<p class=\"graf graf--p\">For anyone looking to sharpen their retrieval results, let\u2019s dive into how these retrievers work and how they can change your game.<\/p>\n\n\n\n<h3 class=\"wp-block-heading graf graf--h3\">MultiQueryRetriever<\/h3>\n\n\n\n<p class=\"graf graf--p\">The <code class=\"markup--code markup--p-code\">MultiQueryRetriever<\/code> addresses the limitations in distance-based similarity search by generating multiple alternative &#8220;perspectives&#8221; on your query\/question.<\/p>\n\n\n\n<p class=\"graf graf--p\">By generating alternative questions and retrieving documents based on those questions, you can cover a broader range of information and increase the chances of finding the most relevant content. To determine if you need to use the MultiQueryRetriever, consider the following:<\/p>\n\n\n\n<p class=\"graf graf--p\">\u2022 If you use a distance-based similarity search and want to improve the retrieval accuracy, the MultiQueryRetriever can be a valuable tool.<\/p>\n\n\n\n<p class=\"graf graf--p\">\u2022 If you want to provide a more comprehensive set of results to the user by considering different perspectives on their question, the MultiQueryRetriever is a suitable choice.<\/p>\n\n\n\n<p class=\"graf graf--p\">\u2022 If you have a vector database with many documents and want to retrieve the most relevant ones, the MultiQueryRetriever can help you achieve that.<\/p>\n\n\n\n<p class=\"graf graf--p\">MultiQueryRetriever in LangChain retrieves relevant documents based on multiple generated queries.<\/p>\n\n\n\n<p class=\"graf graf--p\">You need it to improve retrieval accuracy, provide diverse results, and handle large vector databases.<\/p>\n\n\n\n<section class=\"section section--body\">\n<div class=\"section-divider\">\n<hr class=\"section-divider\">\n<\/div>\n<div class=\"section-content\">\n<div class=\"section-inner sectionLayout--insetColumn\">\n<blockquote class=\"graf graf--pullquote\"><p>Want to learn how to build modern software with LLMs using the newest tools and techniques in the field? <a class=\"markup--anchor markup--pullquote-anchor\" href=\"https:\/\/www.comet.com\/production\/site\/llm-course\/?utm_source=Heartbeat&amp;utm_medium=referral&amp;utm_content=Medium&amp;utm_campaign=Heartbeat_LangChain_Series_HS\" target=\"_blank\" rel=\"noopener ugc nofollow\" data-href=\"https:\/\/www.comet.com\/production\/site\/llm-course\/?utm_source=Heartbeat&amp;utm_medium=referral&amp;utm_content=Medium&amp;utm_campaign=Heartbeat_LangChain_Series_HS\">Check out this free LLMOps course<\/a> from industry expert Elvis Saravia of&nbsp;DAIR.AI!<\/p><\/blockquote>\n<\/div>\n<\/div>\n<\/section>\n\n\n\n<section class=\"section section--body\">\n<div class=\"section-divider\">\n<hr class=\"section-divider\">\n<\/div>\n<div class=\"section-content\">\n<div class=\"section-inner sectionLayout--insetColumn\">\n<p class=\"graf graf--p\">Consider using the MultiQueryRetriever when you want to overcome the limitations of distance-based similarity search and provide a more comprehensive set of results to the user.<\/p>\n<pre class=\"graf graf--pre graf--preV2\" spellcheck=\"false\" data-code-block-mode=\"1\" data-code-block-lang=\"python\"><span class=\"pre--content\">from langchain.chat_models import ChatOpenAI\nfrom langchain.retrievers.multi_query import MultiQueryRetriever\n\nquestion = \"Why are all great things slow of growth?\"\n\nllm = ChatOpenAI(temperature=0)\n\n# we instantiated the retreiever above\nretriever_from_llm = MultiQueryRetriever.from_llm(\n    retriever=db.as_retriever(), llm=llm\n)\n\n# Set logging for the queries\nimport logging\n\nlogging.basicConfig()\nlogging.getLogger(\"langchain.retrievers.multi_query\").setLevel(logging.INFO)\n\nunique_docs = retriever_from_llm.get_relevant_documents(query=question)<\/span><\/pre>\n<p class=\"graf graf--p\">And you can see the queries that were generated:<\/p>\n<pre class=\"graf graf--pre graf--preV2\" spellcheck=\"false\" data-code-block-mode=\"1\" data-code-block-lang=\"vbnet\"><span class=\"pre--content\">INFO:langchain.retrievers.multi_query:Generated queries: ['1. What are the reasons behind the slow growth of all great things?', '2. Can you explain why great things tend to have a slow growth rate?', '3. What factors contribute to the slow pace of growth in all great things?']<\/span><\/pre>\n<p class=\"graf graf--p\">Feel free to see what the actual Documents are as follows:<\/p>\n<pre class=\"graf graf--pre graf--preV2\" spellcheck=\"false\" data-code-block-mode=\"2\" data-code-block-lang=\"python\"><span class=\"pre--content\">unique_docs<\/span><\/pre>\n<p class=\"graf graf--p\">You can also supply a prompt along with an output parser to split the results into a list of queries:<\/p>\n<pre class=\"graf graf--pre graf--preV2\" spellcheck=\"false\" data-code-block-mode=\"2\" data-code-block-lang=\"python\"><span class=\"pre--content\">from typing import List\nfrom langchain import LLMChain\nfrom pydantic import BaseModel, Field\nfrom langchain.prompts import PromptTemplate\nfrom langchain.output_parsers import PydanticOutputParser\n\n\n# Output parser will split the LLM result into a list of queries\nclass LineList(BaseModel):\n    # \"lines\" is the key (attribute name) of the parsed output\n    lines: List[str] = Field(description=\"Lines of text\")\n\n\nclass LineListOutputParser(PydanticOutputParser):\n    def __init__(self) -&gt; None:\n        super().__init__(pydantic_object=LineList)\n\n    def parse(self, text: str) -&gt; LineList:\n        lines = text.strip().split(\"\\n\")\n        return LineList(lines=lines)\n\n\noutput_parser = LineListOutputParser()\n\nQUERY_PROMPT = PromptTemplate(\n    input_variables=[\"question\"],\n    template=\"\"\"You are a modern Stoic philosopher who has grown up in the\n    hoods of south Sacramento, California. The teachings of Epictetus helped\n    you overcome ups and downs in life. You are a student and teacher of his\n    teachings,\n    Original question: {question}\"\"\",\n)\n\nllm = ChatOpenAI(temperature=0)\n\n# Chain\nllm_chain = LLMChain(llm=llm, prompt=QUERY_PROMPT, output_parser=output_parser)\n\n# Run\nretriever = MultiQueryRetriever(\n    retriever=db.as_retriever(), llm_chain=llm_chain, parser_key=\"lines\"\n)  # \"lines\" is the key (attribute name) of the parsed output\n\n# Results\nunique_docs = retriever.get_relevant_documents(\n    query=\"How can I live a meaningful life\"\n)\n\nlen(unique_docs)<\/span><\/pre>\n<pre class=\"graf graf--pre graf--preV2\" spellcheck=\"false\" data-code-block-mode=\"1\" data-code-block-lang=\"vbnet\"><span class=\"pre--content\">INFO:langchain.retrievers.multi_query:Generated queries: ['As a modern Stoic philosopher who has experienced the challenges of growing up in the hoods of South Sacramento, I can understand the desire to seek a meaningful life. The teachings of Epictetus can indeed provide valuable guidance in this pursuit. Here are some principles that can help you live a meaningful life:', '', '1. Focus on what you can control: Epictetus emphasized the importance of distinguishing between what is within our control and what is not. By focusing our energy on the things we can control, such as our thoughts, actions, and attitudes, we can find meaning in our ability to shape our own lives.', '', '2. Cultivate virtue: According to Stoicism, the ultimate goal in life is to live in accordance with virtue. Virtue encompasses qualities such as wisdom, courage, justice, and temperance. By striving to cultivate these virtues in our daily lives, we can find purpose and meaning in our actions.', '', '3. Embrace adversity: Stoicism teaches us to view adversity as an opportunity for growth and self-improvement. Rather than being overwhelmed by the challenges we face, we can choose to embrace them as valuable lessons and opportunities to develop resilience and character.', '', '4. Practice gratitude: Epictetus emphasized the importance of gratitude in finding contentment and meaning in life. By cultivating a mindset of gratitude, we can learn to appreciate the simple joys and blessings that surround us, even in the midst of difficult circumstances.', '', '5. Serve others: Stoicism encourages us to live a life of service to others. By helping and supporting those around us, we can find purpose and fulfillment in making a positive impact on the lives of others.', '', '6. Live in accordance with nature: Stoicism teaches us to align our lives with the natural order of the universe. By accepting the impermanence of things and embracing the present moment, we can find meaning in living in harmony with the flow of life.', '', '7. Seek wisdom: Epictetus believed that the pursuit of wisdom is essential for living a meaningful life. By continuously seeking knowledge, reflecting on our experiences, and learning from others, we can deepen our understanding of ourselves and the world around us.', '', 'Remember, living a meaningful life is a personal journey, and it may take time to fully integrate these principles into your daily life. But by embracing the teachings of Epictetus and applying them in your own unique circumstances, you can find purpose, fulfillment, and meaning in your life, regardless of your background or upbringing.']\n17<\/span><\/pre>\n<h3 class=\"graf graf--h3\">Contextual compression<\/h3>\n<p class=\"graf graf--p\">Contextual compression in LangChain is a technique used to compress and filter documents based on their relevance to a given query.<\/p>\n<p class=\"graf graf--p\">It aims to extract only the relevant information from documents, reducing the need for expensive language model calls and improving response quality.<\/p>\n<p class=\"graf graf--p\"><strong class=\"markup--strong markup--p-strong\">Contextual compression is achieved by using a base retriever and a document compressor.<\/strong><\/p>\n<p class=\"graf graf--p\">The base retriever retrieves the initial set of documents based on the query, and the document compressor processes these documents to extract the relevant content. You can use contextual compression when you have a document storage system and want to improve retrieval performance by returning only the most relevant information. It is particularly useful when the relevant information is buried within documents containing a lot of irrelevant text.<\/p>\n<p class=\"graf graf--p\">To determine if you need to use contextual compression, consider the following factors:<\/p>\n<ul class=\"postList\">\n<li class=\"graf graf--li\">If your document storage system contains a large amount of data with potentially irrelevant information.<\/li>\n<li class=\"graf graf--li\">If you want to reduce the cost and response time of language model calls by extracting only the relevant content.<\/li>\n<li class=\"graf graf--li\">If you want to improve the overall retrieval performance and quality of your application.<\/li>\n<\/ul>\n<p class=\"graf graf--p\">By using contextual compression, you can enhance the efficiency and effectiveness of your document retrieval process, resulting in better user experiences and optimized resource utilization.<\/p>\n<pre class=\"graf graf--pre graf--preV2\" spellcheck=\"false\" data-code-block-mode=\"1\" data-code-block-lang=\"python\"><span class=\"pre--content\">from langchain.text_splitter import CharacterTextSplitter\nfrom langchain.embeddings import OpenAIEmbeddings\nfrom langchain.document_loaders import TextLoader\nfrom langchain.vectorstores import FAISS\n\n# Helper function for printing docs\ndef pretty_print_docs(docs):\n    print(f\"\\n{'-' * 100}\\n\".join([f\"Document {i+1}:\\n\\n\" + d.page_content for i, d in enumerate(docs)]))\n\ndocuments = TextLoader('\/content\/golden_hymns_of_epictetus.txt').load()\n\ntext_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=100)\n\ntexts = text_splitter.split_documents(documents)\n\nretriever = Chroma.from_documents(texts, OpenAIEmbeddings()).as_retriever()\n\ndocs = retriever.get_relevant_documents(\"What do the Stoics say of Socrates?\")\n\npretty_print_docs(docs)<\/span><\/pre>\n<h4 class=\"graf graf--h4\">Adding contextual compression with <code class=\"markup--code markup--h4-code\">LLMChainExtractor<\/code><\/h4>\n<p class=\"graf graf--p\">We\u2019ll add an <code class=\"markup--code markup--p-code\">LLMChainExtractor<\/code>, which will iterate over the initially returned documents and extract from each only the content that is relevant to the query.<\/p>\n<pre class=\"graf graf--pre graf--preV2\" spellcheck=\"false\" data-code-block-mode=\"2\" data-code-block-lang=\"python\"><span class=\"pre--content\">from langchain.llms import OpenAI\nfrom langchain.retrievers import ContextualCompressionRetriever\nfrom langchain.retrievers.document_compressors import LLMChainExtractor\n\nllm = OpenAI(temperature=0)\n\ncompressor = LLMChainExtractor.from_llm(llm)\n\ncompression_retriever = ContextualCompressionRetriever(base_compressor=compressor,\n                                                       base_retriever=retriever)\n\ncompressed_docs = compression_retriever.get_relevant_documents(\"What do the Stoics say of Socrates?\")\n\npretty_print_docs(compressed_docs)<\/span><\/pre>\n<h3 class=\"graf graf--h3\">LLMChainFilter<\/h3>\n<p class=\"graf graf--p\">The <code class=\"markup--code markup--p-code\">LLMChainFilter<\/code> in LangChain is a component used for filtering and processing documents based on their relevance to a given query.<\/p>\n<p class=\"graf graf--p\">It is a simpler but more robust compressor that uses an LLM chain to decide which of the initially retrieved documents to filter out and which ones to return, without manipulating the document contents.<\/p>\n<pre class=\"graf graf--pre graf--preV2\" spellcheck=\"false\" data-code-block-mode=\"2\" data-code-block-lang=\"python\"><span class=\"pre--content\">from langchain.retrievers.document_compressors import LLMChainFilter\n\nllm = OpenAI(temperature=0)\n\n_filter = LLMChainFilter.from_llm(llm)\n\nfilter_retriever = ContextualCompressionRetriever(base_compressor=_filter,\n                                                       base_retriever=retriever)\n\ncompressed_docs = compression_retriever.get_relevant_documents(\"What do the Stoics say of Socrates?\")\n\npretty_print_docs(compressed_docs)<\/span><\/pre>\n<h3 class=\"graf graf--h3\">EmbeddingsFilter<\/h3>\n<p class=\"graf graf--p\">Making an extra LLM call over each retrieved document is expensive and slow.<\/p>\n<p class=\"graf graf--p\">The <code class=\"markup--code markup--p-code\">EmbeddingsFilter<\/code> provides a cheaper and faster option by embedding the documents and query and only returning those documents which have sufficiently similar embeddings to the query.<\/p>\n<p class=\"graf graf--p\">It follows the same pattern we\u2019ve seen in the last two examples.<\/p>\n<pre class=\"graf graf--pre graf--preV2\" spellcheck=\"false\" data-code-block-mode=\"2\" data-code-block-lang=\"python\"><span class=\"pre--content\">embeddings = OpenAIEmbeddings()\n\nembeddings_filter = EmbeddingsFilter(embeddings=embeddings, similarity_threshold=0.76)\n\ncompression_retriever = ContextualCompressionRetriever(base_compressor=embeddings_filter, base_retriever=retriever)\n\ncompressed_docs = compression_retriever.get_relevant_documents(\"What does Epictetus say about being mindful of the company you keep?\")\n\npretty_print_docs(compressed_docs)<\/span><\/pre>\n<h3 class=\"graf graf--h3\">DocumentCompressorPipeline<\/h3>\n<p class=\"graf graf--p\">The <code class=\"markup--code markup--p-code\">DocumentCompressorPipeline<\/code> is a feature in LangChain that allows you to combine multiple compressors and document transformers in sequence.<\/p>\n<p class=\"graf graf--p\">It helps in compressing and transforming documents in a contextual manner. The pipeline can include compressors like <code class=\"markup--code markup--p-code\">EmbeddingsRedundantFilter<\/code> to remove redundant documents based on embedding similarity, and <code class=\"markup--code markup--p-code\">EmbeddingsFilter<\/code> to filter documents based on their similarity to the query. Document transformers like <code class=\"markup--code markup--p-code\">TextSplitter<\/code> can be used to split documents into smaller pieces.<\/p>\n<p class=\"graf graf--p\">You may need to use the <code class=\"markup--code markup--p-code\">DocumentCompressorPipeline<\/code> when you want to perform multiple compression and transformation steps on your documents.<\/p>\n<p class=\"graf graf--p\">It provides a flexible way to customize the compression process according to your specific requirements.<\/p>\n<pre class=\"graf graf--pre graf--preV2\" spellcheck=\"false\" data-code-block-mode=\"2\" data-code-block-lang=\"python\"><span class=\"pre--content\">from langchain.document_transformers import EmbeddingsRedundantFilter\nfrom langchain.retrievers.document_compressors import DocumentCompressorPipeline, EmbeddingsFilter\nfrom langchain.text_splitter import CharacterTextSplitter\n\nsplitter = CharacterTextSplitter(chunk_size=300, chunk_overlap=50, separator=\". \")\n\nredundant_filter = EmbeddingsRedundantFilter(embeddings=embeddings)\n\nrelevant_filter = EmbeddingsFilter(embeddings=embeddings, similarity_threshold=0.76)\n\npipeline_compressor = DocumentCompressorPipeline(\n    transformers=[splitter, redundant_filter, relevant_filter]\n)\n\ncompression_retriever = ContextualCompressionRetriever(base_compressor=pipeline_compressor, base_retriever=retriever)\n\ncompressed_docs = compression_retriever.get_relevant_documents(\"What waere the characteristics of Socrates?\")\n\npretty_print_docs(compressed_docs)<\/span><\/pre>\n<h3 class=\"graf graf--h3\">Ensemble Retriever<\/h3>\n<p class=\"graf graf--p\">The <code class=\"markup--code markup--p-code\">EnsembleRetriever<\/code> in LangChain is a retrieval algorithm that combines the results of multiple retrievers and reranks them using the Reciprocal Rank Fusion algorithm.<\/p>\n<p class=\"graf graf--p\">It is used to improve the performance of retrieval by leveraging the strengths of different algorithms. You may need to use the <code class=\"markup--code markup--p-code\">EnsembleRetriever<\/code> when you want to achieve better retrieval performance than any single algorithm can provide. It is particularly useful when combining a sparse retriever (e.g., BM25) with a dense retriever (e.g., embedding similarity) because their strengths are complementary.<\/p>\n<p class=\"graf graf--p\">The sparse retriever is good at finding relevant documents based on keywords, while the dense retriever is good at finding relevant documents based on semantic similarity.<\/p>\n<p class=\"graf graf--p\">To use the <code class=\"markup--code markup--p-code\">EnsembleRetriever<\/code>, you need to initialize it with a list of retrievers and their corresponding weights. The retrievers can be instances of different retrieval algorithms, such as <code class=\"markup--code markup--p-code\">BM25Retriever<\/code> and <code class=\"markup--code markup--p-code\">FAISSRetriever<\/code>. The weights determine the importance of each retriever in the ensemble. The <code class=\"markup--code markup--p-code\">EnsembleRetriever<\/code> then combines the results of the retrievers and reranks them based on the Reciprocal Rank Fusion algorithm.<\/p>\n<p class=\"graf graf--p\">If you have multiple retrievers that perform well on different aspects of the task, combining them using the <code class=\"markup--code markup--p-code\">EnsembleRetriever<\/code> can lead to improved performance.<\/p>\n<p class=\"graf graf--p\">Additionally, if you have a combination of sparse and dense retrievers, the <code class=\"markup--code markup--p-code\">EnsembleRetriever<\/code> can help leverage their complementary strengths.<\/p>\n<pre class=\"graf graf--pre graf--preV2\" spellcheck=\"false\" data-code-block-mode=\"2\" data-code-block-lang=\"python\"><span class=\"pre--content\">!pip install rank_bm25\n\nfrom langchain.retrievers import BM25Retriever, EnsembleRetriever\nfrom langchain.vectorstores import FAISS\nfrom langchain.vectorstores import Chroma\n\ndoc_list = \"\/content\/golden_hymns_of_epictetus.txt\"\n\n# initialize the bm25 retriever and faiss retriever\nbm25_retriever = BM25Retriever.from_texts(doc_list)\nbm25_retriever.k = 2\n\nembedding = OpenAIEmbeddings()\n\nvectorstore = Chroma.from_texts(doc_list, embedding)\n\nretriever = vectorstore.as_retriever(search_kwargs={\"k\": 2})\n\n# initialize the ensemble retriever\nensemble_retriever = EnsembleRetriever(retrievers=[bm25_retriever, retriever], weights=[0.5, 0.5])\n\ndocs = ensemble_retriever.get_relevant_documents(\"Socrates\")\ndocs<\/span><\/pre>\n<h3 class=\"graf graf--h3\">Other retrievers<\/h3>\n<p class=\"graf graf--p\">This blog is already quite lengthy, but I want you to know there are number of other retrievers you can use. Using these retrievers all follow the same pattern that we\u2019ve seen here<\/p>\n<ul class=\"postList\">\n<li class=\"graf graf--li\"><a class=\"markup--anchor markup--li-anchor\" href=\"https:\/\/colab.research.google.com\/corgiredirector?site=https%3A%2F%2Fpython.langchain.com%2Fdocs%2Fmodules%2Fdata_connection%2Fretrievers%2Fmulti_vector\" target=\"_blank\" rel=\"nofollow noopener\" data-href=\"https:\/\/colab.research.google.com\/corgiredirector?site=https%3A%2F%2Fpython.langchain.com%2Fdocs%2Fmodules%2Fdata_connection%2Fretrievers%2Fmulti_vector\">MultiVector Retriever<\/a><\/li>\n<li class=\"graf graf--li\"><a class=\"markup--anchor markup--li-anchor\" href=\"https:\/\/colab.research.google.com\/corgiredirector?site=https%3A%2F%2Fpython.langchain.com%2Fdocs%2Fmodules%2Fdata_connection%2Fretrievers%2Fparent_document_retriever\" target=\"_blank\" rel=\"nofollow noopener\" data-href=\"https:\/\/colab.research.google.com\/corgiredirector?site=https%3A%2F%2Fpython.langchain.com%2Fdocs%2Fmodules%2Fdata_connection%2Fretrievers%2Fparent_document_retriever\">Parent Document Retriever<\/a><\/li>\n<li class=\"graf graf--li\"><a class=\"markup--anchor markup--li-anchor\" href=\"https:\/\/colab.research.google.com\/corgiredirector?site=https%3A%2F%2Fpython.langchain.com%2Fdocs%2Fmodules%2Fdata_connection%2Fretrievers%2Ftime_weighted_vectorstore\" target=\"_blank\" rel=\"nofollow noopener\" data-href=\"https:\/\/colab.research.google.com\/corgiredirector?site=https%3A%2F%2Fpython.langchain.com%2Fdocs%2Fmodules%2Fdata_connection%2Fretrievers%2Ftime_weighted_vectorstore\">Time-weighted vector store<\/a><\/li>\n<li class=\"graf graf--li\"><a class=\"markup--anchor markup--li-anchor\" href=\"https:\/\/colab.research.google.com\/corgiredirector?site=https%3A%2F%2Fpython.langchain.com%2Fdocs%2Fmodules%2Fdata_connection%2Fretrievers%2Fvectorstore\" target=\"_blank\" rel=\"nofollow noopener\" data-href=\"https:\/\/colab.research.google.com\/corgiredirector?site=https%3A%2F%2Fpython.langchain.com%2Fdocs%2Fmodules%2Fdata_connection%2Fretrievers%2Fvectorstore\">Vector store-backend retriever<\/a><\/li>\n<li class=\"graf graf--li\"><a class=\"markup--anchor markup--li-anchor\" href=\"https:\/\/colab.research.google.com\/corgiredirector?site=https%3A%2F%2Fpython.langchain.com%2Fdocs%2Fmodules%2Fdata_connection%2Fretrievers%2Fweb_research\" target=\"_blank\" rel=\"nofollow noopener\" data-href=\"https:\/\/colab.research.google.com\/corgiredirector?site=https%3A%2F%2Fpython.langchain.com%2Fdocs%2Fmodules%2Fdata_connection%2Fretrievers%2Fweb_research\">WebSearchRetriever<\/a><\/li>\n<\/ul>\n<h3 class=\"graf graf--h3\">Conclusion<\/h3>\n<p class=\"graf graf--p\">As we wrap up, remember that these advanced retrievers aren\u2019t just another tool in your arsenal; it\u2019s your ace in the hole for tackling the complexity of information retrieval.<\/p>\n<p class=\"graf graf--p\">We\u2019ve walked through its strategic approach to broadening search perspectives and how it can finesse your search results with precision. Whether you\u2019re managing a vast repository of documents or seeking nuanced answers, these retrievers are helpful for pushing the boundaries of what\u2019s possible in data retrieval. So, put it to the test and watch as it transforms your quest for information into a multi-faceted discovery journey.<\/p>\n<p class=\"graf graf--p\">Happy querying!<\/p>\n<\/div>\n<\/div>\n<\/section>\n","protected":false},"excerpt":{"rendered":"<p>More Techniques to Improve Retrieval Quality If you\u2019ve ever hit the wall with basic retrievers, it\u2019s time to gear up with some \u201cadvanced\u201d retrievers from LangChain. This isn\u2019t just an upgrade; it\u2019s a new way to think about digging through data. Picture this: instead of a single line of inquiry, you deploy a squad of [&hellip;]<\/p>\n","protected":false},"author":68,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"customer_name":"","customer_description":"","customer_industry":"","customer_technologies":"","customer_logo":"","footnotes":""},"categories":[65,7],"tags":[70,71,52,31,16,34],"coauthors":[166],"class_list":["post-8215","post","type-post","status-publish","format-standard","hentry","category-llmops","category-tutorials","tag-langchain","tag-language-models","tag-llm","tag-llmops","tag-ml-experiment-management","tag-prompt-engineering"],"yoast_head":"<!-- This site is optimized with the Yoast SEO Premium plugin v25.9 (Yoast SEO v25.9) - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Using Advanced Retrievers in LangChain - Comet<\/title>\n<meta name=\"description\" content=\"If you\u2019ve ever hit the wall with basic retrievers, it\u2019s time to gear up with some \u201cadvanced\u201d retrievers from LangChain.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.comet.com\/site\/blog\/using-advanced-retrievers-in-langchain\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Using Advanced Retrievers in LangChain\" \/>\n<meta property=\"og:description\" content=\"If you\u2019ve ever hit the wall with basic retrievers, it\u2019s time to gear up with some \u201cadvanced\u201d retrievers from LangChain.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.comet.com\/site\/blog\/using-advanced-retrievers-in-langchain\/\" \/>\n<meta property=\"og:site_name\" content=\"Comet\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/cometdotml\" \/>\n<meta property=\"article:published_time\" content=\"2023-11-30T14:18:30+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2025-04-24T17:04:07+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/cdn-images-1.medium.com\/max\/1600\/0*A0armiPUzxdEUT1A\" \/>\n<meta name=\"author\" content=\"Harpreet Sahota\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@Cometml\" \/>\n<meta name=\"twitter:site\" content=\"@Cometml\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Harpreet Sahota\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"10 minutes\" \/>\n<!-- \/ Yoast SEO Premium plugin. -->","yoast_head_json":{"title":"Using Advanced Retrievers in LangChain - Comet","description":"If you\u2019ve ever hit the wall with basic retrievers, it\u2019s time to gear up with some \u201cadvanced\u201d retrievers from LangChain.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.comet.com\/site\/blog\/using-advanced-retrievers-in-langchain\/","og_locale":"en_US","og_type":"article","og_title":"Using Advanced Retrievers in LangChain","og_description":"If you\u2019ve ever hit the wall with basic retrievers, it\u2019s time to gear up with some \u201cadvanced\u201d retrievers from LangChain.","og_url":"https:\/\/www.comet.com\/site\/blog\/using-advanced-retrievers-in-langchain\/","og_site_name":"Comet","article_publisher":"https:\/\/www.facebook.com\/cometdotml","article_published_time":"2023-11-30T14:18:30+00:00","article_modified_time":"2025-04-24T17:04:07+00:00","og_image":[{"url":"https:\/\/cdn-images-1.medium.com\/max\/1600\/0*A0armiPUzxdEUT1A","type":"","width":"","height":""}],"author":"Harpreet Sahota","twitter_card":"summary_large_image","twitter_creator":"@Cometml","twitter_site":"@Cometml","twitter_misc":{"Written by":"Harpreet Sahota","Est. reading time":"10 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/www.comet.com\/site\/blog\/using-advanced-retrievers-in-langchain\/#article","isPartOf":{"@id":"https:\/\/www.comet.com\/site\/blog\/using-advanced-retrievers-in-langchain\/"},"author":{"name":"Harpreet Sahota","@id":"https:\/\/www.comet.com\/site\/#\/schema\/person\/46036ab474aa916e2873daece26a28d6"},"headline":"Using Advanced Retrievers in LangChain","datePublished":"2023-11-30T14:18:30+00:00","dateModified":"2025-04-24T17:04:07+00:00","mainEntityOfPage":{"@id":"https:\/\/www.comet.com\/site\/blog\/using-advanced-retrievers-in-langchain\/"},"wordCount":1216,"publisher":{"@id":"https:\/\/www.comet.com\/site\/#organization"},"image":{"@id":"https:\/\/www.comet.com\/site\/blog\/using-advanced-retrievers-in-langchain\/#primaryimage"},"thumbnailUrl":"https:\/\/cdn-images-1.medium.com\/max\/1600\/0*A0armiPUzxdEUT1A","keywords":["LangChain","Language Models","LLM","LLMOps","ML Experiment Management","Prompt Engineering"],"articleSection":["LLMOps","Tutorials"],"inLanguage":"en-US"},{"@type":"WebPage","@id":"https:\/\/www.comet.com\/site\/blog\/using-advanced-retrievers-in-langchain\/","url":"https:\/\/www.comet.com\/site\/blog\/using-advanced-retrievers-in-langchain\/","name":"Using Advanced Retrievers in LangChain - Comet","isPartOf":{"@id":"https:\/\/www.comet.com\/site\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.comet.com\/site\/blog\/using-advanced-retrievers-in-langchain\/#primaryimage"},"image":{"@id":"https:\/\/www.comet.com\/site\/blog\/using-advanced-retrievers-in-langchain\/#primaryimage"},"thumbnailUrl":"https:\/\/cdn-images-1.medium.com\/max\/1600\/0*A0armiPUzxdEUT1A","datePublished":"2023-11-30T14:18:30+00:00","dateModified":"2025-04-24T17:04:07+00:00","description":"If you\u2019ve ever hit the wall with basic retrievers, it\u2019s time to gear up with some \u201cadvanced\u201d retrievers from LangChain.","breadcrumb":{"@id":"https:\/\/www.comet.com\/site\/blog\/using-advanced-retrievers-in-langchain\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.comet.com\/site\/blog\/using-advanced-retrievers-in-langchain\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.comet.com\/site\/blog\/using-advanced-retrievers-in-langchain\/#primaryimage","url":"https:\/\/cdn-images-1.medium.com\/max\/1600\/0*A0armiPUzxdEUT1A","contentUrl":"https:\/\/cdn-images-1.medium.com\/max\/1600\/0*A0armiPUzxdEUT1A"},{"@type":"BreadcrumbList","@id":"https:\/\/www.comet.com\/site\/blog\/using-advanced-retrievers-in-langchain\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.comet.com\/site\/"},{"@type":"ListItem","position":2,"name":"Using Advanced Retrievers in LangChain"}]},{"@type":"WebSite","@id":"https:\/\/www.comet.com\/site\/#website","url":"https:\/\/www.comet.com\/site\/","name":"Comet","description":"Build Better Models Faster","publisher":{"@id":"https:\/\/www.comet.com\/site\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.comet.com\/site\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/www.comet.com\/site\/#organization","name":"Comet ML, Inc.","alternateName":"Comet","url":"https:\/\/www.comet.com\/site\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.comet.com\/site\/#\/schema\/logo\/image\/","url":"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2025\/01\/logo_comet_square.png","contentUrl":"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2025\/01\/logo_comet_square.png","width":310,"height":310,"caption":"Comet ML, Inc."},"image":{"@id":"https:\/\/www.comet.com\/site\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/cometdotml","https:\/\/x.com\/Cometml","https:\/\/www.youtube.com\/channel\/UCmN63HKvfXSCS-UwVwmK8Hw"]},{"@type":"Person","@id":"https:\/\/www.comet.com\/site\/#\/schema\/person\/46036ab474aa916e2873daece26a28d6","name":"Harpreet Sahota","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.comet.com\/site\/#\/schema\/person\/image\/2d21512be19ba7e19a71a803309e2a88","url":"https:\/\/secure.gravatar.com\/avatar\/a6ca5a533fc9f143a0a7428037ff652aa0633d66bf27e76ae89b955ae72a0f2d?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/a6ca5a533fc9f143a0a7428037ff652aa0633d66bf27e76ae89b955ae72a0f2d?s=96&d=mm&r=g","caption":"Harpreet Sahota"},"url":"https:\/\/www.comet.com\/site\/blog\/author\/theartistsofdatasciencegmail-com\/"}]}},"_links":{"self":[{"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/posts\/8215","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/users\/68"}],"replies":[{"embeddable":true,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/comments?post=8215"}],"version-history":[{"count":1,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/posts\/8215\/revisions"}],"predecessor-version":[{"id":15434,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/posts\/8215\/revisions\/15434"}],"wp:attachment":[{"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/media?parent=8215"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/categories?post=8215"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/tags?post=8215"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/coauthors?post=8215"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}