stuffChain

最常见的文档链,将文档直接塞进prompt中,为LLM回答问题提供上下文资料,适合小文档场景



from langchain.chains.combine_documents.stuff import StuffDocumentsChain
from langchain.chains import LLMChain
from langchain.prompts import PromptTemplate
from langchain.document_loaders import  PyPDFLoader
from langchain.chat_models import ChatOpenAI

loader = PyPDFLoader("loader.pdf")
#print(loader.load())

prompt_template = """对以下文字做简洁的总结:
{text}
简洁的总结:"""

prompt = PromptTemplate.from_template(prompt_template)
llm = ChatOpenAI(
    temperature=0,
    model="gpt-4-1106-preview",
)
llm_chain = LLMChain(llm=llm, prompt=prompt)

stuff_chain = StuffDocumentsChain(
    llm_chain=llm_chain,
    document_variable_name="text",
)
docs = loader.load()
print(stuff_chain.run(docs))

#使用预封装好的load_summarize_chain
from langchain.document_loaders import  PyPDFLoader
from langchain.chat_models import ChatOpenAI
from langchain.chains.summarize import load_summarize_chain

loader = PyPDFLoader("loader.pdf")
docs = loader.load()
llm = ChatOpenAI(
    temperature=0,
    model="gpt-4-1106-preview",
)
chain = load_summarize_chain(
    llm=llm,
    chain_type="stuff",
    verbose=True,
    )

chain.run(docs)

refine

通过循环引用LLM,将文档不断投喂,并产生各种中间答案,适合逻辑有上下文关联的文档,不适合交叉引用的文档



from langchain.prompts import  PromptTemplate
from langchain.document_loaders import PyPDFLoader
from langchain.chat_models import ChatOpenAI
from langchain.text_splitter import CharacterTextSplitter
from langchain.chains.summarize import load_summarize_chain

#load
loader = PyPDFLoader("loader.pdf")
docs = loader.load()
#split
text_split = CharacterTextSplitter.from_tiktoken_encoder(
    chunk_size = 1000,
    chunk_overlap=0
)
split_docs = text_split.split_documents(docs)

prompt_template = """对以下文字做简洁的总结:
{text}
简洁的总结:"""

prompt = PromptTemplate.from_template(prompt_template)

refine_template = (
    "你的任务是产生最终摘要\n"
    "我们已经提供了一个到某个特定点的现有回答:{existing_answer}\n"
    "我们有机会通过下面的一些更多上下文来完善现有的回答(仅在需要时使用).\n"
    "------------\n"
    "{text}\n"
    "------------\n"
    "根据新的上下文,用中文完善原始回答.\n"
    "如果上下文没有用处,返回原始回答."
)

refine_prompt = PromptTemplate.from_template(refine_template)
llm = ChatOpenAI(
    temperature=0,
    model="gpt-3.5-turbo",
)

chain = load_summarize_chain(
    llm=llm,
    chain_type="refine",
    question_prompt=prompt,
    refine_prompt = refine_prompt,
    return_intermediate_steps=True,
    input_key = "documents",
    output_key = "output_text",
)

Map reduce

先将每个文档或文档块分别投喂给LLM,并得到结果集(Map步骤),然后通过一个文档合并链,获得一个输出结果(Reduce步骤)

Map re-rank

先将每个文档或文档块投喂给LLM,并对每个文档或文档块生成问题的答案进行打分,然后将打分最高的文档或文档块作为最终答案返回


Prev post

LangChain 013

Next post

LangChain 015