【进阶】Python编写自己的RAG AI系统应用（Gradio交互方案）

2025-02-15T21:33:26+08:00 | 5分钟阅读 | 更新于 2025-02-15T21:33:26+08:00

Macro Zhao

【进阶】Python编写自己的RAG AI系统应用（Gradio交互方案）

本章我们将使用Gradio构建一个Python应用程序，与本地的DeepSeek-R1模型进行交互，来查询和分析PDF类型的文档。

推荐超级课程：

@TOC

先决条件

在实施之前，请确保我们已安装以下工具和库：

Python 3.8+
Langchain：由大型语言模型（LLMs）提供支持的应用程序构建框架，实现轻松检索、推理和工具集成。
Chromadb：一种高性能的向量数据库，适用于高效相似性搜索和嵌入存储。
Gradio：用于创建用户友好的网络界面。

步骤 1、安装python库

运行以下命令以安装必要的依赖项：

pip install langchain chromadb gradio ollama pymupdf

上述依赖库安装完成后，创建主文件并在其中引入下列库：

import gradio as gr
from langchain_community.document_loaders import PyMuPDFLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.vectorstores import Chroma
from langchain_community.embeddings import OllamaEmbeddings
import ollama

步骤 2、处理PDF文档文件

def process_pdf(pdf_bytes):
    if pdf_bytes is None:
        return None, None, None

    loader = PyMuPDFLoader(pdf_bytes)
    data = loader.load()

    text_splitter = RecursiveCharacterTextSplitter(
        chunk_size=500, chunk_overlap=100
    )
    chunks = text_splitter.split_documents(data)

    embeddings = OllamaEmbeddings(model="deepseek-r1")
    vectorstore = Chroma.from_documents(
        documents=chunks, embedding=embeddings, persist_directory="./chroma_db"
    )
    retriever = vectorstore.as_retriever()

    return text_splitter, vectorstore, retriever

process_pdf 函数：

加载并准备PDF内容以进行基于检索的回答。
检查是否上传了PDF。
使用 PyMuPDFLoader 提取文本。
使用 RecursiveCharacterTextSplitter 将文本分割成块。
使用 OllamaEmbeddings 生成向量嵌入。
将嵌入存储在Chroma向量存储中，以实现高效的检索。

步骤 3：合并检索到的文档块

一旦检索到嵌入，接下来我们需要将这些嵌入拼接在一起。combine_docs() 函数将多个检索到的文档块合并成一个字符串。

def combine_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)

由于基于检索的模型提取的是相关摘录而不是整个文档，这个函数确保在将提取的内容传递给DeepSeek-R1之前，内容保持可读和格式正确。

步骤 4：使用Ollama查询DeepSeek-R1

现在，我们的模型输入已经准备好了。让我们使用Ollama设置DeepSeek R1。

import re

def ollama_llm(question, context):
    formatted_prompt = f"Question: {question}\n\nContext: {context}"

    response = ollama.chat(
        model="deepseek-r1",
        messages=[{"role": "user", "content": formatted_prompt}],
    )

    response_content = response["message"]["content"]

    # Remove content between <think> and </think> tags to remove thinking output
    final_answer = re.sub(r"<think>.*?</think>", "", response_content, flags=re.DOTALL).strip()

    return final_answer

ollama_llm() 函数将用户的问题和检索到的文档上下文格式化为结构化的提示。然后，这种格式化的输入通过 ollama.chat() 发送到 DeepSeek-R1，它会在给定的上下文中处理问题并返回相关的答案。如果你需要不带模型思考脚本的答案，使用 strip() 函数来返回最终答案。

步骤 5：构建RAG流程

现在我们有了所有必要的组件，让我们为我们的演示构建RAG流程。

def rag_chain(question, text_splitter, vectorstore, retriever):
    retrieved_docs = retriever.invoke(question)
    formatted_content = combine_docs(retrieved_docs)
    return ollama_llm(question, formatted_content)

上述函数首先使用 retriever.invoke(question) 在向量存储中进行搜索，返回最相关的文档摘录。这些摘录通过 combine_docs 函数格式化为结构化输入，并发送到 ollama_llm，确保 DeepSeek-R1 根据检索到的内容生成有根据的答案。

步骤 6：创建Gradio界面

现在我们已经建立了RAG流程。接下来，我们可以在本地构建Gradio界面，并结合 DeepSeek-R1 模型来处理PDF输入并询问与之相关的问题。

def ask_question(pdf_bytes, question):
    text_splitter, vectorstore, retriever = process_pdf(pdf_bytes)

    if text_splitter is None:
        return None  # No PDF uploaded

    result = rag_chain(question, text_splitter, vectorstore, retriever)
    return {result}

interface = gr.Interface(
    fn=ask_question,
    inputs=[
        gr.File(label="上传PDF文档"),
        gr.Textbox(label="提问"),
    ],
    outputs="text",
    title="对你的PDF文档内容进行提问",
    description="使用 DeepSeek-R1 来回答你关于PDF文档内容相关的为你。",
)

interface.launch()

Interface 类有三个核心参数：
fn：要围绕用户界面（UI）包装的函数
inputs：用于输入的 Gradio 组件。组件的数量应与函数中的参数数量相匹配。
outputs：用于输出的 Gradio 组件。组件的数量应与函数返回值的数量相匹配。

至此，我们执行了以下步骤：

检查是否上传了PDF。
使用 process_pdf 函数处理PDF，以提取文本并生成文档嵌入。
将用户的查询和文档嵌入传递给 rag_chain() 函数，以检索相关信息并生成上下文准确的响应。
设置基于Gradio的网页界面，允许用户上传PDF并询问其内容的问题。
使用 gr.Interface() 函数定义布局，接受PDF文件和文本查询作为输入。
使用 interface.launch() 启动应用程序，通过网页浏览器实现无缝、交互式的基于文档的问答。

PS:

Gradio文档

结论

使用Ollama在本地运行DeepSeek-R1可以实现更快、私密且成本效益高的模型推理。

通过简单的安装过程、CLI交互、API支持和Python集成，你可以将DeepSeek-R1用于各种AI应用，从一般查询到复杂的基于检索的任务。

完整程序示例


import gradio as gr
from langchain_community.document_loaders import PyMuPDFLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.vectorstores import Chroma
from langchain_community.embeddings import OllamaEmbeddings
import ollama


def process_pdf(pdf_bytes):
    if pdf_bytes is None:
        return None, None, None

    loader = PyMuPDFLoader(pdf_bytes)
    data = loader.load()

    text_splitter = RecursiveCharacterTextSplitter(
        chunk_size=500, chunk_overlap=100
    )
    chunks = text_splitter.split_documents(data)

    embeddings = OllamaEmbeddings(model="bge-m3:latest")
    vectorstore = Chroma.from_documents(
        documents=chunks, embedding=embeddings, persist_directory="./chroma_db"
    )
    retriever = vectorstore.as_retriever()

    return text_splitter, vectorstore, retriever


def combine_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)


import re


def ollama_llm(question, context):
    formatted_prompt = f"Question: {question}\n\nContext: {context}"

    response = ollama.chat(
        model="deepseek-r1:7b",
        messages=[{"role": "user", "content": formatted_prompt}],
    )

    response_content = response["message"]["content"]

    # Remove content between <think> and </think> tags to remove thinking output
    final_answer = re.sub(r"<think>.*?</think>", "", response_content, flags=re.DOTALL).strip()

    return final_answer

def rag_chain(question, text_splitter, vectorstore, retriever):
    retrieved_docs = retriever.invoke(question)
    formatted_content = combine_docs(retrieved_docs)
    return ollama_llm(question, formatted_content)


def ask_question(pdf_bytes, question):
    text_splitter, vectorstore, retriever = process_pdf(pdf_bytes)

    if text_splitter is None:
        return None  # No PDF uploaded

    result = rag_chain(question, text_splitter, vectorstore, retriever)
    return {result}

interface = gr.Interface(
    fn=ask_question,
    inputs=[
        gr.File(label="上传PDF文档"),
        gr.Textbox(label="提问"),
    ],
    outputs="text",
    title="对你的PDF文档内容进行提问",
    description="使用 DeepSeek-R1 来回答你关于PDF文档内容相关的问题。",
)

interface.launch()

为了跟上AI的最新发展，我推荐以下博客：

上一页 DeepSeek 提示词及提示词设计技巧

下一页 Open WebUI进行AI系统交互