LangGraph
[N_LangChain]
[[N_LangServe]]
git langgraph 项目页
官页 - overview
官页 - 教程
LangGraph is a library for building stateful, multi-actor applications with LLMs, used to create agent and multi-agent workflows. Compared to other LLM frameworks, it offers these core benefits: cycles, controllability, and persistence. LangGraph allows you to define flows that involve cycles, essential for most agentic architectures, differentiating it from DAG-based solutions. As a very low-level framework, it provides fine-grained control over both the flow and state of your application, crucial for creating reliable agents. Additionally, LangGraph includes built-in persistence, enabling advanced human-in-the-loop and memory features.
LangGraph 是一个专为使用大型语言模型(LLMs)来构建具有状态管理能力的、多角色交互式应用程序而设计的库。它主要用于创建单个或多个代理的工作流。相较于其他基于LLM的框架,LangGraph 突出的核心优势在于它支持循环处理、高度可控以及数据持久化。LangGraph 允许开发者定义包含循环的流程结构,这对于大多数代理系统的设计至关重要,这也是它与那些基于有向无环图(DAG)的解决方案的主要区别。作为一款低层级的框架,LangGraph 提供了对应用流程和状态的微调控制能力,这对于构建稳定可靠的代理系统来说是不可或缺的。另外,LangGraph 内置了持久化存储机制,这一特性极大地增强了人机交互的连续性和系统的记忆功能。
LangGraph is inspired by Pregel and Apache Beam. The public interface draws inspiration from NetworkX. LangGraph is built by LangChain Inc, the creators of LangChain, but can be used without LangChain.
Key Features
https://langchain-ai.github.io/langgraph/#key-features
Cycles and Branching: Implement loops and conditionals in your apps. 在你的应用中实现循环和条件分支。
Persistence: Automatically save state after each step in the graph. Pause and resume the graph execution at any point to support error recovery, human-in-the-loop workflows, time travel and more. 持久化, 可以随时暂停和恢复
Human-in-the-Loop: Interrupt graph execution to approve or edit next action planned by the agent. 人工参与?
Streaming Support: Stream outputs as they are produced by each node (including token streaming).
Integration with LangChain: LangGraph integrates seamlessly with LangChain and LangSmith (but does not require them).
A Simple Graph
https://langchain-ai.github.io/langgraph/tutorials/customer-support/customer-support/#prerequisites
https://langchain-ai.github.io/langgraph/#example
https://github.com/aneasystone/weekly-practice/blob/main/notes/week057-create-agents-with-langgraph/README.md
在LangGraph中,状态(State)是核心概念之一。每当执行图(Graph)时,都会创建一个状态,这个状态会在图中的各个节点间传递。随着节点的执行,每个节点会利用自己的返回值来更新这个内部状态。图(Graph)更新其内部状态的方式取决于所选图的类型,或者可以通过自定义函数来定义。
这种机制允许在图的不同部分之间共享和传递信息,使得图的执行不仅是一个简单的节点执行序列,而是一个具有记忆和反馈的动态过程。节点之间的这种状态传递和更新确保了执行流程中各部分之间的连贯性和数据的一致性。
例如,假设你正在构建一个涉及多个步骤的数据处理流水线,其中每个步骤都代表图中的一个节点。第一个节点可能负责数据清洗,第二个节点负责特征提取,第三个节点则可能进行机器学习模型的预测。在这样的情况下,状态可以包含从第一步到第三步所需的所有中间结果和信息,确保每个后续节点都能访问到前一节点产生的输出,并基于此进行下一步的处理。
此外,由于状态更新可以由自定义函数控制,这赋予了LangGraph 高度的灵活性和可扩展性,允许用户根据特定需求定制状态管理逻辑。
代码实现
'''
Author: yangfh
Date: 2024-07-13 13
LastEditors: yangfh
LastEditTime: 2024-07-23 14
Description:
参考官方教程文档
https://langchain-ai.github.io/langgraph/#step-by-step-breakdown
通义千问
https://help.aliyun.com/zh/dashscope/developer-reference/compatibility-of-openai-with-dashscope/?spm=a2c4g.11186623.0.i1
'''
import os
from langchain_openai import ChatOpenAI
from typing import Annotated, Literal, TypedDict
from langchain_anthropic import ChatAnthropic
from langchain_core.tools import tool
from langgraph.checkpoint import MemorySaver
from langgraph.graph import END, StateGraph, MessagesState
from langgraph.prebuilt import ToolNode
from langchain_core.messages import (
BaseMessage,
HumanMessage,SystemMessage,AIMessage, ToolMessage,
)
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
# 底层调用使用的是 httpx 库, 设置日志级别以获取更多信息
# import logging
# logging.basicConfig(level=logging.DEBUG)
############################################### LLM 模型 定义 #######################################################
# LLM 模型 qwen-turbo 模型
def get_llm():
os.environ["OPENAI_API_KEY"] = 'sk-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'
llm_model = ChatOpenAI(model="qwen-turbo",base_url="https://dashscope.aliyuncs.com/compatible-mode/v1")
return llm_model
# Chat 模型 qwen-turbo 模型 百炼
from langchain.chains import ConversationChain
from langchain.memory import ConversationBufferMemory
from langchain_bailian import Bailian
def get_chat_model():
os.environ["ACCESS_KEY_ID"] = 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'
os.environ["ACCESS_KEY_SECRET"] = 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'
os.environ["AGENT_KEY"] = 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'
os.environ["APP_ID"] = 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'
#
access_key_id = os.environ.get("ACCESS_KEY_ID")
access_key_secret = os.environ.get("ACCESS_KEY_SECRET")
agent_key = os.environ.get("AGENT_KEY")
app_id = os.environ.get("APP_ID")
llm = Bailian(access_key_id=access_key_id,
access_key_secret=access_key_secret,
agent_key=agent_key,
app_id=app_id)
return llm
############################################### tools 定义 #######################################################
from Tools_Basic import tool_get_list
from langgraph.prebuilt import ToolNode
################################ 查询我的所有基地列表数据
def get_list():
response = requests.get(g_serve_ctx+"/queryAll")
json_str = response.content.decode()
# print("queryAll 接口返回数据: ", json_str)
return json_str
################################ 更新基地的信息
class update_schema(BaseModel):
id: int = Field(..., title="ID", description="基地的ID")
name: str = Field(title="名称",description="基地名称")
address: str = Field(title="地址",description="基地地址")
def update_info(id: int, name:str = None, address: str = None):
# 封装请求体
data = {"id":id, "name":name, "address": address}
response = requests.post(g_serve_ctx+"/update", json=data)
json_str = response.content.decode()
# print("/update 接口返回数据: ", json_str)
return json_str
tool_update_base_info = StructuredTool.from_function(
func=update_info,
name="tool_update_base_info",
args_schema=update_schema,
description="`基地管理`更新基地的信息,返回的是JSON格式数据"
)
tool_get_base_list = StructuredTool.from_function(
func=get_list,
name="tool_get_base_list",
description="`基地管理`查询我的所有基地列表数据,返回的是JSON格式数据",
)
def tool_nodes():
tools = [tool_get_list]
tool_node = ToolNode(tools)
return tool_node
############################################### Agent 定义 #######################################################
"""
通用创建 agent 方法.
"""
def create_agent(llm, tools, system_message: str):
"""Create an agent."""
prompt = ChatPromptTemplate.from_messages(
[
(
"system",
"“XXX云平台”是一个以云为基础、AI为核心,构建开放、立体感知、全域协同、精确判断和持续进化的综合服务平台。它具有以下功能:"
"1. `基地管理` 基地数据包括:基地ID,基地名称,基地地址"
"如果您无法完全回答,没关系,您可以使用以下工具:{tool_names}。\n{system_message},尽您所能的配合取得结果。",
),
MessagesPlaceholder(variable_name="messages"),
]
)
# 填充 PromptTemplate 参数
prompt = prompt.partial(system_message=system_message)
prompt = prompt.partial(tool_names=", ".join([tool.name for tool in tools]))
return prompt | llm.bind_tools(tools)
def create_baisc_expert_agent():
llm = get_llm()
tools_ = [tool_get_list]
return create_agent(llm, tools_, "")
g_baisc_expert_agent = create_baisc_expert_agent()
def call_baisc_expert_agent(state: MessagesState):
messages = state['messages']
response = g_baisc_expert_agent.invoke(messages)
# We return a list, because this will get added to the existing list
return {"messages": [response]}
############################################### Graph 定义 #######################################################
# `Graph` 状态更新函数 (路由)
# Define the function that determines whether to continue or not
def should_continue(state: MessagesState) -> Literal["tools", "__end__"]:
messages = state['messages']
# 拿到最新一条消息
last_message = messages[-1]
# 如果是 tools 消息则返回 "tools", 给 tools 节点执行
if last_message.tool_calls:
return "tools"
# Otherwise, we stop (reply to the user)
return END
############################################### Graph 的节点
# agent 节点
# call_baisc_expert_agent 见上 初始化
# tool 节点
tool_nodes_ = tool_nodes()
# 创建图 `Graph`
workflow = StateGraph(MessagesState)
# `Graph` 图中的所有循环节点
# Define the two nodes we will cycle between
workflow.add_node("agent", call_baisc_expert_agent)
workflow.add_node("tools",tool_nodes_ )
# 设置入口点: 这个节点会第一个调用
workflow.set_entry_point("agent")
############################################### Graph 的边缘条件
# 添加边缘条件? 意思是在"agent" 节点执行完之后, 调用 "should_continue" 函数, 它将确定调用下一个的哪个'节点';
# 比如 Agent调用tool后, 把结果返回给对应的Agent; 还可以加一个入参, 映射字典表, 见官方API
# We now add a conditional edge
workflow.add_conditional_edges(
# First, we define the start node. We use `agent`.
# This means these are the edges taken after the `agent` node is called.
"agent",
# Next, we pass in the function that will determine which node is called next.
should_continue,
)
# 调用 `tools` 节点 后会调用`agent`节点;
# 这里是因为只有两个节点,可以硬编码. 意思是 `tools` 节点后再调用回`agent`
workflow.add_edge("tools", 'agent')
############################################### Graph 的持久化策略
# 持久化策略: 保存到内存
# Initialize memory to persist state between graph runs
checkpointer = MemorySaver()
# 编译 `graph`
app = workflow.compile(checkpointer=checkpointer)
############################################### Runnable #######################################################
# Use the Runnable
final_state = app.invoke(
{"messages": [HumanMessage(content="我有多少个基地?")]},
config={"configurable": {"thread_id": 44}}# 相当于会话ID; 会保存 graph 的 状态
)
print("graph final_state = ",final_state)
final_state["messages"][-1].content # graph 结束后, 拿到最后一条消息
关键API说明
Graph
https://langchain-ai.github.io/langgraph/reference/graphs/
Graph 是 LangGraph 的核心抽象。每个 StateGraph 实现都用于创建图形工作流。编译完成后,可以运行 CompiledGraph 来运行应用程序。
add_conditional_edges
https://langchain-ai.github.io/langgraph/tutorials/customer-support/customer-support/#part-3-conditional-interrupt
https://langchain-ai.github.io/langgraph/reference/graphs/#langgraph.graph.message.MessageGraph.add_conditional_edges
add_conditional_edges(source, path, path_map=None, then=None)
Parameters:
source (str) – 起始节点.
path (Union[Callable, Runnable]) – 确定下一个的节点函数(支持一个或多个节点)。如果未指定path_map则这里应返回一个或更多节点。如果返回 END,则 Graph 将停止执行。
path_map (Optional[dict[Hashable, str]], default: None ) – 返回 path 函数的返回值名称与节点的映射 (字典类型)
then (Optional[str], default: None ) – 在确定 path 选择的节点执行之前, 这可以再插一个节点
add_edge
可以理解为硬编码, tools 节点后再调用回agent
workflow.add_edge("tools", 'agent')
Conditional Interrupt
例如, 需要在执行工具时中断, 由人工确认, 参考: https://langchain-ai.github.io/langgraph/tutorials/customer-support/customer-support/#part-3-conditional-interrupt
part_3_graph = builder.compile(
checkpointer=memory,
# NEW: The graph will always halt before executing the "tools" node.
# The user can approve or reject (or even alter the request) before
# the assistant continues
interrupt_before=["sensitive_tools"],# 在执行 sensitive_tools 前 中断
)
调试
ChatOpenAI 发送网络请求的位置是在 C:\Users\yang\AppData\Local\Programs\Python\Python312\Lib\site-packages\openai\_base_client.py :: def _request( 使用的是httpx库
formatted_headers = []
for header in request.headers.raw:
header_key, header_value = header
formatted_header = f"{header_key.decode('utf-8').lower()}: {header_value.decode('utf-8')}"
formatted_headers.append(formatted_header)
print(formatted_header)
request.content.decode('utf-8')
Quick Start
Setup (依赖)
https://langchain-ai.github.io/langgraph/tutorials/introduction/
In this tutorial, we will build a support chatbot in LangGraph that can:
Answer common questions by searching the web
Maintain conversation state across calls
Route complex queries to a human for review
Use custom state to control its behavior
Rewind and explore alternative conversation paths
pip install -U langgraph
First, install the required packages:
%%capture --no-stderr
%pip install -U langgraph langsmith
# Used for this tutorial; not a requirement for LangGraph
%pip install -U langchain_anthropic
Optionally, we can set up LangSmith for best-in-class observability.
export LANGSMITH_TRACING=true
export LANGSMITH_API_KEY=lsv2_sk_...
pip install -U langgraph
pip install -U langsmith
pip install -U tiktoken langchain-cohere langchainhub chromadb langgraph tavily-python
pip install -U langchain_anthropic
pip install -U langchain-huggingface
pip install -U langgraph-checkpoint-sqlite
RAG
%%capture --no-stderr
pip install -U langchain_community tiktoken langchain-openai langchain-cohere langchainhub chromadb langchain langgraph tavily-python
What is RAG?
https://python.langchain.com/v0.2/docs/tutorials/rag/
LLM 最强大的应用之一是复杂的问答聊天机器人。这些应用程序可以回答有关特定源信息的问题。这些应用使用了一种称为 "检索增强生成"(Retrieval Augmented Generation)的技术。
LLM 可以推理各种主题,但它们的知识仅限于在特定时间点之前的公共数据,而这些数据是它们接受训练的基础。要想使用私人数据或模型训练日期之后数据的人工智能应用,就需要用模型所需的特定信息来增强模型的知识。将适当信息引入, 提示模型的过程被称为 检索增强生成(RAG)。
Pasted image 20240711120153
Load: First we need to load our data. This is done with Document Loaders.
Split: Text splitters break large Documents into smaller chunks. This is useful both for indexing data and for passing it in to a model, since large chunks are harder to search over and won't fit in a model's finite context window.
Store: We need somewhere to store and index our splits, so that they can later be searched over. This is often done using a VectorStore and Embeddings model.
Retrieve: Given a user input, relevant splits are retrieved from storage using a Retriever.
Generate: A ChatModel / LLM produces an answer using a prompt that includes the question and the retrieved data
in short 大约是: 1. 将文档切分为多个小片段; 2. 再通过 Embedding (计算/嵌入?) 出词向量; 3. 将问题也计算出词向量; 4. 通过向量检索过滤出有关系的资料数据, 给大模型, 从而过滤了大量无关内容;
Embeddings