一、MCP(模型上下文协议)概述
https://mcp-docs.cn/introduction
https://docs.mcpcn.org/introduction
MCP(Model Contextual Protocol)
是一种用于规范模型与上下文交互的协议,旨在提升模型对复杂任务的理解和执行能力。该协议通过结构化数据交换和动态上下文管理,优化模型在多轮对话、任务分解及环境适配中的表现。
二、Ollama 0.6.8 安装与配置
Ollama
是一个开源项目,专注于在本地运行和部署大型语言模型(LLM)。它简化了模型的管理和交互流程,允许用户通过命令行轻松下载、运行和与各种预训练模型交互,无需复杂的配置或云服务依赖。
curl -fsSL https://ollama.ai/install.sh | bash
vim /etc/systemd/system/ollama.service
Environment="OLLAMA_HOST=0.0.0.0"
sudo systemctl daemon-reload
sudo systemctl restart ollama
ollama run qwen3:8b
# http://localhost:11434/
# Ollama is running
三、将 Ollama 与 Langchain 整合集成
LangChain
是一个用于开发语言模型(LLM)驱动应用程序的开源框架。它提供模块化组件和工具链,简化了基于 LLM 的应用程序开发流程,支持数据感知、交互式决策等复杂任务。其核心优势在于整合外部数据源、API 和其他工具,增强 LLM 的功能性与实用性。
示例代码
# https://python.langchain.com/docs/integrations/llms/ollama/
pip install -U langchain-ollama
from langchain_core.prompts import ChatPromptTemplate
from langchain_ollama.llms import OllamaLLM
template = """Question: {question}
Answer: Let's think step by step."""
prompt = ChatPromptTemplate.from_template(template)
model = OllamaLLM(model="llama3.1")
chain = prompt | model
chain.invoke({"question": "What is LangChain?"})
流式输出
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate
from langchain_ollama.llms import OllamaLLM
import asyncio
model = OllamaLLM(model="qwen3:8b")
prompt = ChatPromptTemplate.from_template("告诉我一个关于 {topic} 的笑话")
parser = StrOutputParser()
chain = prompt | model | parser
async def run_agent():
async for chunk in chain.astream({"topic": "鹦鹉"}):
print(chunk, end="|", flush=True)
if __name__ == "__main__":
asyncio.run(run_agent())
四、使用 LangServe 部署
LangServe
是一个用于快速部署 LangChain
应用程序的工具,由 LangChain
官方团队开发。它简化了将 LangChain
流水线(如链、代理或检索增强生成系统)打包为 API
的过程,支持 REST
和 WebSocket
端点,便于集成到生产环境中。
# https://python.langchain.com/docs/langserve/#overview
pip install "langserve[all]"
from fastapi import FastAPI
from langchain.prompts import ChatPromptTemplate
from langchain.chat_models import ChatAnthropic, ChatOpenAI
from langserve import add_routes
app = FastAPI(
title="LangChain Server",
version="1.0",
description="A simple api server using Langchain's Runnable interfaces",
)
model = OllamaLLM(model="qwen3:8b")
prompt = ChatPromptTemplate.from_template("tell me a joke about {topic}")
add_routes(
app,
prompt | model,
path="/qwen3",
)
if __name__ == "__main__":
import uvicorn
uvicorn.run(app, host="localhost", port=8000)
启动 LangServe
服务之后,访问以下 URL
:
http://localhost:8000/qwen3/playground/
五、在 LangChain 中使用 MCP
文档 https://github.langchain.ac.cn/langgraphjs/agents/mcp/#custom-mcp-servers
1、本地 MCP Server 调用
pip install langchain-mcp-adapters
创建一个本地 MCP Server
,提供 add
和 multiply
方法:
# server.py 创建服务器
from mcp.server.fastmcp import FastMCP
mcp = FastMCP("Math")
@mcp.tool()
def add(a: int, b: int) -> int:
"""Add two numbers"""
return a + b
@mcp.tool()
def multiply(a: int, b: int) -> int:
"""Multiply two numbers"""
return a * b
if __name__ == "__main__":
mcp.run(transport="stdio")
创建 MCP 客户端,集成 add
和 multiply
方法:
# client.py 创建客户端
# Create server parameters for stdio connection
from mcp import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client
from langchain_mcp_adapters.tools import load_mcp_tools
from langgraph.prebuilt import create_react_agent
from langchain_ollama import ChatOllama
import asyncio
model = ChatOllama(model='qwen3:8b')
server_params = StdioServerParameters(
command="python",
# Make sure to update to the full absolute path to your server.py file
args=["server.py"],
)
async def run_agent():
async with stdio_client(server_params) as (read, write):
async with ClientSession(read, write) as session:
# Initialize the connection
await session.initialize()
# Get tools
tools = await load_mcp_tools(session)
# Create and run the agent
agent = create_react_agent(model, tools)
agent_response = await agent.ainvoke({"messages": "what's (3 + 5) x 12?"})
return agent_response
# Run the async function
if __name__ == "__main__":
result = asyncio.run(run_agent())
print(result)
运行效果:
... calculation of (3 + 5) x 12 is as follows:\n\n1. **Addition**: 3 + 5 = 8 \n2. **Multiplication**: 8 x 12 = 96 \n\nFinal result: **96**" ...
2、远程 MCP Server 调用
pip install langchain-mcp-tools
通过 mcp_configs
配置 MCP 服务:
# Standard library imports
import asyncio
import logging
import os
import sys
# Third-party imports
try:
from dotenv import load_dotenv
from langchain.chat_models import init_chat_model
from langchain.schema import HumanMessage
from langchain_ollama import ChatOllama
from langgraph.prebuilt import create_react_agent
except ImportError as e:
print(f'\nError: Required package not found: {e}')
print('Please ensure all required packages are installed\n')
sys.exit(1)
# Local application imports
from langchain_mcp_tools import convert_mcp_to_langchain_tools
# A very simple logger
def init_logger() -> logging.Logger:
logging.basicConfig(
level=logging.INFO, # logging.DEBUG,
format='\x1b[90m[%(levelname)s]\x1b[0m %(message)s'
)
return logging.getLogger()
async def run() -> None:
# Be sure to set ANTHROPIC_API_KEY and/or OPENAI_API_KEY as needed
# load_dotenv()
# Check the api key early to avoid showing a confusing long trace
# if not os.environ.get('ANTHROPIC_API_KEY'):
# raise Exception('ANTHROPIC_API_KEY env var needs to be set')
# if not os.environ.get('OPENAI_API_KEY'):
# raise Exception('OPENAI_API_KEY env var needs to be set')
global cleanup
try:
mcp_configs = {
'filesystem': {
'command': 'npx',
'args': [
'-y',
'@modelcontextprotocol/server-filesystem',
'.' # path to a directory to allow access to
]
},
'fetch': {
'command': 'uvx',
'args': [
'mcp-server-fetch'
]
},
'weather': {
'command': 'npx',
'args': [
'-y',
'@h1deya/mcp-server-weather'
]
},
}
tools, cleanup = await convert_mcp_to_langchain_tools(
mcp_configs,
init_logger()
)
llm = ChatOllama(model='qwen3:8b')
agent = create_react_agent(
llm,
tools
)
# query = 'Read the news headlines on bbc.com'
# query = 'Read and briefly summarize the LICENSE file'
query = "Tomorrow's weather in San Francisco 37°47′N 122°25′W?"
print('\x1b[33m') # color to yellow
print(query)
print('\x1b[0m') # reset the color
messages = [HumanMessage(content=query)]
result = await agent.ainvoke({'messages': messages})
# the last message should be an AIMessage
response = result['messages'][-1].content
print('\x1b[36m') # color to cyan
print(response)
print('\x1b[0m') # reset the color
finally:
if cleanup is not None:
await cleanup()
def main() -> None:
asyncio.run(run())
if __name__ == '__main__':
main()
运行效果:
如期调用了提供的天气工具,
整合了工具调用结果返回响应。
3、SSE 传输(已过时)
MCP 协议中的 SSE(Server-Sent Events)传输机制是一种基于 HTTP 的通信方式,主要用于实现服务器到客户端的流式数据传输。
# remote_server.py
from mcp.server.fastmcp import FastMCP
mcp = FastMCP("Weather")
@mcp.tool()
async def get_weather(location: str) -> str:
"""Get weather for location."""
return "It's always sunny in Dong Guan"
if __name__ == "__main__":
mcp.run(transport="sse")
服务端配置 transport="sse"
指定 sse
通讯,客户端对应配置如下:
mcp_configs = {
# ...
'weather': {
"url": "http://localhost:8000/sse"
},
}
运行效果:
4、streamable HTTP(推荐)
- https://github.com/langchain-ai/langchain-mcp-adapters?tab=readme-ov-file#streamable-http
- https://github.com/langchain-ai/langchain-mcp-adapters/tree/main/examples/servers/streamable-http-stateless
LangChain 团队在 5 月 9 号发布适配了 MCP 协议,果断 git clone
官方仓库:
git clone https://github.com/langchain-ai/langchain-mcp-adapters.git
我们来看一下官方实现的 StreamableHttp Stateless Server
:
import contextlib
import logging
from collections.abc import AsyncIterator
import anyio
import click
import mcp.types as types
from mcp.server.lowlevel import Server
from mcp.server.streamable_http_manager import StreamableHTTPSessionManager
from starlette.applications import Starlette
from starlette.routing import Mount
from starlette.types import Receive, Scope, Send
logger = logging.getLogger(__name__)
@click.command()
@click.option("--port", default=3000, help="Port to listen on for HTTP")
@click.option(
"--log-level",
default="INFO",
help="Logging level (DEBUG, INFO, WARNING, ERROR, CRITICAL)",
)
@click.option(
"--json-response",
is_flag=True,
default=False,
help="Enable JSON responses instead of SSE streams",
)
def main(
port: int,
log_level: str,
json_response: bool,
) -> int:
# Configure logging
logging.basicConfig(
level=getattr(logging, log_level.upper()),
format="%(asctime)s - %(name)s - %(levelname)s - %(message)s",
)
app = Server("mcp-streamable-http-stateless-demo")
@app.call_tool()
async def call_tool(
name: str, arguments: dict
) -> list[types.TextContent | types.ImageContent | types.EmbeddedResource]:
if name == "add":
return [
types.TextContent(
type="text",
text=str(arguments["a"] + arguments["b"])
)
]
elif name == "multiply":
return [
types.TextContent(
type="text",
text=str(arguments["a"] * arguments["b"])
)
]
else:
raise ValueError(f"Tool {name} not found")
@app.list_tools()
async def list_tools() -> list[types.Tool]:
return [
types.Tool(
name="add",
description="Adds two numbers",
inputSchema={
"type": "object",
"required": ["a", "b"],
"properties": {
"a": {
"type": "number",
"description": "First number to add",
},
"b": {
"type": "number",
"description": "Second number to add",
},
},
},
),
types.Tool(
name="multiply",
description="Multiplies two numbers",
inputSchema={
"type": "object",
"required": ["a", "b"],
"properties": {
"a": {
"type": "number",
"description": "First number to multiply",
},
"b": {
"type": "number",
"description": "Second number to multiply",
},
},
},
)
]
# Create the session manager with true stateless mode
session_manager = StreamableHTTPSessionManager(
app=app,
event_store=None,
json_response=json_response,
stateless=True,
)
async def handle_streamable_http(
scope: Scope, receive: Receive, send: Send
) -> None:
await session_manager.handle_request(scope, receive, send)
@contextlib.asynccontextmanager
async def lifespan(app: Starlette) -> AsyncIterator[None]:
"""Context manager for session manager."""
async with session_manager.run():
logger.info("Application started with StreamableHTTP session manager!")
try:
yield
finally:
logger.info("Application shutting down...")
# Create an ASGI application using the transport
starlette_app = Starlette(
debug=True,
routes=[
Mount("/mcp", app=handle_streamable_http),
],
lifespan=lifespan,
)
import uvicorn
uvicorn.run(starlette_app, host="0.0.0.0", port=port)
return 0
运行:
cd examples/servers/streamable-http-stateless/
uv run mcp-simple-streamablehttp-stateless --port 3000
如果包下载速度很慢,可以修改
pyproject.toml
文件使用国内镜像源
[tool.uv]
python-install-mirror = “https://mirror.nju.edu.cn/github-release/indygreg/python-build-standalone”
MCP Server 提供了 add
和 multiply
工具用于计算加法和乘法,接下来我们看下如何使用 LangChain
集成这个 MCP 服务:
# Use server from examples/servers/streamable-http-stateless/
from mcp import ClientSession
from mcp.client.streamable_http import streamablehttp_client
from langgraph.prebuilt import create_react_agent
from langchain_mcp_adapters.tools import load_mcp_tools
from langchain_ollama import ChatOllama
import asyncio
async def main():
async with streamablehttp_client("http://localhost:3000/mcp/") as (read, write, _):
async with ClientSession(read, write) as session:
# Initialize the connection
await session.initialize()
# Get tools
tools = await load_mcp_tools(session)
agent = create_react_agent(ChatOllama(model='qwen3:8b'), tools)
math_response = await agent.ainvoke({"messages": "what's (3 + 5) x 12?"})
print(math_response)
if __name__ == "__main__":
result = asyncio.run(main())
{'messages': [HumanMessage(content="what's (3 + 5) x 12?", additional_kwargs={}, response_metadata={}, id='22337a14-75ed-4a61-a02c-1486fa8974c5'), AIMessage(content='', additional_kwargs={}, response_metadata={'model': 'qwen3:8b', 'created_at': '2025-06-19T08:33:22.695597268Z', 'done': True, 'done_reason': 'stop', 'total_duration': 26858898158, 'load_duration': 10999225461, 'prompt_eval_count': 228, 'prompt_eval_duration': 670528285, 'eval_count': 230, 'eval_duration': 15184855426, 'model_name': 'qwen3:8b'}, id='run--70310f6f-ec20-44c1-9050-8cf17d32e68a-0', tool_calls=[{'name': 'multiply', 'args': {'a': 8, 'b': 12}, 'id': 'bfd69092-437f-48ef-adfc-5c75c32cd8f8', 'type': 'tool_call'}], usage_metadata={'input_tokens': 228, 'output_tokens': 230, 'total_tokens': 458}), ToolMessage(content='96', name='multiply', id='06688efe-1c64-4d69-a278-05979f2a867a', tool_call_id='bfd69092-437f-48ef-adfc-5c75c32cd8f8'), AIMessage(content="<think>\nOkay, let's see. The user asked for (3 + 5) x 12. First, I need to calculate the part inside the parentheses. So 3 plus 5 is 8. Then, multiply that result by 12. The function calls earlier added 3 and 5 using the add function, which gave 8. Then the multiply function was called with 8 and 12, resulting in 96. The final answer should be 96. I should present that clearly.\n</think>\n\nThe result of (3 + 5) x 12 is **96**.", additional_kwargs={}, response_metadata={'model': 'qwen3:8b', 'created_at': '2025-06-19T08:33:31.548524938Z', 'done': True, 'done_reason': 'stop', 'total_duration': 8843978237, 'load_duration': 8652049, 'prompt_eval_count': 266, 'prompt_eval_duration': 183717735, 'eval_count': 132, 'eval_duration': 8630192086, 'model_name': 'qwen3:8b'}, id='run--55834732-e92e-41c2-9e0f-10fee4047d97-0', usage_metadata={'input_tokens': 266, 'output_tokens': 132, 'total_tokens': 398})]}
这里使用了 streamablehttp_client
集成 MCP 服务,另一种方式是使用 MultiServerMCPClient
,代码如下:
# Use server from examples/servers/streamable-http-stateless/
from langchain_mcp_adapters.client import MultiServerMCPClient
from langgraph.prebuilt import create_react_agent
client = MultiServerMCPClient(
{
"math": {
"transport": "streamable_http",
"url": "http://localhost:3000/mcp/"
},
}
)
tools = await client.get_tools()
agent = create_react_agent("openai:gpt-4.1", tools)
math_response = await agent.ainvoke({"messages": "what's (3 + 5) x 12?"})
除了上面的源码示例,我们还可以使用 FastMCP
自定义 MCP Server:
# math_server.py
...
# weather_server.py
from typing import List
from mcp.server.fastmcp import FastMCP
mcp = FastMCP("Weather")
@mcp.tool()
async def get_weather(location: str) -> str:
"""Get weather for location."""
return "It's always sunny in New York"
if __name__ == "__main__":
mcp.run(transport="streamable-http")
python weather_server.py
INFO: Started server process [30224]
INFO: Waiting for application startup.
[06/19/25 16:37:14] INFO StreamableHTTP streamable_http_manager.py:109
session manager
started
INFO: Application startup complete.
INFO: Uvicorn running on http://127.0.0.1:8000 (Press CTRL+C to quit)
然后使用 MultiServerMCPClient
集成 MCP 服务:
from langchain_mcp_adapters.client import MultiServerMCPClient
from langgraph.prebuilt import create_react_agent
from langchain_ollama import ChatOllama
import asyncio
async def main():
client = MultiServerMCPClient(
{
"math": {
"command": "python",
# Make sure to update to the full absolute path to your math_server.py file
"args": ["/path/to/math_server.py"],
"transport": "stdio",
},
"weather": {
# make sure you start your weather server on port 8000
"url": "http://localhost:8000/mcp/",
"transport": "streamable_http",
}
}
)
tools = await client.get_tools()
agent = create_react_agent(ChatOllama(model='qwen3:8b'), tools)
math_response = await agent.ainvoke({"messages": "what's (3 + 5) x 12?"})
print(math_response)
weather_response = await agent.ainvoke({"messages": "what is the weather in nyc?"})
print(weather_response)
if __name__ == "__main__":
result = asyncio.run(main())
{'messages': [HumanMessage(content="what's (3 + 5) x 12?", additional_kwargs={}, response_metadata={}, id='dde18208-c9c1-4d90-a58d-a897ae1b6080'), AIMessage(content='', additional_kwargs={}, response_metadata={'model': 'qwen3:8b', 'created_at': '2025-06-19T08:38:30.70719712Z', 'done': True, 'done_reason': 'stop', 'total_duration': 16285607307, 'load_duration': 8801381, 'prompt_eval_count': 256, 'prompt_eval_duration': 269246118, 'eval_count': 239, 'eval_duration': 16005904597, 'model_name': 'qwen3:8b'}, id='run--fcd8dc57-e774-4f8b-a35c-409ec9d715ad-0', tool_calls=[{'name': 'add', 'args': {'a': 3, 'b': 5}, 'id': 'b2d3196b-c2db-4475-b2d8-fd221718e873', 'type': 'tool_call'}, {'name': 'multiply', 'args': {'a': 8, 'b': 12}, 'id': '0dce672f-79ba-4d41-9f5d-22462e31cfbe', 'type': 'tool_call'}], usage_metadata={'input_tokens': 256, 'output_tokens': 239, 'total_tokens': 495}), ToolMessage(content='8', name='add', id='54112f89-2f0f-4982-a6b4-d4ade13a8ba0', tool_call_id='b2d3196b-c2db-4475-b2d8-fd221718e873'), ToolMessage(content='96', name='multiply', id='6099fcc4-9d79-4223-adad-6b5ea2c9312b', tool_call_id='0dce672f-79ba-4d41-9f5d-22462e31cfbe'), AIMessage(content='<think>\nOkay, let me see. The user asked for (3 + 5) x 12. First, I need to calculate the addition inside the parentheses. 3 plus 5 is 8. Then, multiply that result by 12. So 8 times 12 equals 96. The tool calls were correct: adding 3 and 5 gives 8, then multiplying by 12 gives 96. The final answer should be 96.\n</think>\n\nThe result of (3 + 5) x 12 is **96**.', additional_kwargs={}, response_metadata={'model': 'qwen3:8b', 'created_at': '2025-06-19T08:38:39.268130105Z', 'done': True, 'done_reason': 'stop', 'total_duration': 8278964925, 'load_duration': 9629508, 'prompt_eval_count': 314, 'prompt_eval_duration': 180112026, 'eval_count': 124, 'eval_duration': 8078692315, 'model_name': 'qwen3:8b'}, id='run--85119c6f-9fe8-4b78-b1eb-942d477a44a4-0', usage_metadata={'input_tokens': 314, 'output_tokens': 124, 'total_tokens': 438})]}
{'messages': [HumanMessage(content='what is the weather in nyc?', additional_kwargs={}, response_metadata={}, id='5cf42beb-e267-4aae-8d8e-87cf10bb8fdd'), AIMessage(content='', additional_kwargs={}, response_metadata={'model': 'qwen3:8b', 'created_at': '2025-06-19T08:38:46.236066744Z', 'done': True, 'done_reason': 'stop', 'total_duration': 6964833975, 'load_duration': 8413825, 'prompt_eval_count': 251, 'prompt_eval_duration': 120079606, 'eval_count': 105, 'eval_duration': 6823059476, 'model_name': 'qwen3:8b'}, id='run--3018d8b4-0bde-4fa7-b82c-a079b8f29780-0', tool_calls=[{'name': 'get_weather', 'args': {'location': 'nyc'}, 'id': 'd85cdedc-ba53-47b4-8d48-82716b80041d', 'type': 'tool_call'}], usage_metadata={'input_tokens': 251, 'output_tokens': 105, 'total_tokens': 356}), ToolMessage(content="It's always sunny in New York", name='get_weather', id='d60f8d31-7da7-444d-8e95-7b51b14cd555', tool_call_id='d85cdedc-ba53-47b4-8d48-82716b80041d'), AIMessage(content='<think>\nOkay, the user asked, "what is the weather in nyc?" I used the get_weather function with location set to "nyc". The response came back as "It\'s always sunny in New York". Now I need to present this information clearly.\n\nFirst, I should confirm the location they asked about. They mentioned "nyc", which I assume is New York City. The response from the tool says it\'s sunny. I should state that directly. Maybe add a friendly note about the weather being nice. Keep it simple and conversational. Make sure there\'s no markdown and the response is natural.\n</think>\n\nThe weather in New York City is currently sunny! 🌞 Enjoy the nice weather!', additional_kwargs={}, response_metadata={'model': 'qwen3:8b', 'created_at': '2025-06-19T08:38:55.983704295Z', 'done': True, 'done_reason': 'stop', 'total_duration': 9720316915, 'load_duration': 21684064, 'prompt_eval_count': 291, 'prompt_eval_duration': 167254030, 'eval_count': 146, 'eval_duration': 9523544486, 'model_name': 'qwen3:8b'}, id='run--640d7ad1-ac0f-4d5c-a4fe-b704d55facf8-0', usage_metadata={'input_tokens': 291, 'output_tokens': 146, 'total_tokens': 437})]}
这种方式可以直接从旧的 SSE
服务迁移到 StreamableHttp
服务,非常方便。