DeepSeek LLM：让答案触手可及

1010n111

于 2025-05-29 08:02:40 发布

阅读量660

点赞数 31

文章标签： deepseek

本文链接：https://blog.csdn.net/qq_29715355/article/details/147581141

版权

DeepSeek LLM：让答案触手可及

技术背景

在自然语言处理领域，大语言模型的发展日新月异。DeepSeek LLM作为一款先进的语言模型，由670亿参数构成，基于2万亿英语和中文混合语料从头开始训练。为推动研究发展，其7B/67B基础和聊天模型已开源。

实现步骤

模型下载

Huggingface：可从Huggingface下载DeepSeek LLM 7B/67B的基础和聊天模型。
中间检查点：使用AWS CLI从AWS S3下载中间检查点，如：

# DeepSeek-LLM-7B-Base
aws s3 cp s3://deepseek-ai/DeepSeek-LLM/DeepSeek-LLM-7B-Base <local_path> --recursive --request-payer

# DeepSeek-LLM-67B-Base
aws s3 cp s3://deepseek-ai/DeepSeek-LLM/DeepSeek-LLM-67B-Base <local_path> --recursive  --request-payer

快速开始

安装依赖

在Python >= 3.8环境下，运行以下命令安装依赖：

pip install -r requirements.txt

使用Huggingface的Transformers进行推理

文本补全：

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, GenerationConfig

model_name = "deepseek-ai/deepseek-llm-67b-base"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.bfloat16, device_map="auto")
model.generation_config = GenerationConfig.from_pretrained(model_name)
model.generation_config.pad_token_id = model.generation_config.eos_token_id

text = "An attention function can be described as mapping a query and a set of key-value pairs to an output, where the query, keys, values, and output are all vectors. The output is"
inputs = tokenizer(text, return_tensors="pt")
outputs = model.generate(**inputs.to(model.device), ma