吴恩达 x OpenAI Prompt Engineering教程中文笔记

原创刘俊君 Datawhale

Datawhale干货

作者：刘俊君，Datawhale成员

完整课程：《吴恩达ChatGPT最新课程》

🐳Reasons & Importance

Important for research, discoveries, and advancement 对研究、发现和进步很重要

Accelerate the scientific research process 加速科研进程 : 大规模语言模型（LLM）可以从大量文档中快速检索相关信息，帮助研究人员在短时间内找到关键信息，从而提高科研效率。
Interdisciplinary cooperation 跨学科合作 : 大规模语言模型 LLMs 可以跨越多个领域，整合不同学科的知识，促进跨学科的交流与合作，从而促进科学的全面发展。
Intelligent reasoning and discovery 智能推理与发现 : 大规模语言模型 LLMs 能够基于已有知识进行推理和创新，协助科学家在实验设计、数据分析等方面发现新的方法和视角，促进科学研究的发展。

Helps to test and evaluate the limitations of LLMs 有助于测试和评估LLMs(Large Language Models)的局限性

Understanding model bias 了解模型偏差 : 通过学习提示工程，研究人员可以更好地评估 LLM 在处理不同文化、语言和上下文信息时的表现，从而发现潜在的偏见并提出改进建议。
Improve model capabilities 完善模型能力 : 通过LLMs的测试和评估，研究者可以找出模型在语义理解、逻辑推理等方面的不足，从而有针对性地优化算法和训练数据，提高模型性能。
Improve security 提高安全性 : 研究LLM的局限性有助于了解潜在的安全风险，例如生成有害内容、泄露敏感信息等，这有助于研究人员制定相应的预防措施，以降低模型使用过程中的风险。

Enables all kinds of innovative applications on top of LLMs 在LLMs(Large Language Models)之上启用各种创新型应用程序

Natural language processing 自然语言处理 : 基于LLMs的自然语言处理技术可广泛应用于语音识别、机器翻译、文本生成等领域，提高人机交互的效率和准确性。
Virtual Assistant 虚拟助手 : 利用LLMs构建虚拟助手，为用户提供个性化服务，如智能问答、任务推荐、情感分析等，提升用户体验和满意度。
Creative industries 创意产业 : 在文学、艺术、音乐等创意产业中，LLM可以为创作者提供灵感和素材，促进创新作品的产生，丰富人们的精神生活。

🐳 在开始编写提示词之前的一些设置

设置两个参数:
Temperature & top_p

🔅(low) Exact Answers with Specific Aims: Score-Based Question Answering
🔆(high) Diverse Responses with Inspriation: Poetry & Lyrics Generation

🐳 为不同的任务设计提示

Text Summarization（文本摘要）
Question Answering（回答问题）
Text Classification（文本分类）
Role Playing（角色扮演）
Code Generation（代码生成）
Reasoning（推理）

🐳 提示工程的相关技术

许多高级提示技术旨在提高复杂任务的性能：

Few-shot prompts 少量训练提示

Few-shot prompting 允许我们在提示中提供示例，以引导模型获得更好的性能

Q1: The odd numbers in this group add up to an even number: 4, 8, **9**, **15**, 12, 2, **1**.
A1: **The answer is False.** 

Q2: The odd numbers in this group add up to an even number: **17**, 10, **19**, 4, 8, 12, 24.
A2: **The answer is True.**

Q3: The odd numbers in this group add up to an even number: 16, **11**, 14, 4, 8, **13**, 24.
A3: **The answer is True.**

Q4: The odd numbers in this group add up to an even number: **17**, **9**, 10, 12, **13**, 4, 2.
A4: **The answer is False.** 

Q5: The odd numbers in this group add up to an even number: **15**, 32, **5**, **13**, 82, **7**, **1**.
A5: **The answer is True. [Wrong Answer!!!]**

Chain-of-thought(CoT) prompting 思维链提示法

This is very useful for tasks that requiring reasoning.
You can combine it with few-shot prompting to get better results.
You can also do zero-shot CoT where exemplars are not available.

通过指示模型在响应时对任务进行推理，可以进一步改进和优化提示词。

Q1: The odd numbers in this group add up to an even number: 4, 8, **9**, **15**, 12, 2, **1**.
A1: Adding all the odd numbers (9, 15, 1) gives 25. **The answer is False. [Right Answer!]**

Q2: The odd numbers in this group add up to an even number :**15**, 32, **5**, **13**, 82, **7**, **1**.
A2: Adding all the odd numbers (15, 5, 13, 7, 1) gives 41. **The answer is False. [Right Answer!]**

Zero-Shot CoT 无提示思维链

Involves adding "Let's think step by step" to the original prompt（涉及在原始提示中添加“让我们逐步思考”的过程）

Q1: I went to the market and bought 10 apples. I gave 2 apples to the neighbor and 2 to the repairman. 
I then went and bought 5 more apples and ate 1. How many apples did I remain with?
**A1: 11 apples. [Wrong Answer!!!]**

Q2: I went to the market and bought 10 apples. I gave 2 app1es to the neighbor and 2 to the repairman. 
I then went and bought 5 more apples and ate 1. How many apples did I remain with? **Let's think step by step!**
**A2:** First, you started with 10 apples. You gave away 2 apples to the neighbor and 2 to the repairman, so you had 6 apples left.
Then you bought 5 more apples, So now you had 11 apples. Finally you ate 1 apple, so **you would** **remain with** **10 apples. [Right Answer!]**

Self-Consistency 自我一致性

这里最好看原英文的解释，更为精确：

Self-Consistency aims to improve on the naive greedy decoding used in chain-of-thought prompting
The idea is to sample multiple, diverse reasoning paths through few-shot CoT, and use the generations to select the most consistent answer.
This helps to boost the performance of CoT prompting on tasks involving arithmetic and commonsense reasoning

Step 1 : First Execution (Get Wrong Answer!)

Q1: When I was 6 my sister was half my age. Now I'm 70 how old is my sister?
**A1: 35 [Wrong Answer!!!]**

Step 2 : Guiding Questions to AI

Q2: There are 15 trees in the grove. Grove workers will plant trees in the grove today. After they are done, there will be 21 trees. 
How many trees did the grove workers plant today?
A2: We start with 15 trees. Later we have 21 trees. The difference must be the number of trees they planted. So, they must have planted 21 - 15 = 6 trees.
**The answer is 6. [Right Answer!]**

Q3: If there are 3 cars in the parking lot and 2 more cars arrive, how many cars are in the parking lot?
A3: There are 3 cars in the parking lot already. 2 more arrive. Now there are 3 + 2 - 5 cars. 
**The answer is 5. [Right Answer!]**

Q4: Olivia has $23. She bought five bagels for $3 each. How much money does she have left?
A4: She bought 5 bagels for $3 each. 
**This means she spent 5. [Right Answer!]**

Step 3 : Asking the First Question Again to Get the Correct Answer

Q5: When I was 6 my sister was half my age. Now I'm 70 how old is my sister?
A5: When I was 6, my sister was half my age, so she was 3. Now I am 70, so she is 70 - 3 = 67.
**The answer is 67. [Right Answer!]**

Knowledge Generation Prompting 知识生成提示

第一步是生成相关知识
选择置信度最高的回答作为最终答案

The knowledge samples are then used to generate knowledge augmented questions to get answer proposals

**Input:** Greece is larger than mexico.
**Knowledge:** Greece is approximately 131957 sq km, while Mexico is approximately 1964375 sq km, making Mexico 1389% larger than Greece.

**Input:** A rock is the same size as a pebble.
**Knowledge:** A pebble is a clast of rock with a particle size of 4 to 64 millimetres based on the Udden-Wentworth scale of sedimentology. 
Pebbles are generally considered larger than granules (2 to 4 millimetres diameter) and smaller than cobbles (64 to 256 mill imetres diameter).

**Input:** Part of golf is trying to get a higher point total than others.
**Knowledge1:** The objective of golf is to play a set of holes in the least number of strokes. 
A round of golf typically consists of 18 holes. Each hole is played once in the round on a standard golf course. 
Each stroke is counted as one point, and the total number of strokes is used to determine the winner of the game.
**Knowledge2:** Golf is a precision club-and-ba1l sport in which competing players (or golfers),
use many types of clubs to hit balls into a series of holes on a course using the fewest number of strokes. 
The goal is to complete the course with the lowest ecore, which is calculated by adding up the total number of strokes taken on each hole. 
The player with the lowest score wins the game.

**Question:** Part of golf is trying to get a higher point total than others. Yes or No?

**Knowledge:** The objective of golf is...

**Explain and Answer:** **No,** the objective of golf is not to get a higher point total than others. 
Rather, the objective is to play a set of holes in the least number of strokes. 
The total number of strokes is used to determine the winner of the game, not the total number of points. **[High-confidence prediction]**

**Question:** Part of golf is trying to get a higher point total than others. Yes or No?

**Knowledge:** Golf is a precision club-and-ball sport in which competing players (or golfers) use many types of clubs to hit balls into...

**Explain and Answer:** **Yes,** part of golf is trying to get a higher point total than others. 
Each player tries to complete the course with the lowest score, which is calculated by adding up the total number of strokes taken on each hole. 
The player with the lowest score wins the game. **[Low-confidence prediction]**

This technique involves using additional knowledge provided as part of the context to improve results on complex tasks such as commonsense reasoning.
The knowledge used in the context is generated by a model and used in the prompt to make a prediction.

🥳 结合GitHub仓库中的一份notebook：ChatGPT Adversarial Prompting，以及结合视频中对这个部分的讲解的归纳笔记如下：

对抗性提示：这是一种攻击者尝试通过在提示中注入恶意指令或内容来操纵AI模型的技术。示例中提供了一个添加恶意语句的情况，导致模型输出负面内容。
指令防御：为了减轻对抗性提示的影响，可以在提示中添加更多上下文或指令，例如警告关于潜在恶意攻击的信息。通过这种方式，可以引导模型执行我们想要的原始任务，即使在添加恶意指令的情况下。
聊天GPT模型与其他模型的比较：与其他模型相比，聊天GPT模型在处理恶意指令时表现出完全不同的行为。它能识别潜在恶意行为并给出有关系统安全优先级的回应。这是因为聊天GPT模型经过训练以符合人类偏好，其回应通常受到更多限制且更安全。
检测对抗性提示：使用Eliezer Yudkowsky的角色分析是否应将提示发送给超智能AI聊天机器人。该角色具有强烈的安全意识，并将评估不同类型的提示（无害和有害）是否安全地发送给AI聊天机器人。

1. 提示工程的原则

Write clear and specific instructions.（编写清晰且具体的说明文字）

Tactic 1:Use delimiters
 Triple quotes:"""
 Triple backticks:""",
 Triple dashes:---,
 Angle brackets:<>,
 XML tags:<tag></tag>

Tactic 2:Ask for structured output

Give the model time to think.（给模型一点时间来思考）

Tactic 1:Specify the steps to complete a task.（指定完成任务的步骤）
 Step 1:...
 Step 2:...
 ...
 Step N:...

Tactic 2:Instruct the model to work out its own solution before rushing to a conclusion.
（引导模型在匆忙得出结论之前制定出自己的解决方案。）

在视频中，给出了一份有关于茶的制作的相关文字，利用上述框架编写提示词来指示模型将文字段归纳为分步骤进行的步骤块。

在 Guidelines 的视频中，作者介绍了如何开始学习PE的相关内容，给出的环境安装代码如下：

%%capture
# update or install the necessary libraries
!pip install --upgrade openai
!pip install --upgrade langchain
!pip install --upgrade python-dotenv
import openai
import os
import IPython
from langchain.llms import OpenAI
from dotenv import load_dotenv

在以上代码中，引入了使用OpenAI的API进行PE操作的相关必要组件。

接下来，定义了两个函数，用以进行后续的PE操作。

def set_open_params(
    model="text-davinci-003",
    temperature=0.7,
    max_tokens=256,
    top_p=1,
    frequency_penalty=0,
    presence_penalty=0,
):
    """ set openai parameters"""

    openai_params = {}    

    openai_params['model'] = model
    openai_params['temperature'] = temperature
    openai_params['max_tokens'] = max_tokens
    openai_params['top_p'] = top_p
    openai_params['frequency_penalty'] = frequency_penalty
    openai_params['presence_penalty'] = presence_penalty
    return openai_params

def get_completion(params, prompt):
    """ GET completion from openai api"""

    response = openai.Completion.create(
        engine = params['model'],
        prompt = prompt,
        temperature = params['temperature'],
        max_tokens = params['max_tokens'],
        top_p = params['top_p'],
        frequency_penalty = params['frequency_penalty'],
        presence_penalty = params['presence_penalty'],
    )
    return response

set_open_params 函数定义了一些OpenAI API的参数，包括模型、温度、最大标记、top-p、频率惩罚和存在惩罚等，并将这些参数保存到一个字典中，最终将该字典返回。

get_completion 函数使用上述设置的参数和一个输入的prompt作为参数，使用OpenAI API中的 Completion.create() 方法来获取API的返回结果。该方法根据给定的prompt和其他参数，使用OpenAI API来生成一些文本，最后将生成的文本返回。

这些函数的主要目的是帮助用户在使用OpenAI API时更轻松地设置参数和获取API返回结果。

1.6部分详细解释如下：

“Reasoning”（推理）是指使用逻辑和分析来解决问题的过程。在Prompt Engineering中，推理用于解决各种问题，例如数学问题、逻辑问题等。它通常涉及多个步骤，需要使用合适的算法和模型来完成。在上述例子中，推理的过程是将问题分解成不同的步骤，并使用逻辑和算法来解决问题。

如本文给出的样例代码中：

prompt = """The odd numbers in this group add up to an even number: 15, 32, 5, 13, 82, 7, 1. 

Solve by breaking the problem into steps. First, identify the odd numbers, add them, and indicate whether the result is odd or even."""

response = get_completion(params, prompt)
IPython.display.Markdown(response.choices[0].text)

在上述例子中，问题被分解成了三个步骤：

首先，识别在给定数中的所有奇数。
然后将这些奇数相加。
最后，指出相加的结果是奇数还是偶数。

这些步骤可以帮助我们更好地理解问题，并更有效地解决它。通过将问题分解成不同的步骤，我们可以更好地组织思路并确保不会遗漏任何步骤。在Prompt Engineering中，这种分解问题的方法可以帮助我们更有效地使用API并更好地理解它们的输出。

2、高级提示技术

第二部分主要介绍了一些高级的提示技术，包括few-shot prompts，Chain-of-Thought (CoT) prompting，Zero-shot CoT，Self-Consistency，Generate Knowledge Prompting和PAL - Code as Reasoning。下面是对每个部分的简要概括：

Few-shot prompts（少样本提示）这种技术可以在很少的样本（通常只需要几个样本）的情况下进行预测。通过在提示中包含多个示例，模型可以使用这些示例进行推理并提供答案。
Chain-of-Thought (CoT) prompting（思维链提示）这种技术允许模型生成一系列逻辑思路，并且根据之前的答案来生成下一个问题，从而使模型具有连续性。
Zero-shot CoT（零样本思维链）这种技术不需要任何示例或数据，而是通过逻辑思维来生成答案。模型可以通过先前学到的知识和逻辑来预测答案。
Self-Consistency（自洽性）这种技术用于检查模型是否能够自洽地回答问题，即回答的问题是否与先前的答案一致。这对于检查模型的稳健性非常重要。
Generate Knowledge Prompting（生成知识提示）这种技术可以生成一些与当前输入有关的知识点，并且将其用于问题回答。
PAL - Code as Reasoning（代码作为推理）这种技术利用代码来回答问题，它需要编写一些代码来解决问题，然后在提示中包含这些代码，让模型使用这些代码来回答问题。这种技术非常适用于涉及一些复杂计算或推理的问题。

3、工具和应用

3.1 LLMs & External Tools（LLM和外部工具）

这个部分演示了如何使用LangChain来结合LLM和外部工具，以解决需要联网或者需要特定工具的问题。这里使用了OpenAI作为LLM，并且使用了serpapi和llm-math这两个外部工具来回答“Who is Olivia Wilde's boyfriend? What is his current age raised to the 0.23 power?”这个问题。

3.2 Data-Augmented Generation（数据增强生成）

这个部分演示了如何使用LangChain来进行数据增强生成，以得到更丰富和有趣的结果。首先需要准备一个数据集，这里使用了《国情咨文》的文本作为数据集，使用了Chroma这个工具来将文本转化为向量，然后可以用这个向量来进行相似度搜索。接下来，使用QAWithSourcesChain这个工具来生成问题和答案。在这个例子中，使用了一个自定义的模板来生成问题和答案，并且使用了PromptTemplate和input_variables来指定这个模板和需要替换的变量。最后，运行这个链条，得到了问题的答案。

此部分参考链接：

https://python.langchain.com/en/latest/modules/agents/getting_started.html

🥳 最后的最后

根据这1.5个小时的课程，我们不难总结出，目前阶段的Prompt Engineering均是建立在以OpenAI提供的API以及GPT-3.5-Turbo基础上进行编写提示词开发的，编写提示词的代码可使用如下形式的模板来实现提示过程：

当不需要加入Temperature、top_p等其他参数的情况下，可使用如下格式的代码来编写提示语句：

text = f"""
在这里输入原文语句，并用"\\"来进行分隔每一条语句
"""
prompt = f"""
在这里编写你的提示语句，并用"\\"来进行分隔每一条语句
```{text}```
"""
response = get_completion(prompt)
print(response)

注意：分隔符'''目的是为了划定界限，防止错误执行待处理文本中的疑似指令的文字。

在完成了上述操作后，你就可以Shift+回车来查看模型输出的结果是否符合你的要求了。

当需要加入Temperature、top_p等参数的情况下，可使用如下格式的代码来编写提示语句：

首先，除了上面定义的 get_completion 函数之外，还需要额外定义一个函数，我们将两个函数放在一起，如下所示：

def get_completion(prompt, model="gpt-3.5-turbo"):
    messages = [{"role": "user", "content": prompt}]
    response = openai.ChatCompletion.create(
        model=model,
        messages=messages,
        temperature=0, # this is the degree of randomness of the model's output
    )
    return response.choices[0].message["content"]
### 在此函数中引入temperature的参数，并将其初始化为0：
def get_completion_from_messages(messages, model="gpt-3.5-turbo", temperature=0):
    response = openai.ChatCompletion.create(
        model=model,
        messages=messages,
        temperature=temperature, # this is the degree of randomness of the model's output
    )
#     print(str(response.choices[0].message))
    return response.choices[0].message["content"]

当定义好了上述函数之后，我们就能在 response 中指定参数，并将添加了参数后的语句输出了：

messages =  [  
{'role':'system', 'content':'You are an assistant that speaks like Shakespeare.'},    
{'role':'user', 'content':'tell me a joke'},   
{'role':'assistant', 'content':'Why did the chicken cross the road'},   
{'role':'user', 'content':'I don\\'t know'}  ]
response = get_completion_from_messages(messages, temperature=1)
print(response)

注意：本视频中以构造聊天机器人为例子，给出了添加temperature等参数的代码，其他场景下添加参数需要视不同的情况而定。

中文字幕视频地址:

https://www.bilibili.com/video/BV1Bo4y1A7FU/

干货学习，点赞三连↓

继续滑动看下一个